using haptics and vision PhD thesis proposal Presented by: Leonel - - PowerPoint PPT Presentation

using haptics and vision
SMART_READER_LITE
LIVE PREVIEW

using haptics and vision PhD thesis proposal Presented by: Leonel - - PowerPoint PPT Presentation

Robot coaching of manipulation tasks using haptics and vision PhD thesis proposal Presented by: Leonel D. Rozo C. Advisors: Carme Torras Pablo Jimnez Barcelona. Spain September 29 th , 2008 Outline Objectives 1. State of the art 2.


slide-1
SLIDE 1

PhD thesis proposal

Presented by: Leonel D. Rozo C. Advisors: Carme Torras Pablo Jiménez

  • Barcelona. Spain

September 29th, 2008

Robot coaching of manipulation tasks using haptics and vision

slide-2
SLIDE 2

Outline

1.

Objectives

2.

State of the art

3.

Expected contributions

4.

Work planning

5.

Resources

6.

Conclusions

slide-3
SLIDE 3

Objectives

 Main objective

 To provide robots with manipulation skills acquired

from demonstrated examples given by a human who acts as a coach.

slide-4
SLIDE 4

Objectives

 Specific objectives

 To analyze (and adapt) different learning algorithms based

  • n robot learning by demonstration, with the aim of finding

those that best suit the manipulation task features.

 Incremental learning  Fast learning  Robust learning

 To identify the relevant features in the manipulation tasks

from sensorial information with the aim of including them as input in the learning stage.

 What to imitate ?

slide-5
SLIDE 5

Objectives

 To develop a set-up where robot learning of manipulation

tasks by demonstration will take place. It will be composed of a robot (the learner) teleoperated through a haptic device driven by a human user (the coach).

 To fuse haptic and visual information for improving and

speeding up the learning stage.

slide-6
SLIDE 6

State of the art

Introduction

 Introduction

 Why should robots learn ?  Two main approaches exist for endowing robots with learning

capabilities:

 Self-learning  Learning from examples

slide-7
SLIDE 7

State of the art

LbD – History and concepts

 Learning by demonstration

 Symbolic approaches

 Exact reproduction of the demonstrated task (playback)  State-action-state representation

 Unsuitable approach when uncertainty appears

 If-then rules

(A. Billard et al. 2008)

slide-8
SLIDE 8

State of the art

LbD – History and concepts

 Machine learning inclusion in programming by demonstration

 Supervised methods

 A training dataset composed by labelled input and desired outputs is given.  Goal: Given a new input, to predict its corresponding output  Some methods are:

Artificial neural networks

Decision trees

Bayesian statistics

Gaussian process regression

Nearest neighbour

Support vector machines

 Unsupervised methods

 A input dataset is presented but no feedback about it is given  Goal: finding a representation of particular input patterns in a way that

reflects the statistical structure of the overall collection of input patterns

slide-9
SLIDE 9

State of the art

LbD – History and concepts

 Imitation learning

 What is imitation ?

 Biological inspiration

From an act witnessed learn to do an act (Thorndike).

 Robotics

Imitation takes place when an agent learns a behaviour from observing the execution of that behaviour by a teacher (Bakker and Kuniyoshi, 1996).

 Current challenges

(P. Bakker & Y . Kuniyoshi, 1996)

slide-10
SLIDE 10

State of the art

LbD – History and concepts

 Movement primitives (MP)

 Inductive approach

MP are sequences of actions that accomplish a complete goal-directed behaviour and allow to have a compact state-action representation (Schaal, 1999). (S. Schaal, 1999 )

slide-11
SLIDE 11

State of the art

LbD – History and concepts

 Movement primitives (MP)

 Biological inspiration

A behaviour-based control approach (Mataric)

To use a control system that is based on a set of behaviours (MP), which are real- time processes that take inputs from sensors or other behaviours and send output commands to effectors or other system behaviour.

How to interpret and understand observed behaviors ? How to integrate the perception and motion control system to reconstruct what was observed ?

(Computational Neuroscience and Humanoid Robotics Department, ATR laboratories)

slide-12
SLIDE 12

State of the art

LbD – History and concepts

 Control policies

 The motor control problem which can be conceived as finding a task-specific

control policy

 Imitation learning can be defined as the problem of how control policies can

be learned by observing a demonstration:

Imitation by direct policy learning

Imitation by learning policies from demonstrated trajectories

Imitation by model-based policy learning

Motor commands Policy States Algorithm parameters

slide-13
SLIDE 13

State of the art

LbD – History and concepts

 What to imitate ? – Learning invariances over demonstrations

 Finding those features of the task that are relevant to the reproduction

Those that appear most repeatedly in different demonstrations of the task i.e., the invariants in time (Billard et al., 2004)

Categorization of the human actions (Dillman,2004):

Performative

Commenting

Commanding Imitation task Observation process Execution process (Dillman, 2004)

slide-14
SLIDE 14

State of the art

LbD – History and concepts

 Improving imitation learning

 A task learned from imitation can be improved, corrected or refined

in two ways:

 By using reinforcement learning

The given demonstrations enclose the search in the state-action space to a more reduced subspace, which means RL is focused

  • n

those areas where demonstration data yield

This approach is based on a self-improvement process, where the robot improves the learned skill by interacting with its environment (A. Billard et al. 2008)

slide-15
SLIDE 15

State of the art

LbD – History and concepts

 By using active teaching

The learned action from imitation is corrected or refined through teacher’s support

The information goes from teacher to the robot The information flow is bi- directional due to a social activity is being carried out

  • S. Calinon and A. Billard. What is the teacher's role in robot programming by demonstration?

toward benchmarks for improved learning. Interaction Studies, 8(3):441-464, 2007.

slide-16
SLIDE 16

State of the art

LbD – History and concepts

 Incremental learning

 Whenever new data are generated, these should be included in the

learning framework

 New demonstrations  Corrections  Refinements

 It is necessary to work with learning algorithms that accomplish

at least the following requirements:

 Online learning  Inexpensive computations  Robustness in front of the interference problem  Fast learning in highly dimensional state-action spaces

slide-17
SLIDE 17

State of the art

LbD – History and concepts

 Locally weighted learning

 LWL methods approximate nonlinear functions by means of piecewise linear

models

 Memory-based

Locally weighted regression – LWR

Locally weighted partial least squares - LWPLS

slide-18
SLIDE 18

State of the art

LbD – History and concepts

 Non-memory-based

Receptive field weighted regression – RFWR

Locally weighted projection regression – LWPR

 LWPR is an incremental learning algorithm, which is able to deal with high

dimensional data streams. In addition is computationally cheap and numerically robust.

SHORTCOMING !!! Too many open parameters to be manually tuned (S. Schaal & C. Atkeson, 1998) (S. Vijayakumar & S. Schaal, 2000)

slide-19
SLIDE 19

State of the art

LbD – History and concepts

 LWL-based Bayesian learning

 These methods deal with the problem of manually tuning of the open

parameters in LWL algorithms

Bayesian locally weighted regression – BLWR

It treats all open parameters probabilistically and learns the appropriate local regime for each linearization problem based on the LWR algorithm approach.

It is Bayesian formulation of spatially local adaptive kernels for LWR

Randomly varying coefficient – RVC

Probabilistic method based on the paradigm of Bayesian probabilistic online learning

It treats each open parameter in LWPR as a probability distribution

 Gaussian processes

 Incremental GMM

Direct update method

It is based on the temporal coherence properties of data streams

It is assumed that were varying smoothly in time to adjust the GMM parameters when new data were observed

Reformulating the problem for a generic observation of multiple datapoints

Generative method

It uses Expectation-Maximization performed on data generated by GMR

Sparse online Gaussian processes - SOGP

slide-20
SLIDE 20

State of the art

LbD – History and concepts

 Coaching

 It can be divided into two process

Imitation learning

Observation

Execution

Active teaching

Observation and evaluation

Corrections and refinements

 It allows ...

to acquire new knowledge

to focus attention on relevant task features

to give a strategy for correction

to help to iteratively define the characteristics of a successful outcome (A. Billard et al. 2008)

slide-21
SLIDE 21

State of the art

LbD – Entire systems

Systems based on vision Manipulation tasks Playing air hockey Gestures Human motion

Optimization criteria Bayesian methods HMM PCA Gaussian processes

slide-22
SLIDE 22

State of the art

LbD – Entire systems

 Learning basketball official’s signals

 Motion sensors  Preprocessing stage by using PCA  Actions are encoded in a probabilistic way by using GMM  GMR is applied for reconstructing a general form for the signals

  • S. Calinon and A. Billard. Incremental learning of gestures by imitation

in a humanoid robot. 2007

slide-23
SLIDE 23

State of the art

LbD – Entire systems

Systems based

  • n haptics

Assembly tasks Virtual environments

Optimization criteria Fuzzy logic HMM LWR Neural networks

slide-24
SLIDE 24

State of the art

LbD – Entire systems

 Virtual environments

 Learning the peg-in-hole insertion task

Virtual scene where user manipulates the peg by moving a haptic device

A preprocessing stage based on Dillman’s criteria was carried out for removing noise in the training data

Position and orientation of the peg, and forces/torques generated compose the input data training

A HMM was applied to identify and estimate the contact states

During the physical implementation, LWR was used for learning the trajectory in each state of the insertion procedure (S. Dong & F. Naghdy, 2007)

slide-25
SLIDE 25

State of the art

LbD – Entire systems

Teleoperated systems Manipulation tasks Grasping tasks Playing soccer Robot-assisted surgery

Decision trees Nearest neighbor Bayesian methods Manifold learning Neural networks Gaussian processes LWL LWPR Scaffolding

slide-26
SLIDE 26

State of the art

LbD – Entire systems

 Haptically guided teleoperation for learning manipulation tasks

 A mobile robot manipulator with a camera placed on its end-effector  The human user teleoperates the robot through a haptic device

He/she is haptically guided by using information provided by the vision system

 Hierarchical feedforward neural networks are used in the learning stage

Backpropagation

 Learning to play soccer

Robot dog learns a variety of tasks related to robot soccer

Training data is composed of vision and proprioceptive information

Learning process is carried out by using:

LWPR

SOGP

(A. Howard & C. Park, 2007) (D. Grollman & O. Jenkins, 2008)

slide-27
SLIDE 27

State of the art

LbD – Entire systems Vision Haptics

Robot Coach Robot Coach Bentivegna et al. (2004) X X(p) Bentivegna et al. (2002) X X Calinon et al. (2007-1) X X Calinon et al. (2007-2) X X Calinon et al. (2007-3) X X X(p) Chen et al. (2002) X X X Dong et al. (2007) X X X Férnandez et al. (2003) X X Grollman et al. (2008) X X X(p) Howard et al. (2007) X X X(p) X Jenkins et al. (2006) X X Kaiser et al. (1996) X X Kang et al. (1995) X X X Kuniyoshi et al. (1994) X X Lockerd et al. (2004) X X Mayer et al. (2008) X X X PetersII et al. (2003) X X(p) Riley et al. (2006) X X Shon et al. (2007) X X

slide-28
SLIDE 28

Expected contributions

 To exploit haptic and visual information fusion in robot

learning by demonstration

 Providing the robot

with haptic and vision senses for improving and speeding up the learning stage

 Haptics and vision will be the information sources for generating

the training data

 Providing the human user with haptic feedback for getting or

improving a bidirectional information flow

 Better samples of the tasks

 Providing the coach

with virtual guides through haptic feedback

 The demonstrations will be restricted to those are relevant to

the task

slide-29
SLIDE 29

Expected contributions

 To develop a learning algorithm that complies with all

learning requirements that manipulation tasks demand

 Both haptic and vision information will be taken into account

in the input training data

 What to imitate ?

 Providing a suitable incremental learning algorithm for the

coaching structure

 When new samples are given  When corrections and refinements are carried out  How to imitate ?

slide-30
SLIDE 30

Work planning

1.

To develop a setup composed of a robot teleoperated by a human provided with visual and haptic feedback

2.

To review the state of the art on learning by demonstration

3.

To study and implement incremental learning algorithms

4.

To propose an incremental learning algorithm for the Coaching framework

5.

To study and implement robot Coaching in manipulation tasks

6.

To propose a suitable robot Coaching framework for learning manipulation skills

7.

To carry out experimental tests

8.

To write and defense the PhD thesis

slide-31
SLIDE 31

Resources

slide-32
SLIDE 32

Conclusions

Most of the learning by demonstration frameworks do not allow to improve the demonstrated task

Reinforcement learning

Active teaching

Incremental learning methods are necessary to achieve a suitable coaching framework

It is necessary to take the manipulation task features into account

Fast learning

Robust learning

Few works have studied and integrated vision and haptics in a perception system for both the robot and the coach

Improving the bidirectional information flow

Relevant variables in manipulation can be taken into account (Forces, torques, positions, velocities, etc.)