Programming by Demonstration: Some recent challenges Aude G. - - PowerPoint PPT Presentation

programming by demonstration some recent challenges
SMART_READER_LITE
LIVE PREVIEW

Programming by Demonstration: Some recent challenges Aude G. - - PowerPoint PPT Presentation

Programming by Demonstration: Some recent challenges Aude G. Billard LASA Laboratory EPFL http://lasa.epfl.ch Generalizing: Learning a control law Learning a control law that ensures that you reach the target even if perturbed and that you


slide-1
SLIDE 1

http://lasa.epfl.ch

Programming by Demonstration: Some recent challenges

Aude G. Billard LASA Laboratory EPFL

slide-2
SLIDE 2

http://lasa.epfl.ch

Learning a control law that ensures that you reach the target even if perturbed and that you follow a particular dynamics

Generalizing: Learning a control law

slide-3
SLIDE 3

http://lasa.epfl.ch

Learn a control law from examples

1

x

2

x Time-invariant DS with stable attractor

( )

x f x = &

( )

* *

0. x f x = = &

x*: target

slide-4
SLIDE 4

http://lasa.epfl.ch

{ }

( )

( )

1

Make

  • bservations of the state of the system

, , 1... . Learn , = , ; , : joint density (mixture of Gaussians) describing the distribution of velocity in state space.

i i K k k k k

N x x i N p x x w N x x µ

=

= Σ

& & &

Learn a control law from examples

1

x

2

x

slide-5
SLIDE 5

http://lasa.epfl.ch

{ }

( )

( )

1

Make

  • bservations of the state of the system

, , 1... . Learn , = , ; , : joint density (mixture of Gaussians) describing the distribution of velocity in state space.

i i K k k k k

N x x i N p x x w N x x µ

=

= Σ

& & &

Learn a control law from examples

2

x

1

x

slide-6
SLIDE 6

http://lasa.epfl.ch

Learn a control law from examples

{ }

( )

( )

1

Make

  • bservations of the state of the system

, , 1... . Learn , = , ; , : joint density (mixture of Gaussians) describing the distribution of velocity in state space.

i i K k k k k

N x x i N p x x w N x x µ

=

= Σ

& & &

2

x

1

x

( ) ( )

~ ; , p x N x µ Σ

slide-7
SLIDE 7

http://lasa.epfl.ch

Learn a control law from examples

{ }

( )

( )

1

Make

  • bservations of the state of the system

, , 1... . Learn , = , ; , : joint density (mixture of Gaussians) describing the distribution of velocity in state space.

i i K k k k k

N x x i N p x x w N x x µ

=

= Σ

& & &

1

x

2

x Compute (analytical expression for f)

( ) ( )

{ }

( )(

)

1

| =

K k k k k

h x A x b f x E p x x

=

+ =

&

slide-8
SLIDE 8

http://lasa.epfl.ch

Learn a control law from examples

{ }

( )

( )

1

Make

  • bservations of the state of the system

, , 1... . Learn , = , ; , : joint density (mixture of Gaussians) describing the distribution of velocity in state space.

i i K k k k k

N x x i N p x x w N x x µ

=

= Σ

& & &

1

x

2

x Compute (analytical expression for f) Determine the parameters of the density as a constrained optimization problem (maximize likelihood under stability constraints).

( ) ( )

{ }

( )(

)

1

| =

K k k k k

h x A x b f x E p x x

=

+ =

&

( ) ( ) ( )

( )

1 1 1

Stability Constraints in terms of Gaussian Parameters ) 1,... )

k k k k x xx xx x T k k k k xx xx xx xx

a k K b µ µ

− − −

⎧ + ∑ ∑ = ⎪ ∀ = ⎨ ⎪ ∑ ∑ + ∑ ∑ ⎩

& & & &

p

Khansari and Billard, SEDS, IEEE TRO 2011

slide-9
SLIDE 9

http://lasa.epfl.ch

Other examples of complex dynamics that can be estimated through SEDS.

1

x

2

x

Khansari and Billard, IEEE TRO 2011

Generalizing: Learning a control law

Stability at attractor Khansari and Billard, SEDS, IEEE TRO 2011

slide-10
SLIDE 10

http://lasa.epfl.ch

1

x

2

x

Khansari and Billard, IEEE TRO 2011

Generalizing: Learning a control law

Other examples of complex dynamics that can be estimated through SEDS.

Stability at attractor

Khansari and Billard, SEDS, IEEE TRO 2011

slide-11
SLIDE 11

http://lasa.epfl.ch

Learning motion with non-zero velocity at target

Kronander, Khansari and Billard, IROS 2011, JTSC Best Paper Award

Extend the SEDS model with modulation in speed at target

slide-12
SLIDE 12

http://lasa.epfl.ch

Learning coupling across dynamical systems

Learn separately stable control laws to control for arm and fingers. Couple the two systems to allow adequate adaptation to perturbations.

Shukla and Billard, RSS 2011

slide-13
SLIDE 13

http://lasa.epfl.ch

Learning coupling across dynamical systems

Learn separately stable control laws to control for arm and fingers. Couple the two systems to allow adequate adaptation to perturbations.

Shukla and Billard, RSS 2011

slide-14
SLIDE 14

http://lasa.epfl.ch

Catching Objects in Flight

slide-15
SLIDE 15

http://lasa.epfl.ch

Learning a skill is more than simply replaying a trajectory. It requires to understand what a skill is. To learn this, one needs to show several demonstrations to generalize across sets of examples.

What to Imitate?

Billard et al, Rob. and Aut. Systems, 2005; Calinon et al. IEEE SMC 2007

slide-16
SLIDE 16

http://lasa.epfl.ch

How to Imitate?

?

Imitator

à à Find the closest solution according to some cost function

Demonstrator

slide-17
SLIDE 17

http://lasa.epfl.ch

Key Idea: The world is uncertain; learn about its uncertainty through probabilistic modeling of information.

( )

{ }

var | p x x &

( )

{ }

| E p x x &

The expectation gives a reference trajectory Computing the variance provides crucial information

( ) ( )

1

, , ; ,

K i i i i

p x x w N x x µ

=

= Σ

& &

Statistical model of the data

x x &

slide-18
SLIDE 18

http://lasa.epfl.ch

Computing the variance provides crucial information The variance à provides a notion of feasible space of solutions à is used to compute new path in the face of changes in the context

Generalizing

To generate new trajectories that depart from the reference trajectory while remaining within the total variance.

x x &

slide-19
SLIDE 19

http://lasa.epfl.ch

The variance à provides a notion of feasible space of solutions à is used to compute new path in the face of changes in the context

Generalizing

( )

H x x x = − ) & & & x x &

( )

{ }

| x E p x x = ) & &

( )

min u.c. (inverse kinematics) H x x x J x θ = − = ) & & & & &

Cost function

slide-20
SLIDE 20

http://lasa.epfl.ch

The variance à provides a notion of feasible space of solutions à is used to compute new path in the face of changes in the context

Generalizing

( )

H x x x = − ) & & & x x &

( )

min u.c. (inverse kinematics) H x x x J x θ = − = ) & & & & &

Cost function Impossible solution

slide-21
SLIDE 21

http://lasa.epfl.ch

The variance à provides a notion of feasible space of solutions à is used to compute new path in the face of changes in the context

Generalizing

( ) (

)

( ) (

)

1 T

H x x x x x x

= − Σ − ) ) & & & & & & x x &

Cost function

( ) ( )

{ }

var | x p x x Σ = & &

( )

min u.c. (inverse kinematics) H x J x θ = & & &

slide-22
SLIDE 22

http://lasa.epfl.ch

Adap%ve ¡Grasping ¡

Grasping usually solved by searching for the optimal placement of fingers onto an object.

slide-23
SLIDE 23

http://lasa.epfl.ch

Grasping usually solved by searching for the optimal placement of fingers onto an object. Knowing the extent to which one can adapt this grasp is useful for safe manipulation.

Adap%ve ¡Grasping ¡

Learn how comply with external perturbations while maintaining a firm grasp.

Sauser, Argall and Billard, Autonomous Robots, 2012

slide-24
SLIDE 24

http://lasa.epfl.ch

Teaching through tactile sensing

Adap%ve ¡Grasping ¡

slide-25
SLIDE 25

http://lasa.epfl.ch

Adap%ve ¡Grasping ¡

Teaching through tactile sensing

slide-26
SLIDE 26

http://lasa.epfl.ch

Adap%ve ¡Grasping ¡

( )

Learn a probabilistic mapping , , between contact signature of the object (normal force and tactile response ) and fingers' posture . p s s φ θ φ θ

slide-27
SLIDE 27

http://lasa.epfl.ch

Adap%ve ¡Grasping ¡

( )

Learn a probabilistic mapping , , between contact signature of the object (normal force and tactile response ) and fingers' posture . p s s φ θ φ θ

{ }

( )

( )

47 1

Make

  • bservations of the state of the system

, , , 1... . = ; , : joint density (mixture of Gaussians) describing the observations and the correlation across the variables o

i i i i K k k k k

N s i N p w N ξ φ θ ξ ξ µ

=

= ∈ = Σ

° f the system.

Can be used to predict the appropriate joint posture when perceiving a change in contact signature:

( )

{ }

( )

{ }

ˆ ˆ | , , | , E p s s E p s θ θ φ θ φ = =

slide-28
SLIDE 28

http://lasa.epfl.ch

Adap%ve ¡Grasping ¡

( )

{ }

( )

{ }

ˆ ˆ | , , | , E p s s E p s θ θ φ θ φ = =

slide-29
SLIDE 29

http://lasa.epfl.ch

Adap%ve ¡Grasping ¡

After Training

slide-30
SLIDE 30

http://lasa.epfl.ch

Adap%ve ¡Grasping ¡

Another Example

slide-31
SLIDE 31

http://lasa.epfl.ch

Adap%ve ¡Grasping ¡

Another Example

slide-32
SLIDE 32

Refining knowledge using tactile interface (5 touchpads mounted on robot’s arm and wrist)

Adap%ve ¡Manipula%on ¡

Teaching through teleoperation using Interface for direct joint motion transfer (Xsens motion sensors)

slide-33
SLIDE 33

http://lasa.epfl.ch

Reuse: To avoid re-learning a new task from scratch when the new task bears similarities with the old task

Adap%ve ¡Manipula%on ¡

slide-34
SLIDE 34

http://lasa.epfl.ch

Reuse preserves variability learned in the previous task. This may be a drawback à Use tactile feedback to adapt locally this variability

Before Reuse After Reuse

Adap%ve ¡Manipula%on ¡

slide-35
SLIDE 35

http://lasa.epfl.ch

Adap%ve ¡Manipula%on ¡

Reuse: One more example

slide-36
SLIDE 36

http://lasa.epfl.ch

Being stiff is not always good à How to teach a robot to relax…

Teaching ¡robots ¡to ¡be ¡less ¡s0ff ¡

Low stiffness when carrying the liquid High stiffness when pouring the liquid

Kronander and Billard, ICRA 2012

slide-37
SLIDE 37

http://lasa.epfl.ch

Shaking the robot: A natural method to teach a robot to relax.

Teaching ¡robots ¡to ¡be ¡less ¡s0ff ¡

Being stiff is not always good à How to teach a robot to relax…

slide-38
SLIDE 38

http://lasa.epfl.ch

( )

Adjust stiffness at each time step:

t t t

K x x − %

( )

Record perturbation from current position . Set stiffness profile inversely proportional to variance of perturbation (the more variation, the less stiff): Covariance matrix: Eigenvalue decompo

t T

x x x Δ Σ = Δ Δ

1

sition: ~

T T t

U U K U U

Σ = Λ ⇒ Λ

Teaching ¡robots ¡to ¡be ¡less ¡s0ff ¡

( ) ( )

~ (critically damped)

PD control law to follow a desired trajectory , D

K t t t t t

x u K x x D x x = − − − % % %

slide-39
SLIDE 39

http://lasa.epfl.ch

Teaching ¡robots ¡to ¡be ¡less ¡s0ff ¡

( )

Adjust stiffness at each time step:

t t t

K x x − %

( ) ( )

~ (critically damped)

PD control law to follow a desired trajectory , D

K t t t t t

x u K x x D x x = − − − % % %

( )

Record perturbation from current position . Set stiffness profile inversely proportional to variance of perturbation (the more variation, the less stiff): Covariance matrix: Eigenvalue decompo

t T

x x x Δ Σ = Δ Δ

1

sition: ~

T T t

U U K U U

Σ = Λ ⇒ Λ

slide-40
SLIDE 40

http://lasa.epfl.ch

After training the robot manages to adapt naturally when required and remains stiff when required.

Teaching ¡robots ¡to ¡be ¡less ¡s0ff ¡

slide-41
SLIDE 41

High coherence across trials à high confidence Little coherence across trials à low confidence

Angular Position (radian) of robot’s wrist

Velocity

Learning from Bad Demonstrations

  • Search around the demonstrations
  • Reproduce only parts where all demonstrators agreed
  • Avoid regions with high uncertainty

Grollman and Billard, ICRA 2012, Best Paper Award Cognitive Robotics

slide-42
SLIDE 42

High coherence across trials à high confidence Little coherence across trials à low confidence

Angular Position (radian) of robot’s wrist

Velocity

Learning from Bad Demonstrations

  • Search around the demonstrations
  • Reproduce only parts where all demonstrators agreed
  • Avoid regions with high uncertainty
slide-43
SLIDE 43

http://lasa.epfl.ch

Conclusion

Learning from human demonstration is foremost generalizing

  • Learning a generic control law
  • Learning feasible regions of the state space

Observing human demonstration is not sufficient to perform the task

  • Extracting key features from demonstrations
  • Use these to adapt the trajectory

Demonstrations do not need to be perfect solutions to the task à Learning from bad demonstrations provides crucial information on what is key to perform the task. à More useful to know several feasible solutions to the task than a single but optimal one

slide-44
SLIDE 44

http://lasa.epfl.ch

The Lab