[PPT] - Representing Movement Primitives as Implicit Dynamical Systems PowerPoint Presentation

SLIDE 1

Representing Movement Primitives as Implicit Dynamical Systems learned from Multiple Demonstrations

Robert Krug and Dimitar Dimitrov

Center for Applied Autonomous Sensor Systems (AASS) Örebro University, Sweden

robert.krug@oru.se

Robert Krug ICAR 2013 1 / 12

SLIDE 2

Dynamical Movement Primitives (DMP) [Ijspeert et al., 2002]

Feedback controllers in joint/task space . . . . . . formulated as one dynamical system per DoF: ˙

x(t) = f(x(t),s(t))

Common phase variable s(t) to synchronize DoF

Robert Krug ICAR 2013 2 / 12

SLIDE 3

Dynamical Movement Primitives (DMP) [Ijspeert et al., 2002]

Feedback controllers in joint/task space . . . . . . formulated as one dynamical system per DoF: ˙

x(t) = f(x(t),s(t))

Common phase variable s(t) to synchronize DoF “On-the-fly” motion profile generation: x(t) = t

0 f(x(τ),s(τ))dτ

Robert Krug ICAR 2013 2 / 12

SLIDE 4

Motivation

Outline

1

Motivation

2

Concept

3

Results

4

Contributions & Outlook

Robert Krug ICAR 2013 2 / 12

SLIDE 5

Motivation

Why use primitive motion controllers?

Generate desired motions for a platform with many DoF

Shadow Hand & Arm with 24 DoF

Robert Krug ICAR 2013 3 / 12

SLIDE 6

Motivation

Why use primitive motion controllers?

Generate desired motions for a platform with many DoF Controllers ˙

x = f(x,s) are state policies

Replaces explicit planning Disturbance compensation

Time synchronization of arbitrary many DoF

Shadow Hand & Arm with 24 DoF

Robert Krug ICAR 2013 3 / 12

SLIDE 7

Motivation

Why use primitive motion controllers?

Generate desired motions for a platform with many DoF Controllers ˙

x = f(x,s) are state policies

Replaces explicit planning Disturbance compensation

Time synchronization of arbitrary many DoF Motions resemble demonstrations Simple implementation

Shadow Hand & Arm with 24 DoF

Robert Krug ICAR 2013 3 / 12

SLIDE 8

Motivation

What’s the problem?

DMP [Ijspeert et al., 2002]: Stable spring excited by a learned control input u

˙ x(t) = f(x,s) = Ax(t)

spring

+Bu(s;p),

learned p

x = q ˙ q

∈ R2

Robert Krug ICAR 2013 4 / 12

SLIDE 9

Motivation

What’s the problem?

DMP [Ijspeert et al., 2002]: Stable spring excited by a learned control input u

˙ x(t) = f(x,s) = Ax(t)

spring

+Bu(s;p),

learned p

x = q ˙ q

∈ R2

Problem: One-shot learning → undesirable behavior in regions not covered by the demonstration

Robert Krug ICAR 2013 4 / 12

SLIDE 10

Motivation

What’s the problem?

DMP [Ijspeert et al., 2002]: Stable spring excited by a learned control input u

˙ x(t) = f(x,s) = Ax(t)

spring

+Bu(s;p),

learned p

x = q ˙ q

∈ R2

Problem: One-shot learning → undesirable behavior in regions not covered by the demonstration Solution: Capture different dynamics from multiple demonstrations [Ude et al., 2010][Forte et al., 2012]

Robert Krug ICAR 2013 4 / 12

SLIDE 11

Motivation

What’s the problem?

DMP [Ijspeert et al., 2002]: Stable spring excited by a learned control input u

˙ x(t) = f(x,s) = Ax(t)

spring

+Bu(s;p),

learned p

x = q ˙ q

∈ R2

Problem: One-shot learning → undesirable behavior in regions not covered by the demonstration Solution: Capture different dynamics from multiple demonstrations [Ude et al., 2010][Forte et al., 2012] Presented approach → locally optimal combination:

˙ x(t) = Ax(t)+B

D

∑

d=1

λd(t)ud(s;pd)

Robert Krug ICAR 2013 4 / 12

SLIDE 12

Concept

Outline

1

Motivation

2

Concept

3

Results

4

Contributions & Outlook

Robert Krug ICAR 2013 4 / 12

SLIDE 13

Concept

Re-compute the dynamical system online

Optimize combination of pre-learned control inputs at each time step k . . .

˙ x[k] = Ax[k]+B

D

∑

d=1

λd[k]ud[k]

. . . by minimizing a distance criterion between current and demonstrated states

Robert Krug ICAR 2013 5 / 12

SLIDE 14

Concept

Re-compute the dynamical system online

Optimize combination of pre-learned control inputs at each time step k . . .

˙ x[k] = Ax[k]+B

D

∑

d=1

λd[k]ud[k]

. . . by minimizing a distance criterion between current and demonstrated states States evolve “in between” demonstrations . . . . . . or get “pulled” onto them with dynamics governed by A Encodes different dynamics

Robert Krug ICAR 2013 5 / 12

SLIDE 15

Concept

Re-compute the dynamical system online

Optimize combination of pre-learned control inputs at each time step k . . .

˙ x[k] = Ax[k]+B

D

∑

d=1

λd[k]ud[k]

. . . by minimizing a distance criterion between current and demonstrated states States evolve “in between” demonstrations . . . . . . or get “pulled” onto them with dynamics governed by A Encodes different dynamics First step towards Model Predictive Control with state constraints

Robert Krug ICAR 2013 5 / 12

SLIDE 16

Concept

How does it work?

Robert Krug ICAR 2013 6 / 12

SLIDE 17

Results

Outline

1

Motivation

2

Concept

3

Results

4

Contributions & Outlook

Robert Krug ICAR 2013 6 / 12

SLIDE 18

Results

Generalization in simulation

Robert Krug ICAR 2013 7 / 12

SLIDE 19

Results

Disturbance rejection in simulation

Robert Krug ICAR 2013 8 / 12

SLIDE 20

Results

Evaluation on the Shadow Robot platform

Grasp motions recorded with a sensorized glove . . . . . . and used to learn primitive controllers for the Shadow Hand

Robert Krug ICAR 2013 9 / 12

SLIDE 21

Results

Evaluation on the Shadow Robot platform

Robert Krug ICAR 2013 10 / 12

SLIDE 22

Contributions & Outlook

Outline

1

Motivation

2

Concept

3

Results

4

Contributions & Outlook

Robert Krug ICAR 2013 10 / 12

SLIDE 23

Contributions & Outlook

To sum up . . .

Contributions: Learn motion controllers from multiple demonstrations . . . . . . and form a (locally) optimal combination to generate movements Allows to encode fundamentally different dynamics Predictable behavior without explicit costly motion planning!

Robert Krug ICAR 2013 11 / 12

SLIDE 24

Contributions & Outlook

To sum up . . .

Contributions: Learn motion controllers from multiple demonstrations . . . . . . and form a (locally) optimal combination to generate movements Allows to encode fundamentally different dynamics Predictable behavior without explicit costly motion planning! Future work: Optimize over a time window → Model Predictive Control Incorporate spatial & temporal state space constraints (obstacle avoidance . . . ) Reactive on-line planning & control scheme [Anderson et al., 2012]

Robert Krug ICAR 2013 11 / 12

SLIDE 25

Contributions & Outlook

That’s it . . .

Robert Krug ICAR 2013 12 / 12

SLIDE 26

References

Anderson, S., Karumanchi, S., and Iagnemma, K. (2012). Constraint-based planning and control for safe, semi-autonomous operation of vehicles. In IEEE Intelligent Vehicles Symposium, pages 383 – 388. Forte, D., Gams, A., Morimoto, J., and Ude, A. (2012). On-line motion synthesis and adaptation using a trajectory database. Robotics and Autonomous Systems, 60(10):1327 – 1339. Ijspeert, A., Nakanishi, J., and Schaal, S. (2002). Movement imitation with nonlinear dynamical systems in humanoid robots. In Proc. of the IEEE Int. Conf. on Robotics and Automation, volume 2, pages 1398 – 1403. Ude, A., Gams, A., Asfour, T., and Morimoto, J. (2010). Task-specific generalization of discrete and periodic dynamic movement primitives. IEEE Transactions on Robotics, 26(5):800 – 815.

Robert Krug ICAR 2013 12 / 12