Incremental Learning of Robot Dynamics using Random Features Arjan - - PowerPoint PPT Presentation

▶

Sep 19, 2022 152 likes •390 views

Incremental Learning of Robot Dynamics using Random Features Arjan Gijsberts, Giorgio Metta Cognitive Humanoids Laboratory Dept. of Robotics, Brain and Cognitive Sciences Italian Institute of Technology general setting learning

SLIDE 1

Cognitive Humanoids Laboratory

Dept. of Robotics, Brain and Cognitive Sciences

Italian Institute of Technology

Incremental Learning of Robot Dynamics using Random Features

Arjan Gijsberts, Giorgio Metta

SLIDE 2

general setting

learning incrementally

– because the world is non-stationary (concept drift)

learn efficiently

– real-time (hard) constraints

we’d like to learn

– accurately (guarantees that learning learns) – autonomously (little prior programming)

SLIDE 3

specific setting

learning body dynamics

– compute external forces – implement compliant control

so far we did it starting from

e.g. the cad models

– but we’d like to avoid it

Six axis F/T sensor Inertial sensor

SLIDE 4

…so

SLIDE 5

some incremental learning methods

LWPR [Vijayakumar et al., 2005]
Kernel Recursive Least Squares [Engel et al., 2004]
Local Gaussian Processes [Nguyen-Tuong et al., 2009]
Sparse Online GPR [Csató and Opper, 2002]

typical problems (not everywhere):

high per-sample complexity (slow learning)
increasing or unpredictable computational

requirements

limited theoretical support and understanding

SLIDE 6

ur method
linear ridge regression as base algorithm

–efficient, elegant, effective –theoretically well-studied

possible extensions for non-linear regression and

incremental updates

 

x w x f



2 2

2 1 2 min Xw y w J

   

 

y X X X I w

T T 1 

  

SLIDE 7

ur method in 3 easy steps
kernel trick
approximate kernel
make it incremental

Rahimi, A. & Recht, B. (2008)

   





m i i i

x x k c x f

,

  y

I K c

1 

  

 

 

     



 D d j w T i w j i

x z x z D E x x k

d d

1 ,

 

   

 

x w x w x z

T T w

sin , cos 

 

y I w

T T

    

1



+ Cholesky rank-1 update

SLIDE 8

features

O(1) update complexity w.r.t. # training samples
exact batch solution after each update
dimensionality of feature mapping trades computation for

approximation accuracy

O(n²) time and space complexity per update w.r.t.

dimensionality of feature mapping

easy to understand/implement (few lines of code)
not exclusively for dynamics/robotics learning!

SLIDE 9

SLIDE 10

batch experiments

3 inverse dynamics datasets: Sarcos, Simulated

Sarcos, Barrett [Nguyen-Tuong et al., 2009]

approximately 15k training and 5k test samples
comparison with LWPR, GPR, LGP, Kernel RR
RFRR with 500, 1000, 2000 random features
hyperparameter optimization by exploiting

functional similarity with GPR (log marginal likelihood optimization)

SLIDE 11

batch error on 7-DOF Sarcos arm

SLIDE 12

prediction time

SLIDE 13

incremental experiments

two large scale inverse dynamics datasets from

“James” and iCub humanoids (4-DOF)

realistic scenario: initial 15k training and

remaining approx. 200k and 80k test samples

RFRR with 200, 500, 1000 random features
RFRR uses training samples only for

hyperparameter optimization

comparison with batch Kernel RR (identical

hyperparameters)

SLIDE 14

batch vs. incremental

SLIDE 15

verification (learning dynamics)

SLIDE 16

verification: time

SLIDE 17

 

g s r r l l CE

V V T v u v u M z y x , , , , , , , , 

fixation point to learn image eye configuration

verification: reaching

SLIDE 18

verification

SLIDE 19

affordances (learning objects)

SLIDE 20

learning object behavior

SLIDE 21

SLIDE 22

conclusions

incremental learning is advantageous when

models cannot be assumed stationary

ridge regression with kernel approximation and

exact update rule for efficient incremental learning

RFRR has an O(1) time and space complexity

per update (suitable for hard real-time)

number of random features regulates

computation vs. accuracy tradeoff

SLIDE 23

Incremental Learning of Robot Dynamics using Random Features

general setting

– because the world is non-stationary (concept drift)

– real-time (hard) constraints

– accurately (guarantees that learning learns) – autonomously (little prior programming)

specific setting

– compute external forces – implement compliant control

e.g. the cad models

– but we’d like to avoid it

…so

some incremental learning methods

typical problems (not everywhere):

requirements

–efficient, elegant, effective –theoretically well-studied

incremental updates

 

x w x f



2 1 2 min Xw y w J

   

 

y X X X I w

  

   





x x k c x f

,

  y

I K c

  

 

 

 

     



x z x z D E x x k

1 ,

 

   

 

x w x w x z

sin , cos 

 

y I w

    



features

approximation accuracy

dimensionality of feature mapping

batch experiments

Sarcos, Barrett [Nguyen-Tuong et al., 2009]

functional similarity with GPR (log marginal likelihood optimization)

batch error on 7-DOF Sarcos arm

prediction time

incremental experiments

“James” and iCub humanoids (4-DOF)

remaining approx. 200k and 80k test samples

hyperparameter optimization

hyperparameters)

batch vs. incremental

verification (learning dynamics)

verification: time

 

 

V V T v u v u M z y x , , , , , , , , 

verification: reaching

verification

affordances (learning objects)

learning object behavior

conclusions

models cannot be assumed stationary

exact update rule for efficient incremental learning

per update (suitable for hard real-time)

computation vs. accuracy tradeoff

sponsors