!"#"$%&'"()*$+*&&,%*-."/+)%+ - - PowerPoint PPT Presentation

0 1 0 230 4 0 2 0
SMART_READER_LITE
LIVE PREVIEW

!"#"$%&'"()*$+*&&,%*-."/+)%+ - - PowerPoint PPT Presentation

!"#"$%&'"()*$+*&&,%*-."/+)%+ /"(/%,0'%)%,+*(1+$0(230/4-+$"*,(0(2+0(+ ,%5%4-/ ! "#$%%$&'($)!*+,$-$%! ./+#0$!0%12$3!45647&.5897!:;*<.68! ! =>0?@@A1B$%)C#D%#ECF%!


slide-1
SLIDE 1

!"#"$%&'"()*$+*&&,%*-."/+)%+ /"(/%,0'%)%,+*(1+$0(230/4-+$"*,(0(2+0(+ ,%5%4-/!

"#$%%$&'($)!*+,$-$%! ./+#0$!0%12$3!45647&.5897!:;*<.68! ! =>0?@@A1B$%)C#D%#ECF%! =>0?@@BBBC0-1+,$-$%CG1H! !

slide-2
SLIDE 2

!I$=E(#1+%EJ!ED,!K1LD#M($! N$($J10H$D3!#D!O+HED!4DFED3) !

  • !"#$%&'(##)

*+,"(&($-.)

  • /"0-.-1-"&)
  • 234.%5.646)

7484#-314.0)

slide-3
SLIDE 3

4DDE3$!H1M(EM1DEJ!)-)3$H!3=E3!F1)3$%)!)01D3ED$1+)!IP9! 1%LED#Q$,!$R0J1%EM1D!S#D3%#D)#G!H1M(EM1D@G+%#1)#3-&,%#($D! $R0J1%EM1DT! ! U131%!0%#H#M($)!3=E3!G1D)3%E#D!3=$!)0EG$!1F!H131%!G1HHED,)! ED,!L$)3+%$)?!$CLC!H+)GJ$)!E%$!D13!G1D3%1JJ$,!#D,#(#,+EJJ-!ED,! #D,$0$D,$D3J-V!1)G#JJE31%)V!W! ! 8$D)1%#!,$3$G31%)!ED,!3%EGX$%)!3=E3!EJJ1B!3=$!YEY-!31! Y113)3%E0!#3)!E>$DM1DEJ!ED,!$H1M1DEJ!)-)3$H)?!$CLC! H1($H$D3V!=#L=!0#3G=V!FEG$)V!W! ! 8$D)1%#H131%!%$A$R$)?!$CLC!$-$!3%EGX#DL!1F!H1(#DL!1Y2$G3)V! GJ1)#DL!=ED,)!B=$D!1Y2$G3)!31+G=$,V!W! ! U1%0=1J1L#GEJ!0%10$%M$)!3=E3!FEG#J#3E3$!3=$!G1D3%1J!1F!3=$! Y1,-V!W!

7D!#DDE3$!G$%$Y%EJ!ED,!H1%0=1J1L#GEJ! $/+#0H$D3!W!

slide-4
SLIDE 4

W!Y+#J3!B#3=#D!E!HE3+%EM1DEJ!0%1L%EH!W !

$CLC!H-$J#DEM1D@H-$J#D1L$D$)#)!0%1L%$))#($J-!Y+#J,#DL!Y%E#D!%$L#1D)V!G1DD$GMDL!3=$H! 31L$3=$%!ED,!31!H+)GJ$)V!#DG%$E)#DL!0%1L%$))#($J-!%$)1J+M1D!1F!)$D)$)!ED,!H131%!G1D3%1JV! W!!!

slide-5
SLIDE 5

W!#D a structured physical and social environment !!!3=$D!G1DMD+1+)J-!$R3$D,$,!3=EDX)!31!E!! L$D$%#G!J$E%D#DL!ED,!,$($J10H$D3EJ!)-)3$H!

slide-6
SLIDE 6

Developmental Psychology Biology Developmental and Social Robotics Developmental Psychology Biology

9".+$-.*#) :.&3(;*$-.) 9".+$-.*#) !-64##(.<) 2=>4+0)-?)&0"6@A)BC4)/;+C(04+0";4)-?)D4.&-;(1-0-;)*.6)D-+(*#)7484#-314.0) !)E4*;.(.<)*#<-;(0C1&)*;4)-.#@)*)+-13-.4.0)

(Weng et al., 2001, Science) (Lungarella et al., 2006, Conn. Sc.) (Oudeyer, 2011, Encycl. Lear. Sc.)

Developmental and Social Robotics

Study how to build developmental machines Understand human development better

slide-7
SLIDE 7

;$E%D#DL!H1,$J)!F1%! %1Y13!H131%!)X#JJ!EG/+#)#M1D !

Models of the self/body

Movements <-> Effects

slide-8
SLIDE 8

;$E%D#DL!H1,$J)!F1%! %1Y13!H131%!)X#JJ!EG/+#)#M1D !

Models of physical interaction with objects

Movements <-> Effects

slide-9
SLIDE 9

;$E%D#DL!H1,$J)!F1%! %1Y13!H131%!)X#JJ!EG/+#)#M1D !

Models of tool use

Movements <-> Effects

slide-10
SLIDE 10

;$E%D#DL!H1,$J)!F1%! %1Y13!H131%!)X#JJ!EG/+#)#M1D !

Forward Model Inverse Model

yi y1 y2

Reachable Space of Effect

yi(Ci, (s1, a1, ..., sn, an)πi) ∈ Rn

xi = (Ci, πi) πi : S ∈ Rn → A ∈ Rl

D3*+4)-?)F-.0;-##4;&) B*&')D3*+4)G)D3*+4)-?) 5H4+0&)

Y

Π

Yr

O#L=&,#H$D)#1D)! ! O#L=&(1J+H$! ! 831G=E)MG#3-! ! 6$,+D,EDG-!

x1 xi x2

slide-11
SLIDE 11

U131%!)-D$%L#$)@0%#H#M($) !

(Rossignol, 1996) CPGs (Ijspeert et al., 2005) I"1*.&A)1"&+"#*;)&@.4;<(4&) J-=-0&A)

  • DMP Formalism
  • Recurrent Neural Nets
  • GMR
  • Splines + vector fields
slide-12
SLIDE 12

.R0J1%#DL!ED,!;$E%D#DL!H+JM0J$!H1,$J)! ED,!)X#JJ)!#D!E!,$($J10H$D3EJ!%1Y13 !

Π1 Π2 Π3 Π4 Y1 Y2 Y3

Y4

Y5

The Playground Experiment IEEE Trans. Ev. Comp. (Oudeyer et al., 2007) Bashing param. primitive Biting param. primitive Head turn param. primitive Vocalizing param. primitive

  • Mov. sensori. primitive

Visual patt. sensori. primitive Mouth touch sensori. primitive Leg touch sensori. primitive Sound pitch sensori. primitive

slide-13
SLIDE 13

πi,1 πi,2 πi,j Πi πi,1 πi,2 πi,j Πi πi,1 πi,2 πi,j Πi πi,1 πi,2 πi,j Πi

Yl

Yl,r yl,1 yl,i yl,2

Yl

Yl,r yl,1 yl,i yl,2

Yl

Yl,r yl,1 yl,i yl,2

Yl

Yl,r yl,1 yl,i yl,2

Yl

Yl,r yl,1 yl,i yl,2

Multiple Families

  • f Motor Primitives

= Multiple Controller Spaces Multiple Families

  • f Sensori Primitives

= Multiple Task Spaces + Operators for projecting/ combining motor primitives (include dimensionality reduction or increase) + Operators for projecting/ combining sensori primitives

π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr

M1 M2 M3 ! M4 M5 M6 M7 M8 Mi Mechanisms for self-generation of problems = models do be learnt Innate equipment + (Social) learning Explore and learn

slide-14
SLIDE 14

7GM($!.R0J1%EM1D!ED,!;$E%D#DL !

π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr

M1 M2 M3 ! M4 M5 M6 M7 M8 Mi What models to generate, explore and learn and in what order, given:

  • High inhomogeneities in the mathematical properties of the

mappings

  • Diversity of complexity/dimensionality/volume, learnability,

and level of noise

  • Some are trivial, some other unlearnable
  • Some may be non-stationary
  • Life-time severely limited: the set of learnable models cannot

be learnt entirely during lifetime ! The goal is that learnt models can be reused to solve efficiently (predictive or control) problems unknown to the learner initially and taken for e.g. uniformly in a space of problems relevant in the environment in which the robot exists

slide-15
SLIDE 15

9$G=D#GEJ!G=EJJ$DL$) !

π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr π1 π2 πi yi y1 y2

Y

Π

Yr

M1 M2 M3 ! M4 M5 M6 M7 M8 Mi ! Problem generation: Fixed or adaptive set of problems? Adaptive boundaries boundaries for a given problem? How to control of the growth of complexity (inside and across problems)? ! Problem selection: What problems to focus on ? How to build a useful learning curriculum? ! Which measure of interestingness? Standard approaches to active learning will fail (most often do worse than random), i.e. approaches based on sampling where uncertainty is high, density approaches or approaches based on analytic hypothesis about the learning algorithm or the data (e.g. like when using GPs)

(Whitehead, 1991; Linden and Weber, 1993; Thrun, 1995; Sutton, 1990; Cohn et al., 1996; Brafman and M. Tennenholtz, 2002; Strehl et Littman, 2006; Szita and Lorincz, 2008)

! In particular, very difficult to evaluate analytically the information gain, rather need to evaluate it empirically, but then how? ! If interaction between self-generated problems, then need for sequential decision optimization " Intrinsically Motivated Reinforcement Learning (IMRL, Barto et al. 04, Schmidhuber, 1991).

slide-16
SLIDE 16

N$($J10H$D3EJ!! 0)-G=1J1L-! 5$+%1)G#$DG$)!

<=#3$!SZ[\[TV!I$%J-D$!SZ[]^TV!! K)#X)Q$D3H#=EJ-#!SZ[[]T! NE-ED!ED,!I$JJ$#D$!S_^^_TV!! `EXE,$!ED,!NE-ED!S_^^_TV! O1%(#3Q!S_^^^T!

! 7GM(#M$)!1F!#D3$%H$,#E3$! G1H0J$R#3-V!E)!$(EJ+E3$,!$H0#%#GEJJ-V! E%$!#D3%#D)#GEJJ-!%$BE%,#DL! ! )!4+C*.(&1&)?-;);4<"#*$.<)0C4) <;-K0C)-?)+-13#4L(0@A)0C4)(13-;0*.+4)

  • ?)&0*;$.<)&1*##)

FC(#6)6484#-314.0A)(.0;(.&(+)1-$8*$-.)*.6) 14+C*.(&1&)-?)&3-.0*.4-"&)4L3#-;*$-.) :.);-=-0&A)

9=$!)$E%G=!F1%!#D3$%H$,#E3$!G1H0J$R#3- !

U1,$J)!47KV!647KV!87aa&647KV!UG87aa! S*+,$-$%!$3!EJCV!_^^\b!*+,$-$%!$3!EJCV!_^^cb! IE%ED$)!ED,!`E0JEDV!_^^[b!IE%ED$)!ED,! `E0JEDV!_^Z^EVYT!! ! 7JL1%#3=H#G!E)0$G3)!ED,!/+EJ#3EM($! H1,$JJ#DL!1F!)$D)1%#H131%!,$($J10H$D3! ! 4D3$%H$,#E3$!G1H0J$R#3-!#!UER#HEJ! J$E%D#DL!0%1L%$))!E)!$(EJ+E3$,!$H0#%#GEJJ-!

slide-17
SLIDE 17 π1 π2 πi yi y1 y2 Y Π Yr

Parameterized space of problems/models

π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr π1 π2 πi yi y1 y2 Y Π Yr

Interestingness = Empirical measure of learning progress Stochastic Choice of Problem according to a probability proportional to Learning Progress Recursive splitting or problem space optimizing difference in learning progress IAC (2007) R-IAC (2009) SAGG-RIAC (2010)

slide-18
SLIDE 18

7GM($!%$L+JEM1D!1F!3=$!L%1B3=!1F! G1H0J$R#3-!#D!$R0J1%EM1D !

IAC: Oudeyer P-Y, Kaplan , F. and Hafner, V. (2007), R-IAC: Baranes and Oudeyer (2009) Related Work: Schmidhuber (1991, 2006)

Optimizing learning progress, i.e. the decrease of prediction errors (derivative)

The IAC/R-IAC (Intelligent Adaptive Curiosity) architecture(s) Makes no assumption on the regression algorithm used as “Predictor” (e.g. can be SVE, GP, or non- parametric)

slide-19
SLIDE 19

http://playground.csl.sony.fr

(Oudeyer, Kaplan, Hafner, 2007, IEEE Trans. Evol. Comp.) Here a classic non-parametric regressor is used (Schaal and Atkeson, 1994)

slide-20
SLIDE 20

D4#?%-;<*.(M*$-.)-?) 6484#-314.0*#) 3*N4;.&! B#3=!".(84;&*#&)*.6) 6(84;&(0@ ) *(1+ ! 6(&+-84;@)-?) +-11".(+*$-. ) )

4DFED3!ED,!K=#J,!N$(C!_^^d ! :%1DM$%)!#D!5$+%1)G#$DG$V!_^^c ! K1DD$GM1D!8G#$DG$V!_^^] !

slide-21
SLIDE 21

7GM($!J$E%D#DL!1F!)#DLJ$! =#L=&,#H$D)#1DEJ!H1,$J) !

Forward Model Inverse Model

yi y1 y2

Reachable Space of Effect

yi(Ci, (s1, a1, ..., sn, an)πi) ∈ Rn

xi = (Ci, πi) πi : S ∈ Rn → A ∈ Rl

D3*+4)-?)F-.0;-##4;&) B*&')D3*+4)G)D3*+4)-?) 5H4+0&)

Y

Π

Yr x1 xi x2

slide-22
SLIDE 22

9$J$1L#GEJ!$R0J1%EM1D!#D!=+HED! #DFED3) !

slide-23
SLIDE 23

87aa&647K ! S8$JF&7,E0M($!a1EJ!a$D$%EM1D!647KT !

Active goal self-generation Active goal directed learning

(SSA, Schaal et Atkeson, 1994)

SAGG (Baranes and Oudeyer, IROS 2010; IEEE ICDL/Epirob 2011) Competence-based models Oudeyer and Kaplan, Frontiers in Neurorobotics, 2008)

slide-24
SLIDE 24

8 joints * 3 parameters = Motor primitive M with 24 dimensions

.REH0J$?!N$($J10H$D3EJ!J$E%D#DL!1F! J1G1H1M1D !

The motor primitive: a CPG

slide-25
SLIDE 25

d1 a1 d2 a2

.R0J1%$!3=$!G1D)$/+$DG$!1F!1D$e)! H1($H$D3) !

M

θ

u

v

3 DOF/Leg

The sensori-primitive: Translation + Rotation

  • f COM
slide-26
SLIDE 26

;$E%D3!)X#JJ) !

The robot can re-use its curiosity-driven learnt forward and inverse models to reach any particular location in its field of view Note: Here the forward and inverse model are learnt actively using a local learning algorithm (Local Gaussian Mixture Regression, ILO-GMR, Cederborg et al., 2010)

slide-27
SLIDE 27

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5

Reaching Error

Distance

Number of Actions (time steps)

SAGG-RIAC SAGG-Random ACTUATOR-Random ACTUATOR-RIAC

10000 1 6 3 2 4 5

:E)3$%!J$E%D#DL!ED,!Y$>$%! 0$%F1%HEDG$)!#D!L$D$%EJ#QEM1D!

slide-28
SLIDE 28

UE3+%EM1DEJ!G1D)3%E#D3) !

I"1*.&A)1*0";*$-.)-?)0C4)&4.&-;(1-0-;) &@&041) J-=-0&A)

!"#$%&#'(%)*+,-

. / 1 + , # ' . + 2 3 + ,

  • Cephalo-caudal and

proximo-distal law

Baranes, A., Oudeyer, P-Y., 2011, IEEE ICDL 2011

(Eyre, 2003; Berthier et al., 1999)

slide-29
SLIDE 29

UG87aa&647K?!.R0$%#H$D3EJ! 6$)+J3) !

2000 4000 6000 8000 10000 1 2 3 4 5 6 7

SAGG-RIAC SAGG-Random ACTUATOR-Random ACTUATOR-RIAC Mc SAGG-RIAC Out Mc SAGG-RIAC In & Out

1 5 4 3 2 6 7 10000

Number of Actions

Distance

Reaching Error

slide-30
SLIDE 30

4H01%3EDG$!1F!3=$!Y#&,#%$GM1DEJ! G1+0J#DL!Y$3B$$D!HE3+%EM1D!ED,! EGM($!J$E%D#DL !

slide-31
SLIDE 31

;$E%D#DL!Y-!,$H1D)3%EM1D! ED,!#H#3EM1D! S8G=EEJ!$3!EJCV!I#JJE%,!$3!EJCV! 7)F1+%!ED,!N#JJHEDV!;10$)!$3! EJCV!N$H#%#)!$3!EJCV!WT!!

I"1*.&A)D-+(*#)<"(6*.+4)(.)0C4)O-.4)-?) P;-L(1*#)7484#-314.0) J-=-0&A) !)F-"3#(.<)-?)&-+(*#)<"(6*.+4)*.6)(.0;(.&(+*##@) 1-$8*046)#4*;.(.<)) f-L13)X#V!g"N!

81G#EJ!L+#,EDG$ !

Mirror neurons (Gallese et al., 1996)

slide-32
SLIDE 32

8a4U?!.R0$%#H$D3EJ!6$)+J3) !

slide-33
SLIDE 33

8a4U?!.R0$%#H$D3EJ!6$)+J3) !

(Nguyen et al., IEEE ICDL/Epirob 2011)

slide-34
SLIDE 34

O1B!GED!E!%1Y13!J$E%D!D1($J!(#)+EJJ-! L%1+D,$,!B1%,)!F%1H!E!=+HEDh!

Problem: How to teach a robot to recognize new visual objects associated to new words ?

slide-35
SLIDE 35

i+)3!E!HE>$%!1F!HEX#DL!$jG#$D3! )3EM)MG)!1($%!H+JMH1,EJ! 1Y)$%(EM1D)h !

No ! Also a matter of collecting data that is good enough through adequate human-robot interaction

slide-36
SLIDE 36

9=$!G%+G#EJ!%1J$!1F!21#D3!E>$DM1D!

Humans use heavily social cues to coordinate social interaction, realize « joint attention », and thus allow the child learner to collect good training data

slide-37
SLIDE 37

Shall we mimick human-human natural mechanisms for ensuring human-robot joint attention (e.g. use of pointing, gaze direction, « waving », !) ?

slide-38
SLIDE 38

UE-Y$!D13!W!E)!=#D3$,!Y-!3=$!<#QE%,!1F!*Q !

Even with human intelligence, the sensorimotor apparatus of a robot is so different from the one of humans that it is very difficult to use social cues such as pointing or waving (for example, big different in the field of view that makes it very difficult for a non-engineer human teacher to understand what the robot is seeing).

slide-39
SLIDE 39

4D3%1,+G#DL!H$,#E31%!#D3$%FEG$)!

Allowing organisms that do not share the same tools for perception and action to still manage to communicate

slide-40
SLIDE 40

N$($J10#DL!D1($J!=+HED&%1Y13! #D3$%FEG$)!YE)$,!1D!H$,#E31%!1Y2$G3) !

slide-41
SLIDE 41

Mediator interfaces

(Rouanet et al., SIGGRAPH 2010) (Rouanet, Danieau and Oudeyer,2011, HRI 2011) (Rouanet et al., 2009, Humanoids 2009)

slide-42
SLIDE 42
  • Cap Sciences, Bordeaux
  • 107 participants : 77

hommes, 30 femmes

  • Age: 10 à 76 (M = 26.3)

! Using well-designed interfaces/ interaction schemes allows the robot to collect much better training data and to improve its learning dramatically (the increase is much higher than the different between a naive and a sophisticated statistical learning approach for a given dataset)

slide-43
SLIDE 43

:EH#J#$)!1F!,$($J10H$D3EJ!G1D)3%E#D3)!EJJ1B#DL!F1%! ($%)EMJ$!)$D)1%#H131%!,$($J10H$D3 !

Humans Robots Intrinsically motivated exploration Active learning alg. Muscular synergies Function basis for constraining movement Eco-adapted morphology Bio-inspired morphology Myelination Models of maturational constraints Cognitive bias for inference and abstraction

  • Alg. for inference and

abstraction building Socially guided exploration Techniques for learning through social interaction

slide-44
SLIDE 44

Thank you! http://flowers.inria.fr http://www.pyoudeyer.com

slide-45
SLIDE 45

6$F$%$DG$)!SZT !

Baranes, A., Oudeyer, P-Y. (2010) “Intrinsically Motivated Goal Exploration for Active Motor Learning in Robots: a Case Study”, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2010), Taipei, Taiwan. Baranes, A., Oudeyer, P-Y. (2011) “The Interaction of Maturational Constraints and Intrinsic Motivations in Active Motor Development”, in proceedings of the IEEE International Conference on Development and Learning, Frankfurt, Germany. Baranes, A., Oudeyer, P-Y. (2009) “R-IAC: Robust intrinsically motivated exploration and active learning”, IEEE Transactions on Autonomous Mental Development, 1(3), pp. 155--169.

  • A. Barto, S. Singh, and N. Chentanez (2004) “Intrinsically motivated learning of hierarchical collections of skills,” in

Proceedings of the 3rd Interna- tional Conference on Development and Learning (ICDL 2004), 2004.

  • D. Berlyne (1960) Conflict, Arousal and Curiosity.New York: McGraw-Hill, 1960.
  • N. E. Berthier, R. Clifton, D. McCall, and D. Robin (1999) “Proximodistal structure of early reaching in human infants,” Exp

Brain Res, 1999.

  • R. I. Brafman and M. Tennenholtz (2001) R-max - A General Polynomial Time Algorithm for Near-Optimal

Reinforcement Learning. In IJCAI`01, 2001. Cederborg, T., Ming, L., Baranes, A., Oudeyer, P-Y. (2010) Incremental Local Online Gaussian Mixture Regression for Imitation Learning of Multiple Tasks, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2010), Taipei, Taiwan. Cohn D., Ghahramani Z., and Jordan M. (1996) Active learning with statistical models, J. Artif. Intell. Res., vol. 4, pp. 129– 145. M.Csikszenthmihalyi (1991) Flow – the Psychology of Optimal Experience.New York: Harper Perennial, 1991.

slide-46
SLIDE 46

6$F$%$DG$)!S_T !

  • P. Dayan and W. Belleine (2002) “Reward, motivation and reinforcementlearning,” Neuron, vol. 36, pp. 285–298.

J.A. Eyre (2003) Development and plasticity of the corticospinal system in man. Hindawi Publishing Corporation, 2003.

  • S. Kakade and P. Dayan (2002) “Dopamine: Generalization and bonuses”, Neural Netw., vol. 15, pp. 549–559..

J.-C. Horvitz (2000) “Mesolimbocortical and nigrostriatal dopamine re-sponses to salient non-reward events,” Neuroscience,

  • vol. 96, no. 4,pp. 651–656, 2000.

Linden, A. and Weber, F. (1993) “Implementing inner drive through competence reflection”. In Press, MIT (ed.), Proceedings of the second international conference on From animals to animats 2 : simulation of adaptive behavior: simulation of adaptive behavior, pp. 321–326, 1993. + ;-V!*CV!*+,$-$%V!"&'C!S_^Z^T!k!7G%1YED!3=$!=+HED1#,?!"JE-F+J!ED,!G1H0J#ED3!0=-)#GEJ!G=#J,!%1Y13!#D3$%EGM1D!lV!84aa67"Oe_^Z^!.H$%L$D3!! 9$G=D1J1L#$)C!! ! 5L+-$DV!UCV!IE%ED$)V!7CV!*+,$-$%V!"&'C!S_^ZZT!!k!I113)3%E00#DL!4D3%#D)#GEJJ-!U1M(E3$,!;$E%D#DL!B#3=!O+HED!N$H1D)3%EM1D)!lV!#D! 0%1G$$,#DL)!1F!3=$!4...!4D3$%DEM1DEJ!K1DF$%$DG$!1D!N$($J10H$D3!ED,!;$E%D#DLV!:%EDXF+%3V!a$%HED-C! Oudeyer P-Y, Kaplan , F. and Hafner, V. (2007) “Intrinsic Motivation Systems for Autonomous Mental Development”, IEEE Transactions on Evolutionary Computation, 11(2), pp. 265--286. Rouanet, P., Oudeyer, P-Y. and Filliat, D. (2009) “An integrated system for teaching new visually grounded words to a robot for non-expert users using a mobile device”, in the Proceedings of IEEE-RAS International Conference on Humanoid Robots. Rouanet, P., Oudeyer, P-Y., and Filliat, D. (2010) « Using mediator objects to easily and robustly teach visual objects to a robot », in ACM SIGGRAPH Posters, 2010. Rouanet, P., Danieau, F. and Oudeyer, P-Y. (2011) « A robotic game to evaluate interfaces used to show and teach visual

  • bjects to a robot in real world condition », In Proceedings of 6th ACM/IEEE International Conference on Human-Robot

Interaction (HRI 2011), Lausanne, Switzerland.

slide-47
SLIDE 47

6$F$%$DG$)!SmT !

  • J. Schmidhuber (1991) “Curious model-building control systems,” in Proc. Int. Joint Conf. Neural Netw., Singapore, 1991, vol. 2,
  • pp. 1458–1463.
  • J. Schmidhuber (2006) “Developmental Robotics, Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts”. Connection

Science, 18(2): 173-187.

  • S. Schaal and C. G. Atkeson (1994) “Robot juggling: an implementation of memory-based learning,” Control systems magazine,
  • pp. 57–71, 1994.
  • R. S. Sutton (1990) “Integrated architectures for learning, planning, and re- acting based on approximating dynamic

programming,” in Proc. 7th Int. Conf. Mach. Learn., Washington DC, 1990, pp. 216–224.

  • A. L. Strehl, C. Mesterharm, M. L. Littman, Haym Hirsh (2006) “Experience-efficient learning in associative bandit problems”. ICML

2006: 889-896

  • I. Szita, A. Lorincz (2008) ”The many faces of optimism: a unifying approach”. In Proceedings of ICML'2008. pp.1048~1055
  • S. Thrun and K. Möller, J. Moody, S. Hanson, and R. Lippmann, Eds. (1992) “Active exploration in dynamic environments,” in
  • Proc. Adv. Neural Info. Process. Syst. 4, Denver, CO, 1992.
  • S. Thrun (1995) “Exploration in active learning,” in Handbook of Brain Science and Neural Networks, M. Arbib, Ed.Cambridge, MA:

MIT Press, 1995, pp. 381–384. R.White (1959) “Motivationreconsidered:Theconceptofcompetence,”Psy-chol. Rev., vol. 66, pp. 297–333, 1959.

  • S. Whitehead (1991) “A Study of Cooperative Mechanisms for Faster Re-inforcement Learning”, Univ. Rochester, Rochester, NY,
  • Tech. Rep. TR-365.
slide-48
SLIDE 48

K1D3%E#D3$)!H1%0=1J1L#/+$) !

I"1*(.&A);Q#4)64)#*)+-#-..4)84;0R=;*#4)40)64) #*)S4L(=(#(0R)6")+-;3&) J-=-0&) (Ceccato et Cazalets, 2009) Institut de Neurosciences Cognitives et Intégratives d’Aquitaine

(Ly et Oudeyer, SIGGRAPH 2010, emerging technologies)

  • Modélisation et
  • exp. du tronc;
  • Extension aux

jambes;

slide-49
SLIDE 49

K1D3%E#D3$)!H1%0=1J1L#/+$)!)+%!JeE00%$DM))EL$!,$!JE!HE%G=$?!! J$!%nJ$!,$!JE!)1+0J$))$!$3!,$!JE!G1J1DD$!($%3oY%EJ$!

  • !/+;-=*.)T2#(8(4;)E@UV)WX)729&)

)

  • )D0;"+0";4)&-"3#4),"()34"0)*=&-;=4;)40)

&0-+'4;)64)#YR.4;<(4)T04.6-.&)R#*&$,"4&V) ;4&&-;0&V)1-04";&U)

  • )B-;&4)&41(%3*&&(?)*84+)+-#-..4)

84;0R=;*#4)1"#$%*;$+"#R4)

  • )P;(1($84)1-0;(+4)6YR,"(#(=;*<4)

<R.R;(,"4)

  • )E*)1*;+C4)+-114)".4)*"0-%

34;0";=*$-.)

  • )Z.4)(.04;?*+4)C-114%;-=-0)[)*"0-%
  • ;<*.(&R4)\V)34;14N*.0)64)<"(64;)

(.0"($8414.0)#4);-=-0)4.)#4)3;4.*.0)3*;) #*)1*(.! ;-V!*CV!*+,$-$%V!"&'C!S_^Z^T!7G%1YED!3=$!=+HED1#,?!"JE-F+J!ED,!G1H0J#ED3!0=-)#GEJ!G=#J,!%1Y13!#D3$%EGM1DV!84aa67"Oe_^Z^! 6'",2"()+7"-.(%$%20"/8+901"%/+%(+.:&;<<=%>",/80(,0*8?,<*-,%5*(8&.&++

slide-50
SLIDE 50

:;*<.68V!45647&.5897!"E%#)9$G=!

P4;1*.4.0)141=4;&) "#$%%$&'($)!*+,$-$%!S45647!K6ZV!)G#$DMpG! %$)01D)#YJ$T!! UED+$J!;10$)!S45647!K6_T! NE(#,!:#JJ#E3!SU,KV!.5897T! :%$XX!83+J0!SU,KV!.5897T! 7J$RED,$%!a$00$%3=!SU,K!.5897T! ! /61(.(&0;*$84)*&&(&0*.0) 5E3=EJ#$!61Y#D! ) 5.<(.44;&! io%1H$!IoG=+!S45647T! "E+J!:+,EJ!S45647T! OE-J$$!:1LL!S45647T! ) ! P-&06-+&) ! 83o0=ED$!IEQ$#JJ$!S.5897!"1)3,1GT! 9=1HE)!N$L%#)!! KJoH$D3!U1+J#D&:%#$%! ) PC7)D0"64.0&) ! 7,%#$D!IE%ED$)!S45647!"=NT!! "#$%%$!61+ED$3!S45647!"=NT!! 9=1HE)!K$,$%Y1%L!S45647!"=NT! UE#!5L+-$D!S45647!"=NT! UE>=#$+!;E0$-%$!S45647!"=NT! i1DE3=ED!a%#Q1+!S45647!"=NT! *J#(#$%!UEDL#D!SI1+%)$!7Uq!r!"1J-3$G=D#/+$T! :EY#$D!I$D+%$E+!SI1+%)$!.58!;-1DT! 4)J$H!i$YE%#!S.5897!"=NT! 5E3EJ#E!;-+Y1(E!S.5897!"=NT! 7J$RED,%$!K=E01+J#$!S.5897!"=NT! ! !

slide-51
SLIDE 51

K1JJEY1%EM1D)!#D3$%,#)G#0J#DE#%$)!

:;*<.68!

Linguistique

Louis ten Bosch, Radboud University, The Netherlands (prof. invité) (modèles de la découverte d’invariants moteurs, approche NMF) Benjamin Bergen, USC, US (linguistique cognitive, modèles de représentation du sens et d’affordances)

Neurosciences cognitives et intégratives

Jacqueline Gottlieb, Columbia University, NY, US, (motivations intrinsèques, attention visuelle) J-R. Cazalets, Inst. Neur. Int. De Bordeaux (Acroban, physiologie de la colonne vertébrale)

Inférence statistique et apprentissage automatique

Marc Toussaint, FU Berlin, Germany, (Inférence probabiliste pour la décision et la planification); Rich Sutton, Univ. Alberta, Canada (Intrinsic motivation and RL) INRIA Alea, Pierre Del Moral, François Caron (méthodes de Monte-Carlo, informal collaboration); Andrew Barto, Univ. Mass., US (RL et théorie des options)

psychologie développementale

IMClever European project on Intrinsically motivated cumulative learning (motivations intrinsèques) Philippe Rochat, Emory State University, US (découverte des cartes corporelles) Linda Smith, Indiana University, US (Acquisition of symbolic communication

Robotique

  • O. Sigaud, V. Padois ISIR, Univ. Paris VI

(Operational space control)

  • F. Chaumette (LAGADIC) (ROMEO 2/

PAL, robot grasper in an assistive context)

  • P. Rives (AROBAS) (Slam)
  • M. Cakmak, Georgia Tech Univ. US

(Human-Robot interaction and learning) Stefan Schaal, UCSD, US (dynamic motor primitives, Entreprises GOSTAI, Aldebaran Robotics, Robot Studio

Ergonomie et facteurs humains

INRIA Iparla (interfaces) Institut de Cognitique, Bordeaux (évaluation des interfaces)

Mécanique

Alexandre Lasserre, lab. De mécanique, Bordeaux (conception mécanique, Acroban)

  • Coll. déjà commencées
  • Coll. Déjà planifiées (ERC, ANR, !)
slide-52
SLIDE 52

N#H$D)#1D!)1G#o3EJ$!$3!oG1D1H#/+$!

@+A/+B+$%%C+*)+)."+),"(1/+).*)+*,"+(%>+/)*,4(2+)%+-%(#",2"D+B+-*(+ "(#0/0%(+*+?3)3,"+0(+>.0-.+,%5%4-+1"#0-"/+>0$$+5"-%'"+*+("*,$E+ 350F30)%3/+&*,)+%?+%3,+1*EG)%G1*E+$0#"/+H+ ](##)^*04&V!8G#$DMpG!7H$%#GEDV!2ED+E%-!_^^c!

  • !7))#)3EDG$!s!JE!

0$%)1DD$!

  • !81G#o3o!

(#$#JJ#))ED3$!

  • !.,+GEM1DV!

G1DF1%3!$3!2$+! Nop?!"#$%&'()*#!%$!+#$%&,'(%-! S+MJ#)EY#J#3o!$3!EGG$03EM1D!)1G#EJ$T!$3! '.'/$')*#!SE00%$DM))EL$T! !

slide-53
SLIDE 53

"JE3$F1%H$)!$R0o%#H$D3EJ$)!

4G+Y!SE($G!Je4846V!*0$D! KEJJ!61Y1G+Y!$D!_^^cT! 5E1! 7G%1YED! 8#H+JE3$+%)?! <$Y13)!!$3!! f6."! !!9*P8!0%1L%EHHo)!,ED)!J$!F%EH$B1%X!H#,,J$BE%$!P6I4!

slide-54
SLIDE 54

PD!$R$H0J$!)#H0J$!,+! F1DGM1DD$H$D3!,$!6&47K!

I%E)!%$,1D,ED3V!GEHo%E!s!Z!0#R$J! .)0EG$!0$%t+!

N*:!Z! N*:!_!

.(1J+M1D!,+!F1G+)! ,e$R0J1%EM1D!E($G!J$! 3$H0)?!J$!Y%+#3!$)3!o(#3oV! $3!J$)!%oL#1D)!u)#H0J$)v! )1D3!$R0J1%o$)!E(ED3!J$)! %oL#1D)! G1H0J#/+o$)%$L#1D)! 3$H0)!