Learning(and(Sharing(Knowledge(( for(Robots
Ashutosh'Saxena' ' CEO,'Brain'of'Things' ' '
Interact learn share
Learning(and(Sharing(Knowledge(( for(Robots learn Interact - - PowerPoint PPT Presentation
Learning(and(Sharing(Knowledge(( for(Robots learn Interact Ashutosh'Saxena' ' share CEO,'Brain'of'Things' ' ' Prepare&affogato: ' !Take!some!coffee!in!a!cup.!Add!ice!cream!of!your!choice.!!!
Ashutosh'Saxena' ' CEO,'Brain'of'Things' ' '
Interact learn share
Prepare&affogato:'“!Take!some!coffee!in!a!cup.!Add!ice!cream!of!your!choice.!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Finally!add!raspberry!syrup!to!the!mixture.”'
Sung,'Jin,'Saxena:'Robobarista.''Misra'et'al.,'Tell'Me'Dave'
2' Ashutosh'Saxena'
Sung,'Jin,'Saxena:'Robobarista,'ISRR'2015'
Robot&
[Koppula'&'Saxena'et'al.'2011]' [Socher'et'al.'2011]' [Yao'et'al.'2012]' [Farabet'et'al.'2013]' [Wu'et'al.'2014]' …'
'
0%' 20%'40%'60%'80%' 100%'
stove&
microwave& fridge&
Natural&Language&Understanding'
[Walter'et'al.'2013]' [Beetz'et'al.'2011]' [Misra,'Sung,'Saxena'2014]'
Moveto ($x) Grasp ($x) Toggle ($x, on) …
[Muelling'et'al.'2010]' [MaiWnXShepard'et'al.'2010]'
! MoWon'Planners' ''PRM*'/'RRT*'[Karaman'&'Frazzoli'2011]''
'''CHOMP'[Ratliff'et'al.'2009]' '''trajopt'[Schulman'et'al.'2013]]'
Manipula>on&Planning&
! LearningXbased'Approaches'
''Markov'Decision'Process'(MDP)' ''Inverse'reinforcement'learning'(ILR)'' '''''''[Ng'&'Russel'2000;'Abbeel'&'Ng'2004]' ''Max'Margin'Planning'[Ratliff'et'al.'2006]'
3'
Robot&
4'
Robot&
5'
“Cat”'
ruth Image FCN-8s
Long'et'al.'CVPR’15' &'many'more' Krizhevsky'et'al.'NIPS’12' &'many'more' “Hello'world”' “Hallo'welt”' Sutskever'et'al.'NIPS’14' &'many'more'
A group of people shopping at an
!
Vinyals'et'al.'CVPR’15' &'many'more' “Hello'world”' Hinton'et'al.'IEEE'speech’12' &'many'more'
X' Y'
6' Ashutosh'Saxena'
Hierarchical&RNN& Du'et'al.'CVPR’15' ' Recursive&Neural&Networks& Socher'et'al.'ICML’11' Structured&predic>on&with&deep&neural&networks& Chen'et'al.'ICLR’15' Zhang'et'al.'CVPR’14' Tompson'et'al.'SIGGRAPH’14' Chen'et'al.'ICML’15' Zheng'et'al.'ICCV’15' '
Combining!structure!with!deep!neural!networks!helps!
7' Ashutosh'Saxena'
Robots!interact!with! the!world,!humans! and!Internet! !
8'
Ashutosh'Saxena'
Interact learn share
Learn!shared! models!and! representaBons! Update!the! knowledge!in!the! cloud!(RoboBrain)!
Human&object&interac>on&
Koppula'et'al.'T.'PAMI’15' Li'et'al.'ECCV’08' Gupta'et'al.'T.'PAMI’09'
9' Ashutosh'Saxena'
Manipula>ng&food& DeepMPC.'Lenz,'Knepper,' Saxena,'RSS'2014' Appliance&Manipula>on& RoboBarista,&' Sung'et'al.,'2015' Robot&Language& Tell&Me&Dave,&RSS’13.' Tellex'et'al.' Abbeel'et'al.,'UC'Berkeley' Homes&and&Humans& Brain'of'Things,'2015'
! Large'variety'of' required'movements' ! Large'variety'of'
! Large'variety'in' environments'
Sung,'Jin,'Saxena:'Robobarista' 10' Ashutosh'Saxena'
similar'manner'
Even'if'the'robot'has'not'seen'the'object'before,' prior'moWon'plan'can'be'reXused'on'new'objects.'
Sung,'Jin,'Saxena:'Robobarista' 11' Ashutosh'Saxena'
6
Sung,'Jin,'Saxena:'Robobarista' 12' Ashutosh'Saxena'
Sung,'Jin,'Saxena:'Robobarista' 13' Ashutosh'Saxena'
Sung,'Jin,'Saxena:'Robobarista,'ISRR'2015' 14'
Sung,'Jin,'Saxena:'Robobarista,'ISRR'2015'
! 116'objects'(250'parts)'
! 1225'crowdXsourced'trajectories'
15' Accuracy&(%)&& [DTWNMT&<&10]& chance!
LSSVM!+!kinemaBc!structure![50]! similar!task!+!weighBng![13]! Our'Model'without'NoiseXhandling' Our$Model$
Ashutosh'Saxena' Ashutosh'Saxena'
Ashutosh'Saxena' Ashutosh'Saxena'
RoboBrain'Knowledge'learned'from:'
Ashutosh'Saxena' Ashutosh'Saxena'
Ashutosh'Saxena'
Brain4Cars,'Jain'et'al.'
20' Ashutosh'Saxena'
Training' example'
“Memory”' Learning'' to'fuse' Learning'' to'anWcipate'
21'
Jain'et'al.'ICRA’16'
Brain4Cars,'Jain'et'al.' Ashutosh'Saxena'
Precision& Recall& TimeNtoN maneuver&(s)& Chance' 20.0' 20.0' XX' SVM' Morris'et'al.'IVS’11' 43.7' 37.7' 1.20' SimpleRNN' 78.0' 71.1' 3.15' FusionRNN' Jain'et'al.'ICRA’16' 84.5& 77.1& 3.58'
22'
Jain'et'al.'ICCV’15,'ICRA’16,'IJRR’16*'
Ashutosh'Saxena'
DeepMPC,'Lenz,'Knepper'and'Saxena,'RSS'2014'
Human&object&interac>on&
Koppula'et'al.'T.'PAMI’15' Li'et'al.'ECCV’08' Gupta'et'al.'T.'PAMI’09'
24' Ashutosh'Saxena'
Manipula>ng&food& DeepMPC.'Lenz,'Knepper,' Saxena,'RSS'2014' Appliance&Manipula>on& RoboBarista,&' Sung'et'al.,'2015' Robot&Language& Tell&Me&Dave,&RSS’13.' Tellex'et'al.' Abbeel'et'al.,'UC'Berkeley' Homes&and&Humans& Brain'of'Things,'2015'
Robots!interact!with! the!world,!humans! and!Internet! !
25'
Ashutosh'Saxena'
Interact learn share
Learn!shared! models!and! representaBons! Update!the! knowledge!in!the! cloud!(RoboBrain)!
Douillard'et'al.'ISRR’07' Li'et'al.'ECCV’08' Lezama'et'al.'CVPR’11' Fragkiadaki'et'al.'ICCV’15' Koppula'et'al.'RSS’13' Jain'et'al.'ICCV’15' Brendel'et'al.'ICCV’11' …'and'many'more' Human'object'interacWon'' (Koppula'et'al.'IJRR’13)' Tracking' (Li'et'al.'ECCV’08)' Modeling'human'moWon' '(Taylor'et'al.'NIPS’06)'
26'
𝑌 𝑍
𝑌 𝑍
𝑌 𝑎 𝑍
𝑌 𝑈
?
Input Layer Hidden Layer Output Layer Inside features Outside features Driver states
Maneuver'anWcipaWon' (Jain'et'al.'ICCV’15)'
Ashutosh'Saxena'
jhjh'
*Scalable'and'flexible' ' *Generic'and'principled' ' *EndXtoXend'trainable'
'
27' Ashutosh'Saxena'
StructuralXRNN:'Deep'Learning'on'SpaWoXTemporal'Graphs.'Jain,'Zamir,' Savarese,'Saxena.'In'CVPR,'2016.'
1.'FactorXgraph'parameterizaWon' ' ' 2.'SemanWcally'parWWoning'the'nodes' ' ' 3.'Modeling'each'factor'funcWon'with'RNN' ' ' 4.'Wiring'the'RNNs'to'form'structuralNRNN&
28' Ashutosh'Saxena'
Factor graph
'
29' Ashutosh'Saxena'
'
factor'funcWon'and'parameters.'
Factor graph
Why'share'factors?'' Incorporate'priors,'compactness,'flexibility,'generalizaWon'
30' Ashutosh'Saxena'
GeneralizaBon!to!changes!in!environment!and!task!
31' Ashutosh'Saxena'
spaWoXtemporal'interacWons'
32' Ashutosh'Saxena'
StructuralXRNN'is'a' biparWte'graph'
Algorithm 1 From spatio-temporal graph to S-RNN Input G = (V, ES, ET ), CV = {V1, ..., VP } Output S-RNN graph GR = ({REm}, {RVp}, ER) 1: Semantically partition edges CE = {E1, ..., EM} 2: Find factor components {ΨVp, ΨEm} of G 3: Represent each ΨVp with a nodeRNN RVp 4: Represent each ΨEm with an edgeRNN REm 5: Connect {REm} and {RVp} to form a bipartite graph. (REm, RVp) ∈ ER iff ∃v ∈ Vp, u ∈ V s.t. (u, v) ∈ Em Return GR = ({REm}, {RVp}, ER)
33' Ashutosh'Saxena'
ForwardXpass'for'human'node'v! ForwardXpass'for'object'node'w!
Parameter&sharing'through' ' Structured&feature&space:& Input'in''''''''''is'' ' ' ForwardXpass'for'v! ForwardXpass'for'u!
Ashesh'Jain' 34'
connecWons'between'nodeRNNs'and'edgeRNNs'
the'output'labels'
35' Ashutosh'Saxena'
Modeling&human&mo>on&
Taylor'et'al.'NIPS’06,'CVPR’10' Sutskever'et'al.'NIPS’09' Fragkiadaki'et'al.'ICCV’15'
' Human&object&interac>on&
Koppula'et'al.'T.'PAMI’15' Li'et'al.'ECCV’08' Gupta'et'al.'T.'PAMI’09'
Maneuver&an>cipa>on&
Oliver'et'al.'IVS’00' Morris'et'al.'IVS’11' Jain&et&al.&ICCV’15&(AIONHMM)&
'
36'
𝑌 𝑍
𝑌 𝑍
𝑌 𝑎 𝑍
𝑌 𝑈
?
Input Layer Hidden Layer Output Layer Inside features Outside features Driver states
Jain'et'al.'ICCV’15' Jain'et'al.'ICRA’16'
Ashutosh'Saxena'
Modeling&human&mo>on& Taylor'et'al.'NIPS’06,'CVPR’10' Sutskever'et'al.'NIPS’09' Fragkiadaki'et'al.'ICCV’15' Corresponding'StructuralXRNN' ' Learns'from'rawXjoint'values'
37' Ashutosh'Saxena'
FC' FC' LSTM' FC' FC' LSTM' LSTM' FC' FC' FC' FC' LSTM' FC' FC'
nodeRNN' edgeRNN'
t' t+1' t' t+1'
Flexibility!in!designing!edgeRNNs!and!nodeRNNs!
38' Ashutosh'Saxena'
Figure 1: User study with five users. Each user was shown 36 forecasted motions equally divided across four activities (walking, eating, smoking, discussion) and three algorithms (S-RNN, ERD, LSTM-3LR). The plot shows the number of bad, neutral, and good motions forecasted by each algorithm.
Ashesh'Jain' 39'
40' Ashutosh'Saxena'
StructuralXRNN:'Deep'Learning'on'SpaWoXTemporal'Graphs.'Jain,'Zamir,' Savarese,'Saxena.'In'CVPR,'2016.'
AcWvity' Affordance'
Corresponding'StructuralXRNN'
41' Ashutosh'Saxena'
SXRNN'improves'object'affordance'anWcipaWon'by'44%' ' SXRNN'does'not'have'any'Markov'assumpWons'like'CRF'
42'
(joint detection and anticipation) further improves the performance.
Detection F1-score Anticipation F1-score Method Sub- Object Sub- Object activity (%) Affordance (%) activity (%) Affordance (%) Koppula et al. RSS’13, PAMI’15 80.4 81.5 37.9 36.7 S-RNN w/o edgeRNN 82.2 82.1 64.8 72.4 S-RNN 83.2 88.7 62.3 80.7 S-RNN (multi-task) 82.4 91.1 65.6 80.9
Ashutosh'Saxena'
StructuralXRNN:'Deep'Learning'on'SpaWoXTemporal'Graphs.'Jain,'Zamir,' Savarese,'Saxena.'In'CVPR,'2016.'
Human&object&interac>on&
Koppula'et'al.'T.'PAMI’15' Li'et'al.'ECCV’08' Gupta'et'al.'T.'PAMI’09'
43' Ashutosh'Saxena'
Manipula>ng&food& DeepMPC.'Lenz,'Knepper,' Saxena,'RSS'2014' Appliance&Manipula>on& RoboBarista,&' Sung'et'al.,'2015' Robot&Language& Tell&Me&Dave,&RSS’13.' Tellex'et'al.' Abbeel'et'al.,'UC'Berkeley' Homes&and&Humans& Brain'of'Things,'2015'
Knowledge'engine'
Misra'et'al.'RSS’14' Tellex’s'lab' (Brown'University)' Jain'et'al.'ICCV’15' Jain'et'al.'ISRR’13' Sung'et'al.'ISRR’15'
Knowledge!graph!for!sharing!learned!concepts!
44' Ashutosh'Saxena'
45' Ashutosh'Saxena'