Opening the pod bay doors building intelligent agents that can - PowerPoint PPT Presentation

Augmented state spaces States S = S e ⨉ S r Actions A = A e ⋃ A r Transitions T : S ⨉ A → S Goal: find a policy S ⨉ X → A

Augmented state spaces: training Training Evaluation max p( action | text , state; θ ) max p( action | text , state; θ ) action max E state | θ R( action | state ) [Branavan et al., ACL ’09]

clear the two long columns, and then the row

Augmented state spaces: better training Training Evaluation max p( action | text , state; θ ) max p( action | text , state; θ ) action max E state | θ R( action | state )

Learning the reading state Move into the living room. Go forward then face the sofa. go_forward turn_left turn_left go_forward turn_right

Learning the reading state Key idea: move “reading state” into the hidden state of an RNN. [Mei et al., AAAI ’16]

Learning the reading state Training Evaluation max p( action | text , state; θ ) max p( action | text , state; θ ) action max E state | θ R( action | state )

human : Walk past hall table. Walk into bedroom. Make left at table clock. Wait at bathroom door threshold.

Approach 2: predicting constraints

Actions, goals, constraints Find a table next to a chair. go_forward go_forward turn_left go_forward turn_left

Actions, goals, constraints [Find] [a table] [next to] [a chair]. go_forward go_forward turn_left go_forward turn_left

Actions, goals, constraints [Find] [a table] [next to] [a chair].

Actions, goals, constraints Key idea: predict constraints rather than action sequences, and let a planner do the rest of the work.

Predicting constraints x3 x3 [Find] [a table] [next to] [a chair]. x6 x1 x2 x5

Predicting constraints x3 x3 x3 x3 [Find] [a table] [next to] [a chair]. x6 x6 x1 x1 x2 x2 x5 x5 x1? x3? x4?

Predicting constraints x3 x3 x3 x3 [Find] [a table] [next to] [a chair]. x6 x6 x1 x1 x2 x2 x5 x5 x6? x5?

Predicting constraints x3 x3 x3 x3 [Find] [a table] [next to] [a chair]. x6 x6 x1 x1 x2 x2 x5 x5 x6? adj x5?

Predicting constraints x3 x3 x3 x3 [Find] [a table] [next to] [a chair]. x6 x6 x1 x1 x2 x2 x5 x5 ? ? ?

Predicting constraints x3 x3 x3 x3 [Find] [a table] [next to] [a chair]. x6 x6 x1 x1 x2 x2 x5 x5 obj? rel? obj?

Learning a constraint parser max p( labels | text , graph; θ ) θ x3 x3 x3 x3 [Find] [a table] [next to] [a chair]. x6 x6 x1 x1 x2 x2 x5 x5 x6? adj x5? obj? rel? obj?

Inferring constraints max p( labels | text , graph; θ ) labels x3 x3 x3 x3 [Find] [a table] [next to] [a chair]. x6 x6 x1 x1 x2 x2 x5 x5 x6? adj x5? obj? rel? obj?

Inferring constraints max p( labels | text , graph; θ ) labels x3 x3 x3 x3 [Put] [the cup] [on] [the table]. x6 x6 x1 x1 x2 x2 x5 x5 x6? adj x5? obj? rel? obj? [Tellex et al., NCAI ’11]

Logical constraint languages max p( constraint | text; θ ) max p( constraint | text; θ ) θ constraint Find a table next to a chair. at( x1 ) table( x1 ) next_to( x1 , x2 ) chair( x2 )

Logical constraint languages ⇢ � (a) chair D" E" X" 1" 2" 3" 4" 5" λ x.chair ( x ) y" ⇢ � (b) hall A" B" 1" 0$ λ x.hall ( x ) 270$ 90$ E" (c) the chair 2" ι x.chair ( x ) 180$ C" (d) you B" E" D" 3" you ⇢ � (e) blue hall B" λ x.hall ( x ) ∧ blue ( x ) C" 4" (f) chair in the intersection ⇢ � E" λ x.chair ( x ) ∧ 5" intersect ( ι y.junction ( y ) , x ) ⇢ � A" B" E" (g) in front of you A" λ x.in front of ( you, x ) ⇢ � [Artzi et al., TACL ’13]

action space action space action space go towards this direction! go towards this direction! go towards this direction! action space action space go towards this direction! go towards this direction! Panoramic Panoramic Panoramic Panoramic Panoramic go forward turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left go forward turn left go forward turn left turn left go forward turn left turn left turn left turn left go forward turn left turn left visuomotor space visuomotor space visuomotor space visuomotor space visuomotor space Low-level Low-level Low-level Low-level Low-level instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... go_forward turn_left turn_left go_forward turn_right Find a table next to a chair. Constraints without logic

Constraints without logic Key idea: use freeform learned potential functions rather than symbolic constraints [Andreas & Klein, EMNLP ’16]

action space action space action space go towards this direction! go towards this direction! go towards this direction! action space action space go towards this direction! go towards this direction! go_forward turn_left turn_left go_forward turn_right Panoramic Panoramic Panoramic Panoramic Panoramic go forward turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left go forward turn left go forward turn left turn left go forward turn left turn left turn left turn left go forward turn left turn left visuomotor space visuomotor space visuomotor space visuomotor space visuomotor space Low-level Low-level Low-level Low-level Low-level instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... Find a table next to a chair. Constraints without logic

alignment action space action space action space go towards this direction! go towards this direction! go towards this direction! action space action space go towards this direction! go towards this direction! Panoramic Panoramic Panoramic ∑ f ( plan’, alignment’ | text; θ ) Panoramic Panoramic θ , alignment plan,   max f ( plan, alignment | text; θ ) max f ( plan, alignment | text; θ ) go forward turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left go forward turn left go forward turn left turn left go forward turn left turn left turn left turn left go forward turn left turn left visuomotor space visuomotor space visuomotor space visuomotor space visuomotor space Low-level Low-level Low-level Low-level Low-level instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... Find a table next to a chair. Constraints without logic

Constraints without logic Clear the columns,   then the row

Constraints without logic Clear the columns,   then the row (no “column”!)

[Janner et al., TACL ’18]

Our toolkit so far

Instruction following Act in complex environments With expressive policies that condition on instructions and observations Track progress over time In the underlying state space or RNN state Plan ahead and reason about outcomes With a symbolic planner or learned cost function

What else can we do?

Application: instruction generation

action space action space action space go towards this direction! go towards this direction! go towards this direction! action space action space go towards this direction! go towards this direction! Panoramic Panoramic Panoramic Panoramic Panoramic go forward turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left go forward turn left go forward turn left turn left go forward turn left turn left turn left turn left go forward turn left turn left visuomotor space visuomotor space visuomotor space visuomotor space visuomotor space Low-level Low-level Low-level Low-level Low-level instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... go_forward turn_left turn_left go_forward turn_right Move into the living room. Go forward then face the sofa. Instruction following

action space action space action space go towards this direction! go towards this direction! go towards this direction! action space action space go towards this direction! go towards this direction! Panoramic Panoramic Panoramic Panoramic Panoramic go forward turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left go forward turn left go forward turn left turn left go forward turn left turn left turn left turn left go forward turn left turn left visuomotor space visuomotor space visuomotor space visuomotor space visuomotor space Low-level Low-level Low-level Low-level Low-level instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... go_forward turn_left turn_left go_forward turn_right Move into the living room. Go forward then face the sofa. Instruction following generation

action space action space action space go towards this direction! go towards this direction! go towards this direction! action space action space go towards this direction! go towards this direction! Panoramic Panoramic Panoramic Panoramic Panoramic go forward turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left turn left go forward turn left go forward turn left turn left go forward turn left turn left turn left turn left go forward turn left turn left visuomotor space visuomotor space visuomotor space visuomotor space visuomotor space Low-level Low-level Low-level Low-level Low-level instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... go_forward turn_left turn_left go_forward turn_right find a sofa Prediction action sequences

Instruction generation Key idea: a good instruction gets readers to their goal with high probability (whatever the training data says!)

Instruction generation Max posterior probability max p( text | plan; θ ) text (“how do people describe this?”)

Instruction generation Max posterior probability max p( text | plan; θ ) text (“how do people describe this?”) min Bayes risk max p( plan | text; θ ) text (“how do I make people do this?”)

Reasoning about outcomes max p( plan | text; θ ) text I will make a turn. Instruction   follower

Reasoning about outcomes max p( plan | text; θ ) text I will make a turn. Listener

Reasoning about outcomes max p( plan | text; θ ) text I will go straight through. Listener

Reasoning about outcomes max p( plan | text; θ ) text I will turn left at the brick intersection. Listener [Fried et al., NAACL ’18]

Reasoning about belief I will turn left at the brick intersection. [Frank & Goodman, Trends in Cog. Sci. ’12]

speaker: Walk past the dining room table and chairs and wait there. listener : Walk past the dining room table and chairs and take a right into the living room. Stop once you are on the rug. human : Turn right and walk through the kitchen. Go right into the living room and stop by the rug.

Application: machine teaching

Instructions as sca fg olds for RL

Instructions as parameter-tying schemes

Instructions as parameter tying schemes Environment states S e Reading states S e Environment actions A e Reading actions A e Go forward then face the sofa. instruction: … Turn left and go towards the sofa ... instruction: … Turn left and go towards the sofa ... Go forward then face the sofa. Low-level Low-level visuomotor space turn left turn left turn left turn left go forward visuomotor space turn left turn left turn left turn left go forward s 1 Go forward then face the sofa. s 3 t u r n _ r i g h t Panoramic Go forward then face the sofa. Panoramic go towards this direction! action space go towards this direction! action space

Instructions as parameter-tying schemes go north, go east, go south go north, go east, go north, … go north, go north, go west

Go north. Go east. Go north. [Andreas et al., ICML ’17]

Learning interactively from corrections

Supervision s 1 s 3 t u r n _ r i g h t go_forward s 4 s 0

Conditioning on the past Push the chair against the wall. go_forward grasp turn_left go_forward release

Opening the pod bay doors building intelligent agents that can - PowerPoint PPT Presentation

Opening the pod bay doors building intelligent agents that can interpret, generate and learn from natural language Jacob Andreas, MIT / Microsoft web.mit.edu/jda/www / @jacobandreas Following natural language instructions

SIMPLY COMPLEX TASK OF KUBERNETES INGRESS Richard Li 1 WHAT IS INGRESS? Pod Pod Pod Pod

The Trafficware Pod Pod Detection and Advanced Applications As a Means of Collecting Purdue High

Sliding system for concertina doors 271 Sliding system for concertina doors - Technical features

S9334: Building And Managing Scalable AI Infrastructure With NVIDIA DGX Pod And DGX Pod

Proper Orthogonal Decomposition applied Introduction for in Parallel Solution to Large System of

OPENING V.1 OPENING V.2 - for improvisation OPENING V.3 OPENING V.4 OPENING V.5

"Open the Pod Bay Doors, HAL": Machine Intelligence and the Law Professor Andrew Murray

Development of a LabView Development of a LabView Interface for CSEM POD Interface for CSEM POD

ARC/ORAU 3D Printing ARC POD: The Personal Protection Device Realena, Zachary, Isabelle ARC

Motility Pod Town Hall Meeting IU GI Motility and Neurogastroenterology Feb 3, 2016 Motility Pod

Stabilization of POD-ROMs David Wells Virginia Tech/Rensselaer Polytechnic Institute Wednesday,

Physical POD Test and deployments #OpenCORD Full POD: definition The minimum amount of hardware

Seashore Safaris 2019 Feedback by Greg Bessant Seashore Safaris Minnis Bay West Bay St

Opening R Opening Remar emarks ks Open Doors with Housing Counseling August 08, 2019

LEAD I N CASCO BAY, 1 9 9 1 -2 0 1 1 SEDIMENT ASSESSMENT OF CASCO BAY (1991-2011) SEPTEMBER 14,

July 2014 Corina Moore, Interim CEO Presentation to the Rotary Club of North Bay North Bay and

Kalman Filter Kalman Filter = special case of a Bayes filter with dynamics model and n

Improved Dynamic Graph Learning through Fault-Tolerant Sparsification Chun Jiang Zhu , Sabine

Evolutionary Systems Biology: multilevel evolution Paulien Hogeweg Theoretical Biology and

Optimality Theoretic Lexical Semantics Lotte Hogeweg, Radboud University Nijmegen 1. Introduction

Adversarial Music: Real world audio adversary against wake-word detection systems Juncheng B. Li

Multiclass Boosting with Repartitioning Ling Li Learning Systems Group, Caltech ICML 2006

Preparing for the Unexpected Samuel Parkinson samuel.parkinson@ft.com #qconlondon

Product Development Dilemma Product Development Dilemma

Opening the pod bay doors building intelligent agents that can - PowerPoint PPT Presentation

Opening the pod bay doors building intelligent agents that can interpret, generate and learn from natural language Jacob Andreas, MIT / Microsoft web.mit.edu/jda/www / @jacobandreas Following natural language instructions

SIMPLY COMPLEX TASK OF KUBERNETES INGRESS Richard Li 1 WHAT IS INGRESS? Pod Pod Pod Pod

The Trafficware Pod Pod Detection and Advanced Applications As a Means of Collecting Purdue High

Sliding system for concertina doors 271 Sliding system for concertina doors - Technical features

S9334: Building And Managing Scalable AI Infrastructure With NVIDIA DGX Pod And DGX Pod

Proper Orthogonal Decomposition applied Introduction for in Parallel Solution to Large System of

OPENING V.1 OPENING V.2 - for improvisation OPENING V.3 OPENING V.4 OPENING V.5

&quot;Open the Pod Bay Doors, HAL&quot;: Machine Intelligence and the Law Professor Andrew Murray

Development of a LabView Development of a LabView Interface for CSEM POD Interface for CSEM POD

ARC/ORAU 3D Printing ARC POD: The Personal Protection Device Realena, Zachary, Isabelle ARC

Motility Pod Town Hall Meeting IU GI Motility and Neurogastroenterology Feb 3, 2016 Motility Pod

Stabilization of POD-ROMs David Wells Virginia Tech/Rensselaer Polytechnic Institute Wednesday,

Physical POD Test and deployments #OpenCORD Full POD: definition The minimum amount of hardware

Seashore Safaris 2019 Feedback by Greg Bessant Seashore Safaris Minnis Bay West Bay St

Opening R Opening Remar emarks ks Open Doors with Housing Counseling August 08, 2019

LEAD I N CASCO BAY, 1 9 9 1 -2 0 1 1 SEDIMENT ASSESSMENT OF CASCO BAY (1991-2011) SEPTEMBER 14,

July 2014 Corina Moore, Interim CEO Presentation to the Rotary Club of North Bay North Bay and

Kalman Filter Kalman Filter = special case of a Bayes filter with dynamics model and n

Improved Dynamic Graph Learning through Fault-Tolerant Sparsification Chun Jiang Zhu , Sabine

Evolutionary Systems Biology: multilevel evolution Paulien Hogeweg Theoretical Biology and

Optimality Theoretic Lexical Semantics Lotte Hogeweg, Radboud University Nijmegen 1. Introduction

Adversarial Music: Real world audio adversary against wake-word detection systems Juncheng B. Li

Multiclass Boosting with Repartitioning Ling Li Learning Systems Group, Caltech ICML 2006

Preparing for the Unexpected Samuel Parkinson samuel.parkinson@ft.com #qconlondon

Product Development Dilemma Product Development Dilemma

"Open the Pod Bay Doors, HAL": Machine Intelligence and the Law Professor Andrew Murray