Program Guided Agent ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu - PowerPoint PPT Presentation

Program Guided Agent ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim

Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry the onions until tender, and repeat this for garlic and carrots, if you have soy sauce, add some. Pour 2/3 cups the whisked eggs into the stir-fried and scramble.

Natural Language Instruction Ambiguities in Language Recipe: cooking fried rice • Scoping Stir-fry the onions until tender, and repeat this for garlic and • Coreferences carrots, if you have soy sauce, add some. Pour 2/3 cups the • Entities whisked eggs into the stir-fried and scramble. Bandanau et al. in ICLR 2019 Misra et al. “Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction” in EMNLP 2018 Anderson et al. “Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments” in CVPR 2018 Misra et al. “Mapping Instructions and Visual Observations to Actions with Reinforcement Learning” in EMNLP 2017 Hermann et al. “Grounded Language Learning in a Simulated 3D World” in arXiv 2017

Program Advantages of Programs Function: cooking fried rice • Explicit scoping for item in [onions, garlic, carrots]: if is_there(“soy sauce”): • Resolved Coreferences add(“soy sauce”, “pot”) while not tender(item): • Resolved Entities stir_fry(item) pour(whisked(“eggs”), “pot”, 0.66) scramble(“eggs”)

Problem Formulation Program

Problem Formulation Program State x3 x1 x0

Problem Formulation Program State Execution x3 x1 x0 x3 x1 x0

Exemplar Instructions def Task (): def Task (): if is_there[River]: if is_there[River]: mine (Wood) build_bridge () build_bridge () place (Gold, 3, 4) if agent[Iron ] < 3: if agent[Gold ] = = 1 3: Programs mine (Iron) while agent[Gold] <= 12: place (Gold, 8, 3) place (Iron, 2, 3) if agent[Iron] >= 8: else: place (Wood, 2, 4) goto (4, 2) elif env[Gold] <= 10: while env[Gold ] > 0 : sell (Iron) mine(Gold) Natural Language Instructions

End-to-end Learning Baseline Perception 3 0 1 Module State State def run(): Query Response Policy Action while env[ Gold ] > 0: OR Environment mine ( Gold ) if is_there[ River ]: build_bridge () place ( Wood , 2, 3) Module Module Output Program Program Interpreter Goal Program NL Instruction

Program Guided Agent Perception 3 0 1 Module State def run(): Response Query while env[ Gold ] > 0: Policy Action Environment mine ( Gold ) if is_there[ River ]: build_bridge () place ( Wood , 2, 3) Module Module Output Program Interpreter Goal Program

Program Interpreter Comprehend a given program to 3 categories: • Subtasks (actions) : what agent should perform • Perception : information from the environment • Control flow : decide to call different subtasks according to perceived • information Perception 3 0 1 Module State def run(): Query Response while env[ Gold ] > 0: Policy Action Environment mine ( Gold ) if is_there[ River ]: build_bridge () place ( Wood , 2, 3) Module Module Output Program Interpreter Goal Program

Perception Module Extract environmental information for choosing a path in a program • Input • Query : a symbolically represented query ( e.g. is_there[River]) • State s : environment map and agent inventory status • Output • Predicted answer to the query ( e.g. True/False) • Perception 3 0 1 Module State def run(): Query Response while env[ Gold ] > 0: Policy Action Environment mine ( Gold ) if is_there[ River ]: build_bridge () place ( Wood , 2, 3) Module Module Output Program Interpreter Goal Program

Policy Take low-level actions an the environment for fulfilling a subtask • Input • Symbolically represented subtask (goal) g • State s • Output • Predicted action distribution • Perception 3 0 1 Module State def run(): Query Response while env[ Gold ] > 0: Policy Action Environment mine ( Gold ) if is_there[ River ]: build_bridge () place ( Wood , 2, 3) Module Module Output Program Interpreter Goal Program

Result

Conclusion Specific tasks using programs • def Task (): if is_there[River]: mine (Wood) build_bridge () if agent[Iron ] < 3: Program mine (Iron) place (Iron, 2, 3) else: goto (4, 2) while env[Gold ] > 0 : mine(Gold) Leverage the structure of programs with a modular framework • Perception 3 0 1 Module State def run(): Query Response Policy Action while env[ Gold ] > 0: Environment mine ( Gold ) if is_there[ River ]: build_bridge () place ( Wood , 2, 3) Module Module Output Program Interpreter Goal Program

Program Guided Agent ICLR 2020 (Spotlight) Thank You for Your Attention Shao-Hua Sun Te-Lin Wu Joseph J. Lim

Program Guided Agent ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu - PowerPoint PPT Presentation

Program Guided Agent ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry the onions until tender, and repeat this for garlic and carrots, if you have

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Guided Therapeutics

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

MVC Guided Pathways Brief review of Guided Pathways at MVC Plan for Today Spring

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Guided Pathways Equity & Education Update Feb 7, 2020 Guided Pathways Decision Making

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Year 3 Guided Pathways Plan Presentation Presented by: Palomar Guided Pathways Team DATE: May

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Agent-Based Systems Agent: autonomous Learning for Agent-Based Systems Environment: fully,

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 2 Abstract Agent

Neural LMs Image: (Bengio et al, 03) One Hot Vectors Neural LMs (Bengio et al, 03)

Review S. Cheng (OU-Tulsa) October 17, 2017 1 / 28 Lecture 10 Review Conditioning reduces

Clustering: K-Means, GMM, EM March 11, 2016 Boris Ivanovic* csc411ta@cs.toronto.edu *Based on

MOL2NET , 2017 , 3, doi: 10.3390/mol2net-03-04608 2 with SuperPro Designer software from a raw

Introduction to Linear Programming Linear Programming is the study of optimization problems in

Causation When C causes E, C helps to make E happen. Learning about causes allows us

The Invisible Internet Project Andrew Savchenko Moscow, Russia FOSDEM 2018 3 & 4 February

How the hell does Monero work? @pwrcycle > cafecode.com/shellcon2018-monero.pdf whois

Program Guided Agent ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu - PowerPoint PPT Presentation

Program Guided Agent ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry the onions until tender, and repeat this for garlic and carrots, if you have

FFR Guided Functional FFR Guided Functional FFR Guided Functional FFR Guided Functional

Guided Therapeutics in Cancer Surgery Guided Therapeutics in Cancer Surgery Guided Therapeutics

Overview Multi-Agent Systems Introduction to multi-agent systems and agent societies Agent

MVC Guided Pathways Brief review of Guided Pathways at MVC Plan for Today Spring

An Agent Architecture An Agent Architecture An Agent Architecture An Agent Architecture for

S S S S calable calable Agent calable calable Agent Agent Plat forms Agent Plat forms

Agent-Based Systems Agent communication Speech act theory Michael Rovatsos Agent

Guided Pathways Equity &amp; Education Update Feb 7, 2020 Guided Pathways Decision Making

The Player Agent The Player Agent Are they the most important league official right now? right

Rational Agents (Ch. 2) Rational agent An agent/robot must be able to perceive and interact with

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 6 Agent Communication 1

Year 3 Guided Pathways Plan Presentation Presented by: Palomar Guided Pathways Team DATE: May

Agent Training Welcome Blues Agent Portal Training e-Learning on the BCBSM Agent Portal

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Agent-Based Systems Agent: autonomous Learning for Agent-Based Systems Environment: fully,

Agent-Based Systems Michael Rovatsos mrovatso@inf.ed.ac.uk Lecture 2 Abstract Agent

Neural LMs Image: (Bengio et al, 03) One Hot Vectors Neural LMs (Bengio et al, 03)

Review S. Cheng (OU-Tulsa) October 17, 2017 1 / 28 Lecture 10 Review Conditioning reduces

Clustering: K-Means, GMM, EM March 11, 2016 Boris Ivanovic* csc411ta@cs.toronto.edu *Based on

MOL2NET , 2017 , 3, doi: 10.3390/mol2net-03-04608 2 with SuperPro Designer software from a raw

Introduction to Linear Programming Linear Programming is the study of optimization problems in

Causation When C causes E, C helps to make E happen. Learning about causes allows us

The Invisible Internet Project Andrew Savchenko Moscow, Russia FOSDEM 2018 3 &amp; 4 February

How the hell does Monero work? @pwrcycle &gt; cafecode.com/shellcon2018-monero.pdf whois

Guided Pathways Equity & Education Update Feb 7, 2020 Guided Pathways Decision Making

The Invisible Internet Project Andrew Savchenko Moscow, Russia FOSDEM 2018 3 & 4 February

How the hell does Monero work? @pwrcycle > cafecode.com/shellcon2018-monero.pdf whois