Dialog as a Vehicle for Lifelong Learning of Grounded Language - PowerPoint PPT Presentation

Summary • Jointly improving a semantic parser and dialog policy from human interactions is more effective than improving either alone. • The training procedure needs to enable changes in components to be propagated to each other for joint learning to be effective. 42

Outline • Background • Integrating Learning of Dialog Strategies and Semantic Parsing (Padmakumar et.al., 2017) • Opportunistic Active Learning for Grounding Natural Language Descriptions (Thomason et. al., 2017) • Learning a Policy for Opportunistic Active Learning (Padmakumar et. al., 2018) • Dialog Policy Learning for Joint Clarification and Active Learning Queries (Padmakumar and Mooney, in submission) • Summary • New Directions (Padmakumar and Mooney, RoboDial 2020) 43

Opportunistic Active Learning for Grounding Natural Language Descriptions [Thomason et. al., 2017] Bring the blue mug from Alice’s office Semantic Grounding Understanding Dialog Policy Natural Where should I bring a Language blue mug from? Generation 44

Opportunistic Active Learning • A framework for incorporating active learning queries into test time interactions. • Agent asks locally convenient questions during an interactive task to collect labeled examples for supervised learning. • Questions may not be useful for the current interaction but expected to help future tasks. 45

Opportunistic Active Learning Bring the blue mug from Alice’s office Blue? 46

Opportunistic Active Learning Bring the blue mug from Alice’s office Would you use the word “blue” to refer to this object? Yes 47

Opportunistic Active Learning Bring the blue mug from Alice’s office bring( ,3502) Heavy? Tall? 48

Opportunistic Active Learning Bring the blue mug from Alice’s office Would you use the word “ tall ” to refer to this object? Yes 49

Opportunistic Active Learning Query for labels most likely to improve the model. ? 50

Opportunistic Active Learning Why ask off-topic queries? • Robot may have good models for on-topic concepts. • No useful on-topic queries. • Some off-topic concepts may be more important because they are used in more interactions. 51

Opportunistic Active Learning - Challenges Some other object might be a better candidate for the question Purple? 52

Opportunistic Active Learning - Challenges The question interrupts another task and may be seen as unnatural Bring the blue mug from Alice’s office Would you use the word “ tall ” to refer to this object? 53

Opportunistic Active Learning - Challenges The information needs to be useful for a future task. Red? 54

Object Retrieval Task 55

Object Retrieval Task • User describes an object in the active test set • Robot needs to identify which object is being described 56

Object Retrieval Task • Robot can ask questions about objects on the sides to learn object attributes 57

Two Types of Questions 58

Two Types of Questions 59

Experimental Conditions A yellow water bottle • Baseline (on-topic) - the robot can only ask about “yellow”, “water” and “bottle” • Inquisitive (on and off topic) - the robot can ask about any concept it knows, possibly “red” or “heavy” 60

Results • Inquisitive robot performs better at understanding object descriptions. • Users find the robot more comprehending, fun and usable in a real-world setting, when it is opportunistic. 61

Learning a Policy for Opportunistic Active Learning [Padmakumar et. al., 2018] Bring the blue mug from Alice’s office Semantic Grounding Understanding Dialog Policy Natural Where should I bring a Language blue mug from? Generation 63

Opportunistic Active Learning Bring the blue mug from Alice’s office Would you use the word “ tall ” to refer to this object? Yes 64

Dialog Policy Learning Bring the blue mug from Alice’s office bring( ,3502) Heavy? Tall? 65

Learning a Policy for Opportunistic Active Learning Learn a dialog policy that decides how many and which questions to ask to improve grounding models. 66

Learning a Policy for Opportunistic Active Learning To learn an effective policy, the agent needs to learn – To identify good queries in the opportunistic setting. – When a guess is likely to be successful. – To trade off between model improvement and task completion. 67

Task Setup Target Description 68

Task Setup 69

Task Setup 70

Grounding Model A white umbrella {white, umbrella} white/ not white SVM Pretrained CNN umbrella/ not umbrella SVM 71

Opportunistic Active Learning • Agent starts with no classifiers. • Labeled examples are acquired through questions and used to train the classifiers. • Agent needs to learn a policy to balance active learning with task completion. 72

MDP Model Action: Dialog Agent ● Label query: <yellow, train_1> State: ● Label query: <yellow, train_2> ● Target description ● … ● Active train and test ● Label query: <white, train_1> Max correct guesses objects Reward : ● Label query: <white, train_2> with short dialogs ● Agent’s perceptual ● ... classifiers ● Example Query: yellow ● Example query: white User ● ... ● Guess 73

Challenges Action: Dialog Agent ● Label query: <yellow, train_1> State: ● Label query: <yellow, train_2> ● Target description ● … ● Active train and test ● Label query: <white, train_1> Max correct guesses objects Reward : ● Label query: <white, train_2> with short dialogs ● Agent’s perceptual ● ... classifiers ● Example Query: yellow ● Example query: white User ● ... ● Guess How to represent classifiers for policy learning? 74

Challenges Action: Dialog Agent ● Label query: <yellow, train_1> State: ● Label query: <yellow, train_2> ● Target description ● … ● Active train and test ● Label query: <white, train_1> Max correct guesses objects Reward : ● Label query: <white, train_2> with short dialogs ● Agent’s perceptual ● ... classifiers ● Example Query: yellow ● Example query: white User ● ... ● Guess How to handle a variable and growing action space? 75

Tackling challenges • Features based on active learning metrics – Representing classifiers • Featurize state-action pairs – Variable number of actions and classifiers • Sampling a beam of promising queries – Large action space 76

Feature Groups • Query features - Active learning metrics used to determine whether a query is useful • Guess features - Features that use the predictions and confidences of classifiers to determine whether a guess will be correct 77

Experiment Setup • Policy learning using REINFORCE. • Baseline - A hand-coded dialog policy that asks a fixed number of questions selected using the sampling distribution that provides candidates to the learned policy. 78

Experiment Phases • Initialization - Collect experience using the baseline to initialize the policy. • Training - Improve the policy from on-policy experience. • Testing - Policy weights are fixed, and we run a new set of interactions, starting with no classifiers, over an independent test set with different predicates. 79

Results ● Systems evaluated on dialog success rate and average dialog length. 80

Results ● Systems evaluated on dialog success rate and average dialog length. ● We prefer high success rate and low dialog length (top left corner) 81

Results ● Learned policy is Learned more successful than the baseline, while also using shorter dialogs on average. Static 82

Results ● If we ablate either Learned group of features, the success rate drops considerably - Query but dialogs are also much shorter. - Guess Static ● In both cases, the system chooses to ask very few queries. 83

Summary • We can learn a dialog policy that learns to acquire knowledge of predicates through opportunistic active learning. • The learned policy is more successful at object retrieval than a static baseline, using fewer dialog turns on average. 84

Outline • Dialog Policy Learning for Joint Clarification and Active Learning Queries – Dialog Policy Learning for Joint Clarification and Active Learning Queries (Padmakumar and Mooney, in submission) – Human Evaluation – Extension to Joint Embedding Based Grounding Model 86

Dialog Policy Learning for Joint Clarification and Active Learning Queries [Padmakumar and Mooney, Bring the blue mug in submission] from Alice’s office Semantic Grounding Understanding Dialog Policy Natural Where should I bring a Language blue mug from? Generation 87

Previous Work Bring the blue mug from Alice’s office bring( ,3502) Heavy? Tall? 88

This Work Bring the blue mug from Alice’s office bring(●,3502) Heavy? Tall? 89

This Work Bring the blue mug from Alice’s office What should I bring? Would you use the word “tall” to refer to this object? 90

Dialog Policy Learning for Joint Clarification and Active Learning Queries Opportunistic Clarification Active Learning This Work Dialog Policy Learning 91

Dialog Policy Learning for Joint Clarification and Active Learning Queries Learn a dialog policy to trade off - • Model improvement with opportunistic active learning to better understand future commands • Clarification to better understand and complete the current command 92

Attribute Based Clarification: Motivation Bring the blue mug from Alice’s office bring(●, 3502) What should I bring? 93

Attribute Based Clarification: Motivation Bring the blue mug from Alice’s office What should I bring? The blue coffee mug What should I bring? 94

Attribute Based Clarification: Motivation Bring the blue mug from Alice’s office Is this the object I should bring? No Is this the object I should bring? 95

Attribute Based Clarification: Motivation [Das, et. al., 2017] [De Vries et. al., 2017] 96

Attribute Based Clarification • More specific than a new description. • More general than showing each possible object. • Provide ground truth answers to questions for training in simulation. • Attribute - any property that can be used in a description - categories, colors, shapes, domain specific properties. 97

Attribute Based Clarification: Motivation Bring the blue mug from Alice’s office Is the object I should bring a cup? 98

Task Setup • Motivated by an online shopping application • Use clarifications to help refine search queries • Use active learning to improve the model retrieving images. 99

Dataset • We simulate dialogs using the iMaterialist Fashion Attribute dataset. • Images have associated product titles and are annotated with binary labels for 228 attributes. • Attributes: Dress, Shirt, Red, Blue, V-Neck, Pleats, ... 100

Dialog as a Vehicle for Lifelong Learning of Grounded Language - PowerPoint PPT Presentation

Dialog as a Vehicle for Lifelong Learning of Grounded Language Understanding Systems Aishwarya Padmakumar Doctoral Dissertation Defense 1 Grounded Language Understanding Mapping natural language to real-world entities Bring the blue mug

Advanced NLU & Dialog Models Ling575 Spoken Dialog Systems April 21, 2016 Roadmap

Continually Improving Grounded Natural Language Understanding through Human-Robot Dialog Jesse

Speech Processing 15-492/18-492 Spoken Dialog Systems Advanced Concepts in Dialog Spoken Dialog

Response-based Learning for Grounded Grounded SMT Riezler, Machine Translation Simianer, Haas

Lear Learning M ning Multi ulti-Moda Modal l Grounded Lingu Grounded Linguistic istic

Dialog as a Vehicle for Lifelong Learning Aishwarya Padmakumar, Raymond J. Mooney Department of

Lifelong Learning CS 330 Plan for Today The lifelong learning problem statement Basic approaches

30/06/2017 Universit de La Rochelle 1. Lifelong Learning in France 2. The Blue Biotechnology

AI DIALOG SEARCH news services Josef Krupi ka Michal Svoboda Goals dialog system

Dialog Models 11-716 September 18, 2003 Thomas Harris What is a (dialog) model? A model is

Dialog Management EE596/LING580 -- Conversational Artificial Intelligence Hao Cheng University

Wrapping Up Ling575 Spoken Dialog Systems June 5, 2013 Roadmap Overview Distinctive

SDS: ASR, NLU, & VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System

V2X Technology Date: 11 th May 2013 What is V2X? V2X V2X - Vehicle to Vehicle or Vehicle to

Regret Bounds for Lifelong Learning Pierre Alquier Groupe de Travail de Machine learning du CMLA

Outline Introduction Definition History Features When should Grounded Theory be used? Types

Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition from a Domain Adaptation

Interpreting Interpretations: Organizing Attribution Methods by Criteria Zifan Wang, Piotr

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017 Supervised

Lake Tahoe West Science Symposium Day 1: Tuesday May 19, 9:00 am 2:00 pm Day 2: Friday May

EEE 6503 LASER T HEORY C HAPTER -7:: F AST P ULSE P RODUCTION C HAPTER -8:: N ONLINEAR O PTICS

Abnormal Uterine Bleeding: Evaluation of Premenopausal Women Vanessa Jacoby, MD, MAS Assistant

O N S UBNORMAL F LOATING P OINT AND A BNORMAL T IMING Marc Andrysco, David Kohlbrenner, Keaton

Anomaly Analysis and Diagnosis for Co-located Datacenter Workloads in the Alibaba Cluster

Sambuz

Useful Links

Newsletter

Mail Us

Dialog as a Vehicle for Lifelong Learning of Grounded Language - PowerPoint PPT Presentation

Dialog as a Vehicle for Lifelong Learning of Grounded Language Understanding Systems Aishwarya Padmakumar Doctoral Dissertation Defense 1 Grounded Language Understanding Mapping natural language to real-world entities Bring the blue mug

Advanced NLU &amp; Dialog Models Ling575 Spoken Dialog Systems April 21, 2016 Roadmap

Continually Improving Grounded Natural Language Understanding through Human-Robot Dialog Jesse

Speech Processing 15-492/18-492 Spoken Dialog Systems Advanced Concepts in Dialog Spoken Dialog

Response-based Learning for Grounded Grounded SMT Riezler, Machine Translation Simianer, Haas

Lear Learning M ning Multi ulti-Moda Modal l Grounded Lingu Grounded Linguistic istic

Dialog as a Vehicle for Lifelong Learning Aishwarya Padmakumar, Raymond J. Mooney Department of

Lifelong Learning CS 330 Plan for Today The lifelong learning problem statement Basic approaches

30/06/2017 Universit de La Rochelle 1. Lifelong Learning in France 2. The Blue Biotechnology

AI DIALOG SEARCH news services Josef Krupi ka Michal Svoboda Goals dialog system

Dialog Models 11-716 September 18, 2003 Thomas Harris What is a (dialog) model? A model is

Dialog Management EE596/LING580 -- Conversational Artificial Intelligence Hao Cheng University

Wrapping Up Ling575 Spoken Dialog Systems June 5, 2013 Roadmap Overview Distinctive

SDS: ASR, NLU, &amp; VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System

V2X Technology Date: 11 th May 2013 What is V2X? V2X V2X - Vehicle to Vehicle or Vehicle to

Regret Bounds for Lifelong Learning Pierre Alquier Groupe de Travail de Machine learning du CMLA

Outline Introduction Definition History Features When should Grounded Theory be used? Types

Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition from a Domain Adaptation

Interpreting Interpretations: Organizing Attribution Methods by Criteria Zifan Wang, Piotr

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017 Supervised

Lake Tahoe West Science Symposium Day 1: Tuesday May 19, 9:00 am 2:00 pm Day 2: Friday May

EEE 6503 LASER T HEORY C HAPTER -7:: F AST P ULSE P RODUCTION C HAPTER -8:: N ONLINEAR O PTICS

Abnormal Uterine Bleeding: Evaluation of Premenopausal Women Vanessa Jacoby, MD, MAS Assistant

O N S UBNORMAL F LOATING P OINT AND A BNORMAL T IMING Marc Andrysco, David Kohlbrenner, Keaton

Anomaly Analysis and Diagnosis for Co-located Datacenter Workloads in the Alibaba Cluster

Sambuz

Useful Links

Newsletter

Mail Us

Advanced NLU & Dialog Models Ling575 Spoken Dialog Systems April 21, 2016 Roadmap

SDS: ASR, NLU, & VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System