Outline Morning program Preliminaries Semantic matching Learning - PowerPoint PPT Presentation

Outline Morning program Preliminaries Semantic matching Learning to rank Entities Afternoon program Modeling user behavior Generating responses Recommender systems Industry insights Q & A 182

Outline Morning program Preliminaries Semantic matching Learning to rank Entities Afternoon program Modeling user behavior Generating responses One-shot dialogues Open-ended dialogues (chit-chat) Goal-oriented dialogues Alternatives to RNNs Resources Recommender systems Industry insights Q & A 183

Generating responses Tasks I Question Answering I Summarization I Query Suggestion I Reading Comprehension / Wiki Reading I Dialogue Systems I Goal-Oriented I Chit-Chat 184

Generating responses Example Scenario for machine reading task Sandra went to the kitchen. Fred went to the kitchen. Sandra picked up the milk. Sandra traveled to the o ffi ce. Sandra left the milk. Sandra went to the bathroom. I Where is the milk now? A: o ffi ce I Where is Sandra? A: bathroom I Where was Sandra before the o ffi ce? A: kitchen 185

Generating responses Example Scenario for machine reading task Sandra went to the kitchen. Fred went to the kitchen. Sandra picked up the milk. Sandra traveled to the o ffi ce. Sandra left the milk. Sandra went to the bathroom. I Where is the milk now? A: o ffi ce I Where is Sandra? A: bathroom I Where was Sandra before the o ffi ce? A: kitchen I’ll be going to Los Angeles shortly. I want to book a flight. I am leaving from Amsterdam. I want the return flight to be early morning. I don’t have any extra luggage. I wouldn’t mind extra leg room. I What does the user want? A: Book a flight I Where is the user flying from? A: Amsterdam I Where is the user going to? A: Los Angeles 186

Generating responses What is Required? I The model needs to remember the context I It needs to know what to look for in the context I Given an input, the model needs to know where to look in the context I It needs to know how to reason using this context I It needs to handle changes in the context A Possible Solution: I Hidden states of RNNs have memory: Run an RNN on the and get its representation to map question to answers/response. This will not scale as RNN states don’t have ability to capture long term dependency: vanishing gradients, limited state size. 187

Generating responses Teaching Machine to Read and Comprehend [Hermann et al., 2015] 188

Generating responses Neural Networks with Memory I Memory Networks I End2End MemNNs I Key-Value MemNNs I Neural Turing Machines I Stack/List/Queue Augmented RNNs 189

Generating responses End2End Memory Networks [Sukhbaatar et al., 2015] 190

Generating responses End2End Memory Networks [Sukhbaatar et al., 2015] I Share the input and output embeddings or not I What to store in memories individual words, word windows, full sentences I How to represent the memories? Bag-of-words? RNN reading of words? Characters? 191

Generating responses Attentive Memory Networks [Kenter and de Rijke, 2017] Framing the task of conversational search as a general machine reading task. 192

Generating responses Key-Value Memory Networks Example: for a KB triple [subject, relation, object], Key could be [subject,relation] and value could be [object] or vice versa. [Miller et al., 2016] 193

Generating responses WikiReading [Hewlett et al., 2016, Kenter et al., 2018] Task is based on Wikipedia data (datasets available in English, Turkish and Russian). I Categorical: relatively small number of possible answer (e.g.: instance of, gender, country). I Relational: rare or totally unique answers (e.g.: date of birth, parent,capital). 194

Generating responses WikiReading I Answer Classification : Encoding document and question, using softmax classifier to assign probability to each of to-50k answers (limited answer vocab) . I Sparse BoW Baseline, Averaged Embeddings, Paragraph Vector, LSTM Reader, Attentive Reader, Memory Network. I Generally models with RNN and attention work better, especially at relational properties. I Answer Extraction (labeling/pointing) For each word in the document, compute the probability that it is part of the answer. I Regardless of the vocabulary so the answer requires being mentioned in the document. I RNN Labeler: shows a complementary set of strengths, performing better on relational properties than categorical ones I Sequence to Sequence Encoding query and document and decoding the answer as sequences of words or characters. I Basic seq2seq, Placeholder seq2seq, Basic Character seq2seq, I Unifies the classification and extraction in one model: Greater degree of balance between relational and categorical properties. 195

Generating responses Dialogue systems Dialogues/conversational agents/chat bots Open-ended dialogues Goal-oriented dialogues I ELIZA I Restaurant finding I Twitterbots I Hotel reservations I Alexa/Google home/Siri/Cortana I Set an alarm clock I Order a pizza I Play music I Alexa/Google home/Siri/Cortana Is this IR? 197

Generating responses Dialogue systems thanks Chit-chat bots fine am I machine user machine Hello how are you I am fine thanks user Straightforward seq-to-seq [Vinyals and Hello how are you Le, 2015]. ([Sordoni et al., 2015] is a precursor, but no RNN-to-RNN, and no LSTM). Same idea, but with attention [Shang et al., 2015] 198

Generating responses Dialogue systems Limitations Human : what is your job? I ’Wrong’ optimization criterion Machine : i’m a lawyer. I Generic responses Human : what do you do? I No way to incorporate world knowledge Machine : i’m a doctor. Example from [Vinyals and Le, I No model of conversation 2015] I Inconsistency I No memory of what was said earlier on Evaluation I Perplexity? I BLUE/METEOR? I Nice overview of How NOT To Evaluate Your Dialogue System [Liu et al., 2016]. I Open problem.... 199

Generating responses Dialogue systems 3 solutions I More consistency in dialogue with hierarchical network I Less generic responses with di ff erent optimization function I More natural responses with GANs 200

Generating responses Dialogue systems Hierarchical seq-to-seq [Serban et al., 2016]. Main evaluation metric: perplexity. 201

Generating responses Dialogue systems Avoid generic responses Usually: optimize log likelhood of predicted utterance, given previous context: C LL = arg max log p ( u t | context ) = arg max log p ( u t | u 0 . . . u t − 1 ) u t u t To avoid repetitive/boring answer ( I don’t know ), use maximum mutual information between previous context and predicted utterance [Li et al., 2015]. log p ( u t , context ) C MMI = arg max p ( u t ) p ( context ) u t = [derivation, next page . . . ] = arg max (1 − λ ) log p ( u t | context ) + λ log p ( context | u t ) u t 202

Generating responses Dialogue systems Bayes rule p ( context | u t ) p ( u t ) log p ( u t | context ) = log p ( context ) log p ( u t | context ) = log p ( context | u t ) + log p ( u t ) − log p ( context ) log p ( u t ) = log p ( u t | context ) − log p ( context | u t ) + log p ( context ) log p ( u t , context ) log p ( u t | context ) p ( context ) C MMI = arg max p ( u t ) p ( context ) = arg max p ( u t ) p ( context ) u t u t log p ( u t | context ) = arg max p ( u t ) u t = arg max log p ( u t | context ) − log p ( u t ) ← Weird, minus language model score. u t = arg max log p ( u t | context ) − λ log p ( u t ) ← Introduce λ . Crucial step! Without this it wouldn’t work. u t = arg max log p ( u t | context ) − λ ( log p ( u t | context ) − log p ( context | u t ) + log p ( context )) u t = arg max (1 − λ ) log p ( u t | context ) + λ log p ( context | u t ) u t (More is needed to get it to work. See [Li et al., 2015] for more details.) 203

Generating responses Generative adversarial network for dialogues I Discriminator network p( / ) I Classifier: real or generated utterance I Generator network Discriminator I Generate a realistic utterance Generator Real data Original GAN paper [Goodfellow et al., 2014]. Conditional GANs, e.g. [Isola et al., 2016]. 204

Generating responses Generative adversarial network for dialogues provided generated I Discriminator network I Classifier: real or generated utterance Generator I Generator network Hello how are you I am fine thanks I Generate a realistic utterance thanks am fine I p(x = real) Discriminator p(x = generated) See [Li et al., 2017] for more details. thanks Hello how are you am fine I Code available at https://github.com/jiweil/Neural-Dialogue-Generation 205

Generating responses Dialogue systems Open-ended dialogue systems I Very cool, current problem I Very hard I Many problems I Training data I Evaluation I Consistency I Persona I . . . 206

Outline Morning program Preliminaries Semantic matching Learning - PowerPoint PPT Presentation

Outline Morning program Preliminaries Semantic matching Learning to rank Entities Afternoon program Modeling user behavior Generating responses Recommender systems Industry insights Q & A 182 Outline Morning program Preliminaries

Ins Domingues Breast Cancer Workshop April 7th 2015 Outline Outline Outline Outline

Presentation Preparation Outline Speech Outline Template ***Use this outline to guide you in

Outline for St Outline for St Outline for

Beob Kyun Kim, S oonwook Hwang {kyun, hwang}@ kisti.re.kr KIS TI, Korea Outline Outline

Catherine Revels, World Bank November 2009 Presentation outline Presentation outline

Battlestar Galactica Battlestar Galactica Galactica Battlestar Outline Outline Outline

Outline 2 Outline 2 ZSim core simulation techniques Outline 2 ZSim core simulation

Appendix J: Capstone Presentation Outline Revised Spring 2016 CAPSTONE PRESENTATION OUTLINE This

PT1 TMP Presentation Outline 1 Group Members: ___________________________________ Use this outline

Broverview Outline 2 Outline Philosophy and Architecture A framework for network traffic

Xingqian Peng, Huaqiao University, China Presented by Zhen Wu Presented by Zhen Wu October 30,2011

1 Web Application Development 2 3 Web Application Development CSS Outline An outline is a

Lecture Outline Strengthening Induction Hypothesis. Lecture Outline Strengthening Induction

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

High Dimensional Approximation - Outline Background and Sources Wolfgang Dahmen Seminar: USC,

Outline Outline Deaf and Hearing Impaired Deaf and Hearing Impaired Physical Structures of

Botprize 2010 Jacob Schrum, Igor Karpov, and Risto Miikkulainen

Alexa, can you help me? hi, how are you doing? I don't know what to do. hi, how are you doing?

Natural Language Generation and Dialog System Evaluation EE596/LING580 -- Conversational

Real World IronPython Dynamic Languages on .NET Michael Foord Resolver Systems

Mylobot, Detecting the Undetected Using Deep Learning Yael Daihes 02 WHO AM I Yael Daihes

Low Impact Focus Group Monthly Meeting January 23, 2018 Opening Comments This meeting is

R i R 2 = C 2 + 2C 1 R = C + C i i j T j j hp (

19/04/2016 What does it mean? Response-time analysis conditional