s9276 towards open domain conversational ai
play

S9276: Towards Open-Domain Conversational AI Y U N - N U N G ( V I V - PowerPoint PPT Presentation

S9276: Towards Open-Domain Conversational AI Y U N - N U N G ( V I V I A N ) C H E N 1 H T T P : / / V I V I A N C H E N . I D V. T W Ir Iron Man (2 (2008) What can machines achieve now or in the future? 2 Language Empowering


  1. S9276: Towards Open-Domain Conversational AI Y U N - N U N G ( V I V I A N ) C H E N 陳 縕 儂 1 H T T P : / / V I V I A N C H E N . I D V. T W

  2. Ir Iron Man (2 (2008) What can machines achieve now or in the future? 2

  3. Language Empowering In Intelli ligent Assis istant M I U L A B Microsoft Cortana (2014) Google Now (2012) Apple Siri (2011) N T U Google Assistant (2016) Amazon Alexa/Echo (2014) Apple HomePod (2017) Facebook M & Bot (2015) Google Home (2016)

  4. Why Natural Language? • Global Digital Statistics (2018 January) M I U L A B N T U Active Mobile Active Social Media Internet Users Total Population Unique Mobile Users Users Social Users 4.02B 7.59B 5.14B 3.20B 2.96B 4% 14% 7% 13% The more natural and convenient input of devices evolves towards speech.

  5. Why and When We Need? Social Chit-Chat Turing Test (talk like a human) “I want to chat” Information consumption “I have a question” M I U L A B Task-Oriented “I need to get this done” Task completion Dialogues “What should I do?” Decision support • What is today’s agenda? N T U • What does GTC stand for? • Book me the flight ticket from Taipei to San Francisco • Reserve a table at Din Tai Fung for 5 people, 7PM tonight • Is GTC good to attend?

  6. In Intelligent Assis istants M I U L A B N T U Task-Oriented

  7. Conversational Agents Chit-Chat M I U L A B N T U Task-Oriented

  8. T a s k - O r i e n t e d D i a l o g u e S y s t e m s M I U L A B N T U JARVIS – Iron Man’s Personal Assistant Baymax – Personal Healthcare Companion

  9. Task-Oriented Dialogue System (Y (Young, g, 2000) http://rsta.royalsocietypublishing.org/content/358/1769/1389.short Speech Signal Hypothesis M I U L A B are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame N T U request_movie genre=action, date=this weekend Dialogue Management (DM) 9 Natural Language • Dialogue State Tracking (DST) Text response Generation (NLG) • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers

  10. Task-Oriented Dialogue System (Y (Young, g, 2000) Speech Signal Hypothesis M I U L A B are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame N T U request_movie genre=action, date=this weekend Dialogue Management (DM) 10 Natural Language • Dialogue State Tracking (DST) Text response Generation (NLG) • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers

  11. Semantic ic Frame Representation • Requires a domain ontology: early connection to backend • Contains core content (intent, a set of slots with fillers) M I U L A B Restaurant find me a cheap taiwanese restaurant in oakland Domain price type find_restaurant (price=“cheap”, restaurant type=“ taiwanese ”, location=“ oakland ”) N T U location Movie show me action movies directed by james cameron 11 Domain genre year find_movie (genre=“action”, movie director=“ james cameron ”) director

  12. Backend Database / Ontology • Domain-specific table • Target and attributes date rating M I U L A B • Functionality • Information access: find specific entries movie name • Task completion: find the row that satisfies theater time the constraints N T U Movie Name Theater Rating Date Time Iron Man Last Taipei A1 8.5 2018/10/31 09:00 Iron Man Last Taipei A1 8.5 2018/10/31 09:25 Iron Man Last Taipei A1 8.5 2018/10/31 10:15 Iron Man Last Taipei A1 8.5 2018/10/31 10:40

  13. Task-Oriented Dialogue System (Y (Young, g, 2000) Speech Signal Hypothesis M I U L A B are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame N T U request_movie genre=action, date=this weekend Dialogue Management (DM) 13 Natural Language • Dialogue State Tracking (DST) Text response Generation (NLG) • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers

  14. Language Understanding (L (LU) • Pipelined M I U L A B 1. Domain 2. Intent 3. Slot N T U Classification Classification Filling 14

  15. 1. . Domain Id Identification Requir ires Predefined Do Domain in Ontology User M I U L A B find a good eating place for taiwanese food N T U Movie DB Restaurant DB Taxi DB 15 Organized Domain Knowledge (Database) Intelligent Agent Classification!

  16. 2. . In Intent Detection Requir ires Predefined Sch Schema User M I U L A B find a good eating place for taiwanese food FIND_RESTAURANT N T U FIND_PRICE Restaurant DB FIND_TYPE 16 : Intelligent Agent Classification!

  17. 3. . Slo lot Fil illing Requir ires Predefined Sch Schema O O B-rating O O O B-type O User M I U L A B find a good eating place for taiwanese food Restaurant Rating Type Rest 1 good Taiwanese Rest 2 bad Thai N T U Restaurant DB : : : 17 FIND_RESTAURANT SELECT restaurant { Intelligent rest.rating =“good” rating=“good” Agent type=“ taiwanese ” rest.type =“ taiwanese ” } Semantic Frame Sequence Labeling

  18. Slo lot Tagging (Y (Yao+, 20 2013 13; ; Mesn snil il+, 201 2015) • Variations: http://131.107.65.14/en-us/um/people/gzweig/Pubs/Interspeech2013RNNLU.pdf; http://dl.acm.org/citation.cfm?id=2876380 a. RNNs with LSTM cells M I U L A B b. Input, sliding window of n-grams c. Bi-directional LSTMs 𝑧 0 𝑧 1 𝑧 2 𝑧 𝑜 N T U 𝑧 1 𝑧 2 𝑧 0 𝑧 𝑜 𝑧 0 𝑧 1 𝑧 2 𝑧 𝑜 𝑐 𝑐 𝑐 𝑐 ℎ 0 ℎ 1 ℎ 2 ℎ 𝑜 ℎ 0 ℎ 1 ℎ 2 ℎ 𝑜 ℎ 0 𝑔 𝑔 𝑔 ℎ 1 ℎ 2 ℎ 𝑜 𝑔 ℎ 1 ℎ 0 ℎ 2 ℎ 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 (a) LSTM (c) bLSTM (b) LSTM-LA

  19. Slo lot Tagging (Kurata+, 20 2016 16; Si Simonnet+, 20 2015 15) • Encoder-decoder networks http://www.aclweb.org/anthology/D16-1223 • Leverages sentence level information 𝑧 1 𝑧 0 𝑧 2 𝑧 𝑜 M I U L A B ℎ 𝑜 ℎ 2 ℎ 1 ℎ 0 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 𝑥 𝑜 𝑥 2 𝑥 1 𝑥 0 • Attention-based encoder-decoder • Use of attention (as in MT) in the encoder-decoder network N T U • Attention is estimated using a feed-forward network with input: h t and s t at time t 𝑧 1 𝑧 2 𝑧 0 𝑧 𝑜 𝑡 0 𝑡 1 𝑡 2 𝑡 𝑜 ℎ 0 ℎ 1 ℎ 2 ℎ 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 c i … ℎ 0 ℎ 𝑜

  20. Jo Joint Semantic ic Frame Parsing • Intent prediction • Slot filling and and slot filling are intent prediction M I U L A B performed in two Sequence- in the same Parallel (Liu based (Hakkani- branches output sequence and Lane, 2016) Tur et al., 2016) N T U taiwanese food please EOS U U U U h t-1 h t h t+1 h T+1 W W W W V V V V O FIND_REST B-type O Intent Prediction Slot Filling

  21. Jo Joint Model Comparison M I U L A B Attention Intent-Slot Mechanism Relationship Joint bi-LSTM X Δ (Implicit) Attentional Encoder-Decoder √ Δ (Implicit) N T U Slot Gate Joint Model √ √ (Explicit) 21

  22. Slo lot-Gated Jo Joint SLU (G (Goo+, 20 2018 18) Slot 𝑇 𝑇 𝑇 𝑇 𝑕 𝑧 1 𝑧 2 𝑧 3 𝑧 4 M I U L A B Sequence 𝑧 𝐽 Intent Attention Slot 𝑤 Gate BLSTM tanh Slot Attention Word 𝑋 N T U 𝑦 1 𝑦 2 𝑦 3 𝑦 4 BLSTM Sequence 𝑇 𝑑 𝐽 𝑑 𝑗 Slot Gate Word 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑇 + 𝑋 ∙ 𝑑 𝐽 𝑕 = ∑𝑤 ∙ tanh 𝑑 𝑗 Sequence Slot Prediction 𝑇 = 𝑡𝑝𝑔𝑢𝑛𝑏𝑦 𝑋 𝑇 ℎ 𝑗 + 𝒉 ∙ 𝑑 𝑗 𝑇 + 𝑐 𝑇 𝑧 𝑗 𝒉 will be larger if slot and intent are better related

  23. Context xtual LU Domain Identification → Intent Prediction → Slot Filling M I U L A B send_email D communication I just sent email to bob about fishing this weekend U O O O O O S B-contact_name B-subject I-subject I-subject → send_email(contact_name =“bob”, subject=“fishing this weekend”) N T U U 1 send email to bob S 1 B-contact_name → send_email(contact_name =“bob”) 23 are we going to fish this weekend U 2 B-message I-message I-message I-message S 2 I-message I-message I-message → send_email (message=“are we going to fish this weekend”)

  24. Context xtual LU • User utterances are highly ambiguous in isolation M I U L A B Restaurant Book a table for 10 people tonight. Booking Which restaurant would you like to book a table for? N T U Cascal, for 6. ? #people time

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend