building a user centric and content driven socialbot
play

Building A User-Centric and Content-Driven Socialbot Hao Fang Mari - PowerPoint PPT Presentation

Building A User-Centric and Content-Driven Socialbot Hao Fang Mari Ostendorf (Chair) Hannaneh Hajishirzi Committee: Leah M. Ceccarelli (GSR) Eve Riskin Yejin Choi Geoffrey Zweig Agenda o Background o Sounding Board System 2017 Alexa


  1. Building A User-Centric and Content-Driven Socialbot Hao Fang Mari Ostendorf (Chair) Hannaneh Hajishirzi Committee: Leah M. Ceccarelli (GSR) Eve Riskin Yejin Choi Geoffrey Zweig

  2. Agenda o Background o Sounding Board System – 2017 Alexa Prize Winner o A Graph-Based Document Representation for Dialog Control o Multi-Level Evaluation for Socialbot Conversations o Summary and Future Directions 1

  3. Agenda o Background o Sounding Board System – 2017 Alexa Prize Winner o A Graph-Based Document Representation for Dialog Control o Multi-Level Evaluation for Socialbot Conversations o Summary and Future Directions 2

  4. Sci-Fi Movies 3

  5. Daily Life 4

  6. Types of Conversational AI “converse coherently and engagingly with humans Socialbots on popular topics and current events” Task Domain Dialog Definition Coverage Initiative task-oriented single-domain system-initiative non-task-oriented multi-domain user-initiative open-domain mixed-initiative 5

  7. Socialbot Applications o Entertainment, education, healthcare, companionship, … o A conversational gateway to online content Socialbot Conversational User Interface 6

  8. Agenda o Background o Sounding Board System – 2017 Alexa Prize Winner o A Graph-Based Document Representation for Dialog Control o Multi-Level Evaluation for Socialbot Conversations o Summary and Future Directions 7

  9. Design Objectives • Users can control the dialog flow User- and switch topics at any time Centric • Bot responses are adapted to acknowledge user reactions • Content cover the wide range of Content- user interests Driven • Dialog strategies to lead or contribute to the dialog flow 8 8

  10. 2017 Alexa Prize Finals 9

  11. 10

  12. Dialog Control for Many Miniskills? o Greet o List Topics o Tell Fun Facts Conversation o Tell Jokes Activities o Tell Headlines (Miniskills) o Discuss Movies o Personality Test o … 11

  13. Hierarchical Dialog Management o Dialog Context Tracker o dialog state, topic/content/miniskill history, user personality o Master Dialog Manager o miniskill polling o topic and miniskill backoff o Miniskill Dialog Managers o miniskill dialog control as a finite-state machine o retrieve content & build response plan 12

  14. Social Chat Knowledge An important type of social chat knowledge is online content. How to organize content to facilitate the dialog control? A framework that allows dialog control to be defined in a consistent way. 13

  15. Knowledge Graph UT Austin and Google AI use o Nodes machine learning on data from NASA's Kepler Space Telescope o content post (fact, movie, news article, …) to discover an eighth planet o topic (entity or generic topic) circling a distant star. o Relational edges between content topic mention post and topic category tag o topic mention (NER, noun phrase extraction) AI o category tag (Reddit meta-information) science o movie name, genre, director, actor (IMDB) Google o Dialog Control: move along edges astronomy 14

  16. Agenda o Background o Sounding Board System – 2017 Alexa Prize Winner o A Graph-Based Document Representation for Dialog Control o Multi-Level Evaluation for Socialbot Conversations o Summary and Future Directions 15

  17. Graph-Based Motivation Document Representation o Dialog control defined based on moves on the graph o lead the conversation o handle user initiatives o Challenges for unstructured document (e.g., news articles) o not all sentences are equally interesting to a listener Storytelling o need to figure out a coherent presenting order Question Answering & Asking o answer questions about the document o need a smooth transition between sentences Subject Entity o handle entity-based information seeking requests Opinion Comment o handle opinion-seeking requests 16

  18. Graph-Based Document Representation Storytelling Chain Opinion 1 Sent 1 comment Opinion 2 Entity 1 subject Sent 2 Entity 2 Question 1 Sent 3 Entity 3 Question 2 answer Sent 4 Question 3 17

  19. Document Representation Construction Tokenization Text Pre-processing Sentence Split Sentence Node Creation Sentence Filtering Entity Node Creation Part-of-Speech Tagging NLP Tools Subject Edge Creation Constituency Parsing Named Entity Recognition Storytelling Chain Creation Entity Linking Question Generation Coreference Resolution Comment Collection Dependency Parsing 18

  20. Storytelling Chain Creation o Problem formulation the next 𝑂 Sent 1 sentences o context sentence sequence (𝑡 1 , 𝑡 2 , … , 𝑡 𝑀 ) following 𝑡 𝑀 in o candidate sentence set {y 1 , 𝑧 2 , … , 𝑧 𝑂 } the article Sent 2 o candidate sentence chain (𝑧 𝑗 | 𝑡 1 , 𝑡 2 , … , 𝑡 𝑀 ) Sent 3 Binary Label o Data collection: 550 news articles o Train/Validation/Test: 3/1/1 based on article ID ? L=1, N=4 662 1538 Positive L=2, N=3 Negative 865 1064 0 500 1000 1500 2000 2500 Number of Candidate Sentence Chains 19

  21. Model and Features used for ranking o Model: binary logistic regression sentences given 𝑡 1 , 𝑡 2 , … , 𝑡 𝑀 o input: candidate sentence chain (𝑧 𝑗 | 𝑡 1 , 𝑡 2 , … , 𝑡 𝑀 ) 𝑡 1 , 𝑡 2 , … , 𝑡 𝑀 ∈ ℝ [0,1] o output: probability score 𝑡(𝑧 𝑗 o Features TextRank unsupervised summarization on the document 𝐸 o SentImportance: 𝑠(𝑧 𝑗 𝐸 o SentDistance: 𝑒(𝑧 𝑗 𝑡 1 , 𝑡 2 , … , 𝑡 𝑀 = 𝑇𝑓𝑜𝑢𝐽𝑒𝑦(𝑧 𝑗 ) – 𝑇𝑓𝑜𝑢𝐽𝑒𝑦(𝑡 𝑀 ) o SentEmbedding: 𝑓(𝑧 𝑗 ) Pre-trained BERT o ChainEmbedding: 𝑑(𝑧 𝑗 𝑡 1 , 𝑡 2 , … , 𝑡 𝑀 20

  22. Test Set Results 75 next sentence is not always good 73.7 71.9 70.2 69.3 70 66.3 64.8 SentDistance 65 63.2 SentEmbedding 62.3 62.1 % the highest-ranked SentImportance sentence has a 60 ChainEmbedding positive label All 54.7 55 50 L=1, N=4 L=2, N=3 21

  23. Test Set Results sentence embedding alone 75 73.7 may capture some features 71.9 about importance / style 70.2 69.3 70 (e.g., length, informativeness) 66.3 64.8 SentDistance 65 63.2 SentEmbedding 62.3 62.1 % the highest-ranked SentImportance sentence has a 60 ChainEmbedding positive label All 54.7 55 50 L=1, N=4 L=2, N=3 22

  24. Test Set Results 75 sentence importance 73.7 (document context) is 71.9 70.2 very useful 69.3 70 66.3 64.8 SentDistance 65 63.2 SentEmbedding 62.3 62.1 % the highest-ranked SentImportance sentence has a 60 ChainEmbedding positive label All 54.7 55 50 L=1, N=4 L=2, N=3 23

  25. Test Set Results +4.4 dialog context is important 75 73.7 as the chain gets longer 71.9 70.2 69.3 70 +2.7 66.3 64.8 SentDistance 65 63.2 SentEmbedding 62.3 62.1 % the highest-ranked SentImportance sentence has a 60 ChainEmbedding positive label All 54.7 55 50 L=1, N=4 L=2, N=3 24

  26. using all features (2050-dimensional) overfits Test Set Results for L=2 (1239 training samples) 75 73.7 71.9 70.2 69.3 70 66.3 64.8 SentDistance 65 63.2 SentEmbedding 62.3 62.1 % the highest-ranked SentImportance sentence has a 60 ChainEmbedding positive label All 54.7 55 50 L=1, N=4 L=2, N=3 25

  27. Question 1 Question Generation Sent Question 2 Dependency Parsing Universal Dependencies Dependent Selection for Answer Question Interestingness/Importance Question Type Classification Hand-Crafted Decision Tree Clause/Question Planning Template-Based Planning Clause/Question Realization Dependency-Based Realization 26

  28. Question Generation root ccomp case nmod amod punct punct nsubj xcomp cop compound det ccomp mark det nsubj amod ROOT Among leading U.S. carriers , Sprint was the only one to throttle Skype , the study found constituents clause plan /root /root/nsubj /root/ccomp /root/nsubj /root what (found) (study) (one) (study) (found) Question Type (what, whether, who, why, …) 27

  29. Evaluation of Generated Questions o As a transition clause for introducing Sent2 given Sent1 o do you want to know ______? Sent 1 o 4 question generation methods Do you want to know _____? o generic: more about this article Sent 2 o constituency-based (Heilman, 2011) o dependency-based o human-written o Human judgments on question pairs (A, B, cannot tell) o 134 sentences, 5 judgments per pair 28

  30. Overall Quality dependency-based outperforms constituency-based, but does not achieve “human performance” vs. Generic vs. Human Win Tie Loss Win Tie Loss 100% 100% 90% 90% 80% 80% 44 49 59 70% 70% 73 60% 60% 4 50% 50% 7 40% 40% 6 30% 30% 52 9 44 20% 20% 35 10% 10% 18 0% 0% Constituency Dependency Constituency Dependency 29

  31. dependency-based method generates much Informativeness more informative questions (better than human) vs. Generic vs. Human Win Tie Loss Win Tie Loss 100% 100% 21 90% 90% 35 37 80% 80% 3 51 70% 70% 3 7 60% 60% 50% 50% 9 40% 40% 76 62 30% 30% 56 40 20% 20% 10% 10% 0% 0% Constituency Dependency Constituency Dependency 30

  32. Transition Smoothness dialog context is important! vs. Generic vs. Human Win Tie Loss Win Tie Loss 100% 100% 90% 90% 80% 80% 57 58 70% 70% 73 79 60% 60% 50% 50% 5 4 40% 40% 30% 30% 5 20% 38 20% 38 7 22 10% 10% 14 0% 0% Constituency Dependency Constituency Dependency 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend