Building A User-Centric and Content-Driven Socialbot
Hao Fang
Committee: Hannaneh Hajishirzi Eve Riskin Geoffrey Zweig Mari Ostendorf (Chair) Leah M. Ceccarelli (GSR) Yejin Choi
Building A User-Centric and Content-Driven Socialbot Hao Fang Mari - - PowerPoint PPT Presentation
Building A User-Centric and Content-Driven Socialbot Hao Fang Mari Ostendorf (Chair) Hannaneh Hajishirzi Committee: Leah M. Ceccarelli (GSR) Eve Riskin Yejin Choi Geoffrey Zweig Agenda o Background o Sounding Board System 2017 Alexa
Committee: Hannaneh Hajishirzi Eve Riskin Geoffrey Zweig Mari Ostendorf (Chair) Leah M. Ceccarelli (GSR) Yejin Choi
1
2
3
4
5
6
7
8
8
9
10
11
12
13
14
UT Austin and Google AI use machine learning on data from NASA's Kepler Space Telescope to discover an eighth planet circling a distant star. astronomy category tag science AI Google topic mention
15
16
Graph-Based Document Representation Storytelling Question Answering & Asking Subject Entity Opinion Comment
17
Storytelling Chain subject comment answer
18
Sentence Split Entity Linking Coreference Resolution Dependency Parsing Named Entity Recognition Constituency Parsing Tokenization Part-of-Speech Tagging Sentence Filtering NLP Tools
19
the next 𝑂 sentences following 𝑡𝑀 in the article Binary Label
662 865 1538 1064 500 1000 1500 2000 2500
Positive Negative
Number of Candidate Sentence Chains
20
TextRank unsupervised summarization
Pre-trained BERT used for ranking sentences given 𝑡1, 𝑡2, … , 𝑡𝑀
21
54.7 62.3 62.1 69.3 63.2 71.9 64.8 73.7 66.3 70.2 50 55 60 65 70 75 L=1, N=4 L=2, N=3 SentDistance SentEmbedding SentImportance ChainEmbedding All
% the highest-ranked sentence has a positive label next sentence is not always good
22
54.7 62.3 62.1 69.3 63.2 71.9 64.8 73.7 66.3 70.2 50 55 60 65 70 75 L=1, N=4 L=2, N=3 SentDistance SentEmbedding SentImportance ChainEmbedding All
sentence embedding alone may capture some features about importance / style (e.g., length, informativeness) % the highest-ranked sentence has a positive label
23
54.7 62.3 62.1 69.3 63.2 71.9 64.8 73.7 66.3 70.2 50 55 60 65 70 75 L=1, N=4 L=2, N=3 SentDistance SentEmbedding SentImportance ChainEmbedding All
sentence importance (document context) is very useful % the highest-ranked sentence has a positive label
24
dialog context is important as the chain gets longer +2.7 +4.4 % the highest-ranked sentence has a positive label
54.7 62.3 62.1 69.3 63.2 71.9 64.8 73.7 66.3 70.2 50 55 60 65 70 75 L=1, N=4 L=2, N=3 SentDistance SentEmbedding SentImportance ChainEmbedding All
25
54.7 62.3 62.1 69.3 63.2 71.9 64.8 73.7 66.3 70.2 50 55 60 65 70 75 L=1, N=4 L=2, N=3 SentDistance SentEmbedding SentImportance ChainEmbedding All
using all features (2050-dimensional) overfits for L=2 (1239 training samples) % the highest-ranked sentence has a positive label
26
Universal Dependencies Question Interestingness/Importance Hand-Crafted Decision Tree Template-Based Planning Dependency-Based Realization
27
ROOT Among leading U.S. carriers , Sprint was the only one to throttle Skype , the study found
amod compound punct nsubj det amod xcomp ccomp mark nmod case cop punct
nsubj
det
clause plan
/root/nsubj (study) what /root (found) /root/nsubj (study) /root/ccomp (one) /root (found)
Question Type (what, whether, who, why, …)
constituents
28
Do you want to know _____?
29
35 52 6 4 59 44
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Constituency Dependency
Win Tie Loss 18 44 9 7 73 49
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Constituency Dependency
Win Tie Loss dependency-based outperforms constituency-based, but does not achieve “human performance”
30
62 76 3 3 35 21
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Constituency Dependency
Win Tie Loss 40 56 9 7 51 37
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Constituency Dependency
Win Tie Loss dependency-based method generates much more informative questions (better than human)
31
22 38 5 4 73 58
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Constituency Dependency
Win Tie Loss 14 38 7 5 79 57
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Constituency Dependency
Win Tie Loss dialog context is important!
32
32
33
34
35
0.1 0.2 AskQuestion RequestHelpOrRepeat ProposeTopic AcceptTopic RejectTopic FollowAndNonNegative InterestedInContent NotInterestedInContent PositiveToContent NegativeToContent PositiveToBot NegativeToBot
𝑠
num
𝑠
pct
For each act 𝐵
𝐵
𝐵
Pearson 𝑠 with conversation user ratings 𝑂
𝐵 cannot tell any
negative correlation Conversation Length 𝑠 = 0.15
36
𝑠
num
𝑠
pct
It is a good sign that user follows the conversation flow when the bot is the primary speaker Design, learn, & maintain engaging conversation flows (≠ system-initiative)
0.1 0.2 AskQuestion RequestHelpOrRepeat ProposeTopic AcceptTopic RejectTopic FollowAndNonNegative InterestedInContent NotInterestedInContent PositiveToContent NegativeToContent PositiveToBot NegativeToBot
37
AskQuestion and ProposeTopic slightly impact user ratings in the negative direction Improve the bot’s capability of handling user questions and topic requests
𝑠
num
𝑠
pct
0.1 0.2 AskQuestion RequestHelpOrRepeat ProposeTopic AcceptTopic RejectTopic FollowAndNonNegative InterestedInContent NotInterestedInContent PositiveToContent NegativeToContent PositiveToBot NegativeToBot
38
39
40
41
𝑒
Both learned from conversation-level rating regression 0.2 0.4 NumTurns Linear Subdialog BiLSTM Pearson 𝑠
42
Spearman 𝜍 BiLSTM may learn features about the user by using surrounding context +.17 0.1 0.2 0.3 0.4 Within Conversation Cross Conversation NumTurns Linear Subdialog BiLSTM
43
44
45
46
47
48
Lu, Yi Luan, Kevin Lybarger, Alex Marin, Julie Medero, Farah Nadeem, Nicole Nichols, Sining Sun, Trang Tran, Ellen Wu, Victoria Zayats