Intelligent Chatbot on WeChat
WeChat AI NLP 2017.05.09
Intelligent Chatbot on WeChat WeChat AI NLP 2017.05.09 WeCh We - - PowerPoint PPT Presentation
Intelligent Chatbot on WeChat WeChat AI NLP 2017.05.09 WeCh We Chat is is the he le leading ding mob obil ile e so socia ial l ne network work in in Ch Chin ina. In In 6 6 ye years, rs, We WeCh Chat at has gained llion
WeChat AI NLP 2017.05.09
We WeCh Chat is is the he le leading ding mob
ile e so socia ial l ne network work in in Ch Chin
In 6 6 ye years, rs, We WeCh Chat at has gained…
Data: : Tencent Financial Repor
3
than 1 hour on WeChat every day
Data: Penguin Intelligence
WeChat is not just a mobile messaging app. It’s a new lifestyle, connecting people with people, services, devices and more.
WeChat Overview
The WeChat Lifestyle
Red Pocket Jan 27 – Feb 01 46 Billion Emoji Jan 27 – Feb 01 16 Billion Voice Call Jan 27 – Jan 28 2.1 Billion minutes
Chinese New Year 2017
6
Powered by WeChat
7
Messaging (Can be automated) Account management
Service Accounts
China Merchant Bank case
China Merchants Bank
Over 10 million followers Open an account Pay bill/loan Receive payment notifications Receive CRM promotions
Powered by WeChat
8
Messaging Account management
Service Accounts
China Southern Airlines case
China Southern Airlines
Buy Tickets Check-in Choose seats Flight status update Frequent flyer services
Powered by WeChat
… 4.3
2012.10 Voice Search
4.5
2013.2 Voice Reminder Shake Music
5.0
2013.8 Voice Input Scan Cover/Word
5.2
2013.10 Voice to Text
5.3
2014.1 Shake TV
5.4
2014.6 Scan Cards Smart Open Platform
6.0
2014.12 Voice Print
6.2
2015.4 Data Mining
Now
Amber Platform BOTs Platform
WeChat AI
Growth in 6 years
WeChat Amber Platform
Highlights Applications
Experiments – Google Net
Speech Recognition
Features Applications
grammar
s speech recognition
Infrastructure
– Key personal information extraction and verification
– Classify tens of millions of images daily – Supports 3 levels and 1,000 categories – CNN/RNN/LSTM – End-to-end deep learning
Applications Algorithms
Chatbot Examples
service, information, knowledge, etc.
Question
Question Parsing Question Understanding Output
Rule Match
QnA
Chitter Chat Model
Answer Ranking
Answer
Context
Answer Candidates
Knowledge Base
Question
Question Parsing Question Understanding Output
Rule Match
QnA
Document Content
Chitter Chat Model
Answer Ranking
Answer Sentiment Analysis
Sentiment Analysis Output
Context
Answer Candidates Personalization Knowledge Graph
Under development
Conversational Chatbot
How can be happy? Why I’m so busy?
Questio stion n Unde ders rstan tandi ding ng:
(master is busy)
he doing?)
(being not obsessive)
how?) Knowle wledg dge e Re Repr present esentatio tion: n:
ial l ce certifi ficat cates es, executed in the ma mainla land nd, and to be used in Hong g Kong g Spe peci cial l Adm dministr istrative tive Re Regi gion
the Consu sular lar Depa partment ment of th f the Minis istry ry of f Fo Foreig eign Aff ffairs rs of the People's Republic of China
个正确的东西,否则心在烦恼(affliction)中时是 很难转动的。要不断培养自己的发心(bodhicitta- samutpada) ,让它越来越宽广,越来越清净, 烦恼自然就越来越少。恨(hatred)也好,念 (obsession)也好,都是妄想(delusion) ,消耗心力、 迷障未来。 Answ swer er Genera eratio tion: n: avoid trivial and boring answers
(busy now)
(take your time)
(see you later)
(dogs are cute) 是很可爱 (yes, they are cute)
x0 x1 x2 x3 xn
Embedding Layer
V0 V1 V2 V3
Vn
h0 h1 h2 hn h3
V0 V1 V2 V3 V4 V5 V6 V7 V8 x0 x1 x2 x3 x8
Embedding Layer
x4 x5 x6 x7
Inpu put: q: q: cu current ent qu query c: c: co contenxt nxt Ou Outpu put: q': current query after anaphora resolution H: H: replace pronouns in the current query with noun phrases in the context About 5% of the total queries Examples: C1: 你是陈奕迅粉丝吗? (are you a fan of Eason Chan? ) C2: 更喜欢张学友 (I like Jacky Cheung more) q : 为什么更喜欢他? (Why like him more?) q ‘: 为什么更喜欢张学友 (Why like Jacky Cheung more?)
C1 : 你住哪儿? (where do you live? ) C2 : 不二寺。 (Bu’er Temple ) q : 那在哪儿? (Where is it? ) q ‘ : 不二寺在哪儿? (Where is Bu’er Temple ? )
模型建立 代消解
Context Query
陈奕迅 粉丝 更 喜欢 张学友
为什么 更 喜欢 他
) | (
m ax
为什么更喜欢他 张学友 P P
“他”(him) “张学友”(Jacky Chueng) q' = 为什么更喜欢张学友
Example: C1: 你是陈奕迅粉丝吗? C2: 更喜欢张学友 q : 为什么更喜欢他? q ‘: 为什么更喜欢张学友
caused by the mistakes of entity tagging A bad case: C1: 你认识贤三吗? C2: 当然认识。 q : 他是你什么人? q ': 三是你什么人?
Inpu put: q: current query c: context Ou Outpu put: q': current query after query complement H: H: complete the current query with information in the context About 15% of the total queries Examples: C1: 那你会发表情包吗? (can you send emojis? ) C2: 一般不发 (usually I don’t send emojis) q :为什么? (Why?) q ‘: 为什么不发表情包 (Why not send emojis?)
C1 :讲个故事给我听 (tell me a story ) C2 :等我学会了给你讲哦 。 (I’ll tell you a story once I learn how to) q :我等着 (I’m waiting) q ‘ :我等着听故事 (I’m waiting for the story)
模型建立 代消解
Training Sample: C1:讲个故事给我听 C2:等我学会了给你讲哦 。 q :我等着 q ‘:我等着听故事
Xian’er Mechanical Monk by 11%
我 等 着 听 故 我 等 着 听
讲 个 故 事 给 我 听 _E_ 等 ...
... ... x y
部分结果展示
你去问问师父喜欢你吗 不会的,问你师父去 什么时候问必要
Query Completion Results in Real Dialogs
Does your master like you? Need to ask him ask Ask your master if he likes you
部分结果展示
Sentence Similarity Computation
Unsupervised word embedding approach is not good enough
Sentence 0 Sentence 1
Similarity based
Embedding
Similar Enough? 你是谁 (who are you) 我是谁 (who am I) 0.93 No 我爱你 (I love you) 你爱我 (you love me) 0.89 No 吃饭了吗 (Do you have lunch?) 吃饭了 (just had lunch) 0.84 No 你干嘛的 (what is your job?) 你干嘛呢 (who are you doing?) 0.93 No 有轮回吗? (Is
reincarnation true?)
轮回有结束吗 (will the cycle of
life end?)
0.73 No 会不会轮回 (will reincarnation
happen?)
会不会轮回结束 (Will
reincarnation end?)
0.84 No 随喜您 (you did it well) 您做的很好 (you did it well) 0.20 Yes
Supervised Learning for Sentence Similarity Feature Embedding Model
unigrams bi-grams
word pairs from two sentences each edit operations 什么 含义 vs. s. 什么 意思
match-什么-什么 replace-含义-意思
RNN for sentence similarity
Question 0 Question 1
Models Models Accurac acy Unsuper ervi vised ed word embeddi ding 0.63 0.63 RNN RNN + c cosine similar arit ity 0.65 0.65 RNN + M MLP 0.6878 0.6878 CNN + M MLP 0.6968 0.6968 RNN + T Tensor 0.728 0.728 Feature re Embeddi ding ng 0.75 0.75
220,000 sentence pairs for training 20,000 for testing
28
One sentence in Language A One sentence in Language B Input Sentence Response Sentence
Translation
Response Generation
29
30
34
36
A diverter is developed to generate the mechanism distribution of an input post
knowledge graph article content
knowledge management model tuning chatbot customization
43
Voice & Audio Natural Language Processing Machine Learning Image & Video
AI@tencent.com niucheng@tencent.com Beijing, Guangzhou, Shenzhen, Palo Alto
Machine Translation
/通用格式 /通用格式 /通用格式 /通用格式 /通用格式
/通用格式 /通用格式 /通用格式 /通用格式 /通用格式samples/sec
batch_size
speed, 4 gpus
amber mxnet tf
WeC eChat hat A.I .I. . NLP LP