[PPT] - Generative Deep Neural Networks for Dialogue Presented By Shantanu PowerPoint Presentation

SLIDE 1

Generative Deep Neural Networks for Dialogue

Presented By Shantanu Kumar Adapted from slides by Iulian Vlad Serban

SLIDE 2

What are Dialogue Systems?

Computer system that can

converse like a human with another human while making sense

Types of Dialogue
Open Domain
Task Oriented

SLIDE 3

Applications of Dialogue Systems

Technical Support
Product enquiry
Website navigation
HR helpdesk
Error diagnosis
IVR system in Call Centres
Entertainment
IoT interface
Virtual Assistants
Siri, Cortana, Google Assistant
Assistive technology
Simulate human conversations

SLIDE 4

How do we build such a system??

SLIDE 5

Traditional Pipeline models

SLIDE 6

End-To-End models with DL

Neural Network Response Dialogue Context

SLIDE 7

End-To-End models with DL

Knowledge Database

Actions

SLIDE 8

What is a good Chatbot?

The responses should be

Grammatical
Coherent
In Context
Ideally non-Generic responses

SLIDE 9

How can we learn the model?

Unsupervised Learning (Generative Models)
Maximise likelihood w.r.t. words
Supervised Learning
Maximise likelihood w.r.t. annotated labels
Reinforcement Learning
Learning from real users
Learning from simulated users
Learning with given reward function

SLIDE 10

Generative Dialogue Modeling

Decomposing Dialogue Probability, Decomposing Utterance Probability,

SLIDE 11

Maximising likelihood on fixed corpora

Imitating human dialogues

Generative Dialogue Modeling

SLIDE 12

Models proposed with three inductive biases

Long-term memory
Recurrent units used (GRU)
High-level compositional structure
Hierarchical structure
Multi resolution representation (MRRNN paper)
Representing uncertainty and ambiguity
Latent variables (MRRNN and VHRED)

Generative Dialogue Modeling

SLIDE 13

Hierarchical Recurrent Encoder-Decoder (HRED)

Encoder RNN
For encoding each utterance independently into

an utterance vector

Context RNN
For encoding the topic/context of the dialogue up till the

current utterance using utterance vectors

Decoder RNN
For predicting the next utterance

Akshay: Can be applied to arbitrary lengths

SLIDE 14

Hierarchical Recurrent Encoder-Decoder (HRED)

SLIDE 15

Bidirectional HRED

Encoder RNN -> Bidirectional
Forward and Backward RNNs combined to get fixed

length representation

Concat last state of each RNN
Concat of L2 pooling over temporal dimension

Hierarchical Recurrent Encoder-Decoder (HRED)

SLIDE 16

Hierarchical Recurrent Encoder-Decoder (HRED)

Bootstrapping

Initialising with Word2Vec embeddings
Trained on Google News dataset
Pre-training on SubTle Q-A dataset
5.5M Q-A pairs
Converted to 2-turn dialogue

D = {U1 = Q, U2 = A}

Barun Akshay Prachi Dinesh Gagan

Prachi: 2 stage training

SLIDE 17

Dataset - MovieTriples dataset

Open Domain - Wide variety of topics covered
Names and Numbers replaced with <person> and <number> tokens
Vocab of 10K most popular tokens
Special <continued-utterance> and <end-of-utterance> tokens to capture breaks

Gagan, Rishabh, Dinesh Why only triples? Anshul: Split train/ val on movies?

SLIDE 18

Dialogue Modeling

Ubuntu Dialog Corpus

Goal-driven: Users resolve technical problems
~0.5M dialogues

Twitter Dialog Corpus

Open-domain: Social chit-chat
~0.75M dialogues in Train, 100K for Val and Test
6.27 utterance and 94 tokens per dialogue

SLIDE 19

Expert

Hello! Recently I updated to ubuntu 12.04 LTS and I am unsatisfied by its performance. I am facing a bug since the upgrade to 12.04 LTS. Can anyone help??????????

User

Example - Ubuntu Corpus

SLIDE 20

Expert

Hello! Recently I updated to ubuntu 12.04 LTS and I am unsatisfied by its performance. I am facing a bug since the upgrade to 12.04 LTS. Can anyone help?????????? You need to give more details on the issue.

User

Example - Ubuntu Corpus

SLIDE 21

Expert

Hello! Recently I updated to ubuntu 12.04 LTS and I am unsatisfied by its performance. I am facing a bug since the upgrade to 12.04 LTS. Can anyone help?????????? You need to give more details on the issue. Every time I login it gives me "System Error" pop up. It is happing since I upgraded to 12.04.

User

Example - Ubuntu Corpus

SLIDE 22

Expert

Hello! Recently I updated to ubuntu 12.04 LTS and I am unsatisfied by its performance. I am facing a bug since the upgrade to 12.04 LTS. Can anyone help?????????? You need to give more details on the issue. Every time I login it gives me "System Error" pop up. It is happing since I upgraded to 12.04. Send a report, or cancel it.

User

Example - Ubuntu Corpus

SLIDE 23

Example - Ubuntu Corpus

Expert

Hello! Recently I updated to ubuntu 12.04 LTS and I am unsatisfied by its performance. I am facing a bug since the upgrade to 12.04 LTS. Can anyone help?????????? You need to give more details on the issue. Every time I login it gives me "System Error" pop up. It is happing since I upgraded to 12.04. Send a report, or cancel it. I have already done that but after few min, it pops up again...

User

SLIDE 24

Example - Twitter Corpus

Person B

Hanging out in the library for the past couple hours makes me feel like I'll do great on this test! @smilegirl400 wow, what a nerd lol jk haha =p what!? you changed your bio =( @smileman400 Do you like my bio now? I feel bad for changing it but I like change. =P @smilegirl400 yes I do =) It definitely sums up who you are lisa. Yay! you still got me =)

Person A

SLIDE 25

Evaluation Metric

Word Perplexity
Measures the probability of generating the exact

reference utterance

Word error-rate
Number of words in the dataset the model has predicted

incorrectly divided by the total number of words in the dataset.

Penalises diversity [Akshay]

Barun Akshay Dinesh Rishabh Arindam Anshul

SLIDE 26

Word Perplexity
Can only be used with generative models
Given an utterance, what is the probability?

How do we evaluate given an output utterance?

Multi-modal output
Space of possible valid utterance is huge
Human annotation is expensive and slow

Evaluation Metric

SLIDE 27

How do we evaluate given an output utterance?

Multi-modal output
Space of possible valid utterance is huge
Human annotation is expensive and slow

Automatic Evaluation Metrics

Word overlap measure (BLEU, ROUGE, Levenshtein dist.)
Embedding based measures
Poor correlation with Human annotation

Evaluation Metric

SLIDE 28

Results

Lack of error analysis

SLIDE 29

MAP Output

Most probable last utterance
Found using beam search for better approximation
Generic responses observed
Stochastic sampling gives more diverse dialogues

Nupur: MAP vs Stochastic Sampling

SLIDE 30

SLIDE 31

Extensions

Model

[Barun][Rishabh] Attention model during decoding for long

contexts

[Prachi] Dialogue systems with multiple participants
Different decoders for each participant?
Order of speaking
[Rishabh] Incorporating outside knowledge using KB

SLIDE 32

Extensions

Data

[Akshay][Surag] Use bigger datasets like Reddit for dialogue
[Rishabh] Using film dialogue scripts from films like "Ek ruka

hua fasla" might be useful.

[Barun] Artificially scoring generic responses
[Surag] Prune generic responses from training data

SLIDE 33

Extensions

[Prachi] Automatic generation of dialogue for movie given

storyline and character description

[Gagan] Pre-train word embeddings on SubTle
[Arindam] RL is the best bet to avoid generic responses
[Arindam] Adversarial evaluation
[Arindam] Train additional context to add consistency?

SLIDE 34

Generative Deep Neural Networks for Dialogue

Presented By Shantanu Kumar Adapted from slides by Iulian Vlad Serban

What are Dialogue Systems?

converse like a human with another human while making sense

Applications of Dialogue Systems

How do we build such a system??

Traditional Pipeline models

End-To-End models with DL

Neural Network Response Dialogue Context

End-To-End models with DL

Actions

What is a good Chatbot?

The responses should be

How can we learn the model?

Generative Dialogue Modeling

Decomposing Dialogue Probability, Decomposing Utterance Probability,

Maximising likelihood on fixed corpora

Generative Dialogue Modeling

Models proposed with three inductive biases

Generative Dialogue Modeling

Hierarchical Recurrent Encoder-Decoder (HRED)

an utterance vector

current utterance using utterance vectors

Hierarchical Recurrent Encoder-Decoder (HRED)

Bidirectional HRED

length representation

Hierarchical Recurrent Encoder-Decoder (HRED)

Hierarchical Recurrent Encoder-Decoder (HRED)

Bootstrapping

D = {U1 = Q, U2 = A}

Prachi: 2 stage training

Dataset - MovieTriples dataset

Dialogue Modeling

Ubuntu Dialog Corpus

Twitter Dialog Corpus

Example - Ubuntu Corpus

Example - Ubuntu Corpus

Example - Ubuntu Corpus

Example - Ubuntu Corpus

Example - Ubuntu Corpus

Example - Twitter Corpus

Evaluation Metric

reference utterance

incorrectly divided by the total number of words in the dataset.

How do we evaluate given an output utterance?

Evaluation Metric

How do we evaluate given an output utterance?

Automatic Evaluation Metrics

Evaluation Metric

Results

MAP Output

Extensions

Model

contexts

Extensions

Data

hua fasla" might be useful.

Extensions

storyline and character description

Thank You