Adversarial Learning for Neural Dialogue Generation Li, Jiwei, Will - PowerPoint PPT Presentation

Adversarial Learning for Neural Dialogue Generation Li, Jiwei, Will Monroe, Tianlin Shi, Alan Ritter, and Dan Jurafsky EMNLP’17 Presented by Yiren Wang (CS546, Spring 2018) 2

Ma Main Con Contri ribution ons • Goal • End-to-end neural dialogue generation system • To produce sequences that are indistinguishable from human-generated dialogue utterances • Main Contributions • Adversarial training approach for response generation • Cast the task in a reinforcement learning framework. 3

Ou Outline • Model Architecture • Adversarial Reinforce Learning: • Adversarial REINFORCE • Reward for Every Generation Step (REGS) • Teacher Enforcing • Overall Algorithm (Pseudocode) • Experiment Results • Summary 4

Ad Adversarial Mod Model • Overall Architecture 5

Ge Generativ tive Mod Model Model: Standard Seq2Seq model with Attention Mechanism • Input: dialogue history ! • Output: response " • (Sutskever et al., 2014; Jean et al., 2014) 6

Discrimina nati tive Mod Model • Model: binary classifier Hierarchical encoder + 2-class softmax • • Input: dialogue utterances {", $} • Output: label indicating whether generated by human or by machine • & ' ( ", $ ) (by human) • & * ( ", $ ) (by machine) 7

Ad Adversarial RE REINFORCE CE • Policy Gradient Training • Discriminator score is used as reward for generator • Generator is trained to maximize the expected reward 8

Po Policy Gr Grad adie ient Tr Training Approximated by likelihood ratio 9

Po Policy Gr Grad adie ient Tr Training Approximated by likelihood ratio Baseline value to reduce the variance of the estimate while keeping it unbiased Policy updates in the classification parameter space score 10

Pr Problem wi with va vanilla REINFORCE • Expectation of reward is approximated by only one sample • Reward associated with the sample is used for all actions Input : What’s your name Human : I am John Machine : I don’t know (negative reward) 11

Pr Problem wi with va vanilla REINFORCE • Expectation of reward is approximated by only one sample • Reward associated with the sample is used for all actions Input : What’s your name Human : I am John Machine : I don’t know (negative reward) Machine : I don’t know (negative reward) (neutral reward) 12

Re Reward fo for Ev Every Ge Generatio tion St Step (R (REGS) • Strategies • Monte Carlo (MC) Search • Training Discriminator For Rewarding Partially Decoded Sequences 13

St Strategy I I: Mon : Monte Ca Carl rlo ( o (MC) MC) Se Search ch • Repeats sampling N times • Average score is the reward 14

St Strategy I I: Mon : Monte Ca Carl rlo ( o (MC) MC) Se Search ch • Repeats sampling N times • Average score is the reward Average reward 15

St Strategy I I: Mon : Monte Ca Carl rlo ( o (MC) MC) Se Search ch • Repeats sampling N times • Average score is the reward √ More accurate ✘ Time consuming Average reward 16

St Strategy I II: R : Reward P Part rtially D Decod oded Se Seqs • Break generated sequences into partial subsequences • Sample one positive and one negative subsequence Partially-generated sequence score for each partial sequence √ Time efficient ✘ Less accurate 17

Un Unstab able le Traini ning ng ✘ Generator only indirectly exposed to the gold-standard target • When generator deteriorates: • Discriminator does an excellent job distinguishing – from + • Generator only knows generated sequences are bad • But get lost what are good and how to push itself towards good • Loss of reward signals leads to a breakdown in training 18

Te Teacher Fo Forcing • Teacher Forcing: ”having a teacher intervene and force it to generate true responses” • Discriminator: • assigns a reward of 1 to the human responses • Generator: • uses this reward to update itself on human generated examples √ more direct access to the gold-standard targets 19

Ov Overall Al Algorithm 20

Re Results 21

Su Summa mmary • Adversarial training for response generation • Cast the model in the framework of reinforcement learning • Discriminator: Turing test • Generator: trained to maximize the reward from discriminator 22

Th Thanks! s! 23

Adversarial Learning for Neural Dialogue Generation Li, Jiwei, Will - PowerPoint PPT Presentation

Adversarial Learning for Neural Dialogue Generation Li, Jiwei, Will Monroe, Tianlin Shi, Alan Ritter, and Dan Jurafsky EMNLP17 Presented by Yiren Wang (CS546, Spring 2018) 2 Ma Main Con Contri ribution ons Goal End-to-end neural

Adversarial Learning for Neural Dialogue Generation 1 , Will Monroe 1 , Tianlan Shi 1 , Jiwei Li

Adversarial Learning Bounds for Linear Classes and Neural Nets Understanding Adversarial Learning

dialogue notations and design Dialogue Notations and Design Dialogue Notations

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Another Diversity-Promoting Objective Function for Neural Dialogue Generation Ryo Nakamura ,

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Language Technology II: Natural Language Dialogue Verbal Output Generation in Dialogue

SECURITY, ADVERSARIAL SECURITY, ADVERSARIAL LEARNING, AND PRIVACY LEARNING, AND PRIVACY

Generative Adversarial Nets(GANs) Troy Cary and Chenzhi Zhao A generative adversarial net is

Dialogue corpora NPFL070 December 11, 2019 (NPFL070) Dialogue corpora December 11, 2019 1 /

Language and Computers Speech acts Rules Early dialogue Dialog Systems systems ELIZA Other

dialogue systems, dialogue modeling 15 June 2007 ptt dialogue systems: intro 1/71 Dialog

dialogue notations and Dialogue linked to the semantics of the system what it does

FSU DEPARTMENT OF COMPUTER SCIENCE Improving Performance by Branch Reordering by Minghui Yang

On the use of Lagrangian Coherent Structures in direct assimilation of ocean tracer images O.

When Oblivious is Not: Attacks against OPAM WOOT20@USENIX-SECURITY Nirjhar Roy (Indian

Outline Return-oriented programming (ROP) CSci 5271 Announcements Introduction to Computer

Data Mining 2020 Frequent Pattern Mining (2) Ad Feelders Universiteit Utrecht October 2, 2020

Multipath TCP Signalling Costin Raiciu February 9, 2010 1 Introduction One design issue for

Planning Philipp Koehn 26 March 2020 Philipp Koehn Artificial Intelligence: Planning 26 March

2/9/10 Instructional Strategies that Work: Systematic Instruction and Natural Training Strategies

Adversarial Learning for Neural Dialogue Generation Li, Jiwei, Will - PowerPoint PPT Presentation

Adversarial Learning for Neural Dialogue Generation Li, Jiwei, Will Monroe, Tianlin Shi, Alan Ritter, and Dan Jurafsky EMNLP17 Presented by Yiren Wang (CS546, Spring 2018) 2 Ma Main Con Contri ribution ons Goal End-to-end neural

Adversarial Learning for Neural Dialogue Generation 1 , Will Monroe 1 , Tianlan Shi 1 , Jiwei Li

Adversarial Learning Bounds for Linear Classes and Neural Nets Understanding Adversarial Learning

dialogue notations and design Dialogue Notations and Design Dialogue Notations

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Another Diversity-Promoting Objective Function for Neural Dialogue Generation Ryo Nakamura ,

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Language Technology II: Natural Language Dialogue Verbal Output Generation in Dialogue

SECURITY, ADVERSARIAL SECURITY, ADVERSARIAL LEARNING, AND PRIVACY LEARNING, AND PRIVACY

Generative Adversarial Nets(GANs) Troy Cary and Chenzhi Zhao A generative adversarial net is

Dialogue corpora NPFL070 December 11, 2019 (NPFL070) Dialogue corpora December 11, 2019 1 /

Language and Computers Speech acts Rules Early dialogue Dialog Systems systems ELIZA Other

dialogue systems, dialogue modeling 15 June 2007 ptt dialogue systems: intro 1/71 Dialog

dialogue notations and Dialogue linked to the semantics of the system what it does

FSU DEPARTMENT OF COMPUTER SCIENCE Improving Performance by Branch Reordering by Minghui Yang

On the use of Lagrangian Coherent Structures in direct assimilation of ocean tracer images O.

When Oblivious is Not: Attacks against OPAM WOOT20@USENIX-SECURITY Nirjhar Roy (Indian

Outline Return-oriented programming (ROP) CSci 5271 Announcements Introduction to Computer

Data Mining 2020 Frequent Pattern Mining (2) Ad Feelders Universiteit Utrecht October 2, 2020

Multipath TCP Signalling Costin Raiciu February 9, 2010 1 Introduction One design issue for

Planning Philipp Koehn 26 March 2020 Philipp Koehn Artificial Intelligence: Planning 26 March

2/9/10 Instructional Strategies that Work: Systematic Instruction and Natural Training Strategies

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin