A Reinforcement Learning Framework for Natural Question Generation - PowerPoint PPT Presentation

A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators Presenter: Ji, Lu Zhihao Fan 1 , Zhongyu Wei 1 , Siyuan Wang 1 , Yang Liu 2 , Xuanjing Huang 3 1 School of Data Science, Fudan University, China 2 Liulishuo Company 3 School of Computer Science, Fudan University, China

Outline § Introduction § Framework § Experiment § Conclusion

Natural question generation § Generating a natural question which can potentially engage a human in starting a conversation (Mostafazadeh et al., 2016) [Mostafazadeh et al., 2017] Nasrin Mostafazadeh, Ishan Misra, Jacob Devlin, Margaret Mitchell, Xiaodong He, and Lucy Vanderwende. 2016. Generating natural questions about an image. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics

Existing approaches Approaches: • Retrieval based model • Seq2Seq and its variants that committed to better fit the labeled data with NLL lose. Limitations: • Natural is not emphasized in these models • No knowledge about unnatural questions • Hard to identify the progress in generating natural questions

Compare with Questions in VQA and VQG § VQA questions are much simpler and can be easily answered using information from the source image directly. § VQG questions are more complex and answers are not trivial.

Compare with Questions in VQA and VQG § Regard question for VQA as negative samples of VQG to train the generator in adversarial learning fashion

Contributions § Consider question generation as language generation task with specific attributes in terms of content and linguistics, i.e. interesting and human-written. § For the attribute of human written, we use a generative adversarial network (GAN) to learn a dynamic discriminator to distinguish human generated questions and machine generated questions. § For the attribute of natural, we use questions from VQA as negative samples and questions from VQG as positive samples to train a static discriminator.

Framework

Structure for Question Distribution § An overall domain 𝒠 for all the questions. § According to linguistic attribute, we split 𝒠 into two antithetic domains 𝒠 " (machine generated) and 𝒠 # (human written). § According to content attribute natural, we further split 𝒠 # into two antithetic domains 𝒠 $% (natural) and 𝒠 $& (descriptive). • 𝒠 = 𝒠 " ⋃ 𝒠 # , 𝒠 # = 𝒠 $% ⋃ 𝒠 $& • 𝒠 $% ⊂ 𝒠 # ⊂ 𝒠

Bi-discriminator Configuration § Dynamic discriminator 𝐸 - is proposed to distinguish human written questions and machine generated questions. § It is used to guide the generator to produce questions closer to samples from the domain of 𝒠 # . § 𝑀 / 0 = −𝔽 3~𝒠 5 log91 − 𝐸 - 𝑅 < −𝔽 3~𝒠 = log 𝐸 - 𝑅

Bi-discriminator Configuration § Static discriminator 𝐸 > is proposed to distinguish natural questions and descriptive questions. § 𝑞 @ 𝑅, 𝐽 = B𝑄 𝑅 ∈ 𝒠 $% | 𝐽 , 𝑅 ∈ 𝒠 $% | 𝐽 𝑄 𝑅 ∈ 𝒠 $& | 𝐽 , 𝑅 ∈ 𝒠 $& | 𝐽 § 𝑀 / F = −(1 − 𝑞 @ 𝑅, 𝐽 ) I log 𝑞 @ 𝑅, 𝐽

Optimize with Reinforcement Learning

Dataset § MSCOCO part of Visual Question Generation (VQG) § contains 2500, 1250 and 1250 images for training, validation and testing respectively. § Each image is accompanied with 5 natural questions produced by human annotators. § VQA is used to train the static discriminator. § For each image in VQA, three questions are collected. § Contains about 80000, 40000, 80000 images for training, validation and testing respectively.

Models for Comparison § 𝐿𝑂𝑂 : Retrieve question from those of similar images. § 𝐽𝑛𝑕2𝑇𝑓𝑟 : Generates a question from image features following Seq2Seq fashion. § 𝐽𝑛𝑕2𝑇𝑓𝑟 RST − @SUV$ : Pre-train on VQA. § 𝑁𝐽𝑌𝐹𝑆 − 𝐶𝑀𝐹𝑉 − 4 : Optimizing BLEU-4 directly with RL and curriculum learning. § 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 : Utilize 𝐸 - to guide the training of the generator. § 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / F : Utilize 𝐸 > to guide the training of the generator. § 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 %/ F : Our proposed model.

Automatic Evaluation Corpus Model BLEU-4 METEOR ROUGE CIDEr BLEU-4 37.062 19.799 22.413 52.324 50.199 𝐿𝑂𝑂 36.744 21.028 23.125 54.089 51.171 𝐽𝑛𝑕2𝑇𝑓𝑟 37.522 22.106 23.877 53.310 54.076 𝐽𝑛𝑕2𝑇𝑓𝑟 RST&@SUV$ 41.674 24.808 24.382 57.777 60.527 𝑁𝐽𝑌𝐹𝑆 − BLEU − 4 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 38.945 24.420 24.665 56.196 59.513 40.063 25.237 25.492 57.503 61.745 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / F 41.098 26.265 25.634 57.679 63.388 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 %/ F

Human Evaluation § 200 images are sampled § Questions from different systems are presented for annotation § 2 annotators are involved to rate questions with 3-level grades § 3 is the most interesting Model # of 1 # of 2 # of 3 Avg score 𝐿𝑂𝑂 214 120 66 1.63 𝐽𝑛𝑕2𝑇𝑓𝑟 182 147 71 1.72 𝑁𝐽𝑌𝐹𝑆 − BLEU − 4 153 172 75 1.81 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 167 153 80 1.78 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 %/ F 149 160 91 1.86 𝐻𝑠𝑝𝑣𝑜𝑒 − 𝑈𝑠𝑣𝑢ℎ 50 70 271 2.55

Examples

Conclusion and Future Work • We propose a reinforcement learning framework for natural question generation which incorporates two discriminators to take two specific attributes of natural question into consideration. • It can be generalized to other attributes easily. • It relies on labeled dataset to train the discriminator. • Unsupervised approach is in need.

A Reinforcement Learning Framework for Natural Question Generation Using Bi-discriminators More information, please contact 1430080043@fudan.edu.cn http://www.sdspeople.fudan.edu.cn/zywei/ Zhihao Fan 1 , Zhongyu Wei 1 , Siyuan Wang 1 , Yang Liu 2 , Xuanjing Huang 3 1 School of Data Science, Fudan University, China 2 Liulishuo Company 3 School of Computer Science, Fudan University, China

A Reinforcement Learning Framework for Natural Question Generation - PowerPoint PPT Presentation

A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators Presenter: Ji, Lu Zhihao Fan 1 , Zhongyu Wei 1 , Siyuan Wang 1 , Yang Liu 2 , Xuanjing Huang 3 1 School of Data Science, Fudan University, China 2

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

CSC2621 Topics in Robotics Reinforcement Learning in Robotics Week 11: Hierarchical Reinforcement

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

Reinforcement Learning Framework Reinforcement Learning Rewards, Returns Lectures 4 and 5

Tutorial RCIS 2013-Paris May 29-31 Introduction Presenters: Noushin Ashrafi

Buy! Buy! Buy! Bi (?) A graphic short exploring queerness, media, and representation By Katherine

Probabilistic & Unsupervised Learning Belief Propagation Maneesh Sahani

Higher dimensional massive (bi-)gravity: Constructions and solutions Tuan Q. Do Vietnam National

Image Warping Image Mapping Image Mapping - Examples Forward Mapping Forward Mapping -

Airflow the perfect match in our Analytics Pipeline Sergio Camilo Fandio Hernndez Senior

Sounder PEATE Services for Assessment of CrIMSS xDRs Evan Fishbein JPL/NASA Sounder PEATE

Graph Stories in Small Area 27th Int. Symposium on Graph Drawing and Network Visualization (GD

A Reinforcement Learning Framework for Natural Question Generation - PowerPoint PPT Presentation

A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators Presenter: Ji, Lu Zhihao Fan 1 , Zhongyu Wei 1 , Siyuan Wang 1 , Yang Liu 2 , Xuanjing Huang 3 1 School of Data Science, Fudan University, China 2

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

CSC2621 Topics in Robotics Reinforcement Learning in Robotics Week 11: Hierarchical Reinforcement

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

Reinforcement Learning Framework Reinforcement Learning Rewards, Returns Lectures 4 and 5

Tutorial RCIS 2013-Paris May 29-31 Introduction Presenters: Noushin Ashrafi

Buy! Buy! Buy! Bi (?) A graphic short exploring queerness, media, and representation By Katherine

Probabilistic &amp; Unsupervised Learning Belief Propagation Maneesh Sahani

Higher dimensional massive (bi-)gravity: Constructions and solutions Tuan Q. Do Vietnam National

Image Warping Image Mapping Image Mapping - Examples Forward Mapping Forward Mapping -

Airflow the perfect match in our Analytics Pipeline Sergio Camilo Fandio Hernndez Senior

Sounder PEATE Services for Assessment of CrIMSS xDRs Evan Fishbein JPL/NASA Sounder PEATE

Graph Stories in Small Area 27th Int. Symposium on Graph Drawing and Network Visualization (GD

Probabilistic & Unsupervised Learning Belief Propagation Maneesh Sahani