SLIDE 1 Presenter: Ji, Lu
A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators
Zhihao Fan1, Zhongyu Wei1, Siyuan Wang1, Yang Liu2, Xuanjing Huang3
1 School of Data Science, Fudan University, China 2 Liulishuo Company 3 School of Computer Science, Fudan University, China
SLIDE 2
Outline
§ Introduction § Framework § Experiment § Conclusion
SLIDE 3 Natural question generation
§ Generating a natural question which can potentially engage a human in starting a conversation (Mostafazadeh et al., 2016)
[Mostafazadeh et al., 2017] Nasrin Mostafazadeh, Ishan Misra, Jacob Devlin, Margaret Mitchell, Xiaodong He, and Lucy
- Vanderwende. 2016. Generating natural questions about an image. In Proceedings of the 54th Annual Meeting of the
Association for Computational Linguistics
SLIDE 4 Existing approaches
Approaches:
- Retrieval based model
- Seq2Seq and its variants that committed to better fit the
labeled data with NLL lose. Limitations:
- Natural is not emphasized in these models
- No knowledge about unnatural questions
- Hard to identify the progress in generating natural questions
SLIDE 5
Compare with Questions in VQA and VQG
§ VQA questions are much simpler and can be easily answered using information from the source image directly. § VQG questions are more complex and answers are not trivial.
SLIDE 6
Compare with Questions in VQA and VQG
§ Regard question for VQA as negative samples of VQG to train the generator in adversarial learning fashion
SLIDE 7
Contributions
§ Consider question generation as language generation task with specific attributes in terms of content and linguistics, i.e. interesting and human-written. § For the attribute of human written, we use a generative adversarial network (GAN) to learn a dynamic discriminator to distinguish human generated questions and machine generated questions. § For the attribute of natural, we use questions from VQA as negative samples and questions from VQG as positive samples to train a static discriminator.
SLIDE 8
Outline
§ Introduction § Framework § Experiment § Conclusion
SLIDE 9
Framework
SLIDE 10 Structure for Question Distribution
§ An overall domain for all the questions. § According to linguistic attribute, we split into two antithetic domains " (machine generated) and # (human written). § According to content attribute natural, we further split # into two antithetic domains $% (natural) and $& (descriptive).
- = " ⋃ # , #= $% ⋃ $&
- $% ⊂ # ⊂
SLIDE 11
Bi-discriminator Configuration
§ Dynamic discriminator 𝐸- is proposed to distinguish human written questions and machine generated questions. § It is used to guide the generator to produce questions closer to samples from the domain of #. § 𝑀/0 = −𝔽3~5 log91 − 𝐸- 𝑅 < −𝔽3~= log 𝐸- 𝑅
SLIDE 12
Bi-discriminator Configuration
§ Static discriminator 𝐸> is proposed to distinguish natural questions and descriptive questions. § 𝑞@ 𝑅, 𝐽 = B𝑄 𝑅 ∈ $%| 𝐽 , 𝑅 ∈ $%| 𝐽 𝑄 𝑅 ∈ $&| 𝐽 , 𝑅 ∈ $&| 𝐽 § 𝑀/F = −(1 − 𝑞@ 𝑅, 𝐽 )Ilog 𝑞@ 𝑅, 𝐽
SLIDE 13
Optimize with Reinforcement Learning
SLIDE 14
Outline
§ Introduction § Framework § Experiment § Conclusion
SLIDE 15 Dataset
§ MSCOCO part of Visual Question Generation (VQG)
§ contains 2500, 1250 and 1250 images for training, validation and testing respectively. § Each image is accompanied with 5 natural questions produced by human annotators.
§ VQA is used to train the static discriminator.
§ For each image in VQA, three questions are collected. § Contains about 80000, 40000, 80000 images for training, validation and testing respectively.
SLIDE 16
Models for Comparison
§ 𝐿𝑂𝑂: Retrieve question from those of similar images. § 𝐽𝑛2𝑇𝑓𝑟: Generates a question from image features following Seq2Seq fashion. § 𝐽𝑛2𝑇𝑓𝑟RST−@SUV$: Pre-train on VQA. § 𝑁𝐽𝑌𝐹𝑆−𝐶𝑀𝐹𝑉−4: Optimizing BLEU-4 directly with RL and curriculum learning. § 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓/0: Utilize 𝐸- to guide the training of the generator. § 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓/F: Utilize 𝐸> to guide the training of the generator. § 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓/0%/F: Our proposed model.
SLIDE 17 Automatic Evaluation
Model BLEU-4 Corpus BLEU-4 METEOR ROUGE CIDEr
𝐿𝑂𝑂
37.062 19.799 22.413 52.324 50.199
𝐽𝑛2𝑇𝑓𝑟
36.744 21.028 23.125 54.089 51.171
𝐽𝑛2𝑇𝑓𝑟RST&@SUV$
37.522 22.106 23.877 53.310 54.076
𝑁𝐽𝑌𝐹𝑆 − BLEU − 4
41.674 24.808 24.382 57.777 60.527
𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓/0
38.945 24.420 24.665 56.196 59.513
𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓/F
40.063 25.237 25.492 57.503 61.745
𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓/0%/F
41.098 26.265 25.634 57.679 63.388
SLIDE 18 Human Evaluation
Model # of 1 # of 2 # of 3 Avg score 𝐿𝑂𝑂 214 120 66 1.63 𝐽𝑛2𝑇𝑓𝑟 182 147 71 1.72 𝑁𝐽𝑌𝐹𝑆 − BLEU − 4 153 172 75 1.81 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓/0 167 153 80 1.78 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓/0%/F 149 160 91 1.86 𝐻𝑠𝑝𝑣𝑜𝑒 − 𝑈𝑠𝑣𝑢ℎ 50 70 271 2.55 § 200 images are sampled § Questions from different systems are presented for annotation § 2 annotators are involved to rate questions with 3-level grades § 3 is the most interesting
SLIDE 19
Examples
SLIDE 20
Outline
§ Introduction § Framework § Experiment § Conclusion
SLIDE 21 Conclusion and Future Work
- We propose a reinforcement learning framework for
natural question generation which incorporates two discriminators to take two specific attributes of natural question into consideration.
- It can be generalized to other attributes easily.
- It relies on labeled dataset to train the discriminator.
- Unsupervised approach is in need.
SLIDE 22 More information, please contact 1430080043@fudan.edu.cn http://www.sdspeople.fudan.edu.cn/zywei/
A Reinforcement Learning Framework for Natural Question Generation Using Bi-discriminators
Zhihao Fan1, Zhongyu Wei1, Siyuan Wang1, Yang Liu2, Xuanjing Huang3
1 School of Data Science, Fudan University, China 2 Liulishuo Company 3 School of Computer Science, Fudan University, China