A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N - PowerPoint PPT Presentation

S C A I 2 0 1 9 , 1 2 / 0 8 / 2 0 1 9 A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N E R AT I O N F R O M T E X T U S I N G D E E P R E I N F O R C E M E N T L E A R N I N G V I S H WA J E E T K U M A R 1 , 2 , 3 , G A N E S H R A M A K R I S H N A N 2 , Y U A N - FA N G L I 3 1 I I T B - M O N A S H R E S E A R C H A C A D E M Y, 2 I I T B O M B AY, 3 M O N A S H U N I V E R S I T Y � 1

O U T L I N E • Introduction & motivation • The generator-evaluator framework • Evaluation • Conclusion � 2

W H E N / W H E R E / W H Y D O W E A S K Q U E S T I O N S ? • Organisation : policies, product & service documentation, patents, meeting minutes, FAQ, … • Education : reading comprehension assessment • Healthcare : clinical notes • Technology : chatbots, customer support, … � 3

T H E Q U E S T I O N G E N E R AT I O N TA S K • Goal • Challenges • Automatically generating • Questions must be well-formed questions • Questions must be relevant • From sentences or • Questions must be answerable paragraphs   � 4

M O T I VAT I O N • QG: a (relatively) recent task: a Seq2Seq problem • RNN-based models with attention perform well for short sentences • However for longer text they perform poorly • Cross-entropy loss may make the training process brittle: the exposure bias problem � 5

E X A M P L E G E N E R AT E D Q U E S T I O N S Example text: “new york city traces its roots to its 1624 founding as a trading post by colonists of the dutch republic and was named new amsterdam in 1626 .” M O D E L Q U E S T I O N S e q 2 S e q w i t h w h a t y e a r w a s n e w y o r k n a m e d ? c ro s s - e n t ro p y l o s s C o p y - a w a re w h a t y e a r w a s n e w n e w a m s t e rd a m n a m e d ? s e q 2 s e q G E ( S e q 2 s e q w i t h w h a t y e a r w a s n e w y o r k f o u n d e d ? B L E U ) � 6

T O B E M O R E S P E C I F I C • QG performance is evaluated using discrete metrics like BLEU, ROUGE etc., not cross-entropy loss • Need for a mechanism to deal with relatively rare word and important words • Need to handle the word repetition problem while decoding � 7

A G E N E R AT O R - E VA L U AT O R F R A M E W O R K F O R Q G • Generator ( semantics ) • Evaluator ( structure ) • Identifies pivotal answers • Optimises conformity towards (Pointer Networks) ground-truth questions • Recognises contextually • Reinforcement learning with important keywords (Copy) performance metrics as rewards • Avoids redundancy (Coverage) � 9

R E I N F O R C E M E N T L E A R N I N G F O R Q G Generator Parameter update BLEU, ROUGE-L, METEOR, etc. Words and the context vector � 10

Vocabulary Distribution Generator Context Vector Attention distribution P cg A R C H I T E C T U R E Word Coverage Vector Bi-LSTM Answer Encoded Sentence LSTM Question Decoder Encoder ... Answer Encoder Pointer Network Y Gold Final Distribution Training data Evaluator Reward Y samples � 11

R E WA R D F U N C T I O N S • General rewards • BLEU, GLEU, METEOR, ROUGE-L • DAS: decomposable attention that considers variability • QG-specific rewards • QSS: degree of overlap between generated question & source sentence • ANSS: degree of overlap between predicted answer & gold answer � 12

E VA L U AT I O N : D ATA S E T & B A S E L I N E S • Dataset : SQuAD • Baselines • Train: 70,484 • Learning to ask (L2A): vanilla Seq2Seq model (ACL’17) • Valid: 10,570 • NQG LC : Seq2Seq + ground-truth • Test: 11,877 answer encoding (NAACL’18) • AutoQG: Seq2Seq + answer prediction (PAKDD’18) • SUM: RL-based summarisation (ICLR’18) � 14

A U T O M AT I C E VA L U AT I O N M O D E L B L E U 1 B L E U 2 B L E U 3 B L E U 4 M E T E O R R O U G E - L L 2 A 4 3 . 2 1 2 4 . 7 7 1 5 . 9 3 1 0 . 6 0 1 6 . 3 9 3 8 . 9 8 A u t o Q G 4 4 . 6 8 2 6 . 9 6 1 8 . 1 8 1 2 . 6 8 1 7 . 8 6 4 0 . 5 9 N Q G L C - - - ( 1 3 . 9 8 ) ( 1 8 . 7 7 ) ( 4 2 . 7 2 ) S U M B L E U 1 1 . 2 0 3 . 5 0 1 . 2 1 0 . 4 5 6 . 6 8 1 5 . 2 5 S U M R O U G E 1 1 . 9 4 3 . 9 5 1 . 6 5 0 . 0 8 2 6 . 6 1 1 6 . 1 7 G E B L E U 4 6 . 8 4 2 9 . 3 8 2 0 . 3 3 1 4 . 4 7 1 9 . 0 8 4 1 . 0 7 G E B L E U + Q S S + A N S S 4 6 . 5 9 2 9 . 6 8 2 0 . 7 9 1 5 . 0 4 1 9 . 3 2 4 1 . 7 3 G E D A S 4 4 . 6 4 2 8 . 2 5 1 9 . 6 3 1 4 . 0 7 1 8 . 1 2 4 2 . 0 7 G E D A S + Q S S + A N S S 4 6 . 0 7 2 9 . 7 8 2 1 . 4 3 1 6 . 2 2 1 9 . 4 4 4 2 . 8 4 G E G L U E 4 5 . 2 0 2 9 . 2 2 2 0 . 7 9 1 5 . 2 6 1 8 . 9 8 4 3 . 4 7 G E G L U E + Q S S + A N S S 4 7 . 0 4 3 0 . 0 3 2 1 . 1 5 1 5 . 9 2 1 9 . 0 5 4 3 . 5 5 G E R O U G E 4 7 . 0 1 3 0 . 6 7 2 1 . 9 5 1 6 . 1 7 1 9 . 8 5 4 3 . 9 0 G E R O U G E + Q S S + A N S S 4 8 . 1 3 3 1 . 1 5 2 2 . 0 1 1 6 . 4 8 2 0 . 2 1 4 4 . 1 1 � 15

H U M A N E VA L U AT I O N S Y N TA X S E M A N T I C S R E L E VA N C E M O D E L S C O R E K A P PA S C O R E K A P PA S C O R E K A P PA L 2 A 3 9 . 2 0 . 4 9 3 9 0 . 4 9 2 9 0 . 4 0 A u t o Q G 5 1 . 5 0 . 4 9 4 8 0 . 7 8 4 8 0 . 5 0 G E B L E U 4 7 . 5 0 . 5 2 4 9 0 . 4 5 4 1 . 5 0 . 4 4 G E B L E U + Q S S + A N S S 8 2 0 . 6 3 7 5 . 3 0 . 6 8 7 8 . 3 3 0 . 4 6 G E D A S 6 8 0 . 4 0 6 3 0 . 3 3 4 1 0 . 4 0 G E D A S + Q S S + A N S S 8 4 0 . 5 7 8 1 . 3 0 . 6 0 7 4 0 . 4 7 G E G L U E 6 0 . 5 0 . 5 0 6 2 0 . 5 2 4 4 0 . 4 1 G E G L U E + Q S S + A N S S 7 8 . 3 0 . 6 8 7 4 . 6 0 . 7 1 7 2 0 . 4 0 G E R O U G E 6 9 . 5 0 . 5 6 6 8 0 . 5 8 5 3 0 . 4 3 G E R O U G E + Q S S + A N S S 7 9 . 3 0 . 5 2 7 2 0 . 4 1 6 7 0 . 4 1 � 16

C O N C L U S I O N • A generator-evaluator framework for question generation from text • Takes into account both semantics & structure • Proposes novel reward functions • Evaluation shows state-of-the-art performance � 18

T H A N K Y O U ! A N Y Q U E S T I O N S ? � 19

R E F E R E N C E S • Xinya Du, Junru Shao, and Claire Cardie. Learning to ask: Neural question generation for reading comprehension. In ACL, volume 1, pages 1342–1352, 2017. • Vishwajeet Kumar, Kireeti Boorla, Yogesh Meena, Ganesh Ramakrishnan, and Yuan-Fang Li. Au- tomating reading comprehension by generating question and answer pairs. In PAKDD, 2018. • Pranav Rajpurkar, Jian Zhang, Kon- stantin Lopyrev, and Percy Liang. SQuAD: 100,000+ questions for machine comprehension of text. In EMNLP 2016, pages 2383–2392. ACL, November 2016. • Linfeng Song, Zhiguo Wang, Wael Hamza, Yue Zhang, and Daniel Gildea. Leveraging context information for natural question generation. In NAACL, pages 569–574, 2018. • Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. In ICLR, 2018. � 20

S O M E M O R E E X A M P L E S Text: “critics such as economist paul krugman and u.s. treasury secretary timothy geithner have argued that the regulatory framework did not keep pace with financial innovation, such as the increasing importance of the shadow banking system, derivatives and off-balance sheet financing.” M O D E L Q U E S T I O N w h o a rg u e d t h a t t h e re g u l a t o r y f r a m e w o r k w a s n o t k e e p t o t a k e p a c e A u t o Q G w i t h f i n a n c i a l i n n o v a t i o n ? w h a t w a s t h e n a m e o f t h e i n c re a s i n g i m p o r t a n c e o f t h e s h a d o w b a n k i n g G E B L E U s y s t e m ? w h a t w a s t h e m a i n f o c u s o f t h e p ro b l e m w i t h t h e s h a d o w b a n k i n g G E D A S s y s t e m ? G E G L E U w h a t w a s n o t k e e p p a c e w i t h f i n a n c i a l i n n o v a t i o n ? G E R O U G E w h a t d i d p a u l k r u g m a n a n d u . s . t re a s u r y s e c re t a r y d i s a g re e w i t h ? 21 �

“Legislative power in Warsaw is vested in a unicameral Warsaw City Council (Rada Miasta),which comprises 60 members. Council members are elected directly every four years . Like most legislative bodies, the City Council divides itself into committees which have the oversight of various functions of the city government.” – H T T P S : / / E N . W I K I P E D I A . O R G / W I K I / WA R S A W 1 H o w m a n y m e m b e r s a re i n t h e Wa r s a w C i t y C o u n c i l ? 2 H o w o f t e n a re t h e R a d a M i a s t a e l e c t e d ? 3 T h e C i t y C o u n c i l d i v i d e s i t s e l f i n t o w h a t ? � 22

A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N - PowerPoint PPT Presentation

S C A I 2 0 1 9 , 1 2 / 0 8 / 2 0 1 9 A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N E R AT I O N F R O M T E X T U S I N G D E E P R E I N F O R C E M E N T L E A R N I N G V I S H WA J E E T K U M A R 1 , 2 , 3 ,

Capturing Translational Divergences with Zhechev & Andy Way a Statistical Tree-to-Tree

A Greedy Decoder for Phrase-Based Statistical Machine Translation Philippe Langlais, Alexandre

Hybrid Adaptation of Named Entity Recognition for Statistical Machine Translation Vassilina

Machine Translation Evaluation (Based on Milo s Stanojevi cs slides) Iacer Calixto

= RW Dra Sergey Nikolaevich Blazhko (November

= RW Dra Sergey Nikolaevich Blazhko (November

AMMI Introduction to Deep Learning 11.3. Word embeddings and translation Fran cois Fleuret

M ULTI UN A M ULTILINGUAL C ORPUS FROM U NITED N ATION D OCUMENTS Andreas Eisele, Yu Chen DFKI

Automatic Alignment and Annotation Projection for Literary Texts Uli Steinbach Ines Rehbein

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Towards European Transla/on Cloud: Development of public MT services in the Bal/c countries

Overview of Morpho Challenge task at CLEF 2009 Mikko Kurimo, Sami Virpioja, Ville Turunen

CS5242 Neural Networks and Deep Learning Lecture 09: RNN Applications II Wei WANG TA: Yao SHU,

iOS/macOS 0-day^w48-hours from sandbox to kernel Prsent 31/05/2018 Pour BeeRumP Par Eloi

Tracking with Timing: A System Approach Adriano Lai INFN Sezione di Cagliari Italy &

Anatomic Pathology and Quality Process Improvement 17 December 2013 Steve Halasey Chief Editor

,Y\ 00 @ hi 0 1 . 5h45 , Y } QOO y yz , *=E[ be ya9uy5 Small 0 1 ' 7=5 ,

Whens a grammar bilexical? Efficient Parsing for Bilexical CF Grammars If it has rules /

Code Smells & Refactoring Reid Holmes Lecture 18 - Tuesday November 22 2011. Program

Code Smells and Refactorings Tuesday, October 22 1 Announcements Sprint 1 grades are out

SOEN6461: Software Design Methodologies Yann-Gal Guhneuc Yann-Gal Guhneuc Summary

Inf1-OP Refactoring and Design Patterns Volker Seeker School of Informatics March 8, 2019

Python - Week 1 Mohammad Shokoohi-Yekta 1 An Introduction to Computers and Problem Solving

Standard Resuscitation Paradigm Crystalloid 3:1 Ratio Transient or no response Class I

Sambuz

Useful Links

Newsletter

Mail Us

A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N - PowerPoint PPT Presentation

S C A I 2 0 1 9 , 1 2 / 0 8 / 2 0 1 9 A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N E R AT I O N F R O M T E X T U S I N G D E E P R E I N F O R C E M E N T L E A R N I N G V I S H WA J E E T K U M A R 1 , 2 , 3 ,

Capturing Translational Divergences with Zhechev &amp; Andy Way a Statistical Tree-to-Tree

A Greedy Decoder for Phrase-Based Statistical Machine Translation Philippe Langlais, Alexandre

Hybrid Adaptation of Named Entity Recognition for Statistical Machine Translation Vassilina

Machine Translation Evaluation (Based on Milo s Stanojevi cs slides) Iacer Calixto

= RW Dra Sergey Nikolaevich Blazhko (November

= RW Dra Sergey Nikolaevich Blazhko (November

AMMI Introduction to Deep Learning 11.3. Word embeddings and translation Fran cois Fleuret

M ULTI UN A M ULTILINGUAL C ORPUS FROM U NITED N ATION D OCUMENTS Andreas Eisele, Yu Chen DFKI

Automatic Alignment and Annotation Projection for Literary Texts Uli Steinbach Ines Rehbein

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Towards European Transla/on Cloud: Development of public MT services in the Bal/c countries

Overview of Morpho Challenge task at CLEF 2009 Mikko Kurimo, Sami Virpioja, Ville Turunen

CS5242 Neural Networks and Deep Learning Lecture 09: RNN Applications II Wei WANG TA: Yao SHU,

iOS/macOS 0-day^w48-hours from sandbox to kernel Prsent 31/05/2018 Pour BeeRumP Par Eloi

Tracking with Timing: A System Approach Adriano Lai INFN Sezione di Cagliari Italy &amp;

Anatomic Pathology and Quality Process Improvement 17 December 2013 Steve Halasey Chief Editor

,Y\ 00 @ hi 0 1 . 5h45 , Y } QOO y yz , *=E[ be ya9uy5 Small 0 1 ' 7=5 ,

Whens a grammar bilexical? Efficient Parsing for Bilexical CF Grammars If it has rules /

Code Smells &amp; Refactoring Reid Holmes Lecture 18 - Tuesday November 22 2011. Program

Code Smells and Refactorings Tuesday, October 22 1 Announcements Sprint 1 grades are out

SOEN6461: Software Design Methodologies Yann-Gal Guhneuc Yann-Gal Guhneuc Summary

Inf1-OP Refactoring and Design Patterns Volker Seeker School of Informatics March 8, 2019

Python - Week 1 Mohammad Shokoohi-Yekta 1 An Introduction to Computers and Problem Solving

Standard Resuscitation Paradigm Crystalloid 3:1 Ratio Transient or no response Class I

Sambuz

Useful Links

Newsletter

Mail Us

Capturing Translational Divergences with Zhechev & Andy Way a Statistical Tree-to-Tree

Tracking with Timing: A System Approach Adriano Lai INFN Sezione di Cagliari Italy &

Code Smells & Refactoring Reid Holmes Lecture 18 - Tuesday November 22 2011. Program