A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N - - PowerPoint PPT Presentation

a f r a m e w o r k f o r a u t o m at i c q u e s t i o
SMART_READER_LITE
LIVE PREVIEW

A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N - - PowerPoint PPT Presentation

S C A I 2 0 1 9 , 1 2 / 0 8 / 2 0 1 9 A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N E R AT I O N F R O M T E X T U S I N G D E E P R E I N F O R C E M E N T L E A R N I N G V I S H WA J E E T K U M A R 1 , 2 , 3 ,


slide-1
SLIDE 1

A F R A M E W O R K F O R A U T O M AT I C Q U E S T I O N G E N E R AT I O N F R O M T E X T U S I N G D E E P R E I N F O R C E M E N T L E A R N I N G

S C A I 2 0 1 9 , 1 2 / 0 8 / 2 0 1 9 V I S H WA J E E T K U M A R 1 , 2 , 3, G A N E S H R A M A K R I S H N A N 2, Y U A N - FA N G L I 3

1I I T B - M O N A S H R E S E A R C H A C A D E M Y, 2I I T B O M B AY, 3M O N A S H U N I V E R S I T Y

  • 1
slide-2
SLIDE 2

O U T L I N E

  • Introduction & motivation
  • The generator-evaluator framework
  • Evaluation
  • Conclusion
  • 2
slide-3
SLIDE 3

W H E N / W H E R E / W H Y D O W E A S K Q U E S T I O N S ?

  • Organisation: policies, product & service documentation, patents,

meeting minutes, FAQ, …

  • Education: reading comprehension assessment
  • Healthcare: clinical notes
  • Technology: chatbots, customer support, …
  • 3
slide-4
SLIDE 4

T H E Q U E S T I O N G E N E R AT I O N TA S K

  • Goal
  • Automatically generating

questions

  • From sentences or

paragraphs


  • Challenges
  • Questions must be well-formed
  • Questions must be relevant
  • Questions must be answerable
  • 4
slide-5
SLIDE 5

M O T I VAT I O N

  • QG: a (relatively) recent task: a Seq2Seq problem
  • RNN-based models with attention perform well for short sentences
  • However for longer text they perform poorly
  • Cross-entropy loss may make the training process brittle: the exposure bias

problem

  • 5
slide-6
SLIDE 6

E X A M P L E G E N E R AT E D Q U E S T I O N S

  • 6

M O D E L Q U E S T I O N

S e q 2 S e q w i t h c ro s s - e n t ro p y l o s s w h a t y e a r w a s n e w y o r k n a m e d ? C o p y - a w a re s e q 2 s e q w h a t y e a r w a s n e w n e w a m s t e rd a m n a m e d ? G E ( S e q 2 s e q w i t h B L E U ) w h a t y e a r w a s n e w y o r k f o u n d e d ?

Example text: “new york city traces its roots to its 1624 founding as a trading post by colonists of the dutch republic and was named new amsterdam in 1626 .”

slide-7
SLIDE 7

T O B E M O R E S P E C I F I C

  • QG performance is evaluated using discrete metrics like BLEU, ROUGE

etc., not cross-entropy loss

  • Need for a mechanism to deal with relatively rare word and important

words

  • Need to handle the word repetition problem while decoding
  • 7
slide-8
SLIDE 8

O U T L I N E

  • Introduction & motivation
  • The generator-evaluator framework
  • Evaluation
  • Conclusion
  • 8
slide-9
SLIDE 9

A G E N E R AT O R - E VA L U AT O R F R A M E W O R K F O R Q G

  • Generator (semantics)
  • Identifies pivotal answers

(Pointer Networks)

  • Recognises contextually

important keywords (Copy)

  • Avoids redundancy (Coverage)
  • Evaluator (structure)
  • Optimises conformity towards

ground-truth questions

  • Reinforcement learning with

performance metrics as rewards

  • 9
slide-10
SLIDE 10

R E I N F O R C E M E N T L E A R N I N G F O R Q G

  • 10

BLEU, ROUGE-L, METEOR, etc. Generator Parameter update Words and the context vector

slide-11
SLIDE 11

Generator

LSTM Question Decoder Bi-LSTM Answer Encoded Sentence Encoder

Pcg

Attention distribution Vocabulary Distribution Context Vector Word Coverage Vector Final Distribution

Evaluator

YGold Reward Ysamples Training data

...

Pointer Network Answer Encoder

  • 11

A R C H I T E C T U R E

slide-12
SLIDE 12

R E WA R D F U N C T I O N S

  • General rewards
  • BLEU, GLEU, METEOR, ROUGE-L
  • DAS: decomposable attention that considers variability
  • QG-specific rewards
  • QSS: degree of overlap between generated question & source sentence
  • ANSS: degree of overlap between predicted answer & gold answer
  • 12
slide-13
SLIDE 13

O U T L I N E

  • Introduction & motivation
  • The generator-evaluator framework
  • Evaluation
  • Conclusion
  • 13
slide-14
SLIDE 14

E VA L U AT I O N : D ATA S E T & B A S E L I N E S

  • Dataset: SQuAD
  • Train: 70,484
  • Valid: 10,570
  • Test: 11,877
  • Baselines
  • Learning to ask (L2A): vanilla Seq2Seq

model (ACL’17)

  • NQGLC: Seq2Seq + ground-truth

answer encoding (NAACL’18)

  • AutoQG: Seq2Seq + answer

prediction (PAKDD’18)

  • SUM: RL-based summarisation

(ICLR’18)

  • 14
slide-15
SLIDE 15

A U T O M AT I C E VA L U AT I O N

  • 15

M O D E L B L E U 1 B L E U 2 B L E U 3 B L E U 4 M E T E O R R O U G E - L L 2 A 4 3 . 2 1 2 4 . 7 7 1 5 . 9 3 1 0 . 6 0 1 6 . 3 9 3 8 . 9 8 A u t o Q G 4 4 . 6 8 2 6 . 9 6 1 8 . 1 8 1 2 . 6 8 1 7 . 8 6 4 0 . 5 9 N Q G L C

  • ( 1 3 . 9 8 )

( 1 8 . 7 7 ) ( 4 2 . 7 2 ) S U M B L E U 1 1 . 2 0 3 . 5 0 1 . 2 1 0 . 4 5 6 . 6 8 1 5 . 2 5 S U M R O U G E 1 1 . 9 4 3 . 9 5 1 . 6 5 0 . 0 8 2 6 . 6 1 1 6 . 1 7 G E B L E U 4 6 . 8 4 2 9 . 3 8 2 0 . 3 3 1 4 . 4 7 1 9 . 0 8 4 1 . 0 7 G E B L E U + Q S S + A N S S 4 6 . 5 9 2 9 . 6 8 2 0 . 7 9 1 5 . 0 4 1 9 . 3 2 4 1 . 7 3 G E D A S 4 4 . 6 4 2 8 . 2 5 1 9 . 6 3 1 4 . 0 7 1 8 . 1 2 4 2 . 0 7 G E D A S + Q S S + A N S S 4 6 . 0 7 2 9 . 7 8 2 1 . 4 3 1 6 . 2 2 1 9 . 4 4 4 2 . 8 4 G E G L U E 4 5 . 2 0 2 9 . 2 2 2 0 . 7 9 1 5 . 2 6 1 8 . 9 8 4 3 . 4 7 G E G L U E + Q S S + A N S S 4 7 . 0 4 3 0 . 0 3 2 1 . 1 5 1 5 . 9 2 1 9 . 0 5 4 3 . 5 5 G E R O U G E 4 7 . 0 1 3 0 . 6 7 2 1 . 9 5 1 6 . 1 7 1 9 . 8 5 4 3 . 9 0 G E R O U G E + Q S S + A N S S 4 8 . 1 3 3 1 . 1 5 2 2 . 0 1 1 6 . 4 8 2 0 . 2 1 4 4 . 1 1

slide-16
SLIDE 16

H U M A N E VA L U AT I O N

  • 16

M O D E L S Y N TA X S E M A N T I C S R E L E VA N C E S C O R E K A P PA S C O R E K A P PA S C O R E K A P PA L 2 A 3 9 . 2 0 . 4 9 3 9 0 . 4 9 2 9 0 . 4 0 A u t o Q G 5 1 . 5 0 . 4 9 4 8 0 . 7 8 4 8 0 . 5 0 G E B L E U 4 7 . 5 0 . 5 2 4 9 0 . 4 5 4 1 . 5 0 . 4 4 G E B L E U + Q S S + A N S S 8 2 0 . 6 3 7 5 . 3 0 . 6 8 7 8 . 3 3 0 . 4 6 G E D A S 6 8 0 . 4 0 6 3 0 . 3 3 4 1 0 . 4 0 G E D A S + Q S S + A N S S 8 4 0 . 5 7 8 1 . 3 0 . 6 0 7 4 0 . 4 7 G E G L U E 6 0 . 5 0 . 5 0 6 2 0 . 5 2 4 4 0 . 4 1 G E G L U E + Q S S + A N S S 7 8 . 3 0 . 6 8 7 4 . 6 0 . 7 1 7 2 0 . 4 0 G E R O U G E 6 9 . 5 0 . 5 6 6 8 0 . 5 8 5 3 0 . 4 3 G E R O U G E + Q S S + A N S S 7 9 . 3 0 . 5 2 7 2 0 . 4 1 6 7 0 . 4 1

slide-17
SLIDE 17

O U T L I N E

  • Introduction & motivation
  • The generator-evaluator framework
  • Evaluation
  • Conclusion
  • 17
slide-18
SLIDE 18

C O N C L U S I O N

  • A generator-evaluator framework for question generation from text
  • Takes into account both semantics & structure
  • Proposes novel reward functions
  • Evaluation shows state-of-the-art performance
  • 18
slide-19
SLIDE 19

A N Y Q U E S T I O N S ?

T H A N K Y O U !

  • 19
slide-20
SLIDE 20

R E F E R E N C E S

  • Xinya Du, Junru Shao, and Claire Cardie. Learning to ask: Neural question generation for reading
  • comprehension. In ACL, volume 1, pages 1342–1352, 2017.
  • Vishwajeet Kumar, Kireeti Boorla, Yogesh Meena, Ganesh Ramakrishnan, and Yuan-Fang Li. Au-

tomating reading comprehension by generating question and answer pairs. In PAKDD, 2018.

  • Pranav Rajpurkar, Jian Zhang, Kon- stantin Lopyrev, and Percy Liang. SQuAD: 100,000+ questions

for machine comprehension of text. In EMNLP 2016, pages 2383–2392. ACL, November 2016.

  • Linfeng Song, Zhiguo Wang, Wael Hamza, Yue Zhang, and Daniel Gildea. Leveraging context

information for natural question generation. In NAACL, pages 569–574, 2018.

  • Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive
  • summarization. In ICLR, 2018.
  • 20
slide-21
SLIDE 21

S O M E M O R E E X A M P L E S

  • 21

Text: “critics such as economist paul krugman and u.s. treasury secretary timothy geithner have argued that the regulatory framework did not keep pace with financial innovation, such as the increasing importance of the shadow banking system, derivatives and off-balance sheet financing.”

M O D E L Q U E S T I O N

A u t o Q G

w h o a rg u e d t h a t t h e re g u l a t o r y f r a m e w o r k w a s n o t k e e p t o t a k e p a c e w i t h f i n a n c i a l i n n o v a t i o n ?

G E B L E U

w h a t w a s t h e n a m e o f t h e i n c re a s i n g i m p o r t a n c e o f t h e s h a d o w b a n k i n g s y s t e m ?

G E D A S

w h a t w a s t h e m a i n f o c u s o f t h e p ro b l e m w i t h t h e s h a d o w b a n k i n g s y s t e m ?

G E G L E U

w h a t w a s n o t k e e p p a c e w i t h f i n a n c i a l i n n o v a t i o n ?

G E R O U G E

w h a t d i d p a u l k r u g m a n a n d u . s . t re a s u r y s e c re t a r y d i s a g re e w i t h ?

slide-22
SLIDE 22

– H T T P S : / / E N . W I K I P E D I A . O R G / W I K I / WA R S A W

“Legislative power in Warsaw is vested in a unicameral Warsaw City Council (Rada Miasta),which comprises 60 members. Council members are elected directly every four years . Like most legislative bodies, the City Council divides itself into committees which have the

  • versight of various functions of the city government.”
  • 22

1 H o w m a n y m e m b e r s a re i n t h e Wa r s a w C i t y C o u n c i l ? 2 H o w o f t e n a re t h e R a d a M i a s t a e l e c t e d ? 3 T h e C i t y C o u n c i l d i v i d e s i t s e l f i n t o w h a t ?