Automating reading comprehension by generating question and answer - PowerPoint PPT Presentation

1 IITB-Monash Research Academy, India 2 IIT Bombay, India 3 Monash University, Australia Automating reading comprehension by generating question and answer pairs Vishwajeet Kumar 1 Kireeti Boorla 2 Ganesh Ramakrishnan 2 Yuan-Fang Li 3

A system to automatically generate questions and answers from text. Sachin Tendulkar received the Arjuna Award in 1994 for his outstanding sporting achievement, the Rajiv Gandhi Khel Ratna award in 1997... 1. When did Sachin Tendulkar received the Arjuna Award? 2. which award did sachin tendular received in 1994 for his outstanding sporting achievement? 3. when did Sachin tendulkar received the Rajiv Gandhi Khel Ratna Award? Ans: 1997 1 Automatic question and answer generation Some text Questions Ans: 1994 Ans: Arjuna Award

How would someone tell that you have read this text? Sachin Ramesh Tendulkar is a former Indian cricketer and captain, widely regarded as one of the greatest batsmen of all time. He took up cricket at the age of eleven, made his Test debut on 15 November 1989 against Pakistan in Karachi at the age of sixteen, and went on to represent Mumbai domestically and India internationally for close to twenty-four years.............. 2 Motivation

Sachin Ramesh Tendulkar is a former Indian cricketer and captain, widely regarded as one of the greatest batsmen of all time. He took up cricket at the age of eleven, made his Test debut on 15 November 1989 against Pakistan in Karachi at the age of sixteen, and went on to represent Mumbai domestically and India internationally for close to twenty-four years.............. 2 Motivation How would someone tell that you have read this text?

• Answer Must be Unambiguous • Question must be challenging and well formed 3 Why is this problem Challenging? • Question Must be Relevant to the Text

• Question must be challenging and well formed 3 Why is this problem Challenging? • Question Must be Relevant to the Text • Answer Must be Unambiguous

3 Why is this problem Challenging? • Question Must be Relevant to the Text • Answer Must be Unambiguous • Question must be challenging and well formed

• Use crowd sourced templates such as What is X ? • Rules for declarative-to-interrogative sentence transformation • Only syntax is considered not semantics. • Rely heavily on NLP tools. • First approach towards question generation from text using neural network. • Uses vanilla Seq2Seq model for question generation. 4 Existing Work Template Based [Mazidi and Nielsen, 2014, Mostow and Chen, 2009] Syntax Based [Heilman, 2011] Vanilla Seq2Seq for Question Generation [Du et al., 2017]

Example: < Fires Creek , contained by, nantahala national forest> Which forest is Fires Creek in? Template based [Seyler et al., 2015] Factoid question generation using RNN [Serban et al., 2016] Generate question given a fact/triple from KB/Ontology. • Assumption: Facts are present in Domain dependent knowledge base. • Generates question using templates based on facts. • Propose generating factoid question generation from freebase triples(subject,relation,object). • Embeds fact using KG embedding techniques such as TransE. 5 Some other related work

Template based [Seyler et al., 2015] Factoid question generation using RNN [Serban et al., 2016] Generate question given a fact/triple from KB/Ontology. • Assumption: Facts are present in Domain dependent knowledge base. • Generates question using templates based on facts. • Propose generating factoid question generation from freebase triples(subject,relation,object). • Embeds fact using KG embedding techniques such as TransE. 5 Some other related work Example: < Fires Creek , contained by, nantahala national forest> ⇒ Which forest is Fires Creek in?

Generate question given a fact/triple from KB/Ontology. • Assumption: Facts are present in Domain dependent knowledge base. • Generates question using templates based on facts. • Propose generating factoid question generation from freebase triples(subject,relation,object). • Embeds fact using KG embedding techniques such as TransE. 5 Some other related work Example: < Fires Creek , contained by, nantahala national forest> ⇒ Which forest is Fires Creek in? Template based [Seyler et al., 2015] Factoid question generation using RNN [Serban et al., 2016]

• Do not generate answer corresponding to the question. • Mostly rule based or template based. • Overly simple set of linguistic features. 6 Limitations of previous approaches

• Sequence to sequence model with attention and augmented with rich set of • Pointer network based method for automatic answer selection. linguistic features and answer encoding 7 Our contribution

8 Automatic question and answer generation using seq2seq model with pointer network Named Entity Selection Answer Selection Donald Trump is the Current President of United States of America. Pointer Network Donald Trump Answer and Features Encoding 0.3 0.4 0.5 0.6 0.8 0.7 0.9 0.1 ... .. .. Sentence Encoder Thought Vector for the sentence Question Who is the current president of United States Decoder of America ? Figure 1: High level architecture of our question generation model

n j , create representation ( R ) • For each NE, NE • R is fed to MLP along with n h s softmax R i W • P NE i S 9 mean , h s of named entity being pivotal answer a . to get probability B where h s h s h ne a Most relevant answer to ask question about h ne mean = n i Named Entity Selection Named Entity Selection Answer Selection • Sentence S = ( w 1 , w 2 , ..., w n ) is encoded using a 2-layer Pointer Network LSTM network into hidden states H = ( h s 1 , h s 2 , ..., h s n ) . Answer and Features Encoding Sentence Encoder n is final state mean is the mean of all activations mean is mean of activations in NE span ( h s j ) i , ..., h s Question Decoder

• R is fed to MLP along with n h s softmax R i W • P NE i S h s a Most relevant answer to ask question about h ne h s where h s B of named entity being pivotal answer a . to get probability mean 9 Named Entity Selection Named Entity Selection Answer Selection • Sentence S = ( w 1 , w 2 , ..., w n ) is encoded using a 2-layer Pointer Network LSTM network into hidden states H = ( h s 1 , h s 2 , ..., h s n ) . • For each NE, NE = ( n i , ..., n j ) , create representation ( R ) = < h ne mean > , Answer and Features Encoding Sentence Encoder n is final state mean is the mean of all activations mean is mean of activations in NE span ( h s j ) i , ..., h s Question Decoder

softmax R i W • P NE i S 9 of named entity being pivotal answer a . a Most relevant answer to ask question about h ne h s where h s B Named Entity Selection Named Entity Selection Answer Selection • Sentence S = ( w 1 , w 2 , ..., w n ) is encoded using a 2-layer Pointer Network LSTM network into hidden states H = ( h s 1 , h s 2 , ..., h s n ) . • For each NE, NE = ( n i , ..., n j ) , create representation ( R ) = < h ne mean > , Answer and • R is fed to MLP along with < h s n ; h s mean ; > to get probability Features Encoding Sentence Encoder n is final state mean is the mean of all activations mean is mean of activations in NE span ( h s j ) i , ..., h s Question Decoder

Automating reading comprehension by generating question and answer - PowerPoint PPT Presentation

1 IITB-Monash Research Academy, India 2 IIT Bombay, India 3 Monash University, Australia Automating reading comprehension by generating question and answer pairs Vishwajeet Kumar 1 Kireeti Boorla 2 Ganesh Ramakrishnan 2 Yuan-Fang Li 3 A system to

Comprehension Skills: Teacher Presentation Book, Comprehension Skills: Teacher Presentation Book,

Literacy Strategies Literacy Strategies What is comprehension? What is comprehension? Simply

MIHS Expectations for Reading Comprehension May 18, 2017 Common Thread: Reading for Information

H2 F2009 H2 F2009 GENERATING GENERATING GENERATING GENERATING FREE CASH FLOW FREE CASH FLOW

Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension

(Age 7-11) A new solution for guided reading Agenda Why a comprehension programme? What is Bug

Reading and Comprehension Reading requires: o Decoding written text o Can compensate for lack

Reading and Comprehension Reading requires: o Decoding written text o Can compensate for lack

Reading between the lines: Improving Comprehension for Students Kevin Larson Microsoft Advanced

Adversarial Examples for Evaluating Reading Comprehension Systems Robin Jia and Percy Liang

Elements of reading Decoding Reading skills Comprehension Reading words

Question Answering and Reading Comprehension Kevin Duh Fall 2019, Intro to HLT, Johns Hopkins

on Reading Skills 24 March 2018 Outline of Sharing Overview of Reading Extensive Reading

School Strategic Plan 2012-2015 School Curriculum- Reading Comprehension and Writing Key

SHE IT ME 1 Lesson 3 Reading Comprehension.notebook April 22, 2020 Replace the blank with

Data Driven Reading Comprehension Phil Blunsom In collaboration with Karl Moritz Hermann, Tom

Funding for Clubs 2020 FUTURE WEBINARS Thursday 23 rd April at 17.00 Club Development and

Estimation of Skill Distribution from a Tournament Ali Jadbabaie, Anuran Makur, and Devavrat Shah

Last Lecture: Localization Primitives This Lecture: Indoor Positioning Systems:

Board Meeting # 3 07 March, 2015 Agenda 1. Roll Call 2. Last meeting minutes approval 3.

Population Coding Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit

Auditing net neutrality violations globally -or- What happened when Apple said no

Formula 1 What is Formula 1 ? What is Formula 1 ? Highest class of single seater auto racing

Herecast: An Open Infrastructure for Location-Based Services using WiFi Mark Paciga and Hanan