Simple and Effective Multi-Paragraph Reading Comprehension - PowerPoint PPT Presentation

Simple and Effective Multi-Paragraph Reading Comprehension Christopher Clark and Matt Gardner

Neural Question Answering Question: “What color is the sky?” Passage: “Air is made mainly from molecules of nitrogen and oxygen. These molecules scatter the blue colors of sunlight more effectively than the green and red colors. Therefore, a clean sky appears blue.”

40 45 50 55 60 65 70 75 80 85 90 Fast Progress on Paragraph Datasets Jun-16 Jul-16 Aug-16 Sep-16 Oct-16 Nov-16 Dec-16 Jan-17 Feb-17 Accuracy on SQuAD 1.1 Mar-17 Apr-17 May-17 Jun-17 Jul-17 Aug-17 Sep-17 Oct-17 Nov-17 Dec-17 Jan-18 Feb-18 Mar-18 Apr-18 May-18 Jun-18

What Next?

Open Question Answering Question: “What color is the sky?” Blue Relevant Text Model Answer Span Document Retrieval

Challenge: Scaling Models to Documents § Modern reading comprehension models have many layers and parameter s § The trend is continuing in this direction, for example with the use of large language models § Reduced efficiency as the paragraph length increases due to long RNN chains or transformers/self-attention modules § Limits the model to processing short paragraphs

Two Possible Approaches • Pipelined Systems • Select a single paragraph from the input, and run the model on that paragraph § Confidence Systems (0.68) § Run the model on many paragraphs from the input, and have itassign a confidence score to its results on each paragraph (0.83) (0.29)

This Work Improved Pipeline Method • Improve several of the key design decision that arise when training on document-level data Improved Confidence Method • Study ways to train models to produce correct confidence scores

Pipeline Method: Paragraph Selection § Train a shallow linear model to select the best paragraphs § Features include TF-IDF, word occurrences, and its position within the document § If there is just one document TF-IDF alone is effective § Improves change of selecting an answering-containing paragraph from 83.0 to 85.1 on TriviaQA Web

Pipeline Method: Noisy Supervision Document level data can be expected to be distantly supervised: Question: Which British general was killed at Khartoum in 1885? Passage: In February 1884 Gordon returned to the Sudan to evacuate Egyptian forces. Rebels broke into the city , killing Gordon and the other defenders. The British public reacted to his death by acclaiming ' Gordon of Khartoum , a saint. However, historians have since suggested that Gordon defied orders and….

Pipeline Method: Noisy Supervision § Need a training objective that can handle multiple (noisy) answer spans § Use the summed objective from Kadlec et al (2016), that optimizes the log sum of the probability of all answer spans § Remains agnostic to how probability mass is distributed among the answer spans

Pipeline Method: Model § Construct a fast, competitive model § Use some keys ideas from prior work, bidirectional-attention, self-attention, character- embeddings, variational dropout § Also added learned tokens for document and paragraphs starts § < 5 hours to train for 26 epochs on SQuAD

Confidence Methods § We can derive confidence scores from the logit scores given to each span by the model, i.e., the scores given before the softmax operator is applied § Without re-training this can work poorly

Example from SQuAD Question: “When is the Members Debate held?” Model Extraction: “..majority of the Scottish electorate voted for it in a referendum to be held on 1 March 1979 that represented at least... ” Correct Answer: “Immediately after Decision Time a “Members Debate” is held, which lasts for 45 minutes... ”

Learning Well-Calibrated Confidence Scores § Train the model on both answering-containing and non-answering containing paragraph and use a modified objective function § Merge : Concatenate sampled paragraphs together § No-Answer : Process paragraphs independently, and allow the model to place probability mass on a “no-answer” output § Sigmoid : Assign an independent probability on each span using the sigmoid operator § Shared-Norm : Process paragraphs independently, but compute the span probability across spans in all paragraphs

Results

Datasets • TriviaQA : Datasets of trivia questions and related documents found by web- search • Includes three setting, Web (a single document for each questions) Wiki (multiple wikipedia documents for each questions) and Unfiltered (Multiple documents for each questions) • SQuAD: Turker-generated questions about Wikipedia articles • We use the questions paired with the entire article • Manual annotation shows most (90%) of questions are answerable as given the document it was generated from

Pipeline Method: Results on TriviaQA Web 70 Baseline implementation: 60 61.1 57.2 • Uses BiDAF as the model 56.22 50 53.41 50.21 • Select paragraphs by truncating documents 40 41.08 EM • Select answer-spans randomly 30 • 72.14 EM / 81.05 F1 on SQuAD 20 • 78.58 EM / 85.83 F1 with contextualized 10 word embeddings (Peters et al., 2017) 0 TriviaQA Our +TF-IDF +Sum +TF-IDF +Model Baseline Baseline +Sum +TF-IDF +Sum

TriviaQA Leaderboard (Exact Match Scores) Model Web-All Web-Verified Wiki-All Wiki-Verified Best leaderboard entry (“mingyan”) 68.65 82.44 66.56 74.83 Leaderboard entry (“dirkweissen”) 64.60 67.46 77.63 72.77 Shared-Norm (Ours) 66.37 79.97 63.99 67.98 Dynamic Integration of Background Knowledge 50.56 63.20 48.64 53.42 (Weissenborn et al., 2017a) Neural Cascades (Swayamdipta et al., 2017) 53.75 63.20 51.59 58.90 MnemonicReader (Hue et al., 2017) 46.65 56.96 46.94 54.45 SMARNET (Chen et al., 2017 51.11 40.87 42.41 50.51

Error Analysis • Manually annotated 200 errors made by the TriviaQA Web model • 40.5% are due to noise or lack of context in the relevant documents • Of the remaining….

Answer indirectly stated 20% Sentence Reading Missing backgroun 35% knoweldge 6% Part of answer extracted 7% Document Coreference Paragraph Reading 14% 18%

Building an Open Question Answering System • Use Bing web search and a Wikipedia entity linker to locate relevant documents • Extract the top 12 paragraphs, as found using the linear paragraph ranker • Use the model trained for TriviaQA Unfiltered to find the final answer Question

Curated Trec Results 60 53.31 50 40 ACCURACY 37.18 34.26 30 25.7 20 10 0 YodaQA with Bing YodaQA (Baudis, DrQA + DS (Chen et S-Norm (ours) (Baudis, 2015), 2015) al., 2017a)

Thank You Demo : https://documentqa.allenai.org/ Question Github : https://github.com/allenai/document-qa

Simple and Effective Multi-Paragraph Reading Comprehension - PowerPoint PPT Presentation

Simple and Effective Multi-Paragraph Reading Comprehension Christopher Clark and Matt Gardner Neural Question Answering Question: What color is the sky? Passage: Air is made mainly from molecules of nitrogen and oxygen. These

Comprehension Skills: Teacher Presentation Book, Comprehension Skills: Teacher Presentation Book,

Extract 2 1984 , G. Orwell, 1. What progression can you find from paragraph 1 to paragraph 5?

Literacy Strategies Literacy Strategies What is comprehension? What is comprehension? Simply

MIHS Expectations for Reading Comprehension May 18, 2017 Common Thread: Reading for Information

Presentation Class Lesson 3 Lesson Preview: I. Writing a Paragraph a. What is a paragraph? b.

Adversarial Examples for Evaluating Reading Comprehension Systems Robin Jia and Percy Liang

Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension

(Age 7-11) A new solution for guided reading Agenda Why a comprehension programme? What is Bug

Reading and Comprehension Reading requires: o Decoding written text o Can compensate for lack

Reading and Comprehension Reading requires: o Decoding written text o Can compensate for lack

Reading between the lines: Improving Comprehension for Students Kevin Larson Microsoft Advanced

Take out Warm- Paragraph Versailles Agenda up Response DBQ Put Paragraph Response and

Video Paragraph Captioning using Hierarchical Recurrent Neural Networks Haonan Yu, Jiang Wang,

1 2 Further information: IFRS 17 paragraph 29 3 4 Further information: IFRS 17 paragraph 32

PARAGRAPH & ESSAY WRITING ESSAY WRITING Teacher : Prof. Indu Bora Subject : English

Elements of reading Decoding Reading skills Comprehension Reading words

Air passenger projections, tourism growth Ireland, Dublin Airport - 40 million PAX Peru,

PNG TACKLE PROJECT BRUSSELS, BELGIUM (2-3 JULY 2013) Summary of KRA3 Action Programmes in

Introduction Last year the Council held a series of discussions and workshops with local

Capacity-building workshop for Southeast Asia on ecosystem conservation and restoration to

Distributed Representations of Sentences and Documents Quoc Le and Tomas Mikolov (ICML 2014)

BOARD OF TRUSTEES FINANCE AND AUDIT COMMITTEE BUDGET WORKSHOP MAY 27, 2004 BUDGET PROCESS

Major Factors to the FY08 Budget: #1. A 5% Revenue Increase based on existing rates. #2. A 6%

M A PHYS or the development of a parallel algebraic domain decomposition solver in the course of

Sambuz

Useful Links

Newsletter

Mail Us

Simple and Effective Multi-Paragraph Reading Comprehension - PowerPoint PPT Presentation

Simple and Effective Multi-Paragraph Reading Comprehension Christopher Clark and Matt Gardner Neural Question Answering Question: What color is the sky? Passage: Air is made mainly from molecules of nitrogen and oxygen. These

Comprehension Skills: Teacher Presentation Book, Comprehension Skills: Teacher Presentation Book,

Extract 2 1984 , G. Orwell, 1. What progression can you find from paragraph 1 to paragraph 5?

Literacy Strategies Literacy Strategies What is comprehension? What is comprehension? Simply

MIHS Expectations for Reading Comprehension May 18, 2017 Common Thread: Reading for Information

Presentation Class Lesson 3 Lesson Preview: I. Writing a Paragraph a. What is a paragraph? b.

Adversarial Examples for Evaluating Reading Comprehension Systems Robin Jia and Percy Liang

Quantifying Program Complexity and Comprehension Quantifying Program Complexity and Comprehension

(Age 7-11) A new solution for guided reading Agenda Why a comprehension programme? What is Bug

Reading and Comprehension Reading requires: o Decoding written text o Can compensate for lack

Reading and Comprehension Reading requires: o Decoding written text o Can compensate for lack

Reading between the lines: Improving Comprehension for Students Kevin Larson Microsoft Advanced

Take out Warm- Paragraph Versailles Agenda up Response DBQ Put Paragraph Response and

Video Paragraph Captioning using Hierarchical Recurrent Neural Networks Haonan Yu, Jiang Wang,

1 2 Further information: IFRS 17 paragraph 29 3 4 Further information: IFRS 17 paragraph 32

PARAGRAPH &amp; ESSAY WRITING ESSAY WRITING Teacher : Prof. Indu Bora Subject : English

Elements of reading Decoding Reading skills Comprehension Reading words

Air passenger projections, tourism growth Ireland, Dublin Airport - 40 million PAX Peru,

PNG TACKLE PROJECT BRUSSELS, BELGIUM (2-3 JULY 2013) Summary of KRA3 Action Programmes in

Introduction Last year the Council held a series of discussions and workshops with local

Capacity-building workshop for Southeast Asia on ecosystem conservation and restoration to

Distributed Representations of Sentences and Documents Quoc Le and Tomas Mikolov (ICML 2014)

BOARD OF TRUSTEES FINANCE AND AUDIT COMMITTEE BUDGET WORKSHOP MAY 27, 2004 BUDGET PROCESS

Major Factors to the FY08 Budget: #1. A 5% Revenue Increase based on existing rates. #2. A 6%

M A PHYS or the development of a parallel algebraic domain decomposition solver in the course of

Sambuz

Useful Links

Newsletter

Mail Us

PARAGRAPH & ESSAY WRITING ESSAY WRITING Teacher : Prof. Indu Bora Subject : English