Phrase-Indexed Question Answering : A New Challenge for Scalable - - PowerPoint PPT Presentation

phrase indexed question answering a new challenge for
SMART_READER_LITE
LIVE PREVIEW

Phrase-Indexed Question Answering : A New Challenge for Scalable - - PowerPoint PPT Presentation

Phrase-Indexed Question Answering : A New Challenge for Scalable Document Comprehension Minjoon Seo, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, Hannaneh Hajishirzi Question Answering? 1961 Model Barack Obama (1961-present) was the 44 th


slide-1
SLIDE 1

Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension

Minjoon Seo, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, Hannaneh Hajishirzi

slide-2
SLIDE 2

Question Answering?

slide-3
SLIDE 3

“Barack Obama (1961-present) was the 44th President of the United States.” When was Obama born?

Model

1961 Document (context) Question

slide-4
SLIDE 4

“Barack Obama (1961-present) was the 44th President of the United States.” When was Obama born?

Model

1961 Document (context) Question Extractive

slide-5
SLIDE 5

Extractive QA Datasets

  • SQuAD (Rajpurkar et al., 2016)
  • NewsQA (Trischler et al., 2016)
  • TriviaQA (Joshi et al., 2017)
  • QuAC (Choi et al., 2018)
  • CoQA (Reddy & Chen & Manning, 2018)
  • HotpotQA (Yang et al., 2018)
  • And more…
slide-6
SLIDE 6

Open-domain QA?

slide-7
SLIDE 7

“Barack Obama (1961-present) was the 44th President of the United States.” When was Obama born?

Model

1961 Document (context) Question

slide-8
SLIDE 8

When was Obama born?

Model

1961 Question

slide-9
SLIDE 9

4 Million documents 3 Billion tokens

0.1s / doc * 4M docs = 6 days!

slide-10
SLIDE 10

Information Retrieval Model

When was Obama born? 1961 Choi et al., 2017; Chen et al., 2017; Clark & Gardner, 2017

Pipelined

TF-IDF, BM 25, LSA

slide-11
SLIDE 11

Information Retrieval Model

When was Obama born? 1961 Choi et al., 2017; Chen et al., 2017; Clark & Gardner, 2017 TF-IDF, BM 25, LSA Wrong document!

slide-12
SLIDE 12

Information Retrieval Model

When was Obama born? 1911 Choi et al., 2017; Chen et al., 2017; Clark & Gardner, 2017

Error propagation

TF-IDF, BM 25, LSA Wrong document! Wrong answer!

slide-13
SLIDE 13

Ideally…

slide-14
SLIDE 14

Information Retrieval Model

When was Obama born? 1961 TF-IDF, BM 25, LSA

slide-15
SLIDE 15

Model

When was Obama born? 1961

?

End-to-end & elegant… But how?

slide-16
SLIDE 16

Solution: Index phrases!

slide-17
SLIDE 17

[-3, 0.1, …] [0.3, -0.2, …] [0.5, 0.1, …] [0.7, -0.4, …] [0.5, 0.0, …] [3.3, -2.2, …]

When was Obama born? Nearest neighbor search

[0.5, 0.1, …]

Document Indexing

  • Locality Sensitive Hashing
  • aLSH (Shrivastava & Li, 2014)
slide-18
SLIDE 18

Barack Obama … … (1961-present … … 44th President … … United States. Who is the 44th President of the U.S.? Nearest neighbor search When was Obama born? “Barack Obama (1961-present) was the 44th President of the United States.”

Phrase encoding

Question encoding

slide-19
SLIDE 19

! " = argmax

)

*+ ", -, . ! " = argmax

)

/+(-) 2 3+(", .)

Model phrase question document

Decompose

Question encoder Phrase encoder

slide-20
SLIDE 20

Decomposability is a strong constraint

slide-21
SLIDE 21

Phrase-Indexed QA (PIQA) Challenge

  • Open-domain QA is hard to setup or evaluate
  • Instead, benchmark on existing datasets (e.g. SQuAD)
  • Create two models:
  • Phrase (document) encoder
  • Question encoder
  • Phrase encoder must be question-agnostic, and vice versa
  • Answer must be obtained via nearest neighbor search (NNS)
slide-22
SLIDE 22

PI-SQuAD Evaluation

slide-23
SLIDE 23

Is it too easy or too hard?

slide-24
SLIDE 24

BERT (Devlin et al., 2018) SA+ELMo (Peters et al., 2018) 92% F1 86% F1 SA+ELMo (Seo et al., 2018) 64% F1

SQuAD v1.1

Feature-based (Rajpurkar et al., 2018) 50% F1 Decomposability gap Red color is phrase- indexed.

slide-25
SLIDE 25

BERT (Devlin et al., 2018) SA+ELMo (Peters et al., 2018) 92% F1 86% F1 SA+ELMo (Seo et al., 2018) 64% F1 Feature-based (Rajpurkar et al., 2018) 50% F1 Sparse+SA+ELMo 70% F1 Match-LSTM (Wang & Jiang., 2017) 68% F1 First neural model

SQuAD v1.1

Red color is phrase- indexed.

slide-26
SLIDE 26

Phrase Representation Learning

  • Not just about scalability, but also about comprehension
  • Standalone representations of phrases (document)

PIQA can be viewed as:

  • A phrase embedding evaluation method
  • Sentence embedding in SNLI (Bowman et al., 2015)
  • Constructing a memory of knowledge
  • Memory Networks (Weston et al., 2014)
slide-27
SLIDE 27

According to the American Library Association, this makes … … tasked with drafting a European Charter of Human Rights, …

Named Entities

slide-28
SLIDE 28

The LM engines were successfully test- fired and restarted, … Steam turbines were extensively applied …

Lexical & Syntactic Similarity

slide-29
SLIDE 29

… primarily accomplished through the ductile stretching and thinning. … directly derived from the homogeneity or symmetry of space …

Syntactic Similarity

slide-30
SLIDE 30

Demo on my Macbook

Corpus size: 300k Tokens (SQuAD dev set) 16 CPUs: 100s+ GPU: 10s+

slide-31
SLIDE 31

A lot of things to do

  • Closing the gap due to decomposability constraint
  • BERT (Devlin et al., 2018)?
  • Reducing index storage (100TB+ for Wikipedia)
  • Reducing phrase embedding dimension (1024)
  • Extending to open-domain QA
  • Analyzing phrase representations
  • And more!
slide-32
SLIDE 32

http://pi-qa.com

Thank you!