phrase indexed question answering a new challenge for
play

Phrase-Indexed Question Answering : A New Challenge for Scalable - PowerPoint PPT Presentation

Phrase-Indexed Question Answering : A New Challenge for Scalable Document Comprehension Minjoon Seo, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, Hannaneh Hajishirzi Question Answering? 1961 Model Barack Obama (1961-present) was the 44 th


  1. Phrase-Indexed Question Answering : A New Challenge for Scalable Document Comprehension Minjoon Seo, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, Hannaneh Hajishirzi

  2. Question Answering?

  3. 1961 Model “Barack Obama (1961-present) was the 44 th When was Obama born? President of the United States.” Document (context) Question

  4. 1961 Model Extractive “Barack Obama (1961-present) was the 44 th When was Obama born? President of the United States.” Document (context) Question

  5. Extractive QA Datasets • SQuAD (Rajpurkar et al., 2016) • NewsQA (Trischler et al., 2016) • TriviaQA (Joshi et al., 2017) • QuAC (Choi et al., 2018) • CoQA (Reddy & Chen & Manning, 2018) • HotpotQA (Yang et al., 2018) • And more…

  6. Open-domain QA?

  7. 1961 Model “Barack Obama (1961-present) was the 44 th When was Obama born? President of the United States.” Document (context) Question

  8. 1961 Model When was Obama born? Question

  9. 4 Million documents 3 Billion tokens 0.1s / doc * 4M docs = 6 days !

  10. Choi et al., 2017; Chen et al., 2017; Clark & Gardner, 2017 TF-IDF, BM 25, LSA Information Retrieval Model 1961 When was Obama born? Pipelined

  11. Choi et al., 2017; Chen et al., 2017; Clark & Gardner, 2017 Wrong TF-IDF, document! BM 25, LSA Information Retrieval Model 1961 When was Obama born?

  12. Choi et al., 2017; Chen et al., 2017; Clark & Gardner, 2017 Wrong TF-IDF, document! BM 25, LSA Wrong Information answer! Retrieval Model 1911 When was Obama born? Error propagation

  13. Ideally…

  14. TF-IDF, BM 25, LSA Information Retrieval Model 1961 When was Obama born?

  15. ? Model 1961 When was Obama born? End-to-end & elegant… But how?

  16. Solution: Index phrases!

  17. [-3, 0.1, …] When was [0.5, 0.1, …] Obama born? [0.3, -0.2, …] Nearest [0.5, 0.1, …] neighbor search [0.7, -0.4, …] Document Indexing [0.5, 0.0, …] - Locality Sensitive Hashing - aLSH (Shrivastava & Li, 2014) [3.3, -2.2, …] - …

  18. “Barack Obama (1961-present) was the 44 th President of the United States.” Who is the 44 th Barack Obama … President of the U.S.? Nearest … ( 1961 -present … neighbor … 44 th President … search When was … United States . Obama born? Question Phrase encoding encoding

  19. Model phrase question document " = argmax ! * + ", -, . ) Decompose " = argmax ! / + (-) 2 3 + (", .) ) Phrase encoder Question encoder

  20. Decomposability is a strong constraint

  21. Phrase-Indexed QA (PIQA) Challenge • Open-domain QA is hard to setup or evaluate • Instead, benchmark on existing datasets (e.g. SQuAD) • Create two models: • Phrase (document) encoder • Question encoder • Phrase encoder must be question-agnostic , and vice versa • Answer must be obtained via nearest neighbor search (NNS)

  22. PI-SQuAD Evaluation

  23. Is it too easy or too hard?

  24. BERT (Devlin et al., 2018) 92% F1 SQuAD v1.1 Red color is phrase- SA+ELMo (Peters et al., 2018) 86% F1 indexed. Decomposability gap SA+ELMo (Seo et al., 2018) 64% F1 Feature-based (Rajpurkar et al., 2018) 50% F1

  25. BERT (Devlin et al., 2018) 92% F1 SQuAD v1.1 Red color is phrase- SA+ELMo (Peters et al., 2018) 86% F1 indexed. Sparse+SA+ELMo 70% F1 Match-LSTM (Wang & Jiang., 2017) First neural model 68% F1 SA+ELMo (Seo et al., 2018) 64% F1 Feature-based (Rajpurkar et al., 2018) 50% F1

  26. Phrase Representation Learning • Not just about scalability, but also about comprehension • Standalone representations of phrases (document) PIQA can be viewed as: • A phrase embedding evaluation method • Sentence embedding in SNLI (Bowman et al., 2015) • Constructing a memory of knowledge • Memory Networks (Weston et al., 2014)

  27. According to the American Library Association , this makes … … tasked with drafting a European Charter of Human Rights , … Named Entities

  28. The LM engines were successfully test- fired and restarted, … Steam turbines were extensively applied … Lexical & Syntactic Similarity

  29. … primarily accomplished through the ductile stretching and thinning . … directly derived from the homogeneity or symmetry of space … Syntactic Similarity

  30. Demo on my Macbook Corpus size: 300k Tokens (SQuAD dev set) 16 CPUs: 100s+ GPU: 10s+

  31. A lot of things to do • Closing the gap due to decomposability constraint • BERT (Devlin et al., 2018)? • Reducing index storage (100TB+ for Wikipedia) • Reducing phrase embedding dimension (1024) • Extending to open-domain QA • Analyzing phrase representations • And more!

  32. http://pi-qa.com Thank you!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend