Large Scale QA-SRL Parsing Nicholas FitzGerald, Julian Michael, - - PowerPoint PPT Presentation

large scale qa srl parsing
SMART_READER_LITE
LIVE PREVIEW

Large Scale QA-SRL Parsing Nicholas FitzGerald, Julian Michael, - - PowerPoint PPT Presentation

Large Scale QA-SRL Parsing Nicholas FitzGerald, Julian Michael, Luheng He, and Luke Zettlemoyer ACL 2018 http://qasrl.org/ Semantic Role Labelling Subject Manner Verb Object Time John surreptitiously ate the burrito at 2am. Semantic Role


slide-1
SLIDE 1

Large Scale QA-SRL Parsing

Nicholas FitzGerald, Julian Michael, Luheng He, and Luke Zettlemoyer ACL 2018 http://qasrl.org/

slide-2
SLIDE 2

John surreptitiously ate the burrito at 2am.

Subject Manner Verb Object Time

Semantic Role Labelling

slide-3
SLIDE 3

John surreptitiously ate the burrito at 2am.

Subject Manner Verb Object Time

Semantic Role Labelling

  • Applied to improve state-of-the-art in NLP tasks such as Question Answering

[Shen 2007] and Machine Translation [Liu and Gildea, 2010]

slide-4
SLIDE 4

John surreptitiously ate the burrito at 2am.

Subject Manner Verb Object Time

Semantic Role Labelling

  • Applied to improve state-of-the-art in NLP tasks such as Question Answering

[Shen 2007] and Machine Translation [Liu and Gildea, 2010]

  • Commonly used interface to facilitate Data Exploration and Information

Extraction [Stanovsky et al 2018] [Chiticariu et al. 2018]

slide-5
SLIDE 5

John surreptitiously ate the burrito at 2am.

Subject Manner Verb Object Time

Semantic Role Labelling

  • Applied to improve state-of-the-art in NLP tasks such as Question Answering

[Shen 2007] and Machine Translation [Liu and Gildea, 2010]

  • Commonly used interface to facilitate Data Exploration and Information

Extraction [Stanovsky et al 2018] [Chiticariu et al. 2018]

  • Considerable interest in general-purpose SRL parsers
slide-6
SLIDE 6

QA-SRL

John surreptitiously ate the burrito at 2am.

W h

  • a

t e s

  • m

e t h i n g ? H

  • w

w a s s

  • m

e t h i n g e a t e n ? W h a t w a s e a t e n ? W h e n w a s s

  • m

e t h i n g e a t e n ? Subject Manner Verb Object Time

[He et al. 2015]

slide-7
SLIDE 7

QA-SRL

John surreptitiously ate the burrito at 2am.

W h

  • a

t e s

  • m

e t h i n g ? H

  • w

w a s s

  • m

e t h i n g e a t e n ? W h a t w a s e a t e n ? W h e n w a s s

  • m

e t h i n g e a t e n ? Subject Manner Verb Object Time

[He et al. 2015]

QA-SRL 1.0

  • Small dataset
  • Trained annotators
  • Only explored sub-problems
slide-8
SLIDE 8

Goal

A high-quality, large-scale parser for QA-SRL

slide-9
SLIDE 9

In 1950 Alan M. Turing published "Computing machinery and intelligence" in Mind, in which he proposed that machines could be tested for intelligence using questions and answers.

published

When was something published? In 1950 Who published something? Alan M. Turing What was published? “Computing Machinery and Intelligence” Where was something published? in Mind

proposed

When did someone propose something? In 1950 Who proposed something? Alan M. Turing What did someone propose? that machines could be tested for intelligent using questions and answers What did someone propose something in? “Computing Machinery and Intelligence”

tested

What can be tested? machines What can something be tested for? intelligence How can something be tested? using questions and answers

using

What was being used? questions and answers Why was something being used? tested for intelligence

slide-10
SLIDE 10

Challenges

  • 1. Scale up QA-SRL data annotation
slide-11
SLIDE 11
  • 1. Scale up QA-SRL data annotation

Challenges

slide-12
SLIDE 12
  • 1. Scale up QA-SRL data annotation

Challenges

75k sentence dataset in 9 days

slide-13
SLIDE 13

Challenges

  • 1. Scale up QA-SRL data annotation
  • 2. Train a QA-SRL Parser
appeared Where didn’t someone appear to do something? In the video Who didn’t appear to do something? the perpetrators When did someone appear? never What didn’t someone appear to do? look at the camera to look at the camera look Where didn't someone look at something? In the video Who didn’t look? the perpetrators What didn’t someone look at? the camera In the video, the perpetrators never appeared to look at the camera.
slide-14
SLIDE 14

Challenges

  • 1. Scale up QA-SRL data annotation
  • 2. Train a QA-SRL Parser
  • 3. Improve Recall

Overgenerate Validate +11% data
 + 2% Fscore

slide-15
SLIDE 15

In 1950 Alan M. Turing published "Computing machinery and intelligence" in Mind, in which he proposed that machines could be tested for intelligence using questions and answers.

published

When was something published? In 1950 Who published something? Alan M. Turing What was published? “Computing Machinery and Intelligence” Where was something published? in Mind

proposed

When did someone propose something? In 1950 Who proposed something? Alan M. Turing What did someone propose? that machines could be tested for intelligent using questions and answers What did someone propose something in? “Computing Machinery and Intelligence”

tested

What can be tested? machines What can something be tested for? intelligence How can something be tested? using questions and answers

using

What was being used? questions and answers Why was something being used? tested for intelligence

slide-16
SLIDE 16

Large-scale QA-SRL Parsing

  • 1. Scale up QA-SRL data annotation
  • 2. Train a QA-SRL Parser
  • 3. Improve Recall
slide-17
SLIDE 17

Easier Annotation

UCCA

[Abend and Rapaport 2013]

~6k sentences 4 Trained Annotators Semantic Proto-roles

[Reisinger et al. 2015]

~7k sentences MTurk Groningen Meaning Bank

[Basile et al. 2012]

~40k sentences Trained annotators/ GWAP QASRL 1.0

[He et al. 2015]

~3k sentences Trained annotators QA-SRL 2.0 75k sentences MTurk

slide-18
SLIDE 18

QA-SRL

Questions: Wh Aux Subj Verb Obj Prep Obj2

Who What Where When Why How ∅ did didn’t might will … ∅ someone something ∅ stem past past participle present ∅ someone something ∅

  • n

to by from … ∅ someone something

slide-19
SLIDE 19

QA-SRL

Questions:

John surreptitiously ate the burrito at 2am.

Wh Aux Subj Verb Obj Prep Obj2

Who What Where When Why How ∅ did didn’t might will … ∅ someone something ∅ stem past past participle present ∅ someone something ∅

  • n

to by from … ∅ someone something

slide-20
SLIDE 20

Wh Aux Subj Verb Obj Prep Obj2

Who What Where When Why How ∅ did didn’t might will … ∅ someone something ∅ stem past past participle present ∅ someone something ∅

  • n

to by from … ∅ someone something

QA-SRL

Questions:

John surreptitiously ate the burrito at 2am.

Who ate something?

slide-21
SLIDE 21

Wh Aux Subj Verb Obj Prep Obj2

Who What Where When Why How ∅ did didn’t might will … ∅ someone something ∅ stem past past participle present ∅ someone something ∅

  • n

to by from … ∅ someone something

QA-SRL

Questions:

John surreptitiously ate the burrito at 2am.

Who ate something? What did someone eat?

slide-22
SLIDE 22

QA-SRL

Questions:

John surreptitiously ate the burrito at 2am.

Who ate something? What did someone eat? …

slide-23
SLIDE 23

John surreptitiously ate the burrito at 2am.

QA-SRL

Questions:

Who ate something? What did someone eat? …

Answers:

John

slide-24
SLIDE 24

John surreptitiously ate the burrito at 2am.

QA-SRL

Questions:

Who ate something? What did someone eat? …

Answers:

John the burrito

slide-25
SLIDE 25

Annotation Pipeline

x

John surreptitiously ate the burrito at 2am

Predicate detection

Identify verbs with POS + heuristics

slide-26
SLIDE 26

Annotation Pipeline

x

John surreptitiously ate the burrito at 2am

Question annotation Predicate detection

Identify verbs with POS + heuristics One worker writes as many QA-SRL questions as possible, and provides the answer

slide-27
SLIDE 27

Annotation Pipeline

x

John surreptitiously ate the burrito at 2am

Validation Question annotation Predicate detection

Identify verbs with POS + heuristics One worker writes as many QA-SRL questions as possible, and provides the answer 2 workers are shows questions, provide answers or mark as invalid

slide-28
SLIDE 28

Question Annotation

  • Efficiency
  • Recall
slide-29
SLIDE 29

Question Annotation

  • Efficiency
  • Autocomplete
slide-30
SLIDE 30

Question Annotation

  • Efficiency
  • Autocomplete
  • Recall
  • Autosuggest
slide-31
SLIDE 31

Question Annotation

  • Efficiency
  • Autocomplete
  • Recall
  • Autosuggest
  • Financial Incentives
slide-32
SLIDE 32

Validation Interface

slide-33
SLIDE 33

Dataset

  • 1 annotator provides questions
  • 2 annotators validate -> 3 spans / question
  • Question invalid if any annotator marks invalid
  • Additional 3 validators for small dense dev and test set
slide-34
SLIDE 34

Dataset

[He et al 2015] This work 3000 sentences 75k sentences Several weeks 9 days ~50c / verb 33c / verb 2.43 questions / verb 2.05 questions / verb

slide-35
SLIDE 35

In 1950 Alan M. Turing published "Computing machinery and intelligence" in Mind, in which he proposed that machines could be tested for intelligence using questions and answers.

published

Who published something? Alan M. Turing What was published? “Computing Machinery and Intelligence” When was something published? In 1950 Where was something published? in Mind

proposed

Who proposed something? Alan M. Turing What did someone propose? that machines could be tested for intelligent using questions and answers When did someone propose something? In 1950 What did someone propose something in? “Computing Machinery and Intelligence”

tested

What can be tested? machines What can something be tested for? intelligence How can something be tested? using questions and answers

using

What was being used? questions and answers Why was something being used? tested for intelligence

slide-36
SLIDE 36

In 1950 Alan M. Turing published "Computing machinery and intelligence" in Mind, in which he proposed that machines could be tested for intelligence using questions and answers.

published

Who published something? Alan M. Turing What was published? “Computing Machinery and Intelligence” When was something published? In 1950 Where was something published? in Mind

proposed

Who proposed something? Alan M. Turing What did someone propose? that machines could be tested for intelligent using questions and answers When did someone propose something? In 1950 What did someone propose something in? “Computing Machinery and Intelligence”

tested

What can be tested? machines What can something be tested for? intelligence How can something be tested? using questions and answers

using

What was being used? questions and answers Why was something being used? tested for intelligence

slide-37
SLIDE 37

In 1950 Alan M. Turing published "Computing machinery and intelligence" in Mind, in which he proposed that machines could be tested for intelligence using questions and answers.

published

Who published something? Alan M. Turing What was published? “Computing Machinery and Intelligence” When was something published? In 1950 Where was something published? in Mind

proposed

Who proposed something? Alan M. Turing What did someone propose? that machines could be tested for intelligent using questions and answers When did someone propose something? In 1950 What did someone propose something in? “Computing Machinery and Intelligence”

tested

What can be tested? machines What can something be tested for? intelligence How can something be tested? using questions and answers

using

What was being used? questions and answers Why was something being used? tested for intelligence

slide-38
SLIDE 38

Large-scale QA-SRL Parsing

  • 1. Scale up QA-SRL data annotation
  • 2. Train a QA-SRL Parser
  • 3. Improve Recall
slide-39
SLIDE 39

QA-SRL Parser

[He et al 2018] [He et al 2017]

slide-40
SLIDE 40

QA-SRL Parser

[He et al 2018] [He et al 2017]

slide-41
SLIDE 41

QA-SRL Parser

  • Unlabeled Argument Detection
  • Question generation
slide-42
SLIDE 42

QA-SRL Parsing

x

John surreptitiously ate the burrito at 2pm

slide-43
SLIDE 43

QA-SRL Parsing

x

John surreptitiously ate the burrito at 2pm

Predicate detection

1

slide-44
SLIDE 44

QA-SRL Parsing

x

John surreptitiously ate the burrito at 2pm

Argument detection

“John” “surreptitiously” “the burrito” “at 2pm”

Predicate detection

1

slide-45
SLIDE 45

QA-SRL Parsing

x

John surreptitiously ate the burrito at 2pm

Argument detection

“John” “surreptitiously” “the burrito” “at 2pm”

Question generation

“Who ate something?” “How did someone eat something?” “What did someone eat?” “When did someone eat something?”

Predicate detection

1

slide-46
SLIDE 46

QA-SRL Parsing

x

John surreptitiously ate the burrito at 2pm

Argument detection

“John” “surreptitiously” “the burrito” “at 2pm”

Question generation

“Who ate something?” “How did someone eat something?” “What did someone eat?” “When did someone eat something?”

Predicate detection

1

Automatic Heuristics (same as data)

slide-47
SLIDE 47

QA-SRL Parsing

x

John surreptitiously ate the burrito at 2pm

Argument detection

“John” “surreptitiously” “the burrito” “at 2pm”

Question generation

“Who ate something?” “How did someone eat something?” “What did someone eat?” “When did someone eat something?”

Predicate detection

1

Automatic Heuristics (same as data)

  • 1. BIO Model
  • 2. Span-based Model
slide-48
SLIDE 48

QA-SRL Parsing

x

John surreptitiously ate the burrito at 2pm

Argument detection

“John” “surreptitiously” “the burrito” “at 2pm”

Question generation

“Who ate something?” “How did someone eat something?” “What did someone eat?” “When did someone eat something?”

Predicate detection

1

Automatic Heuristics (same as data)

  • 1. BIO Model
  • 2. Span-based Model
  • 1. Local
  • 2. Sequential
slide-49
SLIDE 49

Argument Detection - BIO Model

John surreptitiously ate the burrito at 2pm 1

  • Alternating Bi-LSTM with Highway Connections

and Recurrent Dropout [He et al 2017]

  • Input includes predicate indicator
slide-50
SLIDE 50

Argument Detection - BIO Model

John surreptitiously ate the burrito at 2pm

B B O B I B I

MLP+Softmax

1

slide-51
SLIDE 51

Argument Detection - BIO Model

John surreptitiously ate the burrito at 2pm

B B O B I B I

MLP+Softmax

1

“John” “surreptitiously” “the burrito” “at 2pm”

slide-52
SLIDE 52

Argument Detection - Span Model

John surreptitiously ate the burrito at 2pm 1

Form a representation of every possible span

slide-53
SLIDE 53

Argument Detection - Span Model

John surreptitiously ate the burrito at 2pm

“John surreptitiously”

1

Form a representation of every possible span

slide-54
SLIDE 54

Argument Detection - Span Model

John surreptitiously ate the burrito at 2pm

“John” “John surreptitiously”

1

Form a representation of every possible span

slide-55
SLIDE 55

Argument Detection - Span Model

John surreptitiously ate the burrito at 2pm

“John” “John surreptitiously” “the burrito” “surreptitiously ate the” “the burrito at” “at 2pm”

1

Form a representation of every possible span

slide-56
SLIDE 56

Argument Detection - Span Model

John surreptitiously ate the burrito at 2pm

“John” “John surreptitiously” “the burrito” “surreptitiously ate the” “the burrito at” “at 2pm”

0.9 0.1 0.2 0.8 0.1 0.75

MLP+ sigmoid

1

slide-57
SLIDE 57

Argument Detection - Span Model

John surreptitiously ate the burrito at 2pm

“John” “John surreptitiously” “the burrito” “surreptitiously ate the” “the burrito at” “at 2pm”

0.9 0.1 0.2 0.8 0.1 0.75

MLP+ sigmoid

1

Tunable Threshold

slide-58
SLIDE 58

Argument Detection

  • 4 layer Alternating Bi-LSTM with Highway Connections and

Recurrent Dropout [He et al 2017]

  • Trained to maximize log-likelihood
slide-59
SLIDE 59

Argument Detection

Span F-score

70 75 80 85 90

Exact Match IOU >= 0.5

88.1 82.2 85.8 81.3 83.1 72.2

BIO Span (t=0.5) Span (t=t*)

slide-60
SLIDE 60

Question Generation

Wh Aux Subj Verb Obj Prep Obj2

Who What Where When Why How ∅ did didn’t might will … ∅ someone something ∅ stem past past participle present ∅ someone something ∅

  • n

to by from … ∅ someone something

slide-61
SLIDE 61

Question Generation

  • Two models:
  • Local slot prediction
  • Sequential Model (LSTM)

Wh Aux Subj Verb Obj Prep Obj2

Who What Where When Why How ∅ did didn’t might will … ∅ someone something ∅ stem past past participle present ∅ someone something ∅

  • n

to by from … ∅ someone something

slide-62
SLIDE 62

Question Generation - Local

John surreptitiously ate the burrito at 2pm

“John” “surreptitiously” “the burrito” “at 2pm”

1

slide-63
SLIDE 63

Question Generation - Local

Wh Aux Subj Verb Obj1 Prep Obj2

“the burrito”

Who What Where …

“What”

slide-64
SLIDE 64

Question Generation - Local

Wh Aux Subj Verb Obj1 Prep Obj2

“the burrito”

is did might …

“What” “did”

slide-65
SLIDE 65

Question Generation - Local

Wh Aux Subj Verb Obj1 Prep Obj2

“the burrito” “What” “did” “someone” “past-tense” Ø Ø Ø

slide-66
SLIDE 66

Question Generation - Sequential

Wh Aux Subj Verb Obj1 Prep Obj2

“the burrito” “What”

slide-67
SLIDE 67

Question Generation - Sequential

Wh Aux Subj Verb Obj1 Prep Obj2

“the burrito” “What” “What” “did”

slide-68
SLIDE 68

Question Generation - Sequential

Wh Aux Subj Verb Obj1 Prep Obj2

“the burrito” “What” “What” “did” “did” “someone” “past-tense” “Ø” “Ø” “someone” “past-tense” “Ø” “Ø” “Ø”

slide-69
SLIDE 69

Question Generation

  • 4 layer Alternating Bi-LSTM with Highway Connections and

Recurrent Dropout [He et al 2017]

  • Trained to maximize log-likelihood
slide-70
SLIDE 70

Evaluation Questions

  • Paraphrasing

“Who ate something? “Who was something eaten by?

slide-71
SLIDE 71

Evaluation Questions

“Who ate something? “Who was something eaten by?

  • Paraphrasing



 
 


  • Given gold span:
  • Slot-level accuracy
  • Exact Match (full question)
slide-72
SLIDE 72

Question Generation

30 45 60 75 90 Slot Accuracy Exact Match

47.2 82.9 44.2 83.2

Local Sequential

slide-73
SLIDE 73

Full Parsing Accuracy

39 40 41 42 43 44 Exact match f-score (Span & Question)

42.4 40.6

Span + Local Span + Sequential (t=0.5)

slide-74
SLIDE 74

Large-scale QA-SRL Parsing

  • 1. Scale up QA-SRL data annotation
  • 2. Train a QA-SRL Parser
  • 3. Improve Recall
slide-75
SLIDE 75

In 1950 Alan M. Turing published "Computing machinery and intelligence" in Mind, in which he proposed that machines could be tested for intelligence using questions and answers.

published

Who published something? Alan M. Turing What was published? “Computing Machinery and Intelligence” When was something published? In 1950 Where was something published? in Mind

proposed

Who proposed something? Alan M. Turing What did someone propose? that machines could be tested for intelligent using questions and answers When did someone propose something? In 1950 What did someone propose something in? “Computing Machinery and Intelligence”

tested

What can be tested? machines What can something be tested for? intelligence How can something be tested? using questions and answers

using

What was being used? questions and answers Why was something being used? tested for intelligence

slide-76
SLIDE 76

In 1950 Alan M. Turing published "Computing machinery and intelligence" in Mind, in which he proposed that machines could be tested for intelligence using questions and answers.

published

Who published something? Alan M. Turing What was published? “Computing Machinery and Intelligence” When was something published? In 1950 Where was something published? in Mind

proposed

Who proposed something? Alan M. Turing What did someone propose? that machines could be tested for intelligent using questions and answers When did someone propose something? In 1950 What did someone propose something in? “Computing Machinery and Intelligence”

tested

What can be tested? machines What can something be tested for? intelligence How can something be tested? using questions and answers

using

What was being used? questions and answers Why was something being used? tested for intelligence

Impacts training and evaluation

slide-77
SLIDE 77

In 1950 Alan M. Turing published "Computing machinery and intelligence" in Mind, in which he proposed that machines could be tested for intelligence using questions and answers.

published

Who published something? Alan M. Turing What was published? “Computing Machinery and Intelligence” When was something published? In 1950 Where was something published? in Mind

proposed

Who proposed something? Alan M. Turing What did someone propose? that machines could be tested for intelligent using questions and answers When did someone propose something? In 1950 What did someone propose something in? “Computing Machinery and Intelligence”

tested

What can be tested? machines What can something be tested for? intelligence How can something be tested? using questions and answers

using

What was being used? questions and answers Why was something being used? tested for intelligence

Fill in Gaps

slide-78
SLIDE 78

Annotation Pipeline

x

John surreptitiously ate the burrito at 2pm

Validation Question generation Predicate detection

Identify verbs with POS + heuristics One worker writes as many QA-SRL questions as possible, and provides the answer Workers are shows questions, provide answers or mark as invalid

slide-79
SLIDE 79

Annotation Pipeline

x

John surreptitiously ate the burrito at 2pm

Validation Question generation Predicate detection

Identify verbs with POS + heuristics One worker writes as many QA-SRL questions as possible, and provides the answer Workers are shows questions, provide answers or mark as invalid

QA-SRL Parser

slide-80
SLIDE 80

Data Expansion

  • Overgenerate questions with low span threshold
  • Validate
  • +46,715 valid QA-pairs (~11% after filtering)
slide-81
SLIDE 81

Data Expansion

30 45 60 75 90

Span Detection (F-score) Question Generation (Exact Match) Full Parsing (Exact Match)

49.1 50.8 84.6 47.2 50.5 83.7

Original Expanded

slide-82
SLIDE 82

Evaluation

  • Exact Match for Question Generation is overly harsh
  • Paraphrasing
  • Missing Questions
  • Penalizes correct predictions missing from data

“Who ate something? “Who was something eaten by?

slide-83
SLIDE 83

Human Evaluation

  • Validate model predictions with 6 annotators
  • Generated Question valid if 5 out of 6 annotators

provided answers

  • Predicted span correct if exactly matches any annotator’s

answer

slide-84
SLIDE 84

Human Eval - Questions

SpanSeq + Expand SpanSeq SpanLocal

Decreasing t

slide-85
SLIDE 85

Human Eval - Questions

82.64

SpanSeq + Expand SpanSeq SpanLocal

Decreasing t

slide-86
SLIDE 86

Human Eval - Arguments

77.71

SpanSeq + Expand SpanSeq SpanLocal

slide-87
SLIDE 87

met Who met someone? Some of the vegetarians Who met? he What did someone meet? members of the Theosophical Society founded What had been founded? members of the Theosophical Society the Theosophical Society When was something founded? in 1875 Why has something been founded? to further universal brotherhood devoted What was devoted to something? members of the Theosophical Society What was something devoted to? the study of Buddhist and Hindu literature

Example Output

Some of the vegetarians he met were members of the Theosophical Society, which had been founded in 1875 to further universal brotherhood, and which was devoted to the study of Buddhist and Hindu literature.

slide-88
SLIDE 88

Conclusion

  • Large crowdsourced dataset of QA-SRL annotations
  • High quality QA-SRL Parser
  • Techniques for data expansion
slide-89
SLIDE 89

Code and Data available at:
 http://qasrl.org

slide-90
SLIDE 90

Code and Data available at:
 http://demo.qasrl.org

slide-91
SLIDE 91

Thank You!

http://qasrl.org/