Using Sentence-Level LSTM Language Models for Script Inference - - PowerPoint PPT Presentation

using sentence level lstm language models for script
SMART_READER_LITE
LIVE PREVIEW

Using Sentence-Level LSTM Language Models for Script Inference - - PowerPoint PPT Presentation

Using Sentence-Level LSTM Language Models for Script Inference Karl Pichotta and Raymond J. Mooney The University of Texas at Austin ACL 2016, Berlin 1 Event Inference: Motivation Suppose we want to build a Question Answering


slide-1
SLIDE 1

Using Sentence-Level LSTM Language Models for Script Inference

Karl Pichotta and Raymond J. Mooney The University of Texas at Austin

  • ACL 2016, Berlin

1

slide-2
SLIDE 2

Event Inference: Motivation

  • Suppose we want to build a Question Answering

system…

2

slide-3
SLIDE 3

Event Inference: Motivation

  • The Convention ordered the arrest of Robespierre.…

Troops from the Commune, under General Coffinhal, arrived to free the prisoners and then marched against the Convention itself.
 
 –Wikipedia

  • Was Robespierre arrested?


3

slide-4
SLIDE 4

Event Inference: Motivation

  • The Convention ordered the arrest of Robespierre.…

Troops from the Commune, under General Coffinhal, arrived to free the prisoners and then marched against the Convention itself.
 
 –Wikipedia

  • Was Robespierre arrested?


4

slide-5
SLIDE 5

Event Inference: Motivation

  • The Convention ordered the arrest of Robespierre.…

Troops from the Commune, under General Coffinhal,

arrived to free the prisoners and then marched

against the Convention itself.
 
 –Wikipedia

  • Was Robespierre arrested?


5

slide-6
SLIDE 6

Event Inference: Motivation

  • The Convention ordered the arrest of Robespierre.…

Troops from the Commune, under General Coffinhal,

arrived to free the prisoners and then marched

against the Convention itself.
 
 –Wikipedia

  • Was Robespierre arrested? Very probably!


6

slide-7
SLIDE 7

Event Inference: Motivation

  • The Convention ordered the arrest of Robespierre.…

Troops from the Commune, under General Coffinhal,

arrived to free the prisoners and then marched

against the Convention itself.
 
 –Wikipedia

  • Was Robespierre arrested? Very probably!
  • …But this needs to be inferred.

7

slide-8
SLIDE 8

Event Inference: Motivation

  • Question answering requires inference of probable

implicit events.

  • We’ll investigate such event inference systems.

8

slide-9
SLIDE 9

Outline

  • Background & Methods
  • Experiments
  • Conclusions

9

slide-10
SLIDE 10

Outline

  • Background & Methods
  • Experiments
  • Conclusions

10

slide-11
SLIDE 11

Outline

  • Background & Methods
  • Event Sequence Learning & Inference
  • Sentence-Level Language Models

11

slide-12
SLIDE 12

Outline

  • Background & Methods
  • Event Sequence Learning & Inference
  • Sentence-Level Language Models

12

slide-13
SLIDE 13

Event Sequence Learning

  • [Schank & Abelson 1977] gave a non-statistical

account of scripts (events in sequence).

  • [Chambers & Jurafsky (ACL 2008)] provided a

statistical model of (verb, dependency) events.

  • A recent body of work focuses on learning statistical

models of event sequences [e.g. P. & Mooney (AAAI 2016)].

  • Events are, for us, verbs with multiple NP arguments.

13

slide-14
SLIDE 14

Event Sequence Learning

14

Millions

  • f

Documents NLP Pipeline

  • Syntax
  • Coreference

Millions of Event Sequences Train a Statistical Model

slide-15
SLIDE 15

Event Sequence Inference

15

New Test Document NLP Pipeline

  • Syntax
  • Coreference

Single Event Sequence Query Trained Statistical Model Inferred Probable Events

slide-16
SLIDE 16

Event Sequence Inference

16

New Test Document Single Event Sequence Query Trained Statistical Model Inferred Probable Events

slide-17
SLIDE 17

Event Sequence Inference

17

New Test Document Single Text Sequence Query Trained Statistical Model Inferred Probable Events

slide-18
SLIDE 18

Event Sequence Inference

18

New Test Document Single Text Sequence Query Trained Statistical Model Inferred Probable Text

slide-19
SLIDE 19

Event Sequence Inference

19

New Test Document Single Text Sequence Query Trained Statistical Model Inferred Probable Text Parse Events from Text

slide-20
SLIDE 20

Event Sequence Inference

20

New Test Document Single Text Query Trained Statistical Model Inferred Probable Text Parse Events from Text

What if we use raw text as our event representation?

slide-21
SLIDE 21

Outline

  • Background & Methods
  • Event Sequence Learning
  • Sentence-Level Language Models

21

slide-22
SLIDE 22

Outline

  • Background & Methods
  • Event Sequence Learning
  • Sentence-Level Language Models

22

slide-23
SLIDE 23

Sentence-Level Language Models

  • [Kiros et al. NIPS 2015]: “Skip-Thought Vectors”
  • Encode whole sentences into low-dimensional

vectors…

  • …trained to decode previous/next sentences.

23

slide-24
SLIDE 24

Sequence-Level Language Models

24

RNN ti

[word sequence for sentence i]

ti+1

[word sequence for sentence i+1]

RNN ti-1

slide-25
SLIDE 25

Sequence-Level Language Models

  • [Kiros et al. 2015] use sentence-embeddings for
  • ther tasks.
  • We use them directly for inferring text.
  • Central Question: How well can sentence-level

language models infer events?

25

slide-26
SLIDE 26

Outline

  • Background & Methods
  • Event Sequence Learning
  • Sentence-Level Language Models


26

slide-27
SLIDE 27

Outline

  • Background & Methods
  • Experiments
  • Conclusions


27

slide-28
SLIDE 28

Outline

  • Background & Methods
  • Experiments
  • Task Setup
  • Results

28

slide-29
SLIDE 29

Systems

  • Two Tasks:
  • Inferring Events from Events

  • Inferring Text from Text



 


29

slide-30
SLIDE 30

Systems

  • Two Tasks:
  • Inferring Events from Events


…and optionally expanding into text.

  • Inferring Text from Text


…and optionally parsing into events.
 


30

slide-31
SLIDE 31

Systems

  • Two Tasks:
  • Inferring Events from Events


…and optionally expanding into text.

  • Inferring Text from Text


…and optionally parsing into events.

  • How do these tasks relate to each other?

31

slide-32
SLIDE 32

Event Systems

32

jumped(jim, from plane);

  • pened(he, parachute)

Predict an event from a sequence of events. LSTM landed(jim, on ground) LSTM “Jim landed on the ground.”

          

≈ [P. & Mooney (2016)]

slide-33
SLIDE 33

Text Systems

33

“Jim jumped from the plane and

  • pened his parachute.”

Predict text from text. LSTM “Jim landed on the ground.” Parser landed(jim, on ground)

          

≈ [Kiros et al. 2015]

slide-34
SLIDE 34

Outline

  • Background & Methods
  • Experiments
  • Task Setup
  • Results

34

slide-35
SLIDE 35

Outline

  • Background & Methods
  • Experiments
  • Task Setup
  • Results

35

slide-36
SLIDE 36

Experimental Setup

  • Train + Test on English Wikipedia.
  • LSTM encoder-decoders trained with batch SGD

with momentum.

  • Parse events with Stanford CoreNLP.
  • Events are verbs with head noun arguments.
  • Evaluate on Event Prediction & Text Prediction.

36

slide-37
SLIDE 37

Predicting Events: Evaluation

  • Narrative Cloze [Chambers & Jurafsky 2008]: Hold out an

event, judge a system on inferring it.

  • Accuracy: “For what percentage of the

documents is the top inference the gold standard answer?”

  • Partial credit: “What is the average percentage
  • f the components of argmax inferences that are

the same as in the gold standard?”

37

slide-38
SLIDE 38

Predicting Events: Systems

  • Most Common: Always guess the most common

event.

  • e1 -> e2: events to events.
  • t1 -> t2 -> e2: text to text to events.

38

slide-39
SLIDE 39

Results: Predicting Events

39

Accuracy (%)

Most common e1 -> e2 t1 -> t2 -> e2 0.75 1.5 2.25 3

2 2.3 0.2 Partial Credit (%)

Most common e1 -> e2 t1 -> t2 -> e2 7.75 15.5 23.25 31

30.3 26.7 26.5

slide-40
SLIDE 40

Predicting Text: Evaluation

  • BLEU: Geometric mean of modified ngram

precisions.

  • Word-level analog to Narrative Cloze.

40

slide-41
SLIDE 41

Predicting Text: Systems

  • t1 -> t1: Copy/paste a sentence as its predicted

successor.

  • e1 -> e2 -> t2: events to events to text.
  • t1 -> t2: text to text.

41

slide-42
SLIDE 42

Results: Predicting Text

42

BLEU

t1 -> t1 e1 -> e2 -> t2 t1 -> t2 1.5 3 4.5 6

5.2 0.34 1.88 1-BLEU

t1 -> t1 e1 -> e2 -> t2 t1 -> t2 8 16 24 32

30.9 19.9 22.6

slide-43
SLIDE 43

Takeaways

  • In LSTM encoder-decoder event prediction…
  • Raw text models predict events about as well as

event models.

  • Raw text models predict tokens better than event

models.

43

slide-44
SLIDE 44

Example Inferences

  • Input: “White died two days after Curly Bill shot

him.”

  • Gold: “Before dying, White testified that he

thought the pistol had accidentally discharged and that he did not believe that Curly Bill shot him on purpose.”

  • Inferred: “He was buried at <UNK> Cemetery.”

44

slide-45
SLIDE 45

Example Inferences

  • Input: “As of October 1 , 2008 , <UNK> changed

its company name to Panasonic Corporation.”

  • Gold: “<UNK> products that were branded

‘National’ in Japan are currently marketed under the ‘Panasonic’ brand.”

  • Inferred: “The company’s name is now <UNK>.”

45

slide-46
SLIDE 46

Conclusions

  • For inferring events in text, text is about as good a

representation as events (and doesn’t require a parser!).

  • Relation of sentence-level LM inferences to other

NLP tasks is an exciting open question.

46

slide-47
SLIDE 47

Thanks!

47