Using Sentence-Level LSTM Language Models for Script Inference - PowerPoint PPT Presentation

Using Sentence-Level LSTM Language Models for Script Inference Karl Pichotta and Raymond J. Mooney The University of Texas at Austin � ACL 2016, Berlin 1

Event Inference: Motivation • Suppose we want to build a Question Answering system… 2

    Event Inference: Motivation • The Convention ordered the arrest of Robespierre.… Troops from the Commune, under General Coffinhal, arrived to free the prisoners and then marched against the Convention itself.   – Wikipedia • Was Robespierre arrested?   3

    Event Inference: Motivation • The Convention ordered the arrest of Robespierre.… Troops from the Commune, under General Coffinhal, arrived to free the prisoners and then marched against the Convention itself.   – Wikipedia • Was Robespierre arrested? Very probably!   6

  Event Inference: Motivation • The Convention ordered the arrest of Robespierre.… Troops from the Commune, under General Coffinhal, arrived to free the prisoners and then marched against the Convention itself.   – Wikipedia • Was Robespierre arrested? Very probably! � • …But this needs to be inferred. 7

Event Inference: Motivation • Question answering requires inference of probable implicit events. • We’ll investigate such event inference systems. 8

Outline • Background & Methods • Experiments • Conclusions 9

Outline • Background & Methods • Experiments • Conclusions 10

Outline • Background & Methods • Event Sequence Learning & Inference • Sentence-Level Language Models 11

Outline • Background & Methods • Event Sequence Learning & Inference • Sentence-Level Language Models 12

Event Sequence Learning • [Schank & Abelson 1977] gave a non-statistical account of scripts (events in sequence). • [Chambers & Jurafsky (ACL 2008)] provided a statistical model of (verb, dependency) events. • A recent body of work focuses on learning statistical models of event sequences [e.g. P. & Mooney (AAAI 2016)] . • Events are, for us, verbs with multiple NP arguments. 13

Event Sequence Learning Millions NLP Pipeline Millions of of • Syntax Event Sequences Documents • Coreference Train a Statistical Model 14

Event Sequence Inference NLP Pipeline Single New Test • Syntax Event Sequence Document • Coreference Inferred Probable Query Trained Events Statistical Model 15

Event Sequence Inference Single New Test Event Sequence Document Inferred Probable Query Trained Events Statistical Model 16

Event Sequence Inference Single New Test Text Sequence Document Inferred Probable Query Trained Events Statistical Model 17

Event Sequence Inference Single New Test Text Sequence Document Inferred Probable Query Trained Text Statistical Model 18

Event Sequence Inference Single New Test Text Sequence Document Parse Events Inferred Probable Query Trained from Text Text Statistical Model 19

Event Sequence Inference Single New Test Text Document What if we use � raw text � as our � event representation? Parse Events Inferred Probable Query Trained from Text Text Statistical Model 20

Outline • Background & Methods • Event Sequence Learning • Sentence-Level Language Models 21

Outline • Background & Methods • Event Sequence Learning • Sentence-Level Language Models 22

Sentence-Level Language Models • [Kiros et al. NIPS 2015]: “Skip-Thought Vectors” • Encode whole sentences into low-dimensional vectors… • …trained to decode previous/next sentences. 23

Sequence-Level Language Models t i-1 RNN t i RNN t i+1 [word sequence [word sequence for sentence i ] for sentence i+1 ] 24

Sequence-Level Language Models • [Kiros et al. 2015] use sentence-embeddings for other tasks. • We use them directly for inferring text. • Central Question: How well can sentence-level language models infer events? 25

  Outline • Background & Methods • Event Sequence Learning • Sentence-Level Language Models   26

  Outline • Background & Methods • Experiments • Conclusions   27

Outline • Background & Methods • Experiments • Task Setup • Results 28

    Systems • Two Tasks: • Inferring Events from Events   • Inferring Text from Text   29

  Systems • Two Tasks: • Inferring Events from Events   …and optionally expanding into text. • Inferring Text from Text   …and optionally parsing into events.   30

Systems • Two Tasks: • Inferring Events from Events   …and optionally expanding into text. • Inferring Text from Text   …and optionally parsing into events. � • How do these tasks relate to each other? 31

Event Systems Predict an event from a sequence of events.     jumped(jim, from plane);  opened(he, parachute)    ≈ [P. & Mooney (2016)]  LSTM   landed(jim, on ground) LSTM “Jim landed on the ground.” 32

Text Systems Predict text from text.     “Jim jumped from the plane and  opened his parachute.”    ≈ [Kiros et al. 2015]  LSTM   “Jim landed on the ground.” Parser landed(jim, on ground) 33

Experimental Setup • Train + Test on English Wikipedia. • LSTM encoder-decoders trained with batch SGD with momentum. • Parse events with Stanford CoreNLP. • Events are verbs with head noun arguments. • Evaluate on Event Prediction & Text Prediction. 36

Predicting Events: Evaluation • Narrative Cloze [Chambers & Jurafsky 2008] : Hold out an event, judge a system on inferring it. • Accuracy: “For what percentage of the documents is the top inference the gold standard answer?” • Partial credit: “What is the average percentage of the components of argmax inferences that are the same as in the gold standard?” 37

Predicting Events: Systems • Most Common: Always guess the most common event. • e1 -> e2: events to events. • t1 -> t2 -> e2: text to text to events. 38

Results: Predicting Events Accuracy (%) Partial Credit (%) 0.2 26.5 Most common Most common 2.3 26.7 e1 -> e2 e1 -> e2 2 30.3 t1 -> t2 -> e2 t1 -> t2 -> e2 0 0.75 1.5 2.25 3 0 7.75 15.5 23.25 31 39

Predicting Text: Evaluation • BLEU: Geometric mean of modified ngram precisions. • Word-level analog to Narrative Cloze. 40

Predicting Text: Systems • t1 -> t1: Copy/paste a sentence as its predicted successor. • e1 -> e2 -> t2: events to events to text. • t1 -> t2: text to text. 41

Results: Predicting Text BLEU 1-BLEU 1.88 22.6 t1 -> t1 t1 -> t1 0.34 19.9 e1 -> e2 -> t2 e1 -> e2 -> t2 5.2 30.9 t1 -> t2 t1 -> t2 0 1.5 3 4.5 6 0 8 16 24 32 42

Takeaways • In LSTM encoder-decoder event prediction… • Raw text models predict events about as well as event models. • Raw text models predict tokens better than event models. 43

Example Inferences • Input: “White died two days after Curly Bill shot him.” • Gold: “Before dying, White testified that he thought the pistol had accidentally discharged and that he did not believe that Curly Bill shot him on purpose.” • Inferred: “He was buried at <UNK> Cemetery.” 44

Example Inferences • Input: “As of October 1 , 2008 , <UNK> changed its company name to Panasonic Corporation.” • Gold: “<UNK> products that were branded ‘National’ in Japan are currently marketed under the ‘Panasonic’ brand.” • Inferred: “The company’s name is now <UNK>.” 45

Conclusions • For inferring events in text, text is about as good a representation as events (and doesn’t require a parser!). • Relation of sentence-level LM inferences to other NLP tasks is an exciting open question. 46

Thanks! 47

Using Sentence-Level LSTM Language Models for Script Inference - PowerPoint PPT Presentation

Using Sentence-Level LSTM Language Models for Script Inference Karl Pichotta and Raymond J. Mooney The University of Texas at Austin ACL 2016, Berlin 1 Event Inference: Motivation Suppose we want to build a Question Answering

Class Unity scripts Rotate cube script Counter + collision script Sound script

Attention Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Encoder-decoder Models

Attention Graham Neubig Site https://phontron.com/class/nn4nlp2020/ Encoder-decoder Models

SENTENCE STRUCTURE ATI TEAS ENGLISH AND LANGUAGE USAGE SENTENCE STRUCTURE Sentence Structure

A Sentence is a Sentence is a Sentence? Zarah Weiss Introduction Parallels and Differences

Probabilistic Models of Human Sentence Experiment 1: Entropy and Sentence Length 2 Processing

Multi-Dimensional LSTM Networks for Video Prediction Wonmin Byeon NVIDIA Research March 29, 2018

Class 15 - Long Short-Term Memory (LSTM) Class 15 - Long Short-Term Memory (LSTM) Study materials

E-LSTM: Efficient Inference of Sparse LSTM on Embedded Heterogeneous System Runbin Shi 1 Junjie

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting LSTM

Structure for Semantic Tasks Gabriel Stanovsky, Ido Dagan and Mausam Sentence Level Semantic

Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representation Eliyahu

Character-level Language Models With Word-level Learning Arvid Frydenlund March 16, 2018

Andromeda: XSS Accurate and Scalable Security Attackers evil script Analysis of Web

Models of Language Evolution models thereof its evolution language Models of Language Evolution

I. Watch the Einstein video and answer the following questions: What is a sentence? What is a

Piedmont Student Launch Team Critical Design Review 19 January 2017 Piedmont Student Launch Team

Writing Home 7: Flashback Writing Ti Time connectives are words or phrases which are used to

A Scientific Guide to Hobby Rocketry A Guide to Everything You Need to Know Before Launching Your

Reachability Analysis in the KeYmaera X Theorem Prover SNR 2017 | Uppsala, Sweden | April 22,

A skydiver jumps out of a plane. What is the direction of her acceleration immediately after

Disclosures I have nothing to disclose 1 9/21/2015 Why is TAVR Important to You?

Polynomial Chi-binding functions and forbidden induced subgraphs a survey Ingo Schiermeyer TU

Slide # 1. During my years of overseeing Federal energy programs, I received great support from