Statistical Script Learning with Recurrent Neural Nets Karl - PowerPoint PPT Presentation

Statistical Script Learning with Recurrent Neural Nets Karl Pichotta Dissertation Proposal December 17, 2015 1

Motivation • Following the Battle of Actium, Octavian invaded Egypt. As he approached Alexandria, Antony's armies deserted to Octavian on August 1, 30 BC. • Did Octavian defeat Antony? 2

Motivation • Following the Battle of Actium, Octavian invaded Egypt. As he approached Alexandria, Antony's armies deserted to Octavian on August 1, 30 BC. • Did Octavian defeat Antony? 3

Motivation • Antony’s armies deserted to Octavian   ⇒   Octavian defeated Antony • Not simply a paraphrase rule! • Need world knowledge. 4

Scripts • Scripts : models of events in sequence. • Events don’t appear in text randomly, but according to world dynamics. • Scripts try to capture these dynamics. • Enable automatic inference of implicit events, given events in text (e.g. Octavian defeated Antony ). 5

Research Questions • How can Neural Nets improve automatic inference of events from documents? • Which models work best empirically? • Which types of explicit linguistic knowledge are useful? 6

Outline • Background � • Completed Work • Completed Work • Proposed Work • Proposed Work • Conclusion • Conclusion 7

    Outline • Background • Background • Statistical Scripts • Statistical Scripts • Recurrent Neural Nets   • Recurrent Neural Nets   8

Background: Statistical Scripts • Statistical Scripts : Statistical Models of Event Sequences. • Non-statistical scripts date back to the 1970s [Schank & Abelson 1977]. • Statistical script learning is a small-but-growing subcommunity [e.g. Chambers & Jurafsky 2008]. • Model the probability of an event given prior events. 9

Background: Statistical Script Learning Millions NLP Pipeline Millions of of • Syntax Event Sequences Documents • Coreference Train a Statistical Model 10

Background: Statistical Script Inference NLP Pipeline Single New Test • Syntax Event Sequence Document • Coreference Inferred Probable Query Trained Events Statistical Model 11

Background: Statistical Scripts • Central Questions: • What is an “Event?” (Part 1 of completed work) • Which models work well? (Part 2 of completed work) • How to evaluate? • How to incorporate into end tasks? 12

    Outline • Background • Background • Statistical Scripts • Statistical Scripts • Recurrent Neural Nets   • Recurrent Neural Nets   13

Background: RNNs • Recurrent Neural Nets (RNNs) : Neural Nets with cycles in computation graph. • RNN Sequence Models: Map inputs   x 1 , …, x t   to outputs   o 1 , …, o t   via learned latent vector states   z 1 , …, z t . 14

Background: RNNs [Elman 1990] � � � � � � � � �� 15

Background: RNNs � � � � � � �� • Hidden Unit can be arbitrarily complicated, as long as we can calculate gradients! 16

Background: LSTMs • Long Short-Term Memory (LSTM): More complex hidden RNN unit. [Hochreiter & Schmidhuber, 1997] • Explicitly addresses two issues: • Vanishing Gradient Problem. • Long-Range Dependencies. 17

Background: LSTM � � z t = o t � tanh m t �� o t = σ ( W x,o x t + W h,i z t − 1 + b o ) � �� f t = σ ( W x,f x t + W z,f z t − 1 + b f ) �� m t = f t � m t − 1 + i t � g t �� g t = tanh ( W x,m x t + W z,m z t − 1 + b g ) � � � �� i t = σ ( W x,i x t + W z,i z t − 1 + b i ) 18

Background: LSTMs • LSTMs successful for many hard NLP tasks recently: • Machine Translation [Kalchbrenner and Blunsom 2013, Bahdanau et al. 2015]. • Captioning Images/Videos [Donahue et al. 2015, Venugopalan et al. 2015]. • Language Modeling [Sundermeyer et al. 2012, Kim et al. 2016]. • Question Answering [Hermann et al. 2015, Gao et al. 2015]. 19

Outline • Background • Background • Completed Work � • Proposed Work • Proposed Work • Conclusion • Conclusion 20

Outline • Background • Background • Completed Work � • Multi-Argument Events � • RNN Scripts 21

Outline • Background • Background • Completed Work • Completed Work • Multi-Argument Events • Multi-Argument Events • RNN Scripts • RNN Scripts 22

Events • To model “events,” we need a formal definition. • For us, it will be variations of “verbs with participants.” 23

          Pair Events • Other Methods use (verb, dependency) pair events [Chambers & Jurafsky 2008; 2009; Jans et al. 2012; Rudinger et al. 2015].   (vb, dep) Verb Syntactic Dependency • Captures how an entity relates to a verb. 24

Pair Events • Napoleon remained married to Marie Louise, though she did not join him in exile on Elba and thereafter never saw her husband again. N. M.L. (remain_married, subj) � (remain_married, subj) � (remain_married, prep) � (remain_married, prep) � (not_join, obj) � (not_join, obj) � (not_join, subj) � (not_join, subj) � (not_see, obj) (not_see, obj) (not_see, subj) (not_see, subj) • …Doesn’t capture interactions between entities. 25

Multi-Argument Events [P. & Mooney, EACL 2014] • Use more complex events with multiple entities. • Learning is more complicated… • …But inferred events are quantitatively better. 26

            Multi-Argument Events • We represent events as tuples:   v (e s , e o , e p ) Verb Subject Entity Object Entity Prepositional Entity • Entities may be null (“·”). • Entities have only coreference information. 27

Multi-Argument Events • Napoleon remained married to Marie Louise, though she did not join him in exile on Elba and thereafter never saw her husband again. remain_married(N, ·, to ML) � not_join(ML, N, ·) � not_see(ML, N, ·) • Incorporate entities into events as variables. • Captures pairwise interaction between entities. 28

Entity Rewriting remain_married(N, ·, to ML) � not_join(ML, N, ·) � not_see(ML, N, ·) • not_join( x , y , ·) should predict not_see( x , y , ·) for all x , y . • During learning, canonicalize co-occurring events: • Rename variables to a small fixed set. • Add co-occurrences of all consistent rewritings of the events. 29

              Learning & Inference • Learning : From large corpus, count N(a,b) , the number of times event b occurs after event a with at most two intervening events (“2-skip bigram” counts). • Inference : Infer event b at timestep t according to:   ` t X X S ( b ) = log P ( b | a i ) + log P ( a i | b ) i =1 i = t +1 | {z } | {z } Prob. of b following Prob. of b preceding events before t events after t [Jans et al. 2012] 30

Evaluation • “Narrative Cloze” (Chambers & Jurafsky, 2008): from an unseen document, hold one event out, try to infer it given remaining document. • “Recall at k” (Jans et al., 2012): make k top inferences, calculate recall of held-out events. • We evaluate on a number of metrics, but only present one here for clarity (different results are comparatively similar). 31

Experiments • Train on 1.1M NYT articles (Gigaword). • Use Stanford Parser/Coref. 32

Results: Pair Events 0.297 Unigram 0.282 Single-Protagonist 0.336 Joint 0 0.1 0.2 0.3 0.4 Recall at 10 for inferring (verb, dependency) events. 33

Results: Multi-Argument Events 0.216 Unigram 0.209 Multi-Protagonist 0.245 Joint 0 0.063 0.125 0.188 0.25 Recall at 10 for inferring Multi-argument events. 34

Outline • Background • Background • Completed Work • Completed Work • Multi-Argument Events • Multi-Argument Events • RNN Scripts • RNN Scripts 35

Co-occurrence Model Shortcomings • The co-occurrence-based method has shortcomings: • “ x married y ” and “ x is married to y ” are unrelated events. • Nouns are ignored. ( she sits on the chair vs she sits on the board of directors ). • Relative position of events in sequence is ignored (only one notion of co-occurrence). 36

LSTM Script models [P. & Mooney, AAAI 2016] • Feed event sequences into LSTM sequence model. • To infer events, have the model generate likely events from sequence. • Can input noun info, coref info, or both. 37

LSTM Script models • In April 1866 Congress again passed the bill. Johnson again vetoed it. [pass, congress, bill, in, april]; [veto, johnson, it, ·, ·] � � � � � � �� 38

Statistical Script Learning with Recurrent Neural Nets Karl - PowerPoint PPT Presentation

Statistical Script Learning with Recurrent Neural Nets Karl Pichotta Dissertation Proposal December 17, 2015 1 Motivation Following the Battle of Actium, Octavian invaded Egypt. As he approached Alexandria, Antony's armies deserted to

NLP Programming Tutorial 8 - Recurrent Neural Nets Graham Neubig Nara Institute of Science and

Class Unity scripts Rotate cube script Counter + collision script Sound script

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

Conflict nets: Efficient locally canonical MALL proof nets Dominic J. D. Hughes and Willem

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Petri Nets Petri Nets Inputs and Outputs Petri Nets vs FSM Lionel Morel Modeling Templates

Mix-Nets Lecture 19 Some tools for electronic-voting (and other things) Mix-Nets Mix-Nets

Petri Nets and Model Checking Natasa Gkolfi University of Oslo March 31, 2017 Petri Nets and

Recurrent Neural Networks CS60010: Deep Learning Abir Das IIT Kharagpur Mar 11, 2020

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Report of U.S. Ownership of Report of U.S. Ownership of Foreign Securities Foreign Securities

C Concurrency: Still Tricky Francesco Zappa Nardelli Inria, France Based on work done with

Syllables and Phonotactics Syllables and Phonotactics Syllabification Rule Syllabic Consonants

COMP 633 - Parallel Computing Lecture 12 September 17, 2020 CC-NUMA (2) Memory Consistency

r r r q

EC3062 ECONOMETRICS LINEAR STOCHASTIC MODELS Let { x +1 , x +2 , . . . , x + n } denote n

PART 1 What is reading? Introducing the simple view of reading 1 4/30/2017 Am I teaching

+ Design of Parallel Algorithms Communication Algorithms + Topic Overview n One-to-All

Sambuz

Useful Links

Newsletter

Mail Us