RECURSIVE DEEP MODELS FOR SEMANTIC 1 COMPOSITIONALITY Zhicong Lu - PowerPoint PPT Presentation

RECURSIVE DEEP MODELS FOR SEMANTIC 1 COMPOSITIONALITY Zhicong Lu DGP Lab luzhc@dgp.toronto.edu 1 Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher Manning, Andrew Ng and Christopher Potts. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank . Conference on Empirical Methods in Natural Language Processing (EMNLP 2013)

2 RECURSIVE DEEP MODELS FOR SEMANTIC COMPOSITIONALITY OVERVIEW ▸ Background ▸ Stanford Sentiment Treebank ▸ Recursive Neural Models ▸ Experiments

3 BACKGROUND SENTIMENT ANALYSIS ▸ Identify and extract subjective information ▸ Crucial to business intelligence, stock trading, … 1 Adapted from: http://www.rottentomatoes.com/

4 BACKGROUND RELATED WORK ▸ Semantic Vector Spaces ▸ Distributional similarity of single words (e.g., tf-idf) ▸ Do not capture the differences in antonyms ▸ Neural word vectors (Bengio et al.,2003) ▸ Unsupervised ▸ Capture distributional similarity ▸ Need fine-tuning for sentiment detection

5 BACKGROUND RELATED WORK ▸ Compositionally in Vector Spaces ▸ Capture two word compositions ▸ Have not been validated on larger corpora ▸ Logical Form ▸ Mapping sentences to logic form ▸ Could only capture sentiment distributions using separate mechanisms beyond the currently used logic forms

6 BACKGROUND RELATED WORK ▸ Deep Learning ▸ Recursive Auto-associative memories ▸ Restricted Boltzmann machines etc.

7 BACKGROUND SENTIMENT ANALYSIS AND BAG-OF-WORD MODELS 1 ▸ Most methods use bag of words + linguistic features/ processing/lexica ▸ Problem: such methods can’t distinguish different sentiment caused by word order: ▸ + white blood cells destroying an infection ▸ - an infection destroying white blood cells 1 Adapted from Richard Socher’s slides: https://cs224d.stanford.edu/lectures/CS224d-Lecture10.pdf

8 BACKGROUND SENTIMENT DETECTION AND BAG-OF-WORD MODELS 1 ▸ Sentiment detection seems easy for some cases ▸ Detection Accuracy for longer documents reaches 90% ▸ Many easy cases, such as horrible or awesome ▸ For dataset of single sentence movie reviews (Pang and Lee, 2005), accuracy never reached >80% for > 7 years ▸ Hard cases require actual understanding of negation and its scope + other semantic effects 1 Adapted from Richard Socher’s slides: https://cs224d.stanford.edu/lectures/CS224d-Lecture10.pdf

9 BACKGROUND TWO MISSING PIECES FOR IMPROVING SENTIMENT DETECTION ▸ Large and labeled compositional data ▸ Sentiment Treebank ▸ Better models for semantic compositionality ▸ Recursive Neural Networks

10 RECURSIVE DEEP MODELS FOR SEMANTIC COMPOSITIONALITY STANFORD SENTIMENT TREEBANK 1 Adapted from http://nlp.stanford.edu/sentiment/treebank.html

11 STANFORD SENTIMENT TREEBANK DATASET ▸ 215,154 phrases with labels by Amazon Mechanical Turk ▸ Parse trees of 11,855 sentences from movie reviews ▸ Allows for a complete analysis of the compositional effects of sentiment in language.

12 STANFORD SENTIMENT TREEBANK FINDINGS ▸ Stronger sentiment often builds up in longer phrases and the majority of the shorter phrases are neutral ▸ The extreme values were rarely used and the slider was not often left in between the ticks

13 STANFORD SENTIMENT TREEBANK BETTER DATASET HELPED 1 ▸ Performance improved Positive/negative full sentence classification by 2-3% ▸ Hard negation cases are still mostly incorrect ▸ Need a more powerful model 1 Adapted from Richard Socher’s slides: https://cs224d.stanford.edu/lectures/CS224d-Lecture10.pdf

14 RECURSIVE NEURAL MODELS RECURSIVE NEURAL MODELS Example of the Recursive Neural Tensor Network accurately predicting 5 sentiment classes, very negative to very positive (– –, –, 0, +, + +), at every node of a parse tree and capturing the negation and its scope in this sentence.

15 RECURSIVE NEURAL MODELS RECURSIVE NEURAL MODELS ▸ RNN: Recursive Neural Network ▸ MV-RNN: Matrix-Vector RNN ▸ RNTN: Recursive Neural Tensor Network

16 RECURSIVE NEURAL MODELS OPERATIONS IN COMMON ▸ Word vector representations Word vectors: d-dimensional, initialized by randomly from a U(-r,r), r = 0.0001 Word embedding Matrix L , stacked by all the word vectors, trained jointly with compositionality models ▸ Classification Posterior probability over labels given the word vector: — Sentiment classification matrix

17 RECURSIVE NEURAL MODELS RECURSIVE NEURAL MODELS 1 ▸ Focused on compositional representation learning of ▸ Hierarchical structure, features and prediction ▸ Different combinations of ▸ Training Objective ▸ Composition Function ▸ Tree Structure 1 Adapted from Richard Socher’s slides: https://cs224d.stanford.edu/lectures/CS224d-Lecture10.pdf

18 RECURSIVE NEURAL MODELS STANDARD RECURSIVE NEURAL NETWORK ▸ Compositionality Function: — standard element-wise nonlinearity — main parameter to learn

19 RECURSIVE NEURAL MODELS MV-RNN: MATRIX-VECTOR RNN ▸ Composition Function: Adapted from Richard Socher’s slides: https://cs224d.stanford.edu/lectures/CS224d-Lecture10.pdf

20 RECURSIVE NEURAL MODELS RECURSIVE NEURAL TENSOR NETWORK ▸ More expressive than previous RNNs ▸ Basic idea: Allow more interactions of vectors ▸ Composition Function ‣ The tensor can directly relate input vectors ‣ Each slice of the tensor captures a specific type of composition

21 RECURSIVE NEURAL MODELS TENSOR BACKPROP THROUGH STRUCTURE ▸ Minimizing cross entropy error: ▸ Standard softmax error vector: ▸ Update for each slice:

22 RECURSIVE NEURAL MODELS TENSOR BACKPROP THROUGH STRUCTURE ▸ Main backdrop rule to pass error down from parent: ▸ Add errors from parent and current softmax ▸ Full derivative for slice V [k]

23 EXPERIMENTS RESULTS ON TREEBANK ▸ Fine-grained and Positive/Negative results

24 EXPERIMENTS NEGATION RESULTS

25 EXPERIMENTS NEGATION RESULTS ▸ Negating Positive

26 EXPERIMENTS NEGATION RESULTS ▸ Negating Negative ▸ When negative sentences are negated, the overall sentiment should become less negative, but not necessarily positive ▸ — Positive activation should increase

27 EXPERIMENTS Examples of n-grams for which the RNTN predicted the most positive and most negative responses

28 EXPERIMENTS Average ground truth sentiment of top 10 most positive n-grams at various n. RNTN selects more strongly positive phrases at most n-gram lengths compared to other models.

29 EXPERIMENTS DEMO ▸ http://nlp.stanford.edu:8080/sentiment/rntnDemo.html ▸ Stanford CoreNLP

RECURSIVE DEEP MODELS FOR SEMANTIC 1 COMPOSITIONALITY Zhicong Lu - PowerPoint PPT Presentation

RECURSIVE DEEP MODELS FOR SEMANTIC 1 COMPOSITIONALITY Zhicong Lu DGP Lab luzhc@dgp.toronto.edu 1 Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher Manning, Andrew Ng and Christopher Potts. Recursive Deep Models for Semantic

61A Lecture 6 Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Recursive Methods Noter ch.2 Recursive Methods Recursive problem solution Problems

Recursion Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Lesson 9 Recursive Types 2/19, 21 Chapters 20, 21 Recursive type Recursive type terms are

Recursive Methods Recursive problem solution Problems that are naturally solved by

Assessing the Stability of Forecasting Models: Recursive Parameter Estimation and Recursive

Non-Recursive In-Place FFT Algorithm Idea: "Unwind the in-place recursive algorithm and work

Recursion Announcements Recursive Functions Recursive Functions Definition : A function is

Application: Semantic Role Labeling CS 6956: Deep Learning for NLP Overview What is semantic

Learning Deep Structured Models for Semantic Segmentation Guosheng Lin Semantic Segmentation

Semantic Analysis and Semantic Roles Ling 571 Deep Processing Techniques for NLP February 10,

Semantic Roles & Semantic Role Labeling Ling571 Deep Processing Techniques for NLP February

RECURSIVE DEEP MODELS FOR SEMANTIC COMPOSITIONALITY OVER A SENTIMENT TREEBANK Richard Socher,

Creating Semantic Mashups: Bridging Web 2.0 and the Semantic Web Jamie Taylor, Colin Evans, Toby

: on the Semantic Web : on the Semantic Web Building a Semantic Prototype for Danish Building a

Semantic Processing Augmenting CFGs Currying Quantifier scope Semantic Grammars L445 / L545

Theory of Computer Science D4. Primitive Recursion and -Recursion Malte Helmert University of

H0K03a : Advanced Process Control Model-based Predictive Control 4 : Robustness Bert Pluymers Prof.

Towards Making Theory of Describing a For-Loop . . . Computation Course More Resulting

CS344M Autonomous Multiagent Systems Todd Hester Department of Computer Science The University

DatalogRA : Datalog with Recursive Aggregation in the Spark RDD Model Marek Rogala 1 Jan Hidders 2

The dual tree of a recursive triangulation of the disk Henning Sulzbach, INRIA Paris-Rocquencourt

A list of open problems submitted during the conference Model Theory and Proof Theory of

Assessing the cyclical implications of IFRS9: A recursive model Jorge Abad CEMFI Javier Suarez

Sambuz

Useful Links

Newsletter

Mail Us

RECURSIVE DEEP MODELS FOR SEMANTIC 1 COMPOSITIONALITY Zhicong Lu - PowerPoint PPT Presentation

RECURSIVE DEEP MODELS FOR SEMANTIC 1 COMPOSITIONALITY Zhicong Lu DGP Lab luzhc@dgp.toronto.edu 1 Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher Manning, Andrew Ng and Christopher Potts. Recursive Deep Models for Semantic

61A Lecture 6 Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Recursive Methods Noter ch.2 Recursive Methods Recursive problem solution Problems

Recursion Announcements Recursive Functions Recursive Functions 4 Recursive Functions

Lesson 9 Recursive Types 2/19, 21 Chapters 20, 21 Recursive type Recursive type terms are

Recursive Methods Recursive problem solution Problems that are naturally solved by

Assessing the Stability of Forecasting Models: Recursive Parameter Estimation and Recursive

Non-Recursive In-Place FFT Algorithm Idea: &quot;Unwind the in-place recursive algorithm and work

Recursion Announcements Recursive Functions Recursive Functions Definition : A function is

Application: Semantic Role Labeling CS 6956: Deep Learning for NLP Overview What is semantic

Learning Deep Structured Models for Semantic Segmentation Guosheng Lin Semantic Segmentation

Semantic Analysis and Semantic Roles Ling 571 Deep Processing Techniques for NLP February 10,

Semantic Roles &amp; Semantic Role Labeling Ling571 Deep Processing Techniques for NLP February

RECURSIVE DEEP MODELS FOR SEMANTIC COMPOSITIONALITY OVER A SENTIMENT TREEBANK Richard Socher,

Creating Semantic Mashups: Bridging Web 2.0 and the Semantic Web Jamie Taylor, Colin Evans, Toby

: on the Semantic Web : on the Semantic Web Building a Semantic Prototype for Danish Building a

Semantic Processing Augmenting CFGs Currying Quantifier scope Semantic Grammars L445 / L545

Theory of Computer Science D4. Primitive Recursion and -Recursion Malte Helmert University of

H0K03a : Advanced Process Control Model-based Predictive Control 4 : Robustness Bert Pluymers Prof.

Towards Making Theory of Describing a For-Loop . . . Computation Course More Resulting

CS344M Autonomous Multiagent Systems Todd Hester Department of Computer Science The University

DatalogRA : Datalog with Recursive Aggregation in the Spark RDD Model Marek Rogala 1 Jan Hidders 2

The dual tree of a recursive triangulation of the disk Henning Sulzbach, INRIA Paris-Rocquencourt

A list of open problems submitted during the conference Model Theory and Proof Theory of

Assessing the cyclical implications of IFRS9: A recursive model Jorge Abad CEMFI Javier Suarez

Sambuz

Useful Links

Newsletter

Mail Us

Non-Recursive In-Place FFT Algorithm Idea: "Unwind the in-place recursive algorithm and work

Semantic Roles & Semantic Role Labeling Ling571 Deep Processing Techniques for NLP February