Understanding Idiomatic Langauge using Neural Networks Ling 575 - PowerPoint PPT Presentation

Understanding Idiomatic Langauge using Neural Networks Ling 575 Group 1: Josh Tanner, Paige Finkelstein, Wes Rose, Elena Khasanova, and Daniel Campos February 20 th , 2020

Roadmap • Overall Introduction • Evaluating NLM and Lexical Composition (Wes) • Q&A • Idioms and Neural Networks (Daniel) • Q&A • Our Group Project- NLM Understanding of Idioms • Q&A This Photo by Unknown Author is licensed under CC BY-NC

The Principle of Compositionality • “The meaning of a complex expression is determined by its structure and the meanings of its constituents.” Lexical Semantics Syntax • Given any complex expression e in a language L, lexical semantics and syntax determine the semantics of e. https://plato.stanford.edu/entries/compositionality/

The Principle of Compositionality • “The meaning of a complex expression is determined by its structure and the meanings of its constituents.” • Given any complex expression e in a language L, lexical semantics and syntax determine the semantics of e. Is this always true? https://plato.stanford.edu/entries/compositionality/

Difficulties with Compositionality Keep Calm and Carry On ? - used as a function word to - to move while - free from - to cause to indicate the location of something supporting agitation, remain in a - used as a function word to - to convey by direct excitement, or given place, indicate a source of attachment or communication disturbance situation, or support - to contain and direct condition - used as a function word to the course of indicate a time frame during which something takes place - used as a function word to indicate manner of doing something Example from Schwartz et al. Definitions from m-w.com

Difficulties with Compositionality The tea is heating up To become warm or hot The argument is heating up To excite Which meaning to select? Example from Schwartz et al. Definitions from m-w.com

Difficulties with Compositionality • Meaning Shift • The meaning of the phrase departs from the meaning of its constituent words • E.g. Carry on, guilt trip, pain in the neck • Common in multi-word expressions • Implicit meaning • A meaning resulting from composition that requires world knowledge • E.g. hot argument vs. hot tea, olive oil vs. baby oil. Schwartz et al.

Difficulties with Compositionality • Meaning Shift • Implicit meaning • The meaning of the phrase departs • A meaning resulting from from the meaning of its constituent composition that requires words world knowledge • E.g. Carry on, guilt trip, pain in the • E.g. hot argument vs. hot neck tea, olive oil vs. baby oil. How do you think Neural Networks will handle these? Schwartz et al.

Goals of the paper: 1) Define an evaluation suite for lexical composition for NLP models - Based on meaning shift and implicit meaning 2) Evaluate some common word representations using this suite - Word2Vec, GloVe, fasttext, ELMo, OpenAI GPT, BERT

Food for Thought • Would you expect Neural Networks to do better with Meaning Shift or Implicit Meaning? • What do you think of the tasks that were chosen? Should any tasks be added or expanded? • How can we improve NLP applications to handle these phenomena? • (How do humans handle them?)

Overview of Methodology • Train 6 classification models, one for each of 6 types of word representations • For 6 tasks, test each of these models. Compare to each other and to baselines Lexical Composition Tasks Baseline Models Classification Models Verb-Particle Noun Compound Construction Relations Human Baseline Word2Vec ELMo Majority ALL Baseline Light Verb GloVe GPT Adjective Noun Construction Attributes Majority 1 Baseline fasttext BERT Majority 2 Baseline Noun Compound Identifying Literality Phrase Types

Overview of Methodology Task Verb-Particle Light Verb Noun Compound Noun Compound Adjective Noun Identifying Phrase Construction Construction Literality Relations Attributes Types Word2Vec GloVe Classification Model fasttext ELMo GPT BERT Human Baseline Majority_ALL Majority_1 Majority_2

Overview of Methodology • Train 6 classification models, one for each of 6 types of word representations • For 6 tasks, test each of these models. Compare to each other and to baselines Lexical Composition Tasks Baseline Models Classification Models Verb-Particle Noun Compound Construction Relations Human Baseline Word2Vec ELMo Majority ALL Baseline Light Verb GloVe GPT Adjective Noun Construction Attributes Majority 1 Baseline fasttext BERT Majority 2 Baseline Noun Compound Identifying Literality Phrase Types

Classification Models • Embed-Encode-Predict Encode Embed Predict Input (Pre-trained (Perform (Transform the Sentence representation) Classification) embedding)

Classification Model: Embed (Word Representations) Global Embeddings Contextual Embeddings • Word2Vec • ELMo • Using Skip-Gram • OpenAI GPT • GloVe • BERT • fasttext (Use top layer or scalar mix)

Classification Model: Encode Input to encode layer is sequence of pretrained embeddings V = <v1,…,vn> Output is U = <u 1 , …, u n > biLM Att None • Encode embedded • Encode embedded • Don’t encode the sequence using sequence using self- embedded text biLSTM attention • Use the embeddings • U = biLSTM(V) • U i = [v i Σa i,j . V j ] as they are • U = U

Classification Model: Predict • Takes output U from Encode layer, and passes it to a feed-forward Neural Network Classifier • Represent a “span” of text by concatenating end-point vectors • E.g. u i,…,i+k = [u i ; u i+k ] • X = [u i ;u i+k ;u’ 1 ;u’ l ] • u’ 1 and u’ l may be empty. For some tasks, a 2 nd span is needed. • X is passed into classifier • Classifier output is a softmax over all categories

Overview of methodology • Train 6 classification models, one for each of 6 types of word representations • For 6 tasks, test each of these models. Compare to each other and to baselines Lexical Composition Tasks Baseline Models Classification Models Verb-Particle Noun Compound Construction Relations Human Baseline Word2Vec ELMo Majority ALL Baseline Light Verb GloVe GPT Adjective Noun Construction Attributes Majority 1 Baseline fasttext BERT Majority 2 Baseline Noun Compound Identifying Literality Phrase Types

Baselines Human Baseline Majority Baselines • Used Amazon • Majority ALL • Majority 1 • Majority 2 Mechanical Turk • Assign most • For each test • For each test • Classified 100 common label in item, assign most item, assign label examples for training set to all common label in based on final each task test items the training set constituent for items with • Worker same 1st agreement of constituent 80% - 87%

Overview of methodology • Train 6 classification models, one for each of 6 types of word representations • For 6 tasks, test each of these models. Compare to each other and to baselines Lexical Composition Tasks Baseline Models Classification Models Verb-Particle Noun Compound Construction Relations Human Baseline Word2Vec ELMo Majority ALL Baseline Light Verb GloVe GPT Adjective Noun Construction Attributes Majority 1 Baseline fasttext BERT Majority 2 Baseline Noun Compound Identifying Literality Phrase Types

Lexical Composition Tasks Task Name Meaning Implicit Shift? Meaning? Verb-Particle Construction X Light Verb Construction X X Noun Compound Literality Noun Compound Relations X Adjective Noun Attributes X Identifying Phrase Type X X

Task 1: Verb Particle Construction Given a (verb, preposition) pair from a sentence, is it Dataset: a verb particle construction? 1,348 tagged sentences from the BNC (Is the verb’s meaning changed by the preposition?) Example Sentence Is Verb Particle Construction? How many Englishmen gave in to their emotions like that ? Yes It is just this denial of anything beyond what is directly given in No experience that marks Berkeley out as an empiricist . Data Classification Yes / No Model Tu and Roth 2012

Understanding Idiomatic Langauge using Neural Networks Ling 575 - PowerPoint PPT Presentation

Understanding Idiomatic Langauge using Neural Networks Ling 575 Group 1: Josh Tanner, Paige Finkelstein, Wes Rose, Elena Khasanova, and Daniel Campos February 20 th , 2020 Roadmap Overall Introduction Evaluating NLM and Lexical

Idiomatic Python from idiomatic C++: removing barriers to rapid scientific development Toby St

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

+ Marrakesh, February 5-8, 2012 I- Colors Idiomatic Use Similes Color Words as Verbs II-

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Web-derived Pronunciations Arnab Ghoshal Spoken Langauge Systems, Saarland University Research

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

The Scoop Kristy Maitre Tax Specialist Center for Agricultural Law and Taxation February 10,

Topics in Flow Visualization Lecture 15 April 14, 2020 Outline Vortices Flow separation and

Regularity method for sparse graphs and its applica5ons Yufei Zhao (MIT) Joint work with David

Building a Powerful and Reliable Major Donor Program With Kim Klein KLEIN & ROTH CONSULTING

Symbol Table 27th October 2015 Pedram Mir Seyed Nazari, Alexander Roth, and Bernhard Rumpe

Agribusiness Master Class Foundation Week | Cebu, Philippines 25-29 November 2019 Welcome

Fourth Quarter & Fiscal Year End 2014 Earnings Conference Call March 5, 2015 Randall C. Stuewe

Considerations in the Interpretation of Cosmological Anomalies Hiranya V. Peiris University

Understanding Idiomatic Langauge using Neural Networks Ling 575 - PowerPoint PPT Presentation

Understanding Idiomatic Langauge using Neural Networks Ling 575 Group 1: Josh Tanner, Paige Finkelstein, Wes Rose, Elena Khasanova, and Daniel Campos February 20 th , 2020 Roadmap Overall Introduction Evaluating NLM and Lexical

Idiomatic Python from idiomatic C++: removing barriers to rapid scientific development Toby St

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

+ Marrakesh, February 5-8, 2012 I- Colors Idiomatic Use Similes Color Words as Verbs II-

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Web-derived Pronunciations Arnab Ghoshal Spoken Langauge Systems, Saarland University Research

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Relaxation and Hopfield Networks Neural Networks Neural Networks - Hopfield 1 Bibliography

The Scoop Kristy Maitre Tax Specialist Center for Agricultural Law and Taxation February 10,

Topics in Flow Visualization Lecture 15 April 14, 2020 Outline Vortices Flow separation and

Regularity method for sparse graphs and its applica5ons Yufei Zhao (MIT) Joint work with David

Building a Powerful and Reliable Major Donor Program With Kim Klein KLEIN &amp; ROTH CONSULTING

Symbol Table 27th October 2015 Pedram Mir Seyed Nazari, Alexander Roth, and Bernhard Rumpe

Agribusiness Master Class Foundation Week | Cebu, Philippines 25-29 November 2019 Welcome

Fourth Quarter &amp; Fiscal Year End 2014 Earnings Conference Call March 5, 2015 Randall C. Stuewe

Considerations in the Interpretation of Cosmological Anomalies Hiranya V. Peiris University

Building a Powerful and Reliable Major Donor Program With Kim Klein KLEIN & ROTH CONSULTING

Fourth Quarter & Fiscal Year End 2014 Earnings Conference Call March 5, 2015 Randall C. Stuewe