Jointly Learning Word and Phrase Embeddings Using Neural Networks - PowerPoint PPT Presentation

Jointly Learning Word and Phrase Embeddings Using Neural Networks and Implicit Tensor Factorization Kazuma Hashimoto Tsuruoka Laboratory, University of Tokyo 19/06/2015 Talk@UCL Machine Reading Lab.

Self Introduction • Name – Kazuma Hashimoto ( 橋本和真 in Japanese) – http://www.logos.t.u-tokyo.ac.jp/~hassy/ • Belong – Tsuruoka Laboratory, University of Tokyo • April 2015 – present Ph.D. student • April 2013 – March 2015 Master’s student – National Centre for Text Mining (NaCTeM) • Research Interest – Word/phrase/document embeddings and their applications 2 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Today’s Agenda 1. Background – Word and Phrase Embeddings 2. Jointly Learning Word and Phrase Embeddings – General Idea 3. Our Methods Focusing on Transitive Verb Phrases – Word Prediction (EMNLP 2014) – Implicit Tensor Factorization (CVSC 2015) 4. Experiments and Results 5. Summary 3 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Assigning Vectors to Words • Word: String  Index  Vector • Why vectors? – Word similarities can be measured using distance metrics of the vectors (e.g., the cosine similarity) cause trigger cause disorder trigger disease disorder disease mouse animal rat animal mouse rat Embedding words in a vector space 5 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Approaches to Word Representations • Two approaches using large corpora : (systematic comparison of them in Baroni+ (2014) ) – Count-based approach • e.g.) Reducing the dimension of word co- occurrence matrix using SVD – Prediction-based approach • e.g.) Predicting words from their contexts using neural networks • We focus on prediction-based approach – Why? 6 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Learning Word Embeddings • Prediction-based approaches usually – parameterize the word embeddings – learn them based on co-occurrence statistics • Word embeddings appearing in similar contexts get close to each other ------ word prediction using the word embedding ------ text data … the prevalence of drunken driving and accidents caused by drinking … target SkipGram model (Mikolov+, 2013) in word2vec 7 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Task-Oriented Word Embeddings • Learning word embeddings for relation classification – To appear at CoNLL 2015 (just advertising) 8 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Beyond Word Embeddings • Treating phrases and sentences as well as words – gaining much attention recently! make pay payment money make payment pay money make payment pay money Embedding phrases in a vector space 9 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Approaches to Phrase Embeddings • Element-wise addition/multiplication (Lapata+, 2010) – 𝑤 sentnce = 𝑗 𝑤 𝑥 𝑗 • Recursive autoencoders (Socher+, 2011; Hermann+, 2013) – Using parse trees – 𝑤 parent = 𝑔(𝑤 left child , 𝑤 right child ) • Tensor/matrix-based methods – 𝑤 adj noun = 𝑁 adj 𝑤(noun) (Baroni+, 2010) – 𝑁 verb = 𝑗,𝑘 𝑤 subj 𝑗 T 𝑤 obj 𝑘 (Grefenstette+, 2011) • 𝑁 subj, verb, obj = {𝑤 subj T 𝑤 obj } ∗ 𝑁(verb) • 𝑤 subj, verb, obj = 𝑁 verb 𝑤 obj ∗ 𝑤 subj (Kartsaklis+, 2012) 10 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Which Word Embeddings are the Best? • Co-occurrence matrix + SVD • C&W (Collobert+, 2011) • RNNLM (Mikolov+, 2013) • SkipGram/CBOW (Mikolov+, 2013) • vLBL/ivLBL (Mnih+, 2013) • Dependency-based SkipGram (Levy+, 2014) • Glove (Pennington+, 2014) Which word embeddings should we use for which composition methods? Joint leaning 11 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Co-Occurrence Statistics of Phrases • Word co-occurrence statistics  word embeddings • How about phrase embeddings? – Phrase co-occurrence statistics! similar meanings? The businessman pays his monthly fee in yen similar contexts The importer made payment in his own domestic currency 13 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

How to Identify Phrase-Word Relations? • Using Predicate-Argument Structures (PAS) – Enju parer (Miyao+, 2008) • Analyzes relations between phrases and words NP arguments NP NP VP NP The importer made payment in his own domestic currency verb preposition predicates 14 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Why Transitive Verb Phrases? • Meanings of transitive verbs are affected by their arguments – e.g.) run, make, etc.  Good target to test composition models earn make money pay use make payment make use (of) make 16 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Possible Application: Semantic Search • Embedding subject-verb-object tuples in a vector space – Semantic similarities between SVOs can be used! 17 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Training Data from Large Corpora • Focusing on the role of prepositional adjuncts – Prepositional adjuncts complement meanings of verb phrases  should be useful parse ------ ------ simplification English Wikipedia, BNC, etc. How to model the relationships between predicates and arguments? 18 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Word Prediction Model (like word2vec) • Predicting words in predicate-argument tuples max(0, 1- s ( currency )+ s ( furniture ))  cost function currency furniture feature vector + for the word prediction prep ⊙ 𝐰 𝑏𝑠𝑕1 + 𝐪 = tanh(𝐢 𝑏𝑠𝑕1 prep prep arg1 pred 𝐢 𝑏𝑠𝑕1 𝐢 𝑞𝑠𝑓𝑒 prep ⊙ 𝐰 𝑞𝑠𝑓𝑒 ) 𝐢 𝑞𝑠𝑓𝑒 𝐰 𝑏𝑠𝑕1 𝐰 𝑞𝑠𝑓𝑒 PAS-CLBLM [importer make payment] in 20 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

How to Compute SVO Embeddings? • Two methods: – (a) assigning a vector to each SVO tuple – (b) composing SVO embeddings + - parameterized vectors - composed vectors subj verb obj [importer make payment] [importer make payment] (a) (b) 21 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Weakness of PAS-CLBLM • Only element-wise vector operations – Pros: Fast training – Cons: Poor interaction between predicates and arguments • Interactions between predicates and arguments are important for transitive verbs earn make money pay use make payment make use (of) make 23 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Focusing on Tensor-Based Approaches • Tensor/matrix-based approaches (Noun: vector) – Adjective: matrix (Baroni+, 2010) – Transitive verb: matrix (Grefenstette+, 2011; Van de Cruys+, 2013) Given Given 𝑒 subject pre-trained 𝑒 ≅ subject verb 𝑒 verb Given 𝑄𝑁𝐽 (importer, make, payment) = 0.31 24 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Implicit Tensor Factorization (1) • Parameterizing – Predicate matrices and – Argument embeddings Given Given argument 2 𝑒 𝑒 ≅ argument 2 predicate 𝑒 predicate Given 25 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Implicit Tensor Factorization (2) • Calculating plausibility scores – Using predicate matrices & argument embeddings i k j 𝑈 ( i, j, k ) = Given Given argument 2 𝑒 𝑒 ≅ argument 2 predicate 𝑒 predicate Given 26 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Implicit Tensor Factorization (3) • Learning model parameters – Using plausibility judgment task • Observed tuple: ( i, j, k ) • Collapsed tuple: ( i ’ , j, k ), ( i, j’ , k ), ( i, j, k’ ) – Negative sampling (Mikolov+, 2013) Cost function 27 / 39 19/06/2015 Talk@UCL Machine Reading Lab.

Jointly Learning Word and Phrase Embeddings Using Neural Networks - PowerPoint PPT Presentation

Jointly Learning Word and Phrase Embeddings Using Neural Networks and Implicit Tensor Factorization Kazuma Hashimoto Tsuruoka Laboratory, University of Tokyo 19/06/2015 Talk@UCL Machine Reading Lab. Self Introduction Name Kazuma

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Word Embeddings Tutorial HILA GONEN PHD STUDENT AT YOAV GOLDBERGS LAB BAR ILAN UNIVERSITY

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to

Phrase Weights Statistical NLP Spring 2011 Lecture 10: Phrase Alignment Dan Klein UC

Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction Roy Schwartz + ,

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Learning Word Embeddings for Low-resource Languages by PU Learning Chao Jiang, Hsiang-Fu Yu,

Automatic Summarization (and other stuff) Taylor Berg-Kirkpatrick CS 288 UC Berkeley

ALMA Common Software Basic Track Test Driven Development Unit testing and TAT What is it? In a

Statistical aspects of determinantal point processes Jesper Mller , Department of Mathematical

Sec 2A Subject Options 2020 Information Slides Subject Options Exercise and Subject Choices for

Non Verbal Communication Based on the presentation from Boazii University,

Get involved at Langara through International Education (IE) Room A107 The IE Office

1 Peter 4:1-6 1 Therefore, since Christ suffered in his body, arm yourselves also with the same

F O R C H R I S T I A N L I V I N G OPEN YOUR BIBLE TO: EPHESIANS 5:15-33 D OWNLOAD T HIS P

Jointly Learning Word and Phrase Embeddings Using Neural Networks - PowerPoint PPT Presentation

Jointly Learning Word and Phrase Embeddings Using Neural Networks and Implicit Tensor Factorization Kazuma Hashimoto Tsuruoka Laboratory, University of Tokyo 19/06/2015 Talk@UCL Machine Reading Lab. Self Introduction Name Kazuma

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Word Embeddings Tutorial HILA GONEN PHD STUDENT AT YOAV GOLDBERGS LAB BAR ILAN UNIVERSITY

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky &amp; Martin How to

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky &amp; Martin How to

Phrase Weights Statistical NLP Spring 2011 Lecture 10: Phrase Alignment Dan Klein UC

Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction Roy Schwartz + ,

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Learning Word Embeddings for Low-resource Languages by PU Learning Chao Jiang, Hsiang-Fu Yu,

Automatic Summarization (and other stuff) Taylor Berg-Kirkpatrick CS 288 UC Berkeley

ALMA Common Software Basic Track Test Driven Development Unit testing and TAT What is it? In a

Statistical aspects of determinantal point processes Jesper Mller , Department of Mathematical

Sec 2A Subject Options 2020 Information Slides Subject Options Exercise and Subject Choices for

Non Verbal Communication Based on the presentation from Boazii University,

Get involved at Langara through International Education (IE) Room A107 The IE Office

1 Peter 4:1-6 1 Therefore, since Christ suffered in his body, arm yourselves also with the same

F O R C H R I S T I A N L I V I N G OPEN YOUR BIBLE TO: EPHESIANS 5:15-33 D OWNLOAD T HIS P

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to