Word Embeddings CS 6956: Deep Learning for NLP Overview - PowerPoint PPT Presentation

Word Embeddings CS 6956: Deep Learning for NLP

Overview • Representing meaning • Word embeddings: Early work • Word embeddings via language models • Word2vec and Glove • Evaluating embeddings • Design choices and open questions 1

Overview • Representing meaning • Word embeddings: Early work • Word embeddings via language models • Word2vec and Glove • Evaluating embeddings • Design choices and open questions 2

Representing meaning What do words mean? How do they get their meaning? 3

Representing meaning What do words mean? How do they get their meaning? dog table tiger cat 4

Representing meaning What do words mean? How do they get their meaning? dog table tiger cat 5

Representing meaning What do words mean? How do they get their meaning? dog table tiger cat Perhaps more pertinent for modeling language: How can we represent the meaning of words in a form that is computationally flexible? 6

Words are atomic symbols The strings cat , tiger , dog and table are different from each other If we systematically replace all words with unique identifiers, does their meaning change? Think about substituting cat with uniq-id-1 , table with uniq-id-53 , … As long as we are consistent in our substitution, sentence meaning would not be harmed So how do we represent word meaning in a way that is grounded in the way they are used? 7

Words are atomic symbols The strings cat , tiger , dog and table are different from each other If we systematically replace all words with unique identifiers, does their meaning change? Think about substituting cat with uniq-id-1 , table with uniq-id-53 , … As long as we are consistent in our substitution, sentence meaning would not be harmed So how do we represent word meaning in a way that is So how do we represent word meaning in a way that is grounded in the way they are used by everyone? grounded in the way they are used? 8

Words are atomic symbols The strings cat , tiger , dog and table are different from each other If we systematically replace all words with unique identifiers, does their meaning change? Think about substituting cat with uniq-id-1 , table with uniq-id-53 , … As long as we are consistent in our substitution, sentence meaning would not be harmed So how do we represent word meaning in a way that is So how do we represent word meaning in a way that is grounded in the way they are used by everyone? grounded in the way they are used? Various perspectives exist 9

The meaning of words: Perspective 0 An ontology: Eg. WordNet Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun cat 8 senses of cat Sense 1 cat, true cat => feline, felid Sense 2 guy, cat, hombre, bozo => man, adult male Sense 3 Cat => gossip, gossiper, gossipmonger, rumormonger, rumourmonger, newsmonger Sense 4 kat, khat, qat, quat, cat, Arabian tea, African tea => stimulant, stimulant drug, excitant Sense 5 cat-o'-nine-tails, cat => whip Sense 6 Caterpillar, cat => tracked vehicle Sense 7 big cat, cat => feline, felid Sense 8 computerized tomography, computed tomography, CT, computerized axial tomography, computed axial tomography, CAT 10 => X-raying, X-radiation

The meaning of words: Perspective 0 An ontology: Eg. WordNet Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun cat 8 senses of cat Sense 1 Such a taxonomy shows hypernymy relationships between words cat, true cat => feline, felid Sense 2 guy, cat, hombre, bozo => man, adult male Sense 3 Cat => gossip, gossiper, gossipmonger, rumormonger, rumourmonger, newsmonger Sense 4 kat, khat, qat, quat, cat, Arabian tea, African tea => stimulant, stimulant drug, excitant Sense 5 cat-o'-nine-tails, cat => whip Sense 6 Caterpillar, cat => tracked vehicle Sense 7 big cat, cat => feline, felid Sense 8 computerized tomography, computed tomography, CT, computerized axial tomography, computed axial tomography, CAT 11 => X-raying, X-radiation

The meaning of words: Perspective 0 An ontology: Eg. WordNet Synonyms/Hypernyms (Ordered by Estimated Frequency) of noun cat 8 senses of cat Sense 1 Such a taxonomy shows hypernymy relationships between words cat, true cat => feline, felid Sense 2 guy, cat, hombre, bozo A high precision resource • => man, adult male Sense 3 Cat Typically manually built • => gossip, gossiper, gossipmonger, rumormonger, rumourmonger, newsmonger Hard to keep it up-to-date • Sense 4 kat, khat, qat, quat, cat, Arabian tea, African tea New words enter our lexicon, words change meaning over time • => stimulant, stimulant drug, excitant Sense 5 Does not necessarily reflect how words are used in real life • cat-o'-nine-tails, cat Perhaps related to the previous concern => whip • Sense 6 Caterpillar, cat Various methods for computing similarities between words using such an • => tracked vehicle ontology. Sense 7 big cat, cat Eg: using distances in the hypernym hierarchy such as the Wu & Palmer • => feline, felid similarity measure Sense 8 computerized tomography, computed tomography, CT, computerized axial tomography, computed axial tomography, CAT 12 => X-raying, X-radiation

The meaning of words: Perspective 1 The distributional hypothesis Words that occur in the same context have similar meanings – Zelig Harris, J. R. Firth – Firth (1957) : “You shall know a word by the company it keeps” • The key idea: To characterize the meaning of a word, we need to we characterize the distribution of its context • What context? – Commonly interpreted as neighboring words in text – Could be syntactic/semantic/discourse/pragmatic/… context 13

The meaning of words: Perspective 1 The distributional hypothesis Words that occur in the same context have similar meanings – Zelig Harris, J. R. Firth – Firth (1957) : “You shall know a word by the company it keeps” • The key idea: To characterize the meaning of a word, we need to we characterize the distribution of its context • What context? – Commonly interpreted as neighboring words in text – Could be syntactic/semantic/discourse/pragmatic/… context 14

The meaning of words: Perspective 1 The distributional hypothesis Words that occur in the same context have similar meanings – Zelig Harris, J. R. Firth – Firth (1957) : “You shall know a word by the company it keeps” • The key idea: To characterize the meaning of a word, we need to we characterize the distribution of its context John sleeps during the and works at night • What context? with a cup of coffee Mary starts her day – Commonly interpreted as neighboring words in text He starts his with an angry look at his inbox – Could be syntactic/semantic/discourse/pragmatic/… context … … 15

The meaning of words: Perspective 1 The distributional hypothesis Words that occur in the same context have similar meanings – Zelig Harris, J. R. Firth – Firth (1957) : “You shall know a word by the company it keeps” • The key idea: To characterize the meaning of a word, we need to we characterize the distribution of its context context John sleeps during the and works at night • What context? with a cup of coffee Mary starts her day – Commonly interpreted as neighboring words in text He starts his with an angry look at his inbox – Could be syntactic/semantic/discourse/pragmatic/… context … … 16

The meaning of words: Perspective 1 The distributional hypothesis Words that occur in the same context have similar meanings – Zelig Harris, J. R. Firth – Firth (1957) : “You shall know a word by the company it keeps” • The key idea: To characterize the meaning of a word, we need to we characterize the distribution of its context • What context? Commonly interpreted as neighboring words in text, but could be syntactic/semantic/discourse/pragmatic/… context. We will see more about context soon 17

The meaning of words: Perspective 2 Symbolic vs. Distributed representations • The words cat , tiger , dog and table are symbols • Just knowing the symbols does not tell us anything about what they mean. For example: 1. Cats and tigers are conceptually closer to each other than to dogs or tables 2. Cats, tigers and dogs are closer to each other than tables What we need: A representation scheme that • inherently captures similarities between similar objects 18

The meaning of words: Perspective 2 Symbolic vs. Distributed representations • The words cat , tiger , dog and table are symbols • Just knowing the symbols does not tell us anything about what they mean. For example: 1. Cats and tigers are conceptually closer to each other than to dogs or tables 2. Cats, tigers and dogs are closer to each other than tables What we need: A representation scheme that • inherently captures similarities between similar objects 19

The meaning of words: Perspective 2 Symbolic vs. Distributed representations For example: Think about feature representations Cat Dog Tiger Table These one-hot vectors do not capture inherent similarities Distances or dot products are all equal 20

The meaning of words: Perspective 2 Symbolic vs. Distributed representations Distributed representations capture similarities better – Think of them as vector valued representations can coalesce superficially distinct objects Cat Dog Tiger Table Dense vector (often lower dimensional) representations can capture similarities better 21

Word Embeddings CS 6956: Deep Learning for NLP Overview - PowerPoint PPT Presentation

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word embeddings: Early work Word embeddings via language models Word2vec and Glove Evaluating embeddings Design choices and open questions 1

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word Embeddings Tutorial HILA GONEN PHD STUDENT AT YOAV GOLDBERGS LAB BAR ILAN UNIVERSITY

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction Roy Schwartz + ,

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Lecture 8: NLP and Word Embeddings Alireza Akhavan Pour CLASS.VISION

Word Embeddings through Hellinger PCA Rmi Lebret and Ronan Collobert Idiap Research Institute /

Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings Maksim Tkachenko

STEPBRAIN: a stereolitographed phantom of the brain for Nuclear Medicine, Computed Tomography and

three billion people Philips founding fathers: Frederik, Gerard and Anton Philips Founded in

Planning Systems for External Photon Beam Radiotherapy Set of 117 slides based on the chapter

Compressive Imaging EE367/CS448I: Computational Imaging and Display

Magnetic Resonance and computed Tomography Image Fusion using Bidimensional Empirical Mode

Does a Blush on CT following Blunt Abdominal Injury Necessitate an Invasive Intervention?

5/18/2013 Postgraduate Course in General Surgery Small Bowel Obstruction Eric K. Nakakura

CT and MRI of aggressive lymphoma G. Gavelli U.O.

Sambuz

Useful Links

Newsletter

Mail Us

Word Embeddings CS 6956: Deep Learning for NLP Overview - PowerPoint PPT Presentation

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word embeddings: Early work Word embeddings via language models Word2vec and Glove Evaluating embeddings Design choices and open questions 1

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word Embeddings Tutorial HILA GONEN PHD STUDENT AT YOAV GOLDBERGS LAB BAR ILAN UNIVERSITY

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction Roy Schwartz + ,

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky &amp; Martin How to

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky &amp; Martin How to

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Lecture 8: NLP and Word Embeddings Alireza Akhavan Pour CLASS.VISION

Word Embeddings through Hellinger PCA Rmi Lebret and Ronan Collobert Idiap Research Institute /

Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings Maksim Tkachenko

STEPBRAIN: a stereolitographed phantom of the brain for Nuclear Medicine, Computed Tomography and

three billion people Philips founding fathers: Frederik, Gerard and Anton Philips Founded in

Planning Systems for External Photon Beam Radiotherapy Set of 117 slides based on the chapter

Compressive Imaging EE367/CS448I: Computational Imaging and Display

Magnetic Resonance and computed Tomography Image Fusion using Bidimensional Empirical Mode

Does a Blush on CT following Blunt Abdominal Injury Necessitate an Invasive Intervention?

5/18/2013 Postgraduate Course in General Surgery Small Bowel Obstruction Eric K. Nakakura

CT and MRI of aggressive lymphoma G. Gavelli U.O.

Sambuz

Useful Links

Newsletter

Mail Us

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to

Dense Word Embeddings CMSC 470 Marine Carpuat Slides credit: Jurasky & Martin How to