Word, Sense and Contextualized Embeddings: Vector Representations of - PowerPoint PPT Presentation

Word, Sense and Contextualized Embeddings: Vector Representations of Meaning in NLP Jose Camacho-Collados Cardiff University, 18 March 2019 1

Outline ❖ Background ➢ Vector Space Models (word embeddings) ➢ Lexical resources ❖ Sense representations ➢ Knowledge-based: NASARI, SW2V ➢ Contextualized: ELMo, BERT ❖ Applications 2

Word vector space models Words are represented as vectors: semantically similar words are close in the vector space 3

Neural networks for learning word vector representations from text corpora -> word embeddings 4

Why word embeddings? Embedded vector representations: • are compact and fast to compute • preserve important relational information between words (actually, meanings): • are geared towards general use 5

Applications for word representations • Syntactic parsing (Weiss et al. 2015) • Named Entity Recognition (Guo et al. 2014) • Question Answering (Bordes et al. 2014) • Machine Translation (Zou et al. 2013) • Sentiment Analysis (Socher et al. 2013) … and many more! 6

AI goal: language understanding 7

Limitations of word embeddings • Word representations cannot capture ambiguity. For instance, bank 8

Problem 1: word representations cannot capture ambiguity 9

Problem 1: word representations cannot capture ambiguity 07/07/2016 10

Problem 1: word representations cannot capture ambiguity 11

Word representations and the triangular inequality Example from Neelakantan et al (2014) pollen refinery plant NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing 12 Camacho-Collados, Espinosa-Anke, Pilehvar

Word representations and the triangular inequality Example from Neelakantan et al (2014) pollen refinery plant 1 plant 2 NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing 13 Camacho-Collados, Espinosa-Anke, Pilehvar

Limitations of word representations • They cannot capture ambiguity. For instance, bank -> They neglect rare senses and infrequent words • Word representations do not exploit knowledge from existing lexical resources. 14

Motivation: Model senses instead of only words He withdrew money from the bank . NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing Camacho-Collados, Espinosa-Anke, Pilehvar

Motivation: Model senses instead of only words He withdrew money from the bank . bank#1 ... bank#2 ... NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing Camacho-Collados, Espinosa-Anke, Pilehvar

a Novel Approach to a Semantically-Aware Representations of Items http://lcl.uniroma1.it/nasari/ 18

Key goal: obtain sense representations 19

Key goal: obtain sense representations We want to create a separate representation for each entry of a given word 20

Idea Encyclopedic knowledge Lexicographic knowledge + WordNet 21

Idea Encyclopedic knowledge Lexicographic knowledge + WordNet + Information from text corpora 22

WordNet 23

WordNet Main unit: synset (concept) synset the middle of the day Noon, twelve noon, high noon, midday, electronic device noonday, noontide television, telly, television set, tv, tube, tv set, idiot box, boob tube, goggle box word sense 24

WordNet semantic relations a living thing that has (or can a protective develop) the ability to covering that act or function is part of a independently M Hypernymy plant e organism, being r o n ( hood, cap y p m a (is-a) r y t o f ) ((botany) a living organism lacking the power of locomotion plant, flora, plant Hyponymy Domain life (has-kind) any of a variety of plants grown indoors the branch of for decorative biology that purposes studies plants houseplant botany 25

Knowledge-based Representations (WordNet) X. Chen, Z. Liu, M. Sun: A Unified Model for Word Sense Representation and Disambiguation (EMNLP 2014) S. Rothe and H. Schutze: AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes (ACL 2015) Faruqui, M., Dodge, J., Jauhar, S. K., Dyer, C., Hovy, E., & Smith, N. A. Retrofitting Word Vectors to Semantic Lexicons (NAACL 2015)* S. K. Jauhar, C. Dyer, E. Hovy: Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models (NAACL 2015) M. T. Pilehvar and N. Collier, De-Conflated Semantic Representations (EMNLP 2016) NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing 26 Camacho-Collados, Espinosa-Anke, Pilehvar

Wikipedia 27

Wikipedia High coverage of named entities and specialized concepts from different domains 28

Wikipedia hyperlinks 29

Wikipedia hyperlinks 30

Thanks to an automatic mapping algorithm, BabelNet integrates Wikipedia and WordNet , among other resources (Wiktionary, OmegaWiki, WikiData … ). Key feature: Multilinguality (271 languages) 31

BabelNet Concept Entity 32

BabelNet It follows the same structure of WordNet: synsets are the main units 33

BabelNet In this case, synsets are multilingual 34

NASARI (Camacho-Collados et al., AIJ 2016) Goal Build vector representations for multilingual BabelNet synsets. How? We exploit Wikipedia semantic network and WordNet taxonomy to construct a subcorpus (contextual information) for any given BabelNet synset. 35

Pipeline Process of obtaining contextual information for a BabelNet synset exploiting BabelNet taxonomy and Wikipedia as a semantic network 36

Three types of vector representations Three types of vector representations: - Lexical (dimensions are words) - - Unified (dimensions are multilingual BabelNet synsets) - - Embedded (latent dimensions) 37

Three types of vector representations Three types of vector representations: - Lexical (dimensions are words) } - - Unified (dimensions are multilingual BabelNet synsets) - - Embedded (latent dimensions) 38

Human-interpretable dimensions plant (living organism) dictionary#3 garden#2 food#2 organism#1 tree#1 c table#3 refinery#1 soil#2 leaf#1 a r p 4 e t # 2 39

Three types of vector representations Three types of vector representations: - Lexical (dimensions are words) - Unified (dimensions are multilingual BabelNet synsets) - Embedded : Low-dimensional vectors exploiting word embeddings obtained from text corpora . 40

Three types of vector representations Three types of vector representations: - Lexical (dimensions are words) - Unified (dimensions are multilingual BabelNet synsets) - - Embedded : Low-dimensional vectors exploiting word embeddings obtained from text corpora . Word and synset embeddings share the same vector space! 41

Embedded vector representation Closest senses 42

SW2V (Mancini and Camacho-Collados et al., CoNLL 2017) A word is the surface form of a sense: we can exploit this intrinsic relationship for jointly training word and sense embeddings . NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing 43 Camacho-Collados, Espinosa-Anke, Pilehvar

SW2V (Mancini and Camacho-Collados et al., CoNLL 2017) A word is the surface form of a sense: we can exploit this intrinsic relationship for jointly training word and sense embeddings . How? Updating the representation of the word and its associated senses interchangeably. NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing 44 Camacho-Collados, Espinosa-Anke, Pilehvar

SW2V: Idea Given as input a corpus and a semantic network : 1. Use a semantic network to link to each word its associated senses in context . He withdrew money from the bank . NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing 45 Camacho-Collados, Espinosa-Anke, Pilehvar

SW2V: Idea Given as input a corpus and a semantic network : 1. Use a semantic network to link to each word its associated senses in context . He withdrew money from the bank . NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing 46 Camacho-Collados, Espinosa-Anke, Pilehvar

SW2V: Idea Given as input a corpus and a semantic network: 1. Use a semantic network to link to each word its associated senses in context . 2. Use a neural network where the update of word and sense embeddings is linked , exploiting virtual connections. NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing 47 Camacho-Collados, Espinosa-Anke, Pilehvar

SW2V: Idea Given as input a corpus and a semantic network: 1. Use a semantic network to link to each word its associated senses in context . 2. Use a neural network where the update of word and sense embeddings is linked, exploiting virtual connections. money He withdrew from the bank NAACL 2018 Tutorial: The Interplay between Lexical Resources and Natural Language Processing 48 Camacho-Collados, Espinosa-Anke, Pilehvar

Word, Sense and Contextualized Embeddings: Vector Representations of - PowerPoint PPT Presentation

Word, Sense and Contextualized Embeddings: Vector Representations of Meaning in NLP Jose Camacho-Collados Cardiff University, 18 March 2019 1 Outline Background Vector Space Models (word embeddings) Lexical resources Sense

Retrofitting Contextualized Word Embeddings with Paraphrases Weijia Shi 1* , Muhao Chen 1 * , Pei

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by

Contextualized Word Embeddings Luke Zettlemoyer (Slides adapted from Danqi Chen, Chris Manning,

IN5550: Neural Methods in Natural Language Processing Lecture 11/1 Contextualized embeddings

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

Word Embeddings Tutorial HILA GONEN PHD STUDENT AT YOAV GOLDBERGS LAB BAR ILAN UNIVERSITY

Word Meaning & Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Conjugate Directions Powells method is based on a model quadratic objective function and

ANLP Lecture 22 Lexical Semantics with Dense Vectors Shay Cohen (Based on slides by Henry

Hedetniemi conjecture for strict vector chromatic number Robert mal (joint with C.Godsil,

Distributed Systems events vs. physical clocks : time of day Assume no central time source

Introduction to Information Retrieval http://informationretrieval.org IIR 6&7: Vector Space

Maximal Vector Computation in Large Data Sets Parke Godfrey 1 Ryan Shipley 2 Jarek Gryz 1 1 York

Unitary friezes and frieze vectors Emily Gunawan and Ralf Schiffler University of Connecticut

Public-Key Cryptosystems from the Worst-Case Shortest Vector Problem Chris Peikert SRI

Word, Sense and Contextualized Embeddings: Vector Representations of - PowerPoint PPT Presentation

Word, Sense and Contextualized Embeddings: Vector Representations of Meaning in NLP Jose Camacho-Collados Cardiff University, 18 March 2019 1 Outline Background Vector Space Models (word embeddings) Lexical resources Sense

Retrofitting Contextualized Word Embeddings with Paraphrases Weijia Shi 1* , Muhao Chen 1 * , Pei

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by

Contextualized Word Embeddings Luke Zettlemoyer (Slides adapted from Danqi Chen, Chris Manning,

IN5550: Neural Methods in Natural Language Processing Lecture 11/1 Contextualized embeddings

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme:

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

Word Embeddings Tutorial HILA GONEN PHD STUDENT AT YOAV GOLDBERGS LAB BAR ILAN UNIVERSITY

Word Meaning &amp; Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Conjugate Directions Powells method is based on a model quadratic objective function and

ANLP Lecture 22 Lexical Semantics with Dense Vectors Shay Cohen (Based on slides by Henry

Hedetniemi conjecture for strict vector chromatic number Robert mal (joint with C.Godsil,

Distributed Systems events vs. physical clocks : time of day Assume no central time source

Introduction to Information Retrieval http://informationretrieval.org IIR 6&amp;7: Vector Space

Maximal Vector Computation in Large Data Sets Parke Godfrey 1 Ryan Shipley 2 Jarek Gryz 1 1 York

Unitary friezes and frieze vectors Emily Gunawan and Ralf Schiffler University of Connecticut

Public-Key Cryptosystems from the Worst-Case Shortest Vector Problem Chris Peikert SRI

Word Meaning & Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

Introduction to Information Retrieval http://informationretrieval.org IIR 6&7: Vector Space