Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une - PowerPoint PPT Presentation

Word embeddings

Rappel Embeddings ( pas Word Embeddings )

Est une “lookup table” Formalisme: ● Index d’un mot: w i ● Table d’embeddings (lookup matrix): V ● Embedding: e i ● e i = V( w i )

Représentation d’un mot Différentes possibilités: Vecteur One-hot ● ○ Chat : [0,0,… 0, 1 ,0,0,0,0,0,0,0,0,0…] ● Vecteur de context ○ Chat : [ 1 ,0,… 0, 0 ,0,0,0, 1 ,0,0, 1 ,0,0…] félin litière lait chat

Vecteurs de contexte “Chap. 15: Vector Semantics.” Speech and Language Processing: an Introduction to Natural Language Processing, Computational L inguistics, and Speech Recognition, by Dan Jurafsky and James H. Martin, Dorling Kindersley Pvt, Ltd., 2014.

Vecteurs de contexte Vecteurs très grands (taille du vocabulaire) ● Contiennent beaucoup de 0 ● ● On cherche donc une manière de réduire la dimensionnalité pour : ○ Efficacité en mémoire ○ Facile d’utilisation pour des classificateurs Moins de paramètres ○ Des dimensions peuvent se recouper ○

Décomposition en valeurs singulières “Chap. 15: Vector Semantics.” Speech and Language Processing: an Introduction to Natural Language Processing, Computational L inguistics, and Speech Recognition, by Dan Jurafsky and James H. Martin, Dorling Kindersley Pvt, Ltd., 2014.

On conserve les top k valeurs singulières “Chap. 15: Vector Semantics.” Speech and Language Processing: an Introduction to Natural Language Processing, Computational L inguistics, and Speech Recognition, by Dan Jurafsky and James H. Martin, Dorling Kindersley Pvt, Ltd., 2014.

On utilise ensuite seulement la matrice W “Chap. 15: Vector Semantics.” Speech and Language Processing: an Introduction to Natural Language Processing, Computational L inguistics, and Speech Recognition, by Dan Jurafsky and James H. Martin, Dorling Kindersley Pvt, Ltd., 2014.

Méthodes à réseaux de neurones

GloVe “A weighted least squares regression model” L’idée est de prédire le nombre de co -occurrences X ij (ou le log ) des mots w i et w j S’apparente à Word2Vec (ou encore FastText)

GloVe w j w i 25

GloVe Perte = v( w i ) * v( w j ) + b i + b j - log (X ij )

GloVe Perte = v( w i ) * v( w j ) + b i + b j - log (25)

Word2Vec 2 algorithmes: Skip-Gram ● CBOW (Contextual Bag of Words) ●

Word2Vec

CBOW - Negative Sampling litière chat softmax

CBOW - Negative Sampling café chat litière pomme feuille

CBOW - Negative Sampling score entre un mot w et un context C

CBOW - Comment obtenir un score Produit vectoriel entre v C et v w

CBOW - Comment obtenir un score = <wh, whe, her, ere, re>, <where>

CBOW - Phrase Representations v (New) + v (York) ≈ Boston?

CBOW - Phrase Representations v (New) + v (York) ≈ Issshhh?

CBOW - Phrase Representations New York => New_York

Démo FastText

Recap FastText … le petit chat saute sur ...

Recap FastText le petit saute sur [-2.2, 2.3, 2.4] [-0.2, -1.3, 0.4] [-3.2, 1.3, 0.5] [-3.2, 1.3, 0.5] chat [0.2, 1.3, 3.4] w i c 1 c 2 c 3 c 4 C Negative sampling: score + marteau [1.2, -1.3, -3.4] - n i score

ELMo On le verra dans la section modèles de langue..!

Vecteurs de phrases

Comment obtenir la représentation d’une phrase? Prendre la moyenne des embeddings de mots ● Utiliser une idée similaire à Skip-Gram! ●

Skip-Thought Vectors Idée de base: Étant donné un triplet de phrases ( s i-1 , s i , s i+1 ) ● Encoder la phrase s i ○ ○ Générer les phrases s i-1 et s i+1

Skip-Thought Vectors

Skip-Thought Vectors Modèles de langue

Skip-Thought Vectors Au final, on se sert de ça!

Skip-Thought Vectors Probabilité d’avoir généré la phrase suivante Probabilité d’avoir généré la phrase précédente

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une - PowerPoint PPT Presentation

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme: Index dun mot: w i Table dembeddings (lookup matrix): V Embedding: e i e i = V( w i ) Reprsentation dun mot

Model and parameter identification through Bayesian inference in solid mechanics Hussein Rappel

Town of The Pas Town of The Pas Town of The Pas Town of The Pas FINANCIAL PLAN PRESENTATION

New Staff Training Effective Use of PAs Effective Use of PAs How to Use PAs Effectively

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Probabilistic modeling natural way to treat data Hussein Rappel University of Luxembourg

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Orientation to PAS Abstract Selection & Sessioning Process Benard Dreyer, PAS Abstract Review

Pedestrian Alert System (PAS) Improve safety for pedestrians with PAS - a proximity alert system

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word Embeddings Tutorial HILA GONEN PHD STUDENT AT YOAV GOLDBERGS LAB BAR ILAN UNIVERSITY

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction Roy Schwartz + ,

Natural Language Processing Machine Translation Dan Klein UC Berkeley 1 Machine Translation 2

Machine Translation: Examples Statistical NLP Spring 2011 Lecture 7: Phrase-Based MT Dan Klein

CSE 517 Natural Language Processing Winter 2015 Phrase Based Translation Yejin Choi Slides

Self-Assembly The spontaneous and reversible association of molecular species to form larger, more

Corpus Acquisition from the Internet Philipp Koehn partially based on slides from Christian Buck

Agenda Part 1: Professional Guest Speakers Spring Recruitment DAS Upcoming Events

Natural Language Processing Computational Linguistics Text processing Artificial Intelligence

1 Handling Return Traffic Handling Return Traffic URL Switching URL Switching Idea: switch

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une - PowerPoint PPT Presentation

Word embeddings Rappel Embeddings ( pas Word Embeddings ) Est une lookup table Formalisme: Index dun mot: w i Table dembeddings (lookup matrix): V Embedding: e i e i = V( w i ) Reprsentation dun mot

Model and parameter identification through Bayesian inference in solid mechanics Hussein Rappel

Town of The Pas Town of The Pas Town of The Pas Town of The Pas FINANCIAL PLAN PRESENTATION

New Staff Training Effective Use of PAs Effective Use of PAs How to Use PAs Effectively

Word Embeddings Natural Language Processing VU (706.230) - Andi Rexha 02/04/2020 Word Embeddings

Probabilistic modeling natural way to treat data Hussein Rappel University of Luxembourg

Word Embeddings Revisited: Contextual Embeddings CS 6956: Deep Learning for NLP Overview

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Word Embeddings CS 6956: Deep Learning for NLP Overview Representing meaning Word

Orientation to PAS Abstract Selection &amp; Sessioning Process Benard Dreyer, PAS Abstract Review

Pedestrian Alert System (PAS) Improve safety for pedestrians with PAS - a proximity alert system

Embeddings @ Twitter Making ML easy with Embeddings !!! Sept 2018 Agenda 1 Team 2 Whats an

Word Embeddings Tutorial HILA GONEN PHD STUDENT AT YOAV GOLDBERGS LAB BAR ILAN UNIVERSITY

Mixed membership word embeddings: Corpus-specific embeddings without big data James Foulds

Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction Roy Schwartz + ,

Natural Language Processing Machine Translation Dan Klein UC Berkeley 1 Machine Translation 2

Machine Translation: Examples Statistical NLP Spring 2011 Lecture 7: Phrase-Based MT Dan Klein

CSE 517 Natural Language Processing Winter 2015 Phrase Based Translation Yejin Choi Slides

Self-Assembly The spontaneous and reversible association of molecular species to form larger, more

Corpus Acquisition from the Internet Philipp Koehn partially based on slides from Christian Buck

Agenda Part 1: Professional Guest Speakers Spring Recruitment DAS Upcoming Events

Natural Language Processing Computational Linguistics Text processing Artificial Intelligence

1 Handling Return Traffic Handling Return Traffic URL Switching URL Switching Idea: switch

Orientation to PAS Abstract Selection & Sessioning Process Benard Dreyer, PAS Abstract Review