USING PSEUDO-SENSES FOR IMPROVING THE EXTRACTION OF SYNONYMS FROM - PowerPoint PPT Presentation

USING PSEUDO-SENSES FOR IMPROVING THE EXTRACTION OF SYNONYMS FROM WORD EMBEDDINGS Olivier Ferret

CONTEXT AND OBJECTIVES • Context • semantic specialization of word embeddings • most approaches following Retrofitting [Faruqui et al., 2015] • a priori set of lexical semantic relations • bring word vectors closer if they are part of similarity relations (synonymy, lexical association ...) • move them away from each other if they are part of dissimilarity relations (antonymy …) • Objectives of Pseudofit • improving word embeddings for semantic similarity without a priori lexical relations | 2

PRINCIPLES: GENERAL PERSPECTIVE • Theoritical hypothesis • homogeneous corpus C • equal split of C in 2 parts: C1 and C2 • distributional representation of a word w from a corpus C = distrep C (w) = set of contexts • distrep C1 (w) = distrep C2 (w) • In practice • distrep C1 (w) ≠ distrep C2 (w) • Hypothesis • differences between distrep C1 (w) and distrep C2 (w) are contingent • bringing distrep C1 (w) and distrep C2 (w) closer  more general (and better) distributional representation of w | 3

PRINCIPLES: IMPLEMENTATION • Distributional representations • dense representations: Skip-Gram [Mikolov et al., 2013] • Notion of pseudo-sense • 2 sub-corpora  2 representation spaces • require projection in a shared space  source of disturbances • instead, 1 corpus but 2 pseudo-senses for each word • pseudo-sense • arbitrarily split the occurrences of a word into two or more subsets • Overall process • generation of distributional contexts for pseudo-senses • turning pseudo-sense contexts into dense representations • convergence of pseudo-word representations  more general word representation | 4

REPRESENTATIONS OF PSEUDO-WORDS • Generation of contexts • 2 successive occurrences of a word  2 different pseudo-senses • 3 representations / word • 2 pseudo-senses + word itself  for each occurrence, generation of contexts for the current pseudo-sense + word • « frequency trick »: adding the representation of the word  avoiding the impact of having half the occurrences for each pseudo-sense A policeman 1 was arrested by another policeman 2 . TARGET CONTEXT TARGET CONTEXT TARGET CONTEXT policeman a policeman 1 a policeman 2 another policeman be policeman 1 be policeman 2 by policeman arrest (x2) policeman 1 arrest policeman 2 arrest policeman by (x2) policeman 1 by policeman another • Building of dense representations • word2vecf [Levy & Goldberg, 2014] | 5

CONVERGENCE OF PSEUDO-WORD REPRESENTATIONS • Principles • 3 representations / word w: v (word); v 1 , v 2 (pseudo-senses) • v, v 1 and v 2 : supposed to be semantically equivalent  3 similarity relations: (v, v 1 ), (v, v 2 ) and (v 1 , v 2 ) • application of a semantic specialization method for word embeddings to v, v 1 and v 2 with the similarity relations between them • final representation for w: v after its « specialization » • Implementation • specialization method: P ARAGRAM [Wieting et al., 2015] • comparable to Retrofitting but includes an automatically generated repelling component • for each target word to specialize, selection of a repelling word, either randomly or according to their dissimilarity | 6

INTRINSIC EVALUATION • Experimental setup • 1 billion lemmatized words randomly selected from the Annotated English Gigaword corpus [Napoles et al., 2012] at the level of sentences • word embeddings built with the best parameters from [Baroni et al., 2014] • focus on nouns • Word similarity evaluation • Spearman’s rank correlation between human judgments and similarity between vectors for 3 representative datasets of word pairs SimLex-999 MEN Mturk 771 INITIAL 49.5 78.3 65.6 Pseudofit 51.2 79.9 68.0 Retrofitting 49.6 77.4 65.0 Counter-fitting 49.5 77.2 64.9  100 | 7

SYNONYM EXTRACTION • Evaluation framework • Gold Standard: WordNet’s synonyms • 2.9 / word • evaluated words = 11,481 nouns • frequency > 20 • for each evaluated noun, retrieval of its 100 nearest neighbors • neighbors ranked from most similar (Cosine) to less similar • Information Retrieval (IR) paradigm • evaluated word ≡ query; neighbors ≡ docs • IR measures: MAP, R-precision, precision@{1,2,5} R-prec. MAP P@1 P@2 P@5 INITIAL 13.0 15.2 18.3 13.1 7.7 Pseudofit +2.5 +3.3 +3.0 +2.5 +1.8  100 | 8

SENTENCE SIMILARITY • Evaluation task • Semantic Textual Similarity: STS Benchmark dataset [Cer et al., 2017] • Pearson rank correlation between human judgments and similarity between sentences for a set of reference sentence pairs • Computation of sentence similarity • strong baseline approach based on word embeddings • sentence representation: elementwise addition of the embeddings of the plain words of the sentence • use of Pseudofit [max,fus-max-pooling] embeddings, defined for nouns, verbs and adjectives • sentence similarity: Cosine between sentence representations ρ  100 INITIAL 63.2 Pseudofit [max,fus-max-pooling] 66.0 Best baseline (Cer et al., 2017) 56.5 | 9

CONCLUSIONS AND PERSPECTIVES • To sum up • Pseudofit: method for improving word embeddings towards semantic similarity without external semantic relations • method based on the convergence of several representations built from the same corpus  more general representation • successful intrinsic and extrinsic evaluations for word similarity, synonym extraction and sentence similarity • Research directions • transposition of Pseudofit with several corpora  link with researches about meta-embeddings and ensembles of word embeddings | 10

Commissariat à l’énergie atomique et aux énergies alternatives Institut List | CEA SACLAY NANO-INNOV | BAT. 861 – PC142 91191 Gif-sur-Yvette Cedex - FRANCE www-list.cea.fr Établissement public à caractère industriel et commercial | RCS Paris B 775 685 019

USING PSEUDO-SENSES FOR IMPROVING THE EXTRACTION OF SYNONYMS FROM - PowerPoint PPT Presentation

USING PSEUDO-SENSES FOR IMPROVING THE EXTRACTION OF SYNONYMS FROM WORD EMBEDDINGS Olivier Ferret CONTEXT AND OBJECTIVES Context semantic specialization of word embeddings most approaches following Retrofitting [Faruqui et al.,

NCC Education and You Study and Communication Skills Your Name The Senses Date The Senses

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

Class Crustacea: Senses, Development and more Taxonomy A big day in 310 Crustacean Senses

Word Senses Polysemy: many meanings The book uses aspect in these senses Informal

Mobile robot using different senses Motivation Senses for Robots ISOEN 2002 Sight (Cameras)

Soil Extraction Cell: An Alternative Soil Extraction Cell: An Alternative Method of Soil

Sensory Register More senses are better Some senses are stronger Link words to

Human Senses : Vision week 11 Dr. Belal Gharaibeh 1 Body senses Seeing Hearing

Declarative Information Extraction Declarative Information Extraction Using Datalog Datalog with

ECEN 5022 Cryptography Pseudo Random Number Generators Peter Mathys University of Colorado

Models for Inexact Reasoning Reasoning with Subjective Pseudo Reasoning with Subjective Pseudo

MIPS Pseudo Instructions and Functions Philipp Koehn 2 October 2019 Philipp Koehn Computer

Stackable GSS Pseudo-Mechs draft-williams-gssapi-stackable-pseudo-mechs-00

Pseudo-random Functions Debdeep Mukhopadhyay IIT Kharagpur We have seen the construction of

Completions of Pseudo Ordered Sets Maria D Cruz BLAST 2018 August 10,2018 Maria D Cruz (NMSU)

Improving Improving Finances, Finances, Improving Improving Lives Lives www.jeanchatzky.com

tr sr rrts r rt

Detecting Pseudosymmetries with PSEUDO J.M. Perez-Mato, E. Tasci Bilbao Crystallographic Server

Pseudo efficiencies Last time: Dataset Datasets: part of 2016 P09 and 2017 P01 applied usual

Variance Estimation in Complex Samples: The Finite Population Bootstrap Using Pseudo-Populations

National ESEAI Conference, Kansas City, MO (February 2, 2019) Barbara Sc Sche herr, r, Maryl

Pseudocode, the PPP and Alternatives Noah Doersing noah.doersing@student.uni-tuebingen.de June

Pseudo An algorithm design language Overview - Concise Pythonic syntax - Rich list support -

PRESENTATION 30 JANUARY 2020 P R O F I T A B L E S U S T A I N A B L E S T A K E H O