Modeling lexical semantic shits during ad-hoc coordination Alexandre - PowerPoint PPT Presentation

Modeling lexical semantic shiħts during ad-hoc coordination Alexandre Kabbach 12 Aurélie Herbelot 2 18.05.2020 – GeCKo 2020 1 University of Geneva 2 CIMeC – University of Trento

Problem

Conceptual variability and communication Speakers form conceptual representations for words based on difgerent background experiences (Connell and Lynott, 2014). How can speakers nonetheless communicate with one another if the words they utter do not refer to the exact same concepts? 1

Coordination: a possible solution? Speakers coordinate with one-another during each communication instance in order to settle for specific word meanings (Clark, 1992, 1996). In doing so, they contextualize their generic conceptual representations during communication. 2

Question How can we integrate coordination to standard Distributional Semantic Models (DSMs; Turney and Pantel, 2010; Clark, 2012; Erk, 2012; Lenci, 2018)? Problems: 1. DSMs do not distinguish background linguistic stimuli from active coordination in their acquisition process 2. DSMs consider conceptual representations to remain invariant during communication 3

Proposal

Model We distinguish background experience from ad-hoc coordination in a standard count-based PPMI-weighted DSM: • background experience = corpus data fed to the DSM • ad-hoc coordination = singular vector sampling in the SVD We replace the variance-preservation bias in the SVD of the DSM by an explicit coordination bias, sampling the set of d singular vectors which maximize the correlation with a particular similarity dataset (MEN and SimLex). 4

Assumptions 1. a single DSM can capture difgerent kinds of semantic relations from the same corpus, so that a collection of possible meaning spaces could coexist within the same set of data 2. aligning similarity judgments across sets of word pairs provides a nice approximation of ad-hoc coordination between two speakers originally disagreeing and ultimately converging to a form of agreement with respect to some lexical decision 5

Results 1. replacing the variance preservation bias with an explicit sampling bias actually reduces the variability across models generated from difgerent corpora 2. DSMs generated from difgerent corpora can be aligned in difgerent ways. Alignment does not necessarily equate conceptual agreement but in some cases, mere compatibility , so that coordinating one’s conceptual spaces might simply be the cooperative act of avoiding conflict , rather than being in full agreement 6

PPMI-weighted DSM d 7 P ( w , c ) PMI ( w , c ) = log P ( w ) · P ( c ) PPMI = max ( PMI ( w , c ) , 0 ) W = U · Σ · V ⊤ W d = U d · Σ α α ∈ [ 0 , 1 ]

Singular vector sampling d Replace the variance-preservation bias by the following add-reduce algorithm: • add : iterate over all singular vectors and selects only those that increase performance on a given lexical similarity dataset • reduce : iterate over the set of added singular vectors and removes all those that do not negatively alter performance on the given lexical similarity dataset 8 W d = U d · Σ α α ∈ [ 0 , 1 ]

Conceptual similarity a i transformation (rotation + scaling). applying cosine similarity-preserving linear scaling (Dev et al., 2018) which minimizes the RMSE while Models are aligned using absolute orientation with 2 b i 1 We model structural similarity between two DSMs as the i A A 1 RMSE A B minimized Root Mean Square Error (RMSE) between them. 9

Conceptual similarity We model structural similarity between two DSMs as the transformation (rotation + scaling). applying cosine similarity-preserving linear scaling (Dev et al., 2018) which minimizes the RMSE while Models are aligned using absolute orientation with 9 minimized Root Mean Square Error (RMSE) between them. � | A | � � � RMSE ( A , B ) = � 1 || a i − b i || 2 | A | i = 1

Experimental setup: corpora 53M Table 1: Corpora used to generate DSMs Full English Wikipedia of January 20 2019 2 600M WIKI 4% of the English Wikipedia 106M WIKI4 British National Corpus 113M BNC 2% of the English Wikipedia WIKI2 Corpus ACL anthology reference corpus 58M ACL .7% of the English Wikipedia 19M WIKI07 Open American National Corpus 17M OANC Details Word Count 10

Modeling lexical semantic shits during ad-hoc coordination Alexandre - PowerPoint PPT Presentation

Modeling lexical semantic shits during ad-hoc coordination Alexandre Kabbach 12 Aurlie Herbelot 2 18.05.2020 GeCKo 2020 1 University of Geneva 2 CIMeC University of Trento Problem Conceptual variability and communication Speakers

Heterogeneous Lexical Resources MultiJEDI ERC 259234 Lexical Resource Lexical Resource Lexical

WO JIAO LAUREN. WO SHI YI SUI. WO SHI NU HAI-R WO SHI JIA NA DA REN WO DE SHENG RI SHI ER LING

Compilers Lexical Analysis Alex Aiken Lexical Analysis 1. Lexical Analysis 2. Parsing 3.

LEXICAL TYPOLOGY Peter Koch (Part I) Koch, Lexical typology, 2010-8-24 A. General introduction

LEXICAL TYPOLOGY LEXICAL TYPOLOGY Peter Koch (Part II) Department of Romance Studies, Tbingen

LEXICAL SEMANTICS LEXICAL SEMANTICS CS 224N 2011 Gerald Penn Slides largely adapted from

Lesson 2 Lexical Analysis CS 226/326 Spring 2003 Lexical Analysis Transform source program

Lexical analysis Lexical analysis Lexical analysis checks the correctness of program words and

Introduction to Lexical Analysis Outline Informal sketch of lexical analysis

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

Area 11 Redistricting Ad-Hoc Committee AREA 11 Redistricting Ad-Hoc Committee March 8 th 2017 a

Routing In Ad Hoc Networks 1. Introduction to Ad-hoc networks 2. Routing in Ad-hoc networks 3.

Ad-hoc and Mesh Networks MAP-I Manuel P. Ricardo Faculdade de Engenharia da Universidade do

Mobile Communications Ad-hoc and Mesh Networks Manuel P. Ricardo Faculdade de Engenharia da

Semantic change : a words meaning changes independently of its form Evidence for semantic

Declarative Programming with Function Patterns Michael Hanus Christian-Albrechts-Universit

Semantics and Verification of Software Summer Semester 2019 Lecture 1: Introduction Thomas Noll

30 Transformational Design with Essential Aspect Decomposition: Model-Driven Architecture (MDA)

Computational Linguistics: Semantics Raffaella Bernardi University of Trento Contents First

Chapter 7: Raising and Control Constructions Syntactic Constructions in English Kim and Michaelis

Poses and Motion: Representations of Motion and Kinematics of Rigid Bodies The Heart of

1.9 The Matrix of a Linear Transformation McDonald Fall 2018, MATH 2210Q, 1.9Slides 1.9 Homework

Transformations III Week 2, Fri Jan 19 http://www.ugrad.cs.ubc.ca/~cs314/Vjan2007 Readings for