The Role of Dimensionality Reduction in Distributional Semantics - PowerPoint PPT Presentation

The Role of Dimensionality Reduction in Distributional Semantics or: having fun with matrix algebra Stefan Evert Technische Universität Darmstadt, Germany evert@linglit.tu-darmstadt.de Leuven Statistics Days 8 June 2012 Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 1 / 50

Outline Outline Introduction Definitions and notation Sparse high-dimensional models Dimensionality reduction Singular value decomposition (SVD) Interpretations of SVD Alternatives to SVD A case study Outlook and discussion Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 2 / 50

Introduction Definitions and notation Outline Introduction Definitions and notation Sparse high-dimensional models Dimensionality reduction Singular value decomposition (SVD) Interpretations of SVD Alternatives to SVD A case study Outlook and discussion Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 3 / 50

Introduction Definitions and notation General definition of DSMs A distributional semantic model (DSM) is a scaled and/or transformed co-occurrence matrix M , such that each row m represents the distribution of a target term across contexts. get see use hear eat kill knife 0.027 -0.024 0.206 -0.022 -0.044 -0.042 cat 0.031 0.143 -0.243 -0.015 -0.009 0.131 dog -0.026 0.021 -0.212 0.064 0.013 0.014 boat -0.022 0.009 -0.044 -0.040 -0.074 -0.042 cup -0.014 -0.173 -0.249 -0.099 -0.119 -0.042 pig -0.069 0.094 -0.158 0.000 0.094 0.265 banana 0.047 -0.139 -0.104 -0.022 0.267 -0.042 Term = word form, lemma, phrase, morpheme, word pair, . . . Targets = rows (terms whose distribution is represented) Features = columns (individual contexts or collocates) Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 4 / 50

Introduction Definitions and notation Notation: term-context matrix Frequency matrix F œ R k · n ( term-context row vectors f i œ R n ) y n h i p a o p e s a t o l t k d a a n l c i t r o i l h a a e e e l P B P K B F F S T f T · · · · · · cat 10 10 7 – – – – 1 f T W · · · · · · X dog – 10 4 11 – – – 2 W X . W X animal 2 15 10 2 – – – . W X F = . W X time 1 – – – 2 1 – W . X . W X . reason – 1 – – 1 4 1 U V f T cause – – – 2 1 2 6 · · · · · · k e ff ect – – – 1 – 1 – Interpretation as collection of row vectors : I F = ( f ij ) , where f ij = ( f i ) j = frequency count of target term t i in context c j (wrt. context tokens , here: Wikipedia articles) Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 5 / 50

Introduction Definitions and notation Notation: term-term matrix Cooccurrence matrix M œ R k · n ( term-term row vectors m i œ R n ) t n a n t r i d a o y e d l p l l p e e e l i l m k x r a e i b k e i t f i l S T m T · · · · · · cat 83 17 7 37 – 1 – 1 m T W · · · · · · X dog 561 13 30 60 1 2 4 2 W X . W X animal 42 10 109 134 13 5 5 . W X M = . W X time 19 9 29 117 81 34 109 W . X . W X . reason 1 – 2 14 68 140 47 U V m T cause – 1 – 4 55 34 55 · · · · · · k e ff ect – – 1 6 60 35 17 Interpretation as collection of row vectors : I M = ( m ij ) , where m ij = ( m i ) j = cooccurrence frequency of target term t i with feature term τ j (a collocate of t i ) Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 6 / 50

Introduction Definitions and notation DSM parameters Corpus with linguistic annotation Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling » Similarity/distance measure & normalisation Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling » Similarity/distance measure & normalisation » Dimensionality reduction Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling » Similarity/distance measure & normalisation » Dimensionality reduction » Semantic distance, nearest neighbours, semantic maps, . . . Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling » Similarity/ distance measure & normalisation » Dimensionality reduction » Semantic distance, nearest neighbours, semantic maps, . . . Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I row vector m dog 120 describes usage of word dog in the 100 corpus knife I can be seen as ● 80 coordinates of point use in n -dimensional 60 Euclidean space R n I illustrated for two 40 dimensions: boat ● 20 get and use dog ● cat I m dog = ( 115 , 10 ) ● 0 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 9 / 50

Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I similarity = spatial 120 proximity (Euclidean metric) 100 I location depends on knife frequency of noun ● 80 ( f dog ¥ 2 . 7 · f cat ) use 60 40 boat d = 57.5 ● 20 dog d = 63.3 ● cat ● 0 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 10 / 50

Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I similarity = spatial 120 proximity (Euclidean metric) 100 I location depends on knife frequency of noun ● 80 ( f dog ¥ 2 . 7 · f cat ) use I direction more 60 important than 40 location boat ● 20 dog ● cat ● 0 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 11 / 50

Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I similarity = spatial 120 proximity (Euclidean metric) 100 I location depends on knife frequency of noun ● 80 ( f dog ¥ 2 . 7 · f cat ) ● use I direction more 60 important than 40 location boat ● I normalise “length” ● 20 dog Î m dog Î of vector ● cat ● ● ● 0 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 12 / 50

Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I similarity = spatial 120 proximity (Euclidean metric) 100 I location depends on knife frequency of noun ● 80 ( f dog ¥ 2 . 7 · f cat ) ● use I direction more 60 α = 54.3 ° important than 40 location boat ● I normalise “length” ● 20 dog Î m dog Î of vector ● cat ● ● ● I or use angle α as 0 distance measure 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 12 / 50

The Role of Dimensionality Reduction in Distributional Semantics - PowerPoint PPT Presentation

The Role of Dimensionality Reduction in Distributional Semantics or: having fun with matrix algebra Stefan Evert Technische Universitt Darmstadt, Germany evert@linglit.tu-darmstadt.de Leuven Statistics Days 8 June 2012 Stefan Evert (TU

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

WIKIPEDIA ARTICLE GROUP 9 Contents Article Overview 1. Dimensionality Reduction 2.

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Dimensionality Reduction Algorithms (and how to interpret their output) Dalya Baron (Tel Aviv

Exploring Multivariate Data with Clustering and Dimensionality Reduction Marco Baroni Practical

Applied Machine Learning Dimensionality reduction using PCA Siamak Ravanbakhsh COMP 551 (Fall

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

DIMENSIONALITY REDUCTION DIMENSIONALITY REDUCTION MATTHIEU BLOCH April 21, 2020 1 / 26

Probabilistic Dimensionality Reduction Neil D. Lawrence University of Sheffield Facebook, London

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

Distributional Semantics The unsupervised modeling of meaning on a large scale Tim Van de Cruys

Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 In this subfield, we think

Spatial Data: Dimensionality Reduction CSC444 Techniques In this subfield, we think of a data

Dimensionality Reduction INFO-4604, Applied Machine Learning University of Colorado Boulder

The DSM data matrix DSM data are given as a term-term or term-context matrix: get see use hear

Profiling Transmissivity and Contamination in Fractures Intersecting Boreholes Claire Tiedeman,

Algorithmic Game Theory Bercea Multicast and Network Formation Games Introduction Potential

On Definability for Model Counting Jean-Marie Lagniez 1 , Emmanuel Lonca 1 and Pierre Marquis 1 , 2

Mechanisms of Meaning Autumn 2010 Raquel Fernndez Institute for Logic, Language &

Logic Roland Meyer TU Kaiserslautern Summer Term 2014 Roland Meyer (TU Kaiserslautern) Logic

Poster Session 1 11h 12:30h A gentle introduction by Prof. Enrique Alba Articles in section

Steiner Triple Systems Lucia Moura School of Electrical Engineering and Computer Science

The Role of Dimensionality Reduction in Distributional Semantics - PowerPoint PPT Presentation

The Role of Dimensionality Reduction in Distributional Semantics or: having fun with matrix algebra Stefan Evert Technische Universitt Darmstadt, Germany evert@linglit.tu-darmstadt.de Leuven Statistics Days 8 June 2012 Stefan Evert (TU

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

WIKIPEDIA ARTICLE GROUP 9 Contents Article Overview 1. Dimensionality Reduction 2.

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Dimensionality Reduction Algorithms (and how to interpret their output) Dalya Baron (Tel Aviv

Exploring Multivariate Data with Clustering and Dimensionality Reduction Marco Baroni Practical

Applied Machine Learning Dimensionality reduction using PCA Siamak Ravanbakhsh COMP 551 (Fall

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

DIMENSIONALITY REDUCTION DIMENSIONALITY REDUCTION MATTHIEU BLOCH April 21, 2020 1 / 26

Probabilistic Dimensionality Reduction Neil D. Lawrence University of Sheffield Facebook, London

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

Distributional Semantics The unsupervised modeling of meaning on a large scale Tim Van de Cruys

Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 In this subfield, we think

Spatial Data: Dimensionality Reduction CSC444 Techniques In this subfield, we think of a data

Dimensionality Reduction INFO-4604, Applied Machine Learning University of Colorado Boulder

The DSM data matrix DSM data are given as a term-term or term-context matrix: get see use hear

Profiling Transmissivity and Contamination in Fractures Intersecting Boreholes Claire Tiedeman,

Algorithmic Game Theory Bercea Multicast and Network Formation Games Introduction Potential

On Definability for Model Counting Jean-Marie Lagniez 1 , Emmanuel Lonca 1 and Pierre Marquis 1 , 2

Mechanisms of Meaning Autumn 2010 Raquel Fernndez Institute for Logic, Language &amp;

Logic Roland Meyer TU Kaiserslautern Summer Term 2014 Roland Meyer (TU Kaiserslautern) Logic

Poster Session 1 11h 12:30h A gentle introduction by Prof. Enrique Alba Articles in section

Steiner Triple Systems Lucia Moura School of Electrical Engineering and Computer Science

Mechanisms of Meaning Autumn 2010 Raquel Fernndez Institute for Logic, Language &