the role of dimensionality reduction in distributional
play

The Role of Dimensionality Reduction in Distributional Semantics - PowerPoint PPT Presentation

The Role of Dimensionality Reduction in Distributional Semantics or: having fun with matrix algebra Stefan Evert Technische Universitt Darmstadt, Germany evert@linglit.tu-darmstadt.de Leuven Statistics Days 8 June 2012 Stefan Evert (TU


  1. The Role of Dimensionality Reduction in Distributional Semantics or: having fun with matrix algebra Stefan Evert Technische Universität Darmstadt, Germany evert@linglit.tu-darmstadt.de Leuven Statistics Days 8 June 2012 Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 1 / 50

  2. Outline Outline Introduction Definitions and notation Sparse high-dimensional models Dimensionality reduction Singular value decomposition (SVD) Interpretations of SVD Alternatives to SVD A case study Outlook and discussion Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 2 / 50

  3. Introduction Definitions and notation Outline Introduction Definitions and notation Sparse high-dimensional models Dimensionality reduction Singular value decomposition (SVD) Interpretations of SVD Alternatives to SVD A case study Outlook and discussion Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 3 / 50

  4. Introduction Definitions and notation General definition of DSMs A distributional semantic model (DSM) is a scaled and/or transformed co-occurrence matrix M , such that each row m represents the distribution of a target term across contexts. get see use hear eat kill knife 0.027 -0.024 0.206 -0.022 -0.044 -0.042 cat 0.031 0.143 -0.243 -0.015 -0.009 0.131 dog -0.026 0.021 -0.212 0.064 0.013 0.014 boat -0.022 0.009 -0.044 -0.040 -0.074 -0.042 cup -0.014 -0.173 -0.249 -0.099 -0.119 -0.042 pig -0.069 0.094 -0.158 0.000 0.094 0.265 banana 0.047 -0.139 -0.104 -0.022 0.267 -0.042 Term = word form, lemma, phrase, morpheme, word pair, . . . Targets = rows (terms whose distribution is represented) Features = columns (individual contexts or collocates) Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 4 / 50

  5. Introduction Definitions and notation Notation: term-context matrix Frequency matrix F œ R k · n ( term-context row vectors f i œ R n ) y n h i p a o p e s a t o l t k d a a n l c i t r o i l h a a e e e l P B P K B F F S T f T · · · · · · cat 10 10 7 – – – – 1 f T W · · · · · · X dog – 10 4 11 – – – 2 W X . W X animal 2 15 10 2 – – – . W X F = . W X time 1 – – – 2 1 – W . X . W X . reason – 1 – – 1 4 1 U V f T cause – – – 2 1 2 6 · · · · · · k e ff ect – – – 1 – 1 – Interpretation as collection of row vectors : I F = ( f ij ) , where f ij = ( f i ) j = frequency count of target term t i in context c j (wrt. context tokens , here: Wikipedia articles) Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 5 / 50

  6. Introduction Definitions and notation Notation: term-term matrix Cooccurrence matrix M œ R k · n ( term-term row vectors m i œ R n ) t n a n t r i d a o y e d l p l l p e e e l i l m k x r a e i b k e i t f i l S T m T · · · · · · cat 83 17 7 37 – 1 – 1 m T W · · · · · · X dog 561 13 30 60 1 2 4 2 W X . W X animal 42 10 109 134 13 5 5 . W X M = . W X time 19 9 29 117 81 34 109 W . X . W X . reason 1 – 2 14 68 140 47 U V m T cause – 1 – 4 55 34 55 · · · · · · k e ff ect – – 1 6 60 35 17 Interpretation as collection of row vectors : I M = ( m ij ) , where m ij = ( m i ) j = cooccurrence frequency of target term t i with feature term τ j (a collocate of t i ) Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 6 / 50

  7. Introduction Definitions and notation DSM parameters Corpus with linguistic annotation Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

  8. Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

  9. Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

  10. Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

  11. Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling » Similarity/distance measure & normalisation Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

  12. Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling » Similarity/distance measure & normalisation » Dimensionality reduction Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

  13. Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling » Similarity/distance measure & normalisation » Dimensionality reduction » Semantic distance, nearest neighbours, semantic maps, . . . Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

  14. Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling » Similarity/distance measure & normalisation » Dimensionality reduction » Semantic distance, nearest neighbours, semantic maps, . . . Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

  15. Introduction Definitions and notation DSM parameters Corpus with linguistic annotation » Term-context vs. term-term matrix » Type & size of context » Feature scaling » Similarity/ distance measure & normalisation » Dimensionality reduction » Semantic distance, nearest neighbours, semantic maps, . . . Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 8 / 50

  16. Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I row vector m dog 120 describes usage of word dog in the 100 corpus knife I can be seen as ● 80 coordinates of point use in n -dimensional 60 Euclidean space R n I illustrated for two 40 dimensions: boat ● 20 get and use dog ● cat I m dog = ( 115 , 10 ) ● 0 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 9 / 50

  17. Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I similarity = spatial 120 proximity (Euclidean metric) 100 I location depends on knife frequency of noun ● 80 ( f dog ¥ 2 . 7 · f cat ) use 60 40 boat d = 57.5 ● 20 dog d = 63.3 ● cat ● 0 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 10 / 50

  18. Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I similarity = spatial 120 proximity (Euclidean metric) 100 I location depends on knife frequency of noun ● 80 ( f dog ¥ 2 . 7 · f cat ) use I direction more 60 important than 40 location boat ● 20 dog ● cat ● 0 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 11 / 50

  19. Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I similarity = spatial 120 proximity (Euclidean metric) 100 I location depends on knife frequency of noun ● 80 ( f dog ¥ 2 . 7 · f cat ) ● use I direction more 60 important than 40 location boat ● I normalise “length” ● 20 dog Î m dog Î of vector ● cat ● ● ● 0 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 12 / 50

  20. Introduction Definitions and notation Geometric interpretation and semantic distance Two dimensions of English V − Obj DSM I similarity = spatial 120 proximity (Euclidean metric) 100 I location depends on knife frequency of noun ● 80 ( f dog ¥ 2 . 7 · f cat ) ● use I direction more 60 α = 54.3 ° important than 40 location boat ● I normalise “length” ● 20 dog Î m dog Î of vector ● cat ● ● ● I or use angle α as 0 distance measure 0 20 40 60 80 100 120 get Stefan Evert (TU Darmstadt) Dimensionality Reduction for DSM wordspace.collocations.de 12 / 50

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend