From unsupervised induction of linguistic structures from text - - PowerPoint PPT Presentation
From unsupervised induction of linguistic structures from text - - PowerPoint PPT Presentation
Alexander Panchenko From unsupervised induction of linguistic structures from text towards applications in deep learning In close collaboration with May 28, 2018 From unsupervised induction of linguistic structures to applications in deep
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 2/109
In close collaboration with …
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 3/109
Andrei Kutuzov Eugen Ruppert Fide Marten Nikolay Arefyev Stefgen Remus Martin Riedl Hubert Naets Maria Pelevina Anastasiya Lopukhina Konstantin Lopukhin
In collaboration with …
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 4/109
Motivation
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 5/109 Image source: https://commons.wikimedia.org/wiki/File:Major_levels_of_linguistic_structure.svg Motivation
Levels of Linguistic Analysis
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 6/109 Image source: https://commons.wikimedia.org/wiki/File:Major_levels_of_linguistic_structure.svg Motivation
Levels of Linguistic Analysis
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 7/109
(Written) language is a symbolic system Semantic level: typed weighted graphs of concepts
Co-occurrence networks Lexical databases, e.g. WordNet Thesauri, e.g. NLM Ontologies, e.g. DBPedia Associative networks, e.g. Edinburgh Associative Thesaurus …
Motivation
Linguistic Structures and Graphs
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 8/109 Motivation
Semantic Graphs
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 9/109 Motivation
Semantic Graphs
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 10/109
''Anti-connectivism'' End-to-end learning: symbolic representations aren't needed Word embeddings lookup (at most)
Motivation
The new brave world of Deep Learning
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 10/109
''Anti-connectivism'' End-to-end learning: symbolic representations aren't needed Word embeddings lookup (at most)
Motivation
The new brave world of Deep Learning
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 11/109
Adjacency matrix A is dual with the corresponding graph . Vector matrix multiply A x is dual with breadth-fjrst search.
Motivation
Graph Matrix Duality
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 11/109
Adjacency matrix A is dual with the corresponding graph G. Vector matrix multiply A x is dual with breadth-fjrst search.
Motivation
Graph Matrix Duality
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 11/109
Adjacency matrix A is dual with the corresponding graph G. Vector matrix multiply AT x is dual with breadth-fjrst search.
Motivation
Graph Matrix Duality
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 12/109
1 Learn interpretable symbolic structures from text in an
unsupervised way, which are more complex than words.
2 Represent the learned structures in a vector space. 3 Use the vector representations instead/in addition to word
embedding the deep learning applications. Lookup of word senses, frames, etc.
4 More complex structures could improve performance, but
also provide better interpretability of the deep learning models.
Motivation
Goal: Linguistic Structures in DL
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 12/109
1 Learn interpretable symbolic structures from text in an
unsupervised way, which are more complex than words.
2 Represent the learned structures in a vector space. 3 Use the vector representations instead/in addition to word
embedding the deep learning applications. Lookup of word senses, frames, etc.
4 More complex structures could improve performance, but
also provide better interpretability of the deep learning models.
Motivation
Goal: Linguistic Structures in DL
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 12/109
1 Learn interpretable symbolic structures from text in an
unsupervised way, which are more complex than words.
2 Represent the learned structures in a vector space. 3 Use the vector representations instead/in addition to word
embedding the deep learning applications. Lookup of word senses, frames, etc.
4 More complex structures could improve performance, but
also provide better interpretability of the deep learning models.
Motivation
Goal: Linguistic Structures in DL
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 12/109
1 Learn interpretable symbolic structures from text in an
unsupervised way, which are more complex than words.
2 Represent the learned structures in a vector space. 3 Use the vector representations instead/in addition to word
embedding the deep learning applications. Lookup of word senses, frames, etc.
4 More complex structures could improve performance, but
also provide better interpretability of the deep learning models.
Motivation
Goal: Linguistic Structures in DL
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 13/109
Overview
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 14/109
Inducing word sense representations:
word sense embeddings via retrofjtting [Pelevina et al., 2016, Remus & Biemann, 2018]; inducing synsets [Ustalov et al., 2017b, Ustalov et al., 2017a, Ustalov et al., 2018b] inducing semantic classes [Panchenko et al., 2018b]
Making induced senses interpretable [Panchenko et al., 2017b, Panchenko et al., 2017c] Linking induced word senses to lexical resources [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]
Overview
Overview
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 14/109
Inducing word sense representations:
word sense embeddings via retrofjtting [Pelevina et al., 2016, Remus & Biemann, 2018]; inducing synsets [Ustalov et al., 2017b, Ustalov et al., 2017a, Ustalov et al., 2018b] inducing semantic classes [Panchenko et al., 2018b]
Making induced senses interpretable [Panchenko et al., 2017b, Panchenko et al., 2017c] Linking induced word senses to lexical resources [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]
Overview
Overview
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 14/109
Inducing word sense representations:
word sense embeddings via retrofjtting [Pelevina et al., 2016, Remus & Biemann, 2018]; inducing synsets [Ustalov et al., 2017b, Ustalov et al., 2017a, Ustalov et al., 2018b] inducing semantic classes [Panchenko et al., 2018b]
Making induced senses interpretable [Panchenko et al., 2017b, Panchenko et al., 2017c] Linking induced word senses to lexical resources [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]
Overview
Overview
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 15/109
A shared task on word sense induction [Panchenko et al., 2018a, Arefyev et al., 2018] Inducing semantic frames [Ustalov et al., 2018a]
Inducing FrameNet-like structures; …using multi-way clustering.
Learning graph/network embeddings [ongoing joint work with Andrei Kutuzov and Chris Biemann]
How to represent induced networks/graphs? … so that they can be used in deep learning architectures. …efgectively and effjciently.
Overview
Overview
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 15/109
A shared task on word sense induction [Panchenko et al., 2018a, Arefyev et al., 2018] Inducing semantic frames [Ustalov et al., 2018a]
Inducing FrameNet-like structures; …using multi-way clustering.
Learning graph/network embeddings [ongoing joint work with Andrei Kutuzov and Chris Biemann]
How to represent induced networks/graphs? … so that they can be used in deep learning architectures. …efgectively and effjciently.
Overview
Overview
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 15/109
A shared task on word sense induction [Panchenko et al., 2018a, Arefyev et al., 2018] Inducing semantic frames [Ustalov et al., 2018a]
Inducing FrameNet-like structures; …using multi-way clustering.
Learning graph/network embeddings [ongoing joint work with Andrei Kutuzov and Chris Biemann]
How to represent induced networks/graphs? … so that they can be used in deep learning architectures. …efgectively and effjciently.
Overview
Overview
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 16/109
Inducing word sense representations
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 17/109 Inducing word sense representations
Word vs sense embeddings
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 18/109 Inducing word sense representations
Word vs sense embeddings
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 19/109 Inducing word sense representations
Related work
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 20/109
AutoExtend [Rothe & Schütze, 2015]
* image is reproduced from the original paper
Inducing word sense representations
Related work: knowledge-based
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 21/109
Adagram [Bartunov et al., 2016] Multiple vector representations θ for each word:
- - a hidden variable: a sense index of word
in context ;
- - a meta-parameter controlling number of senses.
See also: [Neelakantan et al., 2014] and [Li and Jurafsky, 2015]
Inducing word sense representations
Related work: knowledge-free
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 21/109
Adagram [Bartunov et al., 2016] Multiple vector representations θ for each word: p(Y, Z, β|X, α, θ) =
V
∏
w=1 ∞
∏
k=1
p(βwk|α)
N
∏
i=1
[p(zi|xi, β)
C
∏
j=1
p(yij|zi, xi, θ
zi -- a hidden variable: a sense index of word xi in context C; α -- a meta-parameter controlling number of senses.
See also: [Neelakantan et al., 2014] and [Li and Jurafsky, 2015]
Inducing word sense representations
Related work: knowledge-free
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 21/109
Adagram [Bartunov et al., 2016] Multiple vector representations θ for each word: p(Y, Z, β|X, α, θ) =
V
∏
w=1 ∞
∏
k=1
p(βwk|α)
N
∏
i=1
[p(zi|xi, β)
C
∏
j=1
p(yij|zi, xi, θ
zi -- a hidden variable: a sense index of word xi in context C; α -- a meta-parameter controlling number of senses.
See also: [Neelakantan et al., 2014] and [Li and Jurafsky, 2015]
Inducing word sense representations
Related work: knowledge-free
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 22/109
Word sense induction (WSI) based on graph clustering:
[Lin, 1998] [Pantel and Lin, 2002] [Widdows and Dorow, 2002] Chinese Whispers [Biemann, 2006] [Hope and Keller, 2013]
Inducing word sense representations
Related work: word sense induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 23/109 * source of the image: http://ic.pics.livejournal.com/blagin_anton/33716210/2701748/2701748_800.jpg Inducing word sense representations
Related work: Chinese Whispers#1
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 24/109
Iterative formulation [Biemann, 2006] Vector formulation [Biemann, 2006]
Inducing word sense representations
Related work: Chinese Whispers#2
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 24/109
Iterative formulation [Biemann, 2006] Vector formulation [Biemann, 2006]
Inducing word sense representations
Related work: Chinese Whispers#2
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 25/109 Inducing word sense representations
Related work: Chinese Whispers#2
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 26/109 Inducing word sense representations
Related work: Chinese Whispers#2
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 27/109 Inducing word sense representations
Related work: Chinese Whispers#2
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 28/109
RepL4NLP@ACL'16 [Pelevina et al., 2016], LREC'18 [Remus & Biemann, 2018]
Prior methods:
Induce inventory by clustering of word instances Use existing sense inventories
Our method:
Input: word embeddings Output: word sense embeddings Word sense induction by clustering of word ego-networks
Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 29/109
From word embeddings to sense embeddings
Calculate Word Similarity Graph Learning Word Vectors Word Sense Induction Text Corpus Word Vectors Word Similarity Graph Pooling of Word Vectors Sense Inventory Sense Vectors 1 2 4 3
Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 30/109
Word sense induction using ego-network clustering
Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 31/109
Neighbours of Word and Sense Vectors Vector Nearest Neighbors table tray, bottom, diagram, bucket, brackets, stack, basket, list, parenthesis, cup, saucer, pile, playfjeld, bracket, pot, drop-down, cue, plate table#0 leftmost#0, column#1, tableau#1, indent#1, bracket#3, pointer#0, footer#1, cursor#1, diagram#0, grid#0 table#1 pile#1, stool#1, tray#0, basket#0, bowl#1, bucket#0, box#0, cage#0, saucer#3, mirror#1, pan#1, lid#0
Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 31/109
Neighbours of Word and Sense Vectors Vector Nearest Neighbors table tray, bottom, diagram, bucket, brackets, stack, basket, list, parenthesis, cup, saucer, pile, playfjeld, bracket, pot, drop-down, cue, plate table#0 leftmost#0, column#1, tableau#1, indent#1, bracket#3, pointer#0, footer#1, cursor#1, diagram#0, grid#0 table#1 pile#1, stool#1, tray#0, basket#0, bowl#1, bucket#0, box#0, cage#0, saucer#3, mirror#1, pan#1, lid#0
Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 32/109
Word and sense embeddings
- f words iron and vitamin.
LREC'18 [Remus & Biemann, 2018]
Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 33/109
Word Sense Disambiguation
1 Context extraction: use context words around the target
word
2 Context fjltering: based on context word's relevance for
disambiguation
3 Sense choice in context: maximise similarity between a
context vector and a sense vector
Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 34/109 Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 35/109 Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 36/109 Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 37/109 Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 38/109
Unsupervised WSD SemEval'13, ReprL4NLP [Pelevina et al., 2016]:
Model Jacc. Tau WNDCG F.NMI F.B-Cubed AI-KU (add1000) 0.176 0.609 0.205 0.033 0.317 AI-KU 0.176 0.619 0.393 0.066 0.382 AI-KU (remove5-add1000) 0.228 0.654 0.330 0.040 0.463 Unimelb (5p) 0.198 0.623 0.374 0.056 0.475 Unimelb (50k) 0.198 0.633 0.384 0.060 0.494 UoS (#WN senses) 0.171 0.600 0.298 0.046 0.186 UoS (top-3) 0.220 0.637 0.370 0.044 0.451 La Sapienza (1) 0.131 0.544 0.332
- La Sapienza (2)
0.131 0.535 0.394
- AdaGram, α = 0.05, 100 dim
0.274 0.644 0.318 0.058 0.470 w2v 0.197 0.615 0.291 0.011 0.615 w2v (nouns) 0.179 0.626 0.304 0.011 0.623 JBT 0.205 0.624 0.291 0.017 0.598 JBT (nouns) 0.198 0.643 0.310 0.031 0.595 TWSI (nouns) 0.215 0.651 0.318 0.030 0.573
Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 39/109
Semantic relatedness, LREC'2018 [Remus & Biemann, 2018]:
a u t
- e
x t e n d a d a g r a m S G N S . g l
- v
e . s y m p a t . L S A b
- w
. L S A h a l . p a r a g r a m S L . SimLex999 0.45 0.29 0.44 0.37 0.54 0.30 0.27 0.68 MEN 0.72 0.67 0.77 0.73 0.53 0.67 0.71 0.77 SimVerb 0.43 0.27 0.36 0.23 0.37 0.15 0.19 0.53 WordSim353 0.58 0.61 0.70 0.61 0.47 0.67 0.59 0.72 SimLex999-N 0.44 0.33 0.45 0.39 0.48 0.32 0.34 0.68 MEN-N 0.72 0.68 0.77 ____ 0.76 ____ 0.57 ____ 0.71 ____ 0.73 ____ 0.78 ____ Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 40/109
Unsupervised WSD SemEval'13, ReprL4NLP [Pelevina et al., 2016]: comparable to SOTA, incl. sense embeddings. Semantic relatedness, LREC'2018 [Remus & Biemann, 2018]:
a u t
- e
x t e n d a d a g r a m S G N S S G N S + s e n s e s g l
- v
e g l
- v
e + s e n s e s s y m p a t s y m p a t + s e n s e s L S A b
- w
L S A b
- w
+ s e n s e s L S A h a l L S A h a l + s e n s e s p a r a g r a m S L p a r a g r a m S L + s e n s e s SimLex999 0.45 0.29 0.44 0.46 0.37 0.41 0.54 0.55 0.30 0.39 0.27 0.38 0.68 0.64 MEN 0.72 0.67 0.77 0.78 0.73 0.77 0.53 0.68 0.67 0.70 0.71 0.74 0.77 0.80 SimVerb 0.43 0.27 0.36 0.39 0.23 0.30 0.37 0.45 0.15 0.22 0.19 0.28 0.53 0.53 WordSim353 0.58 0.61 0.70 0.69 0.61 0.65 0.47 0.62 0.67 0.66 0.59 0.63 0.72 0.73 SimLex999-N 0.44 0.33 0.45 0.50 0.39 0.47 0.48 0.55 0.32 0.46 0.34 0.44 0.68 0.66 MEN-N 0.72 0.68 0.77 0.79 0.76 0.80 0.57 0.74 0.71 0.73 0.73 0.76 0.78 0.81 Inducing word sense representations
Sense embeddings using retrofjtting
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 41/109
ACL'17 [Ustalov et al., 2017b] Examples of extracted synsets: Size Synset 2 {decimal point, dot} 3 {gullet, throat, food pipe} 4 {microwave meal, ready meal, TV dinner, frozen dinner} 5 {objective case, accusative case, oblique case, object case, accusative} 6 {radiotheater, dramatized audiobook, audio theater, ra- dio play, radio drama, audio play}
Inducing word sense representations
Synset induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 42/109
Outline of the 'Watset' method:
Background Corpus Synonymy Dictionary Learning Word Embeddings Graph Construction Synsets Word Similarities Ambiguous Weighted Graph Local Clustering: Word Sense Induction Global Clustering: Synset Induction Sense Inventory Disambiguation of Neighbors Disambiguated Weighted Graph Local-Global Fuzzy Graph Clustering
Inducing word sense representations
Synset induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 43/109 Inducing word sense representations
Synset induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 44/109 Inducing word sense representations
Synset induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 45/109 Inducing word sense representations
Synset induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 46/109 CW MCL MaxMax ECO CPM Watset
0.0 0.1 0.2 0.3
WordNet (English) F−score CW MCL MaxMax ECO CPM Watset
0.0 0.1 0.2 0.3
BabelNet (English) F−score Inducing word sense representations
Synset induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 46/109 CW MCL MaxMax ECO CPM Watset
0.0 0.1 0.2 0.3
WordNet (English) F−score CW MCL MaxMax ECO CPM Watset
0.0 0.1 0.2 0.3
BabelNet (English) F−score CW MCL MaxMax ECO CPM Watset
0.00 0.05 0.10 0.15 0.20
RuWordNet (Russian) F−score CW MCL MaxMax ECO CPM Watset
0.0 0.1 0.2 0.3 0.4
YARN (Russian) F−score Inducing word sense representations
Synset induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 47/109
Word Sense Local Sense Cluster: Related Senses Hypernyms mango#0 peach#1, grape#0, plum#0, apple#0, apricot#0, watermelon#1, banana#1, coconut#0, pear#0, fjg#0, melon#0, mangosteen#0, … fruit#0, food#0, … apple#0 mango#0, pineapple#0, banana#1, melon#0, grape#0, peach#1, watermelon#1, apricot#0, cranberry#0, pumpkin#0, mangosteen#0, … fruit#0, crop#0, … Java#1 C#4, Python#3, Apache#3, Ruby#6, Flash#1, C++#0, SQL#0, ASP#2, Visual Basic#1, CSS#0, Delphi#2, MySQL#0, Excel#0, Pascal#0, … programming language#3, lan- guage#0, … Python#3 PHP#0, Pascal#0, Java#1, SQL#0, Visual Ba- sic#1, C++#0, JavaScript#0, Apache#3, Haskell#5, .NET#1, C#4, SQL Server#0, … language#0, tech- nology#0, …
Inducing word sense representations
Sample of induced sense inventory
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 48/109
ID Global Sense Cluster: Semantic Class Hypernyms 1 peach#1, banana#1, pineapple#0, berry#0, black- berry#0, grapefruit#0, strawberry#0, blueberry#0, mango#0, grape#0, melon#0, orange#0, pear#0, plum#0, raspberry#0, watermelon#0, apple#0, apri- cot#0, watermelon#0, pumpkin#0, berry#0, man- gosteen#0, … vegetable#0, fruit#0, crop#0, ingredi- ent#0, food#0, · 2 C#4, Basic#2, Haskell#5, Flash#1, Java#1, Pas- cal#0, Ruby#6, PHP#0, Ada#1, Oracle#3, Python#3, Apache#3, Visual Basic#1, ASP#2, Delphi#2, SQL Server#0, CSS#0, AJAX#0, JavaScript#0, SQL Server#0, Apache#3, Delphi#2, Haskell#5, .NET#1, CSS#0, … programming lan- guage#3, technol-
- gy#0, language#0,
format#2, app#0
Inducing word sense representations
Sample of induced semantic classes
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 49/109
Text Corpus Representing Senses with Ego Networks Semantic Classes Word Sense Induction from Text Corpus Sense Graph Construction Clustering of Word Senes Labeling Sense Clusters with Hypernyms
Induced Word Senses Sense Ego-Networks Global Sense Graph
s Noisy Hypernyms Cleansed Hypernyms Induction of Semantic Classes
Global Sense Clusters
Inducing word sense representations
Induction of semantic classes
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 50/109
Filtering noisy hypernyms with semantic classes LREC'18 [Panchenko et al., 2018b]:
fruit#1 food#0 apple#2 mango#0 pear#0
Hypernyms, Sense Cluster,
mangosteen#0 city#2
Removed Wrong Added Missing
Inducing word sense representations
Induction of sense semantic classes
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 51/109
http://panchenko.me/data/joint/nodes20000-layers7
Inducing word sense representations
Global sense clustering
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 52/109 Inducing word sense representations
Global sense clustering
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 53/109
Filtering of a noisy hypernymy database with semantic classes. LREC'18 [Panchenko et al., 2018b]
Precision Recall F-score Original Hypernyms (Seitner et al., 2016) 0.475 0.546 0.508 Semantic Classes (coarse-grained) 0.541 0.679 0.602
Inducing word sense representations
Induction of sense semantic classes
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 54/109
Making induced senses interpretable
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 55/109
Knowledge-based sense representations are interpretable
Making induced senses interpretable
Making induced senses interpretable
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 56/109
Most knowledge-free sense representations are uninterpretable
Making induced senses interpretable
Making induced senses interpretable
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 57/109 Making induced senses interpretable
Making induced senses interpretable
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 58/109
Hypernymy prediction in context. EMNLP'17 [Panchenko et al., 2017b]
Making induced senses interpretable
Making induced senses interpretable
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 59/109
11.702 sentences, 863 words with avg.polysemy of 3.1. WSD Model Accuracy Inventory Features Hypers HyperHypers Word Senses Random 0.257 0.610 Word Senses MFS 0.292 0.682 Word Senses Cluster Words 0.291 0.650 Word Senses Context Words 0.308 0.686 Super Senses Random 0.001 0.001 Super Senses MFS 0.001 0.001 Super Senses Cluster Words 0.174 0.365 Super Senses Context Words 0.086 0.188
Making induced senses interpretable
Making induced senses interpretable
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 59/109
11.702 sentences, 863 words with avg.polysemy of 3.1. WSD Model Accuracy Inventory Features Hypers HyperHypers Word Senses Random 0.257 0.610 Word Senses MFS 0.292 0.682 Word Senses Cluster Words 0.291 0.650 Word Senses Context Words 0.308 0.686 Super Senses Random 0.001 0.001 Super Senses MFS 0.001 0.001 Super Senses Cluster Words 0.174 0.365 Super Senses Context Words 0.086 0.188
Making induced senses interpretable
Making induced senses interpretable
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 60/109
Linking induced senses to resources
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 61/109 Text Corpus Linking Induced Senses to Senses of the LR Induce a Graph of
- Sem. Related Words
Enriched Lexical Resource
Graph of Related Words
Word Sense Induction Labeling Senses with Hypernyms Disambiguation
- f Neighbours
Typing of the Unmapped Induced Senses
Word Sense Inventory Labeled Word Senses PCZ
Construction of Proto-Conceptualization (PCZ) Linking Proto-Conceptualization to Lexical Resource
- Part. Linked Senses to the LR
Lexical Resource (LR): WordNet, BabelNet, ... Construction of sense feature representations
Graph of Related Senses
LREC'16 [Panchenko, 2016], ISWC'16 [Faralli et al., 2016], SENSE@EACL'17 [Panchenko et al., 2017a], NLE'18 [Biemann et al., 2018]
Linking induced senses to resources
Linking induced senses to resources
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 62/109 Word AdaGram BabelNet AdaGram BoW BabelNet BoW python 2 bn:01713224n perl, php, java, smalltalk, ruby, lua, tcl, scripting, javascript, bindings, binding, programming, coldfusion, actionscript, net, . . . language, programming, python- ista, python programming, python3, python2, level, com- puter, pythonistas, python3000, python 1 bn:01157670n monty, circus, spamalot, python, magoo, muppet, snoopy, fea- turette, disney, tunes, tune, clas- sic, shorts, short, apocalypse, . . . monty, comedy, monty python, british, monte, monte python, troupe, pythonesque, foot, artist, record, surreal, terry, . . . python 3 bn:00046456n spectacled, unicornis, snake, gi- ant, caiman, leopard, squirrel, crocodile, horned, cat, mole, ele- phant, opossum, pheasant, . . . molurus, indian, boa, tigris, tiger python, rock, tiger, indian python, reptile, python molurus, indian rock python, coluber, . . . python 4 bn:01157670n circus, fmy, fmying, dusk, lizard, moth, unicorn, pufg, adder, vul- ture, tyrannosaurus, zephyr, bad- ger, . . . monty, comedy, monty python, british, monte, monte python, troupe, pythonesque, foot, artist, record, surreal, terry, . . . python 1 bn:00473212n monty, circus, spamalot, python, magoo, muppet, snoopy, fea- turette, disney, tunes, tune, clas- sic, shorts, short, apocalypse, . . . pictures, monty, python monty pictures, limited, company, python pictures limited, king- dom, picture, serve, director, . . . python 1 bn:03489893n monty, circus, spamalot, python, magoo, muppet, snoopy, fea- turette, disney, tunes, tune, clas- sic, shorts, short, apocalypse, . . . fjlm, horror, movie, clabaugh, richard, monster, century, direct, snake, python movie, television, giant, natural, language, for-tv, . . . Linking induced senses to resources
Linking induced senses to resources
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 63/109 Model Representation of the Sense "disk (medium)" WordNet memory, device, fmoppy, disk, hard, disk, disk, computer, science, computing, diskette, fjxed, disk, fmoppy, magnetic, disc, magnetic, disk, hard, disc, storage, device WordNet + Linked recorder, disk, fmoppy, console, diskette, handset, desktop, iPhone, iPod, HDTV, kit, RAM, Discs, Blu- ray, computer, GB, microchip, site, cartridge, printer, tv, VCR, Disc, player, LCD, software, component, camcorder, cellphone, card, monitor, display, burner, Web, stereo, internet, model, iTunes, turntable, chip, cable, camera, iphone, notebook, device, server, surface, wafer, page, drive, laptop, screen, pc, television, hardware, YouTube, dvr, DVD, product, folder, VCR, radio, phone, circuitry, partition, megabyte, peripheral, format, machine, tuner, website, merchandise, equipment, gb, discs, MP3, hard-drive, piece, video, storage device, memory device, microphone, hd, EP, content, soundtrack, webcam, system, blade, graphic, microprocessor, collection, document, programming, battery, key- board, HD, handheld, CDs, reel, web, material, hard-disk, ep, chart, debut, confjguration, recording, album, broadcast, download, fjxed disk, planet, pda, microfjlm, iPod, videotape, text, cylinder, cpu, canvas, label, sampler, workstation, electrode, magnetic disc, catheter, magnetic disk, Video, mo- bile, cd, song, modem, mouse, tube, set, ipad, signal, substrate, vinyl, music, clip, pad, audio, com- pilation, memory, message, reissue, ram, CD, subsystem, hdd, touchscreen, electronics, demo, shell, sensor, fjle, shelf, processor, cassette, extra, mainframe, motherboard, fmoppy disk, lp, tape, version, kilobyte, pacemaker, browser, Playstation, pager, module, cache, DVD, movie, Windows, cd-rom, e- book, valve, directory, harddrive, smartphone, audiotape, technology, hard disk, show, computing, computer science, Blu-Ray, blu-ray, HDD, HD-DVD, scanner, hard disc, gadget, booklet, copier, play- back, TiVo, controller, fjlter, DVDs, gigabyte, paper, mp3, CPU, dvd-r, pipe, cd-r, playlist, slot, VHS, fjlm, videocassette, interface, adapter, database, manual, book, channel, changer, storage Linking induced senses to resources
Linking induced senses to resources
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 64/109
Evaluation of linking accuracy:
Linking induced senses to resources
Linking induced senses to resources
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 65/109
Evaluation of enriched representations based on WSD:
Linking induced senses to resources
Linking induced senses to resources
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 66/109
Shared task on word sense induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 67/109
An ACL SIGSLAV sponsored shared task on word sense induction (WSI) for the Russian language. More details: https://russe.nlpub.org/2018/wsi
Shared task on word sense induction
A shared task on WSI
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 68/109
Target word, e.g. ``bank''. Contexts where the word occurs, e.g.:
``river bank is a slope beside a body of water'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun''
You need to group the contexts by senses:
``river bank is a slope beside a body of water'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.''
Shared task on word sense induction
A lexical sample WSI task
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 68/109
Target word, e.g. ``bank''. Contexts where the word occurs, e.g.:
``river bank is a slope beside a body of water'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun''
You need to group the contexts by senses:
``river bank is a slope beside a body of water'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.''
Shared task on word sense induction
A lexical sample WSI task
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 68/109
Target word, e.g. ``bank''. Contexts where the word occurs, e.g.:
``river bank is a slope beside a body of water'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun''
You need to group the contexts by senses:
``river bank is a slope beside a body of water'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.''
Shared task on word sense induction
A lexical sample WSI task
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 69/109 Shared task on word sense induction
Dataset based on Wikipedia
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 70/109 Shared task on word sense induction
Dataset based on RNC
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 71/109 Shared task on word sense induction
Dataset based on dictionary glosses
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 72/109 Shared task on word sense induction
A sample from the wiki-wiki dataset
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 73/109 Shared task on word sense induction
A sample from the wiki-wiki dataset
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 74/109 Shared task on word sense induction
A sample from the wiki-wiki dataset
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 75/109 Shared task on word sense induction
A sample from the bts-rnc dataset
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 76/109 Shared task on word sense induction
A sample from the active-dict dataset
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 77/109
1 Get the neighbors of a target word, e.g. ``bank'': 1
lender
2 river 3 citybank 4 slope 5
…
2 Get similar to ``bank'' and dissimilar to ``lender'': 1
river
2 slope 3 land 4 … 3 Compute distances to ``lender'' and ``river''.
Shared task on word sense induction
jamsic: sense induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 78/109
Induction of semantic frames
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 79/109 Induction of semantic frames
FrameNet: frame ``Kidnapping''
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 80/109
ACL'2018 [Ustalov et al., 2018a] Example of a LU tricluster corresponding to the ``Kidnapping'' frame from FrameNet. FrameNet Role Lexical Units (LU) Perpetrator Subject kidnapper, alien, militant FEE Verb snatch, kidnap, abduct Victim Object son, people, soldier, child
Induction of semantic frames
Frame induction as a triclustering
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 81/109 Induction of semantic frames
SVO triple elements
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 82/109
Officer|chair|Committee
- fficer|head|team
mayor|lead|city
- fficer|lead|company
Mayor|lead|city boss|lead|company chairman|lead|company director|lead|department chief|lead|department president|lead|government president|lead|state director|lead|company president|lead|department
- fficer|chair|committee
Chief|lead|department chairman|lead|committee Director|lead|Department Director|lead|department Director|lead|agency Director|lead|company minister|lead|team Director|head|team director|head|team Chairman|lead|company Chairman|lead|Committee President|lead|company Director|chair|Committee President|lead|party President|head|team leader|head|team Director|chair|committee director|chair|committee Director|head|Department president|head|team director|head|department director|head|agency director|head|committee Chairman|run|committee Chairman|chair|Committee Chairman|chair|committee President|chair|Committee President|chair|committee Governor|lead|state chairman|head|committee chairman|run|committee president|chair|committee president|head|committee president|chair|Committee Minister|chair|committee representative|chair|committee representative|head|committee General|command|department General|command|Department General|head|Department General|head|department
- fficer|head|department
minister|head|department leader|head|agency leader|head|party leader|head|committee leader|head|department minister|head|committee King|run|company leader|head|government Minister|head|government president|head|government Officer|chair|Committee
- fficer|head|team
mayor|lead|city
- fficer|lead|company
Mayor|lead|city boss|lead|company chairman|lead|company director|lead|department chief|lead|department president|lead|government president|lead|state director|lead|company president|lead|department
- fficer|chair|committee
Chief|lead|department chairman|lead|committee Director|lead|Department Director|lead|department Director|lead|agency Director|lead|company minister|lead|team Director|head|team director|head|team Chairman|lead|company Chairman|lead|Committee President|lead|company Director|chair|Committee President|lead|party President|head|team leader|head|team Director|chair|committee director|chair|committee Director|head|Department president|head|team director|head|department director|head|agency director|head|committee Chairman|run|committee Chairman|chair|Committee Chairman|chair|committee President|chair|Committee President|chair|committee Governor|lead|state chairman|head|committee chairman|run|committee president|chair|committee president|head|committee president|chair|Committee Minister|chair|committee representative|chair|committee representative|head|committee General|command|department General|command|Department General|head|Department General|head|department
- fficer|head|department
minister|head|department leader|head|agency leader|head|party leader|head|committee leader|head|department minister|head|committee King|run|company leader|head|government Minister|head|government president|head|government
Induction of semantic frames
An SVO triple graph
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 83/109
Input: an embedding model v ∈ V → ⃗ v ∈ Rd, a set of SVO triples T ⊆ V 3, the number of nearest neighbors k ∈ N, a graph clustering algorithm Cluster. Output: a set of triframes . NN for all Cluster do return
Induction of semantic frames
Triframes frame induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 83/109
Input: an embedding model v ∈ V → ⃗ v ∈ Rd, a set of SVO triples T ⊆ V 3, the number of nearest neighbors k ∈ N, a graph clustering algorithm Cluster. Output: a set of triframes F. NN for all Cluster do return
Induction of semantic frames
Triframes frame induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 83/109
Input: an embedding model v ∈ V → ⃗ v ∈ Rd, a set of SVO triples T ⊆ V 3, the number of nearest neighbors k ∈ N, a graph clustering algorithm Cluster. Output: a set of triframes F. S ← {t→ ⃗ t ∈ R3d : t ∈ T} E ← {(t, t′) ∈ T 2 : t′ ∈ NNS
k (⃗
t), t ̸= t′} F ← ∅ for all C ∈ Cluster(T, E) do fs ← {s ∈ V : (s, v, o) ∈ C} fv ← {v ∈ V : (s, v, o) ∈ C} fo ← {o ∈ V : (s, v, o) ∈ C} F ← F ∪ {(fs, fv, fo)} return F
Induction of semantic frames
Triframes frame induction
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 84/109
Frame # 848 Subjects: Company, fjrm, company Verbs: buy, supply, discharge, purchase, expect Objects: book, supply, house, land, share, company, grain, which, item, product, ticket, work, this, equipment, House, it, fjlm, water, something, she, what, service, plant, time
Induction of semantic frames
Example of an extracted frame
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 85/109
Frame # 849 Subjects: student, scientist, we, pupil, member, company, man, nobody, you, they, US, group, it, people, Man, user, he Verbs: do, test, perform, execute, conduct Objects: experiment, test
Induction of semantic frames
Example of an extracted frame
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 86/109
Frame # 3207 Subjects: people, we, they, you Verbs: feel, seek, look, search Objects: housing, inspiration, gold, witness, part- ner, accommodation, Partner
Induction of semantic frames
Example of an extracted frame
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 87/109
Dataset # instances # unique # clusters FrameNet Triples 99,744 94,170 383
- Poly. Verb Classes
246 110 62
Induction of semantic frames
Evaluation datasets
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 88/109
Dataset # instances # unique # clusters FrameNet Triples 99,744 94,170 383
- Poly. Verb Classes
246 110 62 Quality Measures: nmPU: normalized modifjed purity, niPU: normalized inverse purity.
Induction of semantic frames
Evaluation settings
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 88/109
Dataset # instances # unique # clusters FrameNet Triples 99,744 94,170 383
- Poly. Verb Classes
246 110 62 Quality Measures: nmPU: normalized modifjed purity, niPU: normalized inverse purity.
Induction of semantic frames
Evaluation settings
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 89/109
F1-scores for verbs, subjects,
- bjects,
frames
Induction of semantic frames
Results: comparison to state-of-art
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 90/109
Graph embeddings
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 91/109
Image source: https://www.tensorflow.org/tutorials/word2vec
Graph embeddings
Text: sparse symbolic representation
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 91/109
Image source: https://www.tensorflow.org/tutorials/word2vec
Graph embeddings
Text: sparse symbolic representation
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 92/109 Graph embeddings
Graph: sparse symbolic representation
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 93/109
From a survey on graph embeddings [Hamilton et al., 2017]:
Graph embeddings
Embedding graph into a vector space
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 94/109
From a survey on graph embeddings [Hamilton et al., 2017]:
Graph embeddings
Learning with an ``autoencoder''
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 95/109
From a survey on graph embeddings [Hamilton et al., 2017]:
Graph embeddings
Some established approaches
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 96/109
An submitted joint work with Andrei Kutuzov and Chris Biemann: Given a tree (V, E) Leackock-Chodorow (LCH) similarity measure: sim(vi, vj) = − log shortest_path_distance(vi, vj) 2h Jiang-Conrath (JCN) similarity measure: ln ln ln
Graph embeddings
Graph embeddings using similarities
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 96/109
An submitted joint work with Andrei Kutuzov and Chris Biemann: Given a tree (V, E) Leackock-Chodorow (LCH) similarity measure: sim(vi, vj) = − log shortest_path_distance(vi, vj) 2h Jiang-Conrath (JCN) similarity measure: sim(vi, vj) = 2 ln Plcs(vi, vj) ln P(vi) + ln P(vj)
Graph embeddings
Graph embeddings using similarities
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 97/109
path2vec (arxiv.org/abs/1808.05611): Approximating Structural Node Similarities with Node Embeddings: J = 1 |T| ∑
(vi,vj)∈T
(vi · vj − sim(vi, vj))2, where: sim(vi, vj) - the value of a `gold' similarity measure between a pair of nodes vi and vj; vi - an embeddings of node; T - training batch.
Graph embeddings
Graph embeddings using similarities
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 98/109
Computation of 82,115 pairwise similarities: Model Running time LCH in NLTK 30 sec. JCN in NLTK 6.7 sec. FSE embeddings 0.713 sec. path2vec and other fmoat vectors 0.007 sec.
Graph embeddings
Speedup: graph vs embeddings
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 99/109
Spearman correlation scores with WordNet similarities on SimLex999 noun pairs: Selection of synsets Model JCN-SemCor JCN-Brown LCH WordNet 1.0 1.0 1.0 Node2vec 0.655 0.671 0.724 Deepwalk 0.775 0.774 0.868 FSE 0.830 0.820 0.900 path2vec 0.917 0.914 0.934
Graph embeddings
Results: goodness of fjt
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 100/109
Spearman correlations with human SimLex999 noun similarities: Model Correlation Raw WordNet JCN-SemCor 0.487 Raw WordNet JCN-Brown 0.495 Raw WordNet LCH 0.513 node2vec [Grover & Leskovec, 2016] 0.450 Deepwalk [Perozzi et al., 2014] 0.533 FSE [Subercaze et al., 2015] 0.556 path2vec JCN-SemCor 0.526 path2vec JCN-Brown 0.487 path2vec LCH 0.522
Graph embeddings
Results: SimLex999 dataset
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 101/109
JCN (left) and LCH (right):
Graph embeddings
Results: SimLex999 dataset
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 102/109
WSD: each column lists all the possible synsets for the corresponding word.
Graph embeddings
Results: word sense disambiguation
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 103/109
F1 scores on Senseval-2 word sense disambiguation task:
Model F-measure WordNet JCN-SemCor 0.620 WordNet JCN-Brown 0.561 WordNet LCH 0.547 node2vec [Grover & Leskovec, 2016] 0.501 Deepwalk [Perozzi et al., 2014] 0.528 FSE [Subercaze et al., 2015] 0.536 path2vec Batch size: 20 50 100 JCN-SemCor 0.543 0.543 0.535 JCN-Brown 0.538 0.515 0.542 LCH 0.540 0.535 0.536
Graph embeddings
Results: word sense disambiguation
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 104/109
L = 1 |T| ∑
(vi,vj)∈T
( (vi · vj − sim(vi, vj))2 + vi · vin + vj · vjm ) , where: sim(vi, vj) - the value of a `gold' similarity measure between a pair of nodes vi and vj; vi - an embeddings of node; T - training batch; vin - random adjacent node of vi.
Graph embeddings
Improved Model and Results
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 105/109
Conclusion
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 106/109 Conclusion
Vectors + Graphs = ♡
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 107/109
We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to
improve performance of WSD; enrich lexical resources with emerging senses.
We can represent language graphs using graph embeddings in deep neural models.
Conclusion
Take home messages
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 107/109
We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to
improve performance of WSD; enrich lexical resources with emerging senses.
We can represent language graphs using graph embeddings in deep neural models.
Conclusion
Take home messages
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 107/109
We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to
improve performance of WSD; enrich lexical resources with emerging senses.
We can represent language graphs using graph embeddings in deep neural models.
Conclusion
Take home messages
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 107/109
We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to
improve performance of WSD; enrich lexical resources with emerging senses.
We can represent language graphs using graph embeddings in deep neural models.
Conclusion
Take home messages
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 108/109
A special issue on informing neural architectures for NLP with linguistic and background knowledge. … with Ivan Vulić and Simone Paolo Ponzetto.
goo.gl/A76NGX
Conclusion
Natural Language Engineering journal
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109 Conclusion
Thank you! Questions?
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109
Arefyev, N., Ermolaev, P., & Panchenko, A. (2018). How much does a word weigh? weighting word embeddings for word sense induction. arXiv preprint arXiv:1805.09209. Bartunov, S., Kondrashkin, D., Osokin, A., & Vetrov, D. (2016). Breaking sticks and ambiguities with adaptive skip-gram. In Artifjcial Intelligence and Statistics (pp. 130--138). Biemann, C. (2006). Chinese whispers: an effjcient graph clustering algorithm and its application to natural language processing problems. In Proceedings of the fjrst workshop on graph based methods for natural language processing (pp. 73--80).: Association for Computational Linguistics. Biemann, C., Faralli, S., Panchenko, A., & Ponzetto, S. P. (2018). A framework for enriching lexical semantic resources with distributional semantics. In Journal of Natural Language Engineering (pp. 56--64).: Cambridge Press.
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109
Faralli, S., Panchenko, A., Biemann, C., & Ponzetto, S. P. (2016). Linked disambiguated distributional semantic networks. In International Semantic Web Conference (pp. 56--64).: Springer. Grover, A. & Leskovec, J. (2016). Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 855--864).: ACM. Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Representation learning on graphs: Methods and applications. IEEE Data Engineering Bulletin, September 2017. Panchenko, A. (2016). Best of both worlds: Making word sense embeddings interpretable. In LREC.
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109
Panchenko, A., Faralli, S., Ponzetto, S. P., & Biemann, C. (2017a). Using linked disambiguated distributional networks for word sense disambiguation. In Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications (pp. 72--78). Valencia, Spain: Association for Computational Linguistics. Panchenko, A., Lopukhina, A., Ustalov, D., Lopukhin, K., Arefyev, N., Leontyev, A., & Loukachevitch, N. (2018a). Russe'2018: A shared task on word sense induction for the russian language. arXiv preprint arXiv:1803.05795. Panchenko, A., Marten, F., Ruppert, E., Faralli, S., Ustalov, D., Ponzetto, S. P., & Biemann, C. (2017b). Unsupervised, knowledge-free, and interpretable word sense disambiguation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp.
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109
91--96). Copenhagen, Denmark: Association for Computational Linguistics. Panchenko, A., Ruppert, E., Faralli, S., Ponzetto, S. P., & Biemann, C. (2017c). Unsupervised does not mean uninterpretable: The case for word sense induction and disambiguation. In Proceedings of the 15th Conference of the European Chapter
- f the Association for Computational Linguistics: Volume 1,
Long Papers (pp. 86--98). Valencia, Spain: Association for Computational Linguistics. Panchenko, A., Ustalov, D., Faralli, S., Ponzetto, S. P., & Biemann, C. (2018b). Improving hypernymy extraction with distributional semantic classes. In Proceedings of the LREC 2018 Miyazaki, Japan: European Language Resources Association. Pelevina, M., Arefjev, N., Biemann, C., & Panchenko, A. (2016). Making sense of word embeddings.
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109
In Proceedings of the 1st Workshop on Representation Learning for NLP (pp. 174--183). Berlin, Germany: Association for Computational Linguistics. Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 701--710).: ACM. Remus, S. & Biemann, C. (2018). Retrofjttingword representations for unsupervised sense aware word similarities. In Proceedings of the LREC 2018 Miyazaki, Japan: European Language Resources Association. Rothe, S. & Schütze, H. (2015). Autoextend: Extending word embeddings to embeddings for synsets and lexemes. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109
Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 1793--1803). Beijing, China: Association for Computational Linguistics. Subercaze, J., Gravier, C., & Laforest, F. (2015). On metric embedding for boosting semantic similarity computations. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (pp. 8--14).: Association for Computational Linguistics. Ustalov, D., Chernoskutov, M., Biemann, C., & Panchenko, A. (2017a). Fighting with the sparsity of synonymy dictionaries for automatic synset induction. In International Conference on Analysis of Images, Social Networks and Texts (pp. 94--105).: Springer. Ustalov, D., Panchenko, A., & Biemann, C. (2017b).
May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109