Induction and embedding of linguistic structures from text Overview - - PowerPoint PPT Presentation

induction and embedding of linguistic structures from
SMART_READER_LITE
LIVE PREVIEW

Induction and embedding of linguistic structures from text Overview - - PowerPoint PPT Presentation

Alexander Panchenko Induction and embedding of linguistic structures from text Overview November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 2/80 Making induced senses interpretable [Panchenko et al.,


slide-1
SLIDE 1

Alexander Panchenko

Induction and embedding of linguistic structures from text

slide-2
SLIDE 2

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 2/80

Overview

slide-3
SLIDE 3

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 3/80

Inducing word sense representations:

word sense embeddings via retrofjtting [Pelevina et al., 2016, Remus & Biemann, 2018]; inducing synsets [Ustalov et al., 2017b, Ustalov et al., 2017a, Ustalov et al., 2018b] inducing semantic classes [Panchenko et al., 2018]

Making induced senses interpretable [Panchenko et al., 2017b, Panchenko et al., 2017c]

Overview

Overview

slide-4
SLIDE 4

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 3/80

Inducing word sense representations:

word sense embeddings via retrofjtting [Pelevina et al., 2016, Remus & Biemann, 2018]; inducing synsets [Ustalov et al., 2017b, Ustalov et al., 2017a, Ustalov et al., 2018b] inducing semantic classes [Panchenko et al., 2018]

Making induced senses interpretable [Panchenko et al., 2017b, Panchenko et al., 2017c]

Overview

Overview

slide-5
SLIDE 5

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 4/80

Inducing semantic frames [Ustalov et al., 2018a]

Inducing FrameNet-like structures; …using multi-way clustering.

Learning graph/network embeddings [ongoing joint work with Andrei Kutuzov and Chris Biemann]

How to represent induced networks/graphs? … so that they can be used in deep learning architectures. …efgectively and effjciently.

Overview

Overview

slide-6
SLIDE 6

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 4/80

Inducing semantic frames [Ustalov et al., 2018a]

Inducing FrameNet-like structures; …using multi-way clustering.

Learning graph/network embeddings [ongoing joint work with Andrei Kutuzov and Chris Biemann]

How to represent induced networks/graphs? … so that they can be used in deep learning architectures. …efgectively and effjciently.

Overview

Overview

slide-7
SLIDE 7

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 4/80

Inducing semantic frames [Ustalov et al., 2018a]

Inducing FrameNet-like structures; …using multi-way clustering.

Learning graph/network embeddings [ongoing joint work with Andrei Kutuzov and Chris Biemann]

How to represent induced networks/graphs? … so that they can be used in deep learning architectures. …efgectively and effjciently.

Overview

Overview

slide-8
SLIDE 8

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 5/80 Overview

SemEval 2019 Task 2

slide-9
SLIDE 9

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 6/80

Inducing word sense representations

slide-10
SLIDE 10

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 7/80 Inducing word sense representations

Word vs sense embeddings

slide-11
SLIDE 11

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 8/80 Inducing word sense representations

Word vs sense embeddings

slide-12
SLIDE 12

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 9/80 Inducing word sense representations

Related work

slide-13
SLIDE 13

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 10/80

AutoExtend [Rothe & Schütze, 2015]

* image is reproduced from the original paper

Inducing word sense representations

Related work: knowledge-based

slide-14
SLIDE 14

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 11/80

Adagram [Bartunov et al., 2016] Multiple vector representations θ for each word: where

  • - a hidden variable: a sense index of word

in context ;

  • - a meta-parameter controlling number of senses.
  • - probability of the -th sense of the word

;

  • - probability of observing word

in the sense ;

  • - probability of the context

.

See also: [Neelakantan et al., 2014] and [Li and Jurafsky, 2015]

Inducing word sense representations

Related work: knowledge-free

slide-15
SLIDE 15

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 11/80

Adagram [Bartunov et al., 2016] Multiple vector representations θ for each word: p(Y, Z, β|X, α, θ) =

V

w=1 ∞

k=1

p(βwk|α)

N

i=1

[p(zi|xi, β)

C

j=1

p(yij|zi, xi, θ)], where

zi -- a hidden variable: a sense index of word xi in context C; α -- a meta-parameter controlling number of senses. p(βwk|α) -- probability of the k-th sense of the word w; p(zi|xi, β) -- probability of observing word xi in the sense zi; ∏C

j=1 p(yij|zi, xi, θ) -- probability of the context C.

See also: [Neelakantan et al., 2014] and [Li and Jurafsky, 2015]

Inducing word sense representations

Related work: knowledge-free

slide-16
SLIDE 16

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 11/80

Adagram [Bartunov et al., 2016] Multiple vector representations θ for each word: p(Y, Z, β|X, α, θ) =

V

w=1 ∞

k=1

p(βwk|α)

N

i=1

[p(zi|xi, β)

C

j=1

p(yij|zi, xi, θ)], where

zi -- a hidden variable: a sense index of word xi in context C; α -- a meta-parameter controlling number of senses. p(βwk|α) -- probability of the k-th sense of the word w; p(zi|xi, β) -- probability of observing word xi in the sense zi; ∏C

j=1 p(yij|zi, xi, θ) -- probability of the context C.

See also: [Neelakantan et al., 2014] and [Li and Jurafsky, 2015]

Inducing word sense representations

Related work: knowledge-free

slide-17
SLIDE 17

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 12/80

Word sense induction (WSI) based on graph clustering:

[Lin, 1998] [Pantel and Lin, 2002] [Widdows and Dorow, 2002] Chinese Whispers [Biemann, 2006] [Hope and Keller, 2013]

Inducing word sense representations

Related work: word sense induction

slide-18
SLIDE 18

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 13/80

Iterative formulation [Biemann, 2006] Vector formulation [Biemann, 2006]

Inducing word sense representations

Related work: Chinese Whispers#2

slide-19
SLIDE 19

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 13/80

Iterative formulation [Biemann, 2006] Vector formulation [Biemann, 2006]

Inducing word sense representations

Related work: Chinese Whispers#2

slide-20
SLIDE 20

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 14/80 Inducing word sense representations

Related work: Chinese Whispers#2

slide-21
SLIDE 21

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 15/80 Inducing word sense representations

Related work: Chinese Whispers#2

slide-22
SLIDE 22

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 16/80 Inducing word sense representations

Related work: Chinese Whispers#2

slide-23
SLIDE 23

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 17/80

From word embeddings to sense embeddings

Calculate Word Similarity Graph Learning Word Vectors Word Sense Induction Text Corpus Word Vectors Word Similarity Graph Pooling of Word Vectors Sense Inventory Sense Vectors 1 2 4 3

Inducing word sense representations

Sense embeddings using retrofjtting

slide-24
SLIDE 24

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 18/80

Word sense induction using ego-network clustering

Inducing word sense representations

Sense embeddings using retrofjtting

slide-25
SLIDE 25

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 19/80

Neighbours of Word and Sense Vectors Vector Nearest Neighbors table tray, bottom, diagram, bucket, brackets, stack, basket, list, parenthesis, cup, saucer, pile, playfjeld, bracket, pot, drop-down, cue, plate table#0 leftmost#0, column#1, tableau#1, indent#1, bracket#3, pointer#0, footer#1, cursor#1, diagram#0, grid#0 table#1 pile#1, stool#1, tray#0, basket#0, bowl#1, bucket#0, box#0, cage#0, saucer#3, mirror#1, pan#1, lid#0

Inducing word sense representations

Sense embeddings using retrofjtting

slide-26
SLIDE 26

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 19/80

Neighbours of Word and Sense Vectors Vector Nearest Neighbors table tray, bottom, diagram, bucket, brackets, stack, basket, list, parenthesis, cup, saucer, pile, playfjeld, bracket, pot, drop-down, cue, plate table#0 leftmost#0, column#1, tableau#1, indent#1, bracket#3, pointer#0, footer#1, cursor#1, diagram#0, grid#0 table#1 pile#1, stool#1, tray#0, basket#0, bowl#1, bucket#0, box#0, cage#0, saucer#3, mirror#1, pan#1, lid#0

Inducing word sense representations

Sense embeddings using retrofjtting

slide-27
SLIDE 27

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 20/80

Word and sense embeddings

  • f words iron and vitamin.

LREC'18 [Remus & Biemann, 2018]

Inducing word sense representations

Sense embeddings using retrofjtting

slide-28
SLIDE 28

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 21/80

Word Sense Disambiguation

1 Context extraction: use context words around the target

word

2 Context fjltering: based on context word's relevance for

disambiguation

3 Sense choice in context: maximise similarity between a

context vector and a sense vector

Inducing word sense representations

Sense embeddings using retrofjtting

slide-29
SLIDE 29

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 22/80 Inducing word sense representations

Sense embeddings using retrofjtting

slide-30
SLIDE 30

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 23/80 Inducing word sense representations

Sense embeddings using retrofjtting

slide-31
SLIDE 31

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 24/80 Inducing word sense representations

Sense embeddings using retrofjtting

slide-32
SLIDE 32

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 25/80 Inducing word sense representations

Sense embeddings using retrofjtting

slide-33
SLIDE 33

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 26/80

Word Sense Induction task [Pelevina et al., 2016]:

SemEval SemEval'13 dataset; Performs comparably to SOTA (by 2016) … including neural models.

Semantic Similarity task [Remus & Biemann, 2018]:

SimLex, WordSim353, MEN and other datasets; Improves the results compared to the original word embeddigns … across difgerent models (GloVe, word2vec, …).

Inducing word sense representations

Evaluation: Key results

slide-34
SLIDE 34

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 26/80

Word Sense Induction task [Pelevina et al., 2016]:

SemEval SemEval'13 dataset; Performs comparably to SOTA (by 2016) … including neural models.

Semantic Similarity task [Remus & Biemann, 2018]:

SimLex, WordSim353, MEN and other datasets; Improves the results compared to the original word embeddigns … across difgerent models (GloVe, word2vec, …).

Inducing word sense representations

Evaluation: Key results

slide-35
SLIDE 35

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 27/80

Word sense induction

slide-36
SLIDE 36

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 28/80

Target word, e.g. ``bank''. Contexts where the word occurs, e.g.:

``river bank is a slope beside a body of water'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun''

You need to group the contexts by senses:

``river bank is a slope beside a body of water'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.''

Word sense induction

A lexical sample WSI task

slide-37
SLIDE 37

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 28/80

Target word, e.g. ``bank''. Contexts where the word occurs, e.g.:

``river bank is a slope beside a body of water'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun''

You need to group the contexts by senses:

``river bank is a slope beside a body of water'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.''

Word sense induction

A lexical sample WSI task

slide-38
SLIDE 38

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 28/80

Target word, e.g. ``bank''. Contexts where the word occurs, e.g.:

``river bank is a slope beside a body of water'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun''

You need to group the contexts by senses:

``river bank is a slope beside a body of water'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.''

Word sense induction

A lexical sample WSI task

slide-39
SLIDE 39

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 29/80 Text contexts

  • f a word

Representation of each context in a vector space Clustering of the contexts in the vector space Context clusters corresponding to senses

Representation

Sparse vector model (TF-IDF, etc.) Weighted (TF-IDF, , etc.) sum of word embeddings Sentence embeddings (InterSent, Skip-Thougts, doc2vec, etc.)

Clustering

Affjnity Propagation Agglomerative Clustering

  • means

Word sense induction

Sense induction using clustering

slide-40
SLIDE 40

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 29/80 Text contexts

  • f a word

Representation of each context in a vector space Clustering of the contexts in the vector space Context clusters corresponding to senses

Representation

Sparse vector model (TF-IDF, etc.) Weighted (TF-IDF, χ2, etc.) sum of word embeddings Sentence embeddings (InterSent, Skip-Thougts, doc2vec, etc.)

Clustering

Affjnity Propagation Agglomerative Clustering

  • means

Word sense induction

Sense induction using clustering

slide-41
SLIDE 41

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 29/80 Text contexts

  • f a word

Representation of each context in a vector space Clustering of the contexts in the vector space Context clusters corresponding to senses

Representation

Sparse vector model (TF-IDF, etc.) Weighted (TF-IDF, χ2, etc.) sum of word embeddings Sentence embeddings (InterSent, Skip-Thougts, doc2vec, etc.)

Clustering

Affjnity Propagation Agglomerative Clustering k-means

Word sense induction

Sense induction using clustering

slide-42
SLIDE 42

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 30/80

1 Get the neighbors of a target word, e.g. ``bank'': 1

lender

2 river 3 citybank 4 slope 5

2 Get similar to ``bank'' and dissimilar to ``lender'': 1

river

2 slope 3 land 4 … 3 Compute distances to ``lender'' and ``river''.

Word sense induction

Sense induction using neighbors

slide-43
SLIDE 43

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 31/80

1 For i-th neighbor of the target word w among k neigbours: 1

Get a pair of opposite words for the i neighbor: (wj, wk)

2 Add them as as nodes: V = V ∪ {wj, wk} 3 Remember the pair as an anti-edge: A = A ∪ (wj, wk) 2 Build an ego network

  • f the word

:

1

are computed based on word similarities;

2

are pruned based on the anti-edge constraints: .

3 Cluster the ego network of the word

.

4 Find cluster labels by fjnding the central nodes in a cluster.

Word sense induction

Graph-vector sense induction

slide-44
SLIDE 44

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 31/80

1 For i-th neighbor of the target word w among k neigbours: 1

Get a pair of opposite words for the i neighbor: (wj, wk)

2 Add them as as nodes: V = V ∪ {wj, wk} 3 Remember the pair as an anti-edge: A = A ∪ (wj, wk) 2 Build an ego network G = (V, E) of the word w: 1

E are computed based on word similarities;

2 E are pruned based on the anti-edge constraints: E = E ∖ A. 3 Cluster the ego network of the word

.

4 Find cluster labels by fjnding the central nodes in a cluster.

Word sense induction

Graph-vector sense induction

slide-45
SLIDE 45

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 31/80

1 For i-th neighbor of the target word w among k neigbours: 1

Get a pair of opposite words for the i neighbor: (wj, wk)

2 Add them as as nodes: V = V ∪ {wj, wk} 3 Remember the pair as an anti-edge: A = A ∪ (wj, wk) 2 Build an ego network G = (V, E) of the word w: 1

E are computed based on word similarities;

2 E are pruned based on the anti-edge constraints: E = E ∖ A. 3 Cluster the ego network of the word w. 4 Find cluster labels by fjnding the central nodes in a cluster.

Word sense induction

Graph-vector sense induction

slide-46
SLIDE 46

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 31/80

1 For i-th neighbor of the target word w among k neigbours: 1

Get a pair of opposite words for the i neighbor: (wj, wk)

2 Add them as as nodes: V = V ∪ {wj, wk} 3 Remember the pair as an anti-edge: A = A ∪ (wj, wk) 2 Build an ego network G = (V, E) of the word w: 1

E are computed based on word similarities;

2 E are pruned based on the anti-edge constraints: E = E ∖ A. 3 Cluster the ego network of the word w. 4 Find cluster labels by fjnding the central nodes in a cluster.

Word sense induction

Graph-vector sense induction

slide-47
SLIDE 47

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 32/80

Get the neighbors of a target word, e.g. ``java'':

1

Python

2 Borneo 3 C++ 4 Sumatra 5

Arabica

6 Robusta 7

Ruby

8 JavaScript 9 Bali 10 …

Word sense induction

Graph-vector sense induction

slide-48
SLIDE 48

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 33/80

Get the neighbors of a target word, e.g. ``java'':

1

Python ̸= Borneo

2 Borneo ̸= Scala 3 C++ ̸= Borneo 4 Sumatra ̸= highway 5

Arabica ̸= Python

6 Robusta ̸= Python 7

Ruby ̸= Arabica

8 Bali ̸= North

Word sense induction

Graph-vector sense induction

slide-49
SLIDE 49

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 34/80

Nodes:

1

Python

2 Borneo 3 C++ 4 Arabica 5

Robusta

6 Ruby

Word sense induction

Graph-vector sense induction

slide-50
SLIDE 50

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 35/80 Word sense induction

Sense induction example

slide-51
SLIDE 51

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 36/80 Word sense induction

Sense induction example

slide-52
SLIDE 52

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 37/80 Word sense induction

Sense induction example

slide-53
SLIDE 53

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 38/80 Word sense induction

Sense induction example

slide-54
SLIDE 54

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 39/80

1 SemEval 2007 2 SemEval 2010 3 RUSSE 2018 4 SemEval 2019 Task 2 Subtask 1:

Clustering of verb occurrences Assign occurrences of the target verbs to a number of clusters, in such a way that verbs belonging to the same cluster evoke the same frame type. gold annotations for this subtask are based on FrameNet

Trump leads the world, backward. Disrespecting international laws leads to many complications. Rosenzweig heads the climate impacts section at NASA's Goddard Institute.

Word sense induction

Datasets

slide-55
SLIDE 55

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 39/80

1 SemEval 2007 2 SemEval 2010 3 RUSSE 2018 4 SemEval 2019 Task 2 Subtask 1:

Clustering of verb occurrences Assign occurrences of the target verbs to a number of clusters, in such a way that verbs belonging to the same cluster evoke the same frame type. gold annotations for this subtask are based on FrameNet

Trump leads the world, backward. Disrespecting international laws leads to many complications. Rosenzweig heads the climate impacts section at NASA's Goddard Institute.

Word sense induction

Datasets

slide-56
SLIDE 56

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 40/80

1 SemEval 2007 2 SemEval 2010 3 RUSSE 2018 4 SemEval 2019 Task 2 Subtask 1:

Clustering of verb occurrences Assign occurrences of the target verbs to a number of clusters, in such a way that verbs belonging to the same cluster evoke the same frame type. gold annotations for this subtask are based on FrameNet

Trump leads the world, backward. Disrespecting international laws leads to many complications. Rosenzweig heads the climate impacts section at NASA's Goddard Institute.

Word sense induction

Datasets

slide-57
SLIDE 57

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 41/80

Semantic frame ``Abandonment'' from FrameNet

Word sense induction

Semantic roles

slide-58
SLIDE 58

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 42/80

A semantic class contains words that share a semantic feature. Examples of concrete semantic classes:

people plants animals materials programming languages

Examples of abstract semantic classes:

qualities actions processes

Word sense induction

Semantic classes

slide-59
SLIDE 59

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 43/80

Word Sense Local Sense Cluster: Related Senses Hypernyms mango#0 peach#1, grape#0, plum#0, apple#0, apricot#0, watermelon#1, banana#1, coconut#0, pear#0, fjg#0, melon#0, mangosteen#0, … fruit#0, food#0, … apple#0 mango#0, pineapple#0, banana#1, melon#0, grape#0, peach#1, watermelon#1, apricot#0, cranberry#0, pumpkin#0, mangosteen#0, … fruit#0, crop#0, … Java#1 C#4, Python#3, Apache#3, Ruby#6, Flash#1, C++#0, SQL#0, ASP#2, Visual Basic#1, CSS#0, Delphi#2, MySQL#0, Excel#0, Pascal#0, … programming language#3, lan- guage#0, … Python#3 PHP#0, Pascal#0, Java#1, SQL#0, Visual Ba- sic#1, C++#0, JavaScript#0, Apache#3, Haskell#5, .NET#1, C#4, SQL Server#0, … language#0, tech- nology#0, …

Word sense induction

Sample of induced sense inventory

slide-60
SLIDE 60

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 44/80

ID Global Sense Cluster: Semantic Class Hypernyms 1 peach#1, banana#1, pineapple#0, berry#0, black- berry#0, grapefruit#0, strawberry#0, blueberry#0, mango#0, grape#0, melon#0, orange#0, pear#0, plum#0, raspberry#0, watermelon#0, apple#0, apri- cot#0, watermelon#0, pumpkin#0, berry#0, man- gosteen#0, … vegetable#0, fruit#0, crop#0, ingredi- ent#0, food#0, · 2 C#4, Basic#2, Haskell#5, Flash#1, Java#1, Pas- cal#0, Ruby#6, PHP#0, Ada#1, Oracle#3, Python#3, Apache#3, Visual Basic#1, ASP#2, Delphi#2, SQL Server#0, CSS#0, AJAX#0, JavaScript#0, SQL Server#0, Apache#3, Delphi#2, Haskell#5, .NET#1, CSS#0, … programming lan- guage#3, technol-

  • gy#0, language#0,

format#2, app#0

Word sense induction

Sample of induced semantic classes

slide-61
SLIDE 61

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 45/80

Text Corpus Representing Senses with Ego Networks Semantic Classes Word Sense Induction from Text Corpus Sense Graph Construction Clustering of Word Senes Labeling Sense Clusters with Hypernyms

Induced Word Senses Sense Ego-Networks Global Sense Graph

s Noisy Hypernyms Cleansed Hypernyms Induction of Semantic Classes

Global Sense Clusters

Word sense induction

Induction of semantic classes

slide-62
SLIDE 62

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 46/80

Filtering noisy hypernyms with semantic classes LREC'18 [Panchenko et al., 2018]:

fruit#1 food#0 apple#2 mango#0 pear#0

Hypernyms, Sense Cluster,

mangosteen#0 city#2

Removed Wrong Added Missing

Word sense induction

Induction of sense semantic classes

slide-63
SLIDE 63

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 47/80

http://panchenko.me/data/joint/nodes20000-layers7

Word sense induction

Global sense clustering

slide-64
SLIDE 64

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 48/80 Word sense induction

Global sense clustering

slide-65
SLIDE 65

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 49/80

Filtering of a noisy hypernymy database with semantic classes. LREC'18 [Panchenko et al., 2018]

Precision Recall F-score Original Hypernyms (Seitner et al., 2016) 0.475 0.546 0.508 Semantic Classes (coarse-grained) 0.541 0.679 0.602

Word sense induction

Induction of sense semantic classes

slide-66
SLIDE 66

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 50/80

Making induced senses interpretable

slide-67
SLIDE 67

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 51/80

Knowledge-based sense representations are interpretable

Making induced senses interpretable

Making induced senses interpretable

slide-68
SLIDE 68

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 52/80

Most knowledge-free sense representations are uninterpretable

Making induced senses interpretable

Making induced senses interpretable

slide-69
SLIDE 69

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 53/80 Making induced senses interpretable

Making induced senses interpretable

slide-70
SLIDE 70

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 54/80

Hypernymy prediction in context. EMNLP'17 [Panchenko et al., 2017b]

Making induced senses interpretable

Making induced senses interpretable

slide-71
SLIDE 71

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 55/80

Induction of semantic frames

slide-72
SLIDE 72

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 56/80 Induction of semantic frames

FrameNet: frame ''Kidnapping''

slide-73
SLIDE 73

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 57/80

ACL'2018 [Ustalov et al., 2018a] Example of a LU tricluster corresponding to the ''Kidnapping'' frame from FrameNet. FrameNet Role Lexical Units (LU) Perpetrator Subject kidnapper, alien, militant FEE Verb snatch, kidnap, abduct Victim Object son, people, soldier, child

Induction of semantic frames

Frame induction as a triclustering

slide-74
SLIDE 74

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 58/80 Induction of semantic frames

SVO triple elements

slide-75
SLIDE 75

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 59/80

Officer|chair|Committee

  • fficer|head|team

mayor|lead|city

  • fficer|lead|company

Mayor|lead|city boss|lead|company chairman|lead|company director|lead|department chief|lead|department president|lead|government president|lead|state director|lead|company president|lead|department

  • fficer|chair|committee

Chief|lead|department chairman|lead|committee Director|lead|Department Director|lead|department Director|lead|agency Director|lead|company minister|lead|team Director|head|team director|head|team Chairman|lead|company Chairman|lead|Committee President|lead|company Director|chair|Committee President|lead|party President|head|team leader|head|team Director|chair|committee director|chair|committee Director|head|Department president|head|team director|head|department director|head|agency director|head|committee Chairman|run|committee Chairman|chair|Committee Chairman|chair|committee President|chair|Committee President|chair|committee Governor|lead|state chairman|head|committee chairman|run|committee president|chair|committee president|head|committee president|chair|Committee Minister|chair|committee representative|chair|committee representative|head|committee General|command|department General|command|Department General|head|Department General|head|department

  • fficer|head|department

minister|head|department leader|head|agency leader|head|party leader|head|committee leader|head|department minister|head|committee King|run|company leader|head|government Minister|head|government president|head|government Officer|chair|Committee

  • fficer|head|team

mayor|lead|city

  • fficer|lead|company

Mayor|lead|city boss|lead|company chairman|lead|company director|lead|department chief|lead|department president|lead|government president|lead|state director|lead|company president|lead|department

  • fficer|chair|committee

Chief|lead|department chairman|lead|committee Director|lead|Department Director|lead|department Director|lead|agency Director|lead|company minister|lead|team Director|head|team director|head|team Chairman|lead|company Chairman|lead|Committee President|lead|company Director|chair|Committee President|lead|party President|head|team leader|head|team Director|chair|committee director|chair|committee Director|head|Department president|head|team director|head|department director|head|agency director|head|committee Chairman|run|committee Chairman|chair|Committee Chairman|chair|committee President|chair|Committee President|chair|committee Governor|lead|state chairman|head|committee chairman|run|committee president|chair|committee president|head|committee president|chair|Committee Minister|chair|committee representative|chair|committee representative|head|committee General|command|department General|command|Department General|head|Department General|head|department

  • fficer|head|department

minister|head|department leader|head|agency leader|head|party leader|head|committee leader|head|department minister|head|committee King|run|company leader|head|government Minister|head|government president|head|government

Induction of semantic frames

An SVO triple graph

slide-76
SLIDE 76

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 60/80

Input: an embedding model v ∈ V → ⃗ v ∈ Rd, a set of SVO triples T ⊆ V 3, the number of nearest neighbors k ∈ N, a graph clustering algorithm Cluster. Output: a set of triframes . NN for all Cluster do return

Induction of semantic frames

Triframes frame induction

slide-77
SLIDE 77

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 60/80

Input: an embedding model v ∈ V → ⃗ v ∈ Rd, a set of SVO triples T ⊆ V 3, the number of nearest neighbors k ∈ N, a graph clustering algorithm Cluster. Output: a set of triframes F. NN for all Cluster do return

Induction of semantic frames

Triframes frame induction

slide-78
SLIDE 78

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 60/80

Input: an embedding model v ∈ V → ⃗ v ∈ Rd, a set of SVO triples T ⊆ V 3, the number of nearest neighbors k ∈ N, a graph clustering algorithm Cluster. Output: a set of triframes F. S ← {t→ ⃗ t ∈ R3d : t ∈ T} E ← {(t, t′) ∈ T 2 : t′ ∈ NNS

k (⃗

t), t ̸= t′} F ← ∅ for all C ∈ Cluster(T, E) do fs ← {s ∈ V : (s, v, o) ∈ C} fv ← {v ∈ V : (s, v, o) ∈ C} fo ← {o ∈ V : (s, v, o) ∈ C} F ← F ∪ {(fs, fv, fo)} return F

Induction of semantic frames

Triframes frame induction

slide-79
SLIDE 79

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 61/80

Frame # 848 Subjects: Company, fjrm, company Verbs: buy, supply, discharge, purchase, expect Objects: book, supply, house, land, share, company, grain, which, item, product, ticket, work, this, equipment, House, it, fjlm, water, something, she, what, service, plant, time

Induction of semantic frames

Example of an extracted frame

slide-80
SLIDE 80

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 62/80

Frame # 849 Subjects: student, scientist, we, pupil, member, company, man, nobody, you, they, US, group, it, people, Man, user, he Verbs: do, test, perform, execute, conduct Objects: experiment, test

Induction of semantic frames

Example of an extracted frame

slide-81
SLIDE 81

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 63/80

Frame # 3207 Subjects: people, we, they, you Verbs: feel, seek, look, search Objects: housing, inspiration, gold, witness, part- ner, accommodation, Partner

Induction of semantic frames

Example of an extracted frame

slide-82
SLIDE 82

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 64/80

Dataset # instances # unique # clusters FrameNet Triples 99,744 94,170 383

  • Poly. Verb Classes

246 110 62

Induction of semantic frames

Evaluation datasets

slide-83
SLIDE 83

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 65/80

Dataset # instances # unique # clusters FrameNet Triples 99,744 94,170 383

  • Poly. Verb Classes

246 110 62 Quality Measures: nmPU: normalized modifjed purity, niPU: normalized inverse purity.

Induction of semantic frames

Evaluation settings

slide-84
SLIDE 84

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 65/80

Dataset # instances # unique # clusters FrameNet Triples 99,744 94,170 383

  • Poly. Verb Classes

246 110 62 Quality Measures: nmPU: normalized modifjed purity, niPU: normalized inverse purity.

Induction of semantic frames

Evaluation settings

slide-85
SLIDE 85

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 66/80

F1-scores for verbs, subjects,

  • bjects,

frames

Induction of semantic frames

Results: comparison to state-of-art

slide-86
SLIDE 86

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 67/80

Graph embeddings

slide-87
SLIDE 87

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 68/80

Image source: https://www.tensorflow.org/tutorials/word2vec

Graph embeddings

Text: sparse symbolic representation

slide-88
SLIDE 88

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 68/80

Image source: https://www.tensorflow.org/tutorials/word2vec

Graph embeddings

Text: sparse symbolic representation

slide-89
SLIDE 89

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 69/80 Graph embeddings

Graph: sparse symbolic representation

slide-90
SLIDE 90

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 70/80

From a survey on graph embeddings [Hamilton et al., 2017]:

Graph embeddings

Embedding graph into a vector space

slide-91
SLIDE 91

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 71/80

From a survey on graph embeddings [Hamilton et al., 2017]:

Graph embeddings

Learning with an ''autoencoder''

slide-92
SLIDE 92

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 72/80

From a survey on graph embeddings [Hamilton et al., 2017]:

Graph embeddings

Some established approaches

slide-93
SLIDE 93

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 73/80

Given a tree (V, E) Leackock-Chodorow (LCH) similarity measure: sim(vi, vj) = − log shortest_path_distance(vi, vj) 2h Jiang-Conrath (JCN) similarity measure: ln ln ln

Graph embeddings

Graph embeddings using similarities

slide-94
SLIDE 94

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 73/80

Given a tree (V, E) Leackock-Chodorow (LCH) similarity measure: sim(vi, vj) = − log shortest_path_distance(vi, vj) 2h Jiang-Conrath (JCN) similarity measure: sim(vi, vj) = 2 ln Plcs(vi, vj) ln P(vi) + ln P(vj)

Graph embeddings

Graph embeddings using similarities

slide-95
SLIDE 95

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 74/80

path2vec model (arxiv.org/abs/1808.05611): L = 1 |T| ∑

(vi,vj)∈T

( (vT

i vj − sim(vi, vj))2 + αvT i vin + αvT j vjm

) , sim(vi, vj) - the value of a ''gold'' similarity measure between a pair of nodes (vi, vj); vi - an embeddings of node; T - training batch; vin - random adjacent node of vi; α - a small regularization coeffjcient, e.g. 0.001.

Graph embeddings

Graph embeddings using similarities

slide-96
SLIDE 96

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 75/80

Computation of 82,115 pairwise similarities: Model Running time LCH in NLTK 30 sec. JCN in NLTK 6.7 sec. FSE embeddings 0.713 sec. path2vec and other fmoat vectors 0.007 sec.

Graph embeddings

Speedup: graph vs embeddings

slide-97
SLIDE 97

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 76/80

Spearman correlation scores with WordNet similarities on SimLex999 noun pairs: Selection of synsets Model JCN-SemCor JCN-Brown LCH WordNet 1.0 1.0 1.0 Node2vec 0.655 0.671 0.724 Deepwalk 0.775 0.774 0.868 FSE 0.830 0.820 0.900 path2vec 0.917 0.914 0.934

Graph embeddings

Results: goodness of fjt

slide-98
SLIDE 98

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 77/80

Spearman correlations with human SimLex999 noun similarities: Model Correlation Raw WordNet JCN-SemCor 0.487 Raw WordNet JCN-Brown 0.495 Raw WordNet LCH 0.513 node2vec [Grover & Leskovec, 2016] 0.450 Deepwalk [Perozzi et al., 2014] 0.533 FSE [Subercaze et al., 2015] 0.556 path2vec JCN-SemCor 0.549 path2vec JCN-Brown 0.540 path2vec LCH 0.540

Graph embeddings

Results: SimLex999 dataset

slide-99
SLIDE 99

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 78/80 Graph embeddings

Results: SimLex999 dataset

slide-100
SLIDE 100

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 79/80

Conclusion

slide-101
SLIDE 101

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to

improve performance of WSD; enrich lexical resources with emerging senses; See [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]

We can represent language graphs using graph embeddings for use in deep neural models.

Conclusion

Take home messages

slide-102
SLIDE 102

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to

improve performance of WSD; enrich lexical resources with emerging senses; See [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]

We can represent language graphs using graph embeddings for use in deep neural models.

Conclusion

Take home messages

slide-103
SLIDE 103

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to

improve performance of WSD; enrich lexical resources with emerging senses; See [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]

We can represent language graphs using graph embeddings for use in deep neural models.

Conclusion

Take home messages

slide-104
SLIDE 104

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to

improve performance of WSD; enrich lexical resources with emerging senses; See [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]

We can represent language graphs using graph embeddings for use in deep neural models.

Conclusion

Take home messages

slide-105
SLIDE 105

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

Bartunov, S., Kondrashkin, D., Osokin, A., & Vetrov, D. (2016). Breaking sticks and ambiguities with adaptive skip-gram. In Artifjcial Intelligence and Statistics (pp. 130--138). Biemann, C. (2006). Chinese whispers: an effjcient graph clustering algorithm and its application to natural language processing problems. In Proceedings of the fjrst workshop on graph based methods for natural language processing (pp. 73--80).: Association for Computational Linguistics. Biemann, C., Faralli, S., Panchenko, A., & Ponzetto, S. P. (2018). A framework for enriching lexical semantic resources with distributional semantics. In Journal of Natural Language Engineering (pp. 56--64).: Cambridge Press. Faralli, S., Panchenko, A., Biemann, C., & Ponzetto, S. P. (2016). Linked disambiguated distributional semantic networks. In International Semantic Web Conference (pp. 56--64).: Springer.

slide-106
SLIDE 106

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

Grover, A. & Leskovec, J. (2016). Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 855--864).: ACM. Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Representation learning on graphs: Methods and applications. IEEE Data Engineering Bulletin, September 2017. Panchenko, A. (2016). Best of both worlds: Making word sense embeddings interpretable. In LREC. Panchenko, A., Faralli, S., Ponzetto, S. P., & Biemann, C. (2017a). Using linked disambiguated distributional networks for word sense disambiguation.

slide-107
SLIDE 107

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

In Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications (pp. 72--78). Valencia, Spain: Association for Computational Linguistics. Panchenko, A., Marten, F., Ruppert, E., Faralli, S., Ustalov, D., Ponzetto, S. P., & Biemann, C. (2017b). Unsupervised, knowledge-free, and interpretable word sense disambiguation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 91--96). Copenhagen, Denmark: Association for Computational Linguistics. Panchenko, A., Ruppert, E., Faralli, S., Ponzetto, S. P., & Biemann, C. (2017c). Unsupervised does not mean uninterpretable: The case for word sense induction and disambiguation. In Proceedings of the 15th Conference of the European Chapter

  • f the Association for Computational Linguistics: Volume 1,

Long Papers (pp. 86--98). Valencia, Spain: Association for Computational Linguistics.

slide-108
SLIDE 108

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

Panchenko, A., Ustalov, D., Faralli, S., Ponzetto, S. P., & Biemann, C. (2018). Improving hypernymy extraction with distributional semantic classes. In Proceedings of the LREC 2018 Miyazaki, Japan: European Language Resources Association. Pelevina, M., Arefjev, N., Biemann, C., & Panchenko, A. (2016). Making sense of word embeddings. In Proceedings of the 1st Workshop on Representation Learning for NLP (pp. 174--183). Berlin, Germany: Association for Computational Linguistics. Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 701--710).: ACM. Remus, S. & Biemann, C. (2018).

slide-109
SLIDE 109

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

Retrofjttingword representations for unsupervised sense aware word similarities. In Proceedings of the LREC 2018 Miyazaki, Japan: European Language Resources Association. Rothe, S. & Schütze, H. (2015). Autoextend: Extending word embeddings to embeddings for synsets and lexemes. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 1793--1803). Beijing, China: Association for Computational Linguistics. Subercaze, J., Gravier, C., & Laforest, F. (2015). On metric embedding for boosting semantic similarity computations. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short

slide-110
SLIDE 110

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

Papers) (pp. 8--14).: Association for Computational Linguistics. Ustalov, D., Chernoskutov, M., Biemann, C., & Panchenko, A. (2017a). Fighting with the sparsity of synonymy dictionaries for automatic synset induction. In International Conference on Analysis of Images, Social Networks and Texts (pp. 94--105).: Springer. Ustalov, D., Panchenko, A., & Biemann, C. (2017b). Watset: Automatic induction of synsets from a graph of synonyms. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1579--1590). Vancouver, Canada: Association for Computational Linguistics. Ustalov, D., Panchenko, A., Kutuzov, A., Biemann, C., & Ponzetto, S. P. (2018a). Unsupervised semantic frame induction using triclustering.

slide-111
SLIDE 111

November 7, 2018 Induction and embedding of linguistic structures from text, A. Panchenko 80/80

arXiv preprint arXiv:1805.04715. Ustalov, D., Teslenko, D., Panchenko, A., Chernoskutov, M., & Biemann, C. (2018b). Word sense disambiguation based on automatically induced synsets. In LREC 2018, 11th International Conference on Language Resources and Evaluation : 7-12 May 2018, Miyazaki (Japan) (pp. tba). Paris: European Language Resources Association, ELRA-ELDA. Accepted for publication.