From unsupervised induction of linguistic structures from text - - PowerPoint PPT Presentation

from unsupervised induction of linguistic structures from
SMART_READER_LITE
LIVE PREVIEW

From unsupervised induction of linguistic structures from text - - PowerPoint PPT Presentation

Alexander Panchenko From unsupervised induction of linguistic structures from text towards applications in deep learning In close collaboration with May 28, 2018 From unsupervised induction of linguistic structures to applications in deep


slide-1
SLIDE 1

Alexander Panchenko

From unsupervised induction of linguistic structures from text towards applications in deep learning

slide-2
SLIDE 2

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 2/109

In close collaboration with …

slide-3
SLIDE 3

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 3/109

Andrei Kutuzov Eugen Ruppert Fide Marten Nikolay Arefyev Stefgen Remus Martin Riedl Hubert Naets Maria Pelevina Anastasiya Lopukhina Konstantin Lopukhin

In collaboration with …

slide-4
SLIDE 4

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 4/109

Motivation

slide-5
SLIDE 5

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 5/109 Image source: https://commons.wikimedia.org/wiki/File:Major_levels_of_linguistic_structure.svg Motivation

Levels of Linguistic Analysis

slide-6
SLIDE 6

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 6/109 Image source: https://commons.wikimedia.org/wiki/File:Major_levels_of_linguistic_structure.svg Motivation

Levels of Linguistic Analysis

slide-7
SLIDE 7

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 7/109

(Written) language is a symbolic system Semantic level: typed weighted graphs of concepts

Co-occurrence networks Lexical databases, e.g. WordNet Thesauri, e.g. NLM Ontologies, e.g. DBPedia Associative networks, e.g. Edinburgh Associative Thesaurus …

Motivation

Linguistic Structures and Graphs

slide-8
SLIDE 8

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 8/109 Motivation

Semantic Graphs

slide-9
SLIDE 9

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 9/109 Motivation

Semantic Graphs

slide-10
SLIDE 10

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 10/109

''Anti-connectivism'' End-to-end learning: symbolic representations aren't needed Word embeddings lookup (at most)

Motivation

The new brave world of Deep Learning

slide-11
SLIDE 11

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 10/109

''Anti-connectivism'' End-to-end learning: symbolic representations aren't needed Word embeddings lookup (at most)

Motivation

The new brave world of Deep Learning

slide-12
SLIDE 12

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 11/109

Adjacency matrix A is dual with the corresponding graph . Vector matrix multiply A x is dual with breadth-fjrst search.

Motivation

Graph Matrix Duality

slide-13
SLIDE 13

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 11/109

Adjacency matrix A is dual with the corresponding graph G. Vector matrix multiply A x is dual with breadth-fjrst search.

Motivation

Graph Matrix Duality

slide-14
SLIDE 14

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 11/109

Adjacency matrix A is dual with the corresponding graph G. Vector matrix multiply AT x is dual with breadth-fjrst search.

Motivation

Graph Matrix Duality

slide-15
SLIDE 15

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 12/109

1 Learn interpretable symbolic structures from text in an

unsupervised way, which are more complex than words.

2 Represent the learned structures in a vector space. 3 Use the vector representations instead/in addition to word

embedding the deep learning applications. Lookup of word senses, frames, etc.

4 More complex structures could improve performance, but

also provide better interpretability of the deep learning models.

Motivation

Goal: Linguistic Structures in DL

slide-16
SLIDE 16

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 12/109

1 Learn interpretable symbolic structures from text in an

unsupervised way, which are more complex than words.

2 Represent the learned structures in a vector space. 3 Use the vector representations instead/in addition to word

embedding the deep learning applications. Lookup of word senses, frames, etc.

4 More complex structures could improve performance, but

also provide better interpretability of the deep learning models.

Motivation

Goal: Linguistic Structures in DL

slide-17
SLIDE 17

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 12/109

1 Learn interpretable symbolic structures from text in an

unsupervised way, which are more complex than words.

2 Represent the learned structures in a vector space. 3 Use the vector representations instead/in addition to word

embedding the deep learning applications. Lookup of word senses, frames, etc.

4 More complex structures could improve performance, but

also provide better interpretability of the deep learning models.

Motivation

Goal: Linguistic Structures in DL

slide-18
SLIDE 18

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 12/109

1 Learn interpretable symbolic structures from text in an

unsupervised way, which are more complex than words.

2 Represent the learned structures in a vector space. 3 Use the vector representations instead/in addition to word

embedding the deep learning applications. Lookup of word senses, frames, etc.

4 More complex structures could improve performance, but

also provide better interpretability of the deep learning models.

Motivation

Goal: Linguistic Structures in DL

slide-19
SLIDE 19

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 13/109

Overview

slide-20
SLIDE 20

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 14/109

Inducing word sense representations:

word sense embeddings via retrofjtting [Pelevina et al., 2016, Remus & Biemann, 2018]; inducing synsets [Ustalov et al., 2017b, Ustalov et al., 2017a, Ustalov et al., 2018b] inducing semantic classes [Panchenko et al., 2018b]

Making induced senses interpretable [Panchenko et al., 2017b, Panchenko et al., 2017c] Linking induced word senses to lexical resources [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]

Overview

Overview

slide-21
SLIDE 21

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 14/109

Inducing word sense representations:

word sense embeddings via retrofjtting [Pelevina et al., 2016, Remus & Biemann, 2018]; inducing synsets [Ustalov et al., 2017b, Ustalov et al., 2017a, Ustalov et al., 2018b] inducing semantic classes [Panchenko et al., 2018b]

Making induced senses interpretable [Panchenko et al., 2017b, Panchenko et al., 2017c] Linking induced word senses to lexical resources [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]

Overview

Overview

slide-22
SLIDE 22

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 14/109

Inducing word sense representations:

word sense embeddings via retrofjtting [Pelevina et al., 2016, Remus & Biemann, 2018]; inducing synsets [Ustalov et al., 2017b, Ustalov et al., 2017a, Ustalov et al., 2018b] inducing semantic classes [Panchenko et al., 2018b]

Making induced senses interpretable [Panchenko et al., 2017b, Panchenko et al., 2017c] Linking induced word senses to lexical resources [Panchenko, 2016, Faralli et al., 2016, Panchenko et al., 2017a, Biemann et al., 2018]

Overview

Overview

slide-23
SLIDE 23

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 15/109

A shared task on word sense induction [Panchenko et al., 2018a, Arefyev et al., 2018] Inducing semantic frames [Ustalov et al., 2018a]

Inducing FrameNet-like structures; …using multi-way clustering.

Learning graph/network embeddings [ongoing joint work with Andrei Kutuzov and Chris Biemann]

How to represent induced networks/graphs? … so that they can be used in deep learning architectures. …efgectively and effjciently.

Overview

Overview

slide-24
SLIDE 24

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 15/109

A shared task on word sense induction [Panchenko et al., 2018a, Arefyev et al., 2018] Inducing semantic frames [Ustalov et al., 2018a]

Inducing FrameNet-like structures; …using multi-way clustering.

Learning graph/network embeddings [ongoing joint work with Andrei Kutuzov and Chris Biemann]

How to represent induced networks/graphs? … so that they can be used in deep learning architectures. …efgectively and effjciently.

Overview

Overview

slide-25
SLIDE 25

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 15/109

A shared task on word sense induction [Panchenko et al., 2018a, Arefyev et al., 2018] Inducing semantic frames [Ustalov et al., 2018a]

Inducing FrameNet-like structures; …using multi-way clustering.

Learning graph/network embeddings [ongoing joint work with Andrei Kutuzov and Chris Biemann]

How to represent induced networks/graphs? … so that they can be used in deep learning architectures. …efgectively and effjciently.

Overview

Overview

slide-26
SLIDE 26

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 16/109

Inducing word sense representations

slide-27
SLIDE 27

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 17/109 Inducing word sense representations

Word vs sense embeddings

slide-28
SLIDE 28

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 18/109 Inducing word sense representations

Word vs sense embeddings

slide-29
SLIDE 29

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 19/109 Inducing word sense representations

Related work

slide-30
SLIDE 30

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 20/109

AutoExtend [Rothe & Schütze, 2015]

* image is reproduced from the original paper

Inducing word sense representations

Related work: knowledge-based

slide-31
SLIDE 31

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 21/109

Adagram [Bartunov et al., 2016] Multiple vector representations θ for each word:

  • - a hidden variable: a sense index of word

in context ;

  • - a meta-parameter controlling number of senses.

See also: [Neelakantan et al., 2014] and [Li and Jurafsky, 2015]

Inducing word sense representations

Related work: knowledge-free

slide-32
SLIDE 32

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 21/109

Adagram [Bartunov et al., 2016] Multiple vector representations θ for each word: p(Y, Z, β|X, α, θ) =

V

w=1 ∞

k=1

p(βwk|α)

N

i=1

[p(zi|xi, β)

C

j=1

p(yij|zi, xi, θ

zi -- a hidden variable: a sense index of word xi in context C; α -- a meta-parameter controlling number of senses.

See also: [Neelakantan et al., 2014] and [Li and Jurafsky, 2015]

Inducing word sense representations

Related work: knowledge-free

slide-33
SLIDE 33

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 21/109

Adagram [Bartunov et al., 2016] Multiple vector representations θ for each word: p(Y, Z, β|X, α, θ) =

V

w=1 ∞

k=1

p(βwk|α)

N

i=1

[p(zi|xi, β)

C

j=1

p(yij|zi, xi, θ

zi -- a hidden variable: a sense index of word xi in context C; α -- a meta-parameter controlling number of senses.

See also: [Neelakantan et al., 2014] and [Li and Jurafsky, 2015]

Inducing word sense representations

Related work: knowledge-free

slide-34
SLIDE 34

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 22/109

Word sense induction (WSI) based on graph clustering:

[Lin, 1998] [Pantel and Lin, 2002] [Widdows and Dorow, 2002] Chinese Whispers [Biemann, 2006] [Hope and Keller, 2013]

Inducing word sense representations

Related work: word sense induction

slide-35
SLIDE 35

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 23/109 * source of the image: http://ic.pics.livejournal.com/blagin_anton/33716210/2701748/2701748_800.jpg Inducing word sense representations

Related work: Chinese Whispers#1

slide-36
SLIDE 36

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 24/109

Iterative formulation [Biemann, 2006] Vector formulation [Biemann, 2006]

Inducing word sense representations

Related work: Chinese Whispers#2

slide-37
SLIDE 37

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 24/109

Iterative formulation [Biemann, 2006] Vector formulation [Biemann, 2006]

Inducing word sense representations

Related work: Chinese Whispers#2

slide-38
SLIDE 38

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 25/109 Inducing word sense representations

Related work: Chinese Whispers#2

slide-39
SLIDE 39

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 26/109 Inducing word sense representations

Related work: Chinese Whispers#2

slide-40
SLIDE 40

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 27/109 Inducing word sense representations

Related work: Chinese Whispers#2

slide-41
SLIDE 41

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 28/109

RepL4NLP@ACL'16 [Pelevina et al., 2016], LREC'18 [Remus & Biemann, 2018]

Prior methods:

Induce inventory by clustering of word instances Use existing sense inventories

Our method:

Input: word embeddings Output: word sense embeddings Word sense induction by clustering of word ego-networks

Inducing word sense representations

Sense embeddings using retrofjtting

slide-42
SLIDE 42

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 29/109

From word embeddings to sense embeddings

Calculate Word Similarity Graph Learning Word Vectors Word Sense Induction Text Corpus Word Vectors Word Similarity Graph Pooling of Word Vectors Sense Inventory Sense Vectors 1 2 4 3

Inducing word sense representations

Sense embeddings using retrofjtting

slide-43
SLIDE 43

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 30/109

Word sense induction using ego-network clustering

Inducing word sense representations

Sense embeddings using retrofjtting

slide-44
SLIDE 44

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 31/109

Neighbours of Word and Sense Vectors Vector Nearest Neighbors table tray, bottom, diagram, bucket, brackets, stack, basket, list, parenthesis, cup, saucer, pile, playfjeld, bracket, pot, drop-down, cue, plate table#0 leftmost#0, column#1, tableau#1, indent#1, bracket#3, pointer#0, footer#1, cursor#1, diagram#0, grid#0 table#1 pile#1, stool#1, tray#0, basket#0, bowl#1, bucket#0, box#0, cage#0, saucer#3, mirror#1, pan#1, lid#0

Inducing word sense representations

Sense embeddings using retrofjtting

slide-45
SLIDE 45

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 31/109

Neighbours of Word and Sense Vectors Vector Nearest Neighbors table tray, bottom, diagram, bucket, brackets, stack, basket, list, parenthesis, cup, saucer, pile, playfjeld, bracket, pot, drop-down, cue, plate table#0 leftmost#0, column#1, tableau#1, indent#1, bracket#3, pointer#0, footer#1, cursor#1, diagram#0, grid#0 table#1 pile#1, stool#1, tray#0, basket#0, bowl#1, bucket#0, box#0, cage#0, saucer#3, mirror#1, pan#1, lid#0

Inducing word sense representations

Sense embeddings using retrofjtting

slide-46
SLIDE 46

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 32/109

Word and sense embeddings

  • f words iron and vitamin.

LREC'18 [Remus & Biemann, 2018]

Inducing word sense representations

Sense embeddings using retrofjtting

slide-47
SLIDE 47

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 33/109

Word Sense Disambiguation

1 Context extraction: use context words around the target

word

2 Context fjltering: based on context word's relevance for

disambiguation

3 Sense choice in context: maximise similarity between a

context vector and a sense vector

Inducing word sense representations

Sense embeddings using retrofjtting

slide-48
SLIDE 48

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 34/109 Inducing word sense representations

Sense embeddings using retrofjtting

slide-49
SLIDE 49

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 35/109 Inducing word sense representations

Sense embeddings using retrofjtting

slide-50
SLIDE 50

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 36/109 Inducing word sense representations

Sense embeddings using retrofjtting

slide-51
SLIDE 51

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 37/109 Inducing word sense representations

Sense embeddings using retrofjtting

slide-52
SLIDE 52

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 38/109

Unsupervised WSD SemEval'13, ReprL4NLP [Pelevina et al., 2016]:

Model Jacc. Tau WNDCG F.NMI F.B-Cubed AI-KU (add1000) 0.176 0.609 0.205 0.033 0.317 AI-KU 0.176 0.619 0.393 0.066 0.382 AI-KU (remove5-add1000) 0.228 0.654 0.330 0.040 0.463 Unimelb (5p) 0.198 0.623 0.374 0.056 0.475 Unimelb (50k) 0.198 0.633 0.384 0.060 0.494 UoS (#WN senses) 0.171 0.600 0.298 0.046 0.186 UoS (top-3) 0.220 0.637 0.370 0.044 0.451 La Sapienza (1) 0.131 0.544 0.332

  • La Sapienza (2)

0.131 0.535 0.394

  • AdaGram, α = 0.05, 100 dim

0.274 0.644 0.318 0.058 0.470 w2v 0.197 0.615 0.291 0.011 0.615 w2v (nouns) 0.179 0.626 0.304 0.011 0.623 JBT 0.205 0.624 0.291 0.017 0.598 JBT (nouns) 0.198 0.643 0.310 0.031 0.595 TWSI (nouns) 0.215 0.651 0.318 0.030 0.573

Inducing word sense representations

Sense embeddings using retrofjtting

slide-53
SLIDE 53

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 39/109

Semantic relatedness, LREC'2018 [Remus & Biemann, 2018]:

a u t

  • e

x t e n d a d a g r a m S G N S . g l

  • v

e . s y m p a t . L S A b

  • w

. L S A h a l . p a r a g r a m S L . SimLex999 0.45 0.29 0.44 0.37 0.54 0.30 0.27 0.68 MEN 0.72 0.67 0.77 0.73 0.53 0.67 0.71 0.77 SimVerb 0.43 0.27 0.36 0.23 0.37 0.15 0.19 0.53 WordSim353 0.58 0.61 0.70 0.61 0.47 0.67 0.59 0.72 SimLex999-N 0.44 0.33 0.45 0.39 0.48 0.32 0.34 0.68 MEN-N 0.72 0.68 0.77 ____ 0.76 ____ 0.57 ____ 0.71 ____ 0.73 ____ 0.78 ____ Inducing word sense representations

Sense embeddings using retrofjtting

slide-54
SLIDE 54

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 40/109

Unsupervised WSD SemEval'13, ReprL4NLP [Pelevina et al., 2016]: comparable to SOTA, incl. sense embeddings. Semantic relatedness, LREC'2018 [Remus & Biemann, 2018]:

a u t

  • e

x t e n d a d a g r a m S G N S S G N S + s e n s e s g l

  • v

e g l

  • v

e + s e n s e s s y m p a t s y m p a t + s e n s e s L S A b

  • w

L S A b

  • w

+ s e n s e s L S A h a l L S A h a l + s e n s e s p a r a g r a m S L p a r a g r a m S L + s e n s e s SimLex999 0.45 0.29 0.44 0.46 0.37 0.41 0.54 0.55 0.30 0.39 0.27 0.38 0.68 0.64 MEN 0.72 0.67 0.77 0.78 0.73 0.77 0.53 0.68 0.67 0.70 0.71 0.74 0.77 0.80 SimVerb 0.43 0.27 0.36 0.39 0.23 0.30 0.37 0.45 0.15 0.22 0.19 0.28 0.53 0.53 WordSim353 0.58 0.61 0.70 0.69 0.61 0.65 0.47 0.62 0.67 0.66 0.59 0.63 0.72 0.73 SimLex999-N 0.44 0.33 0.45 0.50 0.39 0.47 0.48 0.55 0.32 0.46 0.34 0.44 0.68 0.66 MEN-N 0.72 0.68 0.77 0.79 0.76 0.80 0.57 0.74 0.71 0.73 0.73 0.76 0.78 0.81 Inducing word sense representations

Sense embeddings using retrofjtting

slide-55
SLIDE 55

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 41/109

ACL'17 [Ustalov et al., 2017b] Examples of extracted synsets: Size Synset 2 {decimal point, dot} 3 {gullet, throat, food pipe} 4 {microwave meal, ready meal, TV dinner, frozen dinner} 5 {objective case, accusative case, oblique case, object case, accusative} 6 {radiotheater, dramatized audiobook, audio theater, ra- dio play, radio drama, audio play}

Inducing word sense representations

Synset induction

slide-56
SLIDE 56

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 42/109

Outline of the 'Watset' method:

Background Corpus Synonymy Dictionary Learning Word Embeddings Graph Construction Synsets Word Similarities Ambiguous Weighted Graph Local Clustering: Word Sense Induction Global Clustering: Synset Induction Sense Inventory Disambiguation of Neighbors Disambiguated Weighted Graph Local-Global Fuzzy Graph Clustering

Inducing word sense representations

Synset induction

slide-57
SLIDE 57

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 43/109 Inducing word sense representations

Synset induction

slide-58
SLIDE 58

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 44/109 Inducing word sense representations

Synset induction

slide-59
SLIDE 59

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 45/109 Inducing word sense representations

Synset induction

slide-60
SLIDE 60

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 46/109 CW MCL MaxMax ECO CPM Watset

0.0 0.1 0.2 0.3

WordNet (English) F−score CW MCL MaxMax ECO CPM Watset

0.0 0.1 0.2 0.3

BabelNet (English) F−score Inducing word sense representations

Synset induction

slide-61
SLIDE 61

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 46/109 CW MCL MaxMax ECO CPM Watset

0.0 0.1 0.2 0.3

WordNet (English) F−score CW MCL MaxMax ECO CPM Watset

0.0 0.1 0.2 0.3

BabelNet (English) F−score CW MCL MaxMax ECO CPM Watset

0.00 0.05 0.10 0.15 0.20

RuWordNet (Russian) F−score CW MCL MaxMax ECO CPM Watset

0.0 0.1 0.2 0.3 0.4

YARN (Russian) F−score Inducing word sense representations

Synset induction

slide-62
SLIDE 62

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 47/109

Word Sense Local Sense Cluster: Related Senses Hypernyms mango#0 peach#1, grape#0, plum#0, apple#0, apricot#0, watermelon#1, banana#1, coconut#0, pear#0, fjg#0, melon#0, mangosteen#0, … fruit#0, food#0, … apple#0 mango#0, pineapple#0, banana#1, melon#0, grape#0, peach#1, watermelon#1, apricot#0, cranberry#0, pumpkin#0, mangosteen#0, … fruit#0, crop#0, … Java#1 C#4, Python#3, Apache#3, Ruby#6, Flash#1, C++#0, SQL#0, ASP#2, Visual Basic#1, CSS#0, Delphi#2, MySQL#0, Excel#0, Pascal#0, … programming language#3, lan- guage#0, … Python#3 PHP#0, Pascal#0, Java#1, SQL#0, Visual Ba- sic#1, C++#0, JavaScript#0, Apache#3, Haskell#5, .NET#1, C#4, SQL Server#0, … language#0, tech- nology#0, …

Inducing word sense representations

Sample of induced sense inventory

slide-63
SLIDE 63

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 48/109

ID Global Sense Cluster: Semantic Class Hypernyms 1 peach#1, banana#1, pineapple#0, berry#0, black- berry#0, grapefruit#0, strawberry#0, blueberry#0, mango#0, grape#0, melon#0, orange#0, pear#0, plum#0, raspberry#0, watermelon#0, apple#0, apri- cot#0, watermelon#0, pumpkin#0, berry#0, man- gosteen#0, … vegetable#0, fruit#0, crop#0, ingredi- ent#0, food#0, · 2 C#4, Basic#2, Haskell#5, Flash#1, Java#1, Pas- cal#0, Ruby#6, PHP#0, Ada#1, Oracle#3, Python#3, Apache#3, Visual Basic#1, ASP#2, Delphi#2, SQL Server#0, CSS#0, AJAX#0, JavaScript#0, SQL Server#0, Apache#3, Delphi#2, Haskell#5, .NET#1, CSS#0, … programming lan- guage#3, technol-

  • gy#0, language#0,

format#2, app#0

Inducing word sense representations

Sample of induced semantic classes

slide-64
SLIDE 64

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 49/109

Text Corpus Representing Senses with Ego Networks Semantic Classes Word Sense Induction from Text Corpus Sense Graph Construction Clustering of Word Senes Labeling Sense Clusters with Hypernyms

Induced Word Senses Sense Ego-Networks Global Sense Graph

s Noisy Hypernyms Cleansed Hypernyms Induction of Semantic Classes

Global Sense Clusters

Inducing word sense representations

Induction of semantic classes

slide-65
SLIDE 65

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 50/109

Filtering noisy hypernyms with semantic classes LREC'18 [Panchenko et al., 2018b]:

fruit#1 food#0 apple#2 mango#0 pear#0

Hypernyms, Sense Cluster,

mangosteen#0 city#2

Removed Wrong Added Missing

Inducing word sense representations

Induction of sense semantic classes

slide-66
SLIDE 66

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 51/109

http://panchenko.me/data/joint/nodes20000-layers7

Inducing word sense representations

Global sense clustering

slide-67
SLIDE 67

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 52/109 Inducing word sense representations

Global sense clustering

slide-68
SLIDE 68

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 53/109

Filtering of a noisy hypernymy database with semantic classes. LREC'18 [Panchenko et al., 2018b]

Precision Recall F-score Original Hypernyms (Seitner et al., 2016) 0.475 0.546 0.508 Semantic Classes (coarse-grained) 0.541 0.679 0.602

Inducing word sense representations

Induction of sense semantic classes

slide-69
SLIDE 69

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 54/109

Making induced senses interpretable

slide-70
SLIDE 70

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 55/109

Knowledge-based sense representations are interpretable

Making induced senses interpretable

Making induced senses interpretable

slide-71
SLIDE 71

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 56/109

Most knowledge-free sense representations are uninterpretable

Making induced senses interpretable

Making induced senses interpretable

slide-72
SLIDE 72

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 57/109 Making induced senses interpretable

Making induced senses interpretable

slide-73
SLIDE 73

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 58/109

Hypernymy prediction in context. EMNLP'17 [Panchenko et al., 2017b]

Making induced senses interpretable

Making induced senses interpretable

slide-74
SLIDE 74

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 59/109

11.702 sentences, 863 words with avg.polysemy of 3.1. WSD Model Accuracy Inventory Features Hypers HyperHypers Word Senses Random 0.257 0.610 Word Senses MFS 0.292 0.682 Word Senses Cluster Words 0.291 0.650 Word Senses Context Words 0.308 0.686 Super Senses Random 0.001 0.001 Super Senses MFS 0.001 0.001 Super Senses Cluster Words 0.174 0.365 Super Senses Context Words 0.086 0.188

Making induced senses interpretable

Making induced senses interpretable

slide-75
SLIDE 75

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 59/109

11.702 sentences, 863 words with avg.polysemy of 3.1. WSD Model Accuracy Inventory Features Hypers HyperHypers Word Senses Random 0.257 0.610 Word Senses MFS 0.292 0.682 Word Senses Cluster Words 0.291 0.650 Word Senses Context Words 0.308 0.686 Super Senses Random 0.001 0.001 Super Senses MFS 0.001 0.001 Super Senses Cluster Words 0.174 0.365 Super Senses Context Words 0.086 0.188

Making induced senses interpretable

Making induced senses interpretable

slide-76
SLIDE 76

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 60/109

Linking induced senses to resources

slide-77
SLIDE 77

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 61/109 Text Corpus Linking Induced Senses to Senses of the LR Induce a Graph of

  • Sem. Related Words

Enriched Lexical Resource

Graph of Related Words

Word Sense Induction Labeling Senses with Hypernyms Disambiguation

  • f Neighbours

Typing of the Unmapped Induced Senses

Word Sense Inventory Labeled Word Senses PCZ

Construction of Proto-Conceptualization (PCZ) Linking Proto-Conceptualization to Lexical Resource

  • Part. Linked Senses to the LR

Lexical Resource (LR): WordNet, BabelNet, ... Construction of sense feature representations

Graph of Related Senses

LREC'16 [Panchenko, 2016], ISWC'16 [Faralli et al., 2016], SENSE@EACL'17 [Panchenko et al., 2017a], NLE'18 [Biemann et al., 2018]

Linking induced senses to resources

Linking induced senses to resources

slide-78
SLIDE 78

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 62/109 Word AdaGram BabelNet AdaGram BoW BabelNet BoW python 2 bn:01713224n perl, php, java, smalltalk, ruby, lua, tcl, scripting, javascript, bindings, binding, programming, coldfusion, actionscript, net, . . . language, programming, python- ista, python programming, python3, python2, level, com- puter, pythonistas, python3000, python 1 bn:01157670n monty, circus, spamalot, python, magoo, muppet, snoopy, fea- turette, disney, tunes, tune, clas- sic, shorts, short, apocalypse, . . . monty, comedy, monty python, british, monte, monte python, troupe, pythonesque, foot, artist, record, surreal, terry, . . . python 3 bn:00046456n spectacled, unicornis, snake, gi- ant, caiman, leopard, squirrel, crocodile, horned, cat, mole, ele- phant, opossum, pheasant, . . . molurus, indian, boa, tigris, tiger python, rock, tiger, indian python, reptile, python molurus, indian rock python, coluber, . . . python 4 bn:01157670n circus, fmy, fmying, dusk, lizard, moth, unicorn, pufg, adder, vul- ture, tyrannosaurus, zephyr, bad- ger, . . . monty, comedy, monty python, british, monte, monte python, troupe, pythonesque, foot, artist, record, surreal, terry, . . . python 1 bn:00473212n monty, circus, spamalot, python, magoo, muppet, snoopy, fea- turette, disney, tunes, tune, clas- sic, shorts, short, apocalypse, . . . pictures, monty, python monty pictures, limited, company, python pictures limited, king- dom, picture, serve, director, . . . python 1 bn:03489893n monty, circus, spamalot, python, magoo, muppet, snoopy, fea- turette, disney, tunes, tune, clas- sic, shorts, short, apocalypse, . . . fjlm, horror, movie, clabaugh, richard, monster, century, direct, snake, python movie, television, giant, natural, language, for-tv, . . . Linking induced senses to resources

Linking induced senses to resources

slide-79
SLIDE 79

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 63/109 Model Representation of the Sense "disk (medium)" WordNet memory, device, fmoppy, disk, hard, disk, disk, computer, science, computing, diskette, fjxed, disk, fmoppy, magnetic, disc, magnetic, disk, hard, disc, storage, device WordNet + Linked recorder, disk, fmoppy, console, diskette, handset, desktop, iPhone, iPod, HDTV, kit, RAM, Discs, Blu- ray, computer, GB, microchip, site, cartridge, printer, tv, VCR, Disc, player, LCD, software, component, camcorder, cellphone, card, monitor, display, burner, Web, stereo, internet, model, iTunes, turntable, chip, cable, camera, iphone, notebook, device, server, surface, wafer, page, drive, laptop, screen, pc, television, hardware, YouTube, dvr, DVD, product, folder, VCR, radio, phone, circuitry, partition, megabyte, peripheral, format, machine, tuner, website, merchandise, equipment, gb, discs, MP3, hard-drive, piece, video, storage device, memory device, microphone, hd, EP, content, soundtrack, webcam, system, blade, graphic, microprocessor, collection, document, programming, battery, key- board, HD, handheld, CDs, reel, web, material, hard-disk, ep, chart, debut, confjguration, recording, album, broadcast, download, fjxed disk, planet, pda, microfjlm, iPod, videotape, text, cylinder, cpu, canvas, label, sampler, workstation, electrode, magnetic disc, catheter, magnetic disk, Video, mo- bile, cd, song, modem, mouse, tube, set, ipad, signal, substrate, vinyl, music, clip, pad, audio, com- pilation, memory, message, reissue, ram, CD, subsystem, hdd, touchscreen, electronics, demo, shell, sensor, fjle, shelf, processor, cassette, extra, mainframe, motherboard, fmoppy disk, lp, tape, version, kilobyte, pacemaker, browser, Playstation, pager, module, cache, DVD, movie, Windows, cd-rom, e- book, valve, directory, harddrive, smartphone, audiotape, technology, hard disk, show, computing, computer science, Blu-Ray, blu-ray, HDD, HD-DVD, scanner, hard disc, gadget, booklet, copier, play- back, TiVo, controller, fjlter, DVDs, gigabyte, paper, mp3, CPU, dvd-r, pipe, cd-r, playlist, slot, VHS, fjlm, videocassette, interface, adapter, database, manual, book, channel, changer, storage Linking induced senses to resources

Linking induced senses to resources

slide-80
SLIDE 80

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 64/109

Evaluation of linking accuracy:

Linking induced senses to resources

Linking induced senses to resources

slide-81
SLIDE 81

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 65/109

Evaluation of enriched representations based on WSD:

Linking induced senses to resources

Linking induced senses to resources

slide-82
SLIDE 82

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 66/109

Shared task on word sense induction

slide-83
SLIDE 83

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 67/109

An ACL SIGSLAV sponsored shared task on word sense induction (WSI) for the Russian language. More details: https://russe.nlpub.org/2018/wsi

Shared task on word sense induction

A shared task on WSI

slide-84
SLIDE 84

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 68/109

Target word, e.g. ``bank''. Contexts where the word occurs, e.g.:

``river bank is a slope beside a body of water'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun''

You need to group the contexts by senses:

``river bank is a slope beside a body of water'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.''

Shared task on word sense induction

A lexical sample WSI task

slide-85
SLIDE 85

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 68/109

Target word, e.g. ``bank''. Contexts where the word occurs, e.g.:

``river bank is a slope beside a body of water'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun''

You need to group the contexts by senses:

``river bank is a slope beside a body of water'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.''

Shared task on word sense induction

A lexical sample WSI task

slide-86
SLIDE 86

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 68/109

Target word, e.g. ``bank''. Contexts where the word occurs, e.g.:

``river bank is a slope beside a body of water'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun''

You need to group the contexts by senses:

``river bank is a slope beside a body of water'' ``bank of Elbe is a good and popular hangout spot complete with good food and fun'' ``bank is a fjnancial institution that accepts deposits'' ``Oh, the bank was robbed. They took about a million dollars.''

Shared task on word sense induction

A lexical sample WSI task

slide-87
SLIDE 87

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 69/109 Shared task on word sense induction

Dataset based on Wikipedia

slide-88
SLIDE 88

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 70/109 Shared task on word sense induction

Dataset based on RNC

slide-89
SLIDE 89

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 71/109 Shared task on word sense induction

Dataset based on dictionary glosses

slide-90
SLIDE 90

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 72/109 Shared task on word sense induction

A sample from the wiki-wiki dataset

slide-91
SLIDE 91

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 73/109 Shared task on word sense induction

A sample from the wiki-wiki dataset

slide-92
SLIDE 92

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 74/109 Shared task on word sense induction

A sample from the wiki-wiki dataset

slide-93
SLIDE 93

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 75/109 Shared task on word sense induction

A sample from the bts-rnc dataset

slide-94
SLIDE 94

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 76/109 Shared task on word sense induction

A sample from the active-dict dataset

slide-95
SLIDE 95

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 77/109

1 Get the neighbors of a target word, e.g. ``bank'': 1

lender

2 river 3 citybank 4 slope 5

2 Get similar to ``bank'' and dissimilar to ``lender'': 1

river

2 slope 3 land 4 … 3 Compute distances to ``lender'' and ``river''.

Shared task on word sense induction

jamsic: sense induction

slide-96
SLIDE 96

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 78/109

Induction of semantic frames

slide-97
SLIDE 97

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 79/109 Induction of semantic frames

FrameNet: frame ``Kidnapping''

slide-98
SLIDE 98

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 80/109

ACL'2018 [Ustalov et al., 2018a] Example of a LU tricluster corresponding to the ``Kidnapping'' frame from FrameNet. FrameNet Role Lexical Units (LU) Perpetrator Subject kidnapper, alien, militant FEE Verb snatch, kidnap, abduct Victim Object son, people, soldier, child

Induction of semantic frames

Frame induction as a triclustering

slide-99
SLIDE 99

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 81/109 Induction of semantic frames

SVO triple elements

slide-100
SLIDE 100

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 82/109

Officer|chair|Committee

  • fficer|head|team

mayor|lead|city

  • fficer|lead|company

Mayor|lead|city boss|lead|company chairman|lead|company director|lead|department chief|lead|department president|lead|government president|lead|state director|lead|company president|lead|department

  • fficer|chair|committee

Chief|lead|department chairman|lead|committee Director|lead|Department Director|lead|department Director|lead|agency Director|lead|company minister|lead|team Director|head|team director|head|team Chairman|lead|company Chairman|lead|Committee President|lead|company Director|chair|Committee President|lead|party President|head|team leader|head|team Director|chair|committee director|chair|committee Director|head|Department president|head|team director|head|department director|head|agency director|head|committee Chairman|run|committee Chairman|chair|Committee Chairman|chair|committee President|chair|Committee President|chair|committee Governor|lead|state chairman|head|committee chairman|run|committee president|chair|committee president|head|committee president|chair|Committee Minister|chair|committee representative|chair|committee representative|head|committee General|command|department General|command|Department General|head|Department General|head|department

  • fficer|head|department

minister|head|department leader|head|agency leader|head|party leader|head|committee leader|head|department minister|head|committee King|run|company leader|head|government Minister|head|government president|head|government Officer|chair|Committee

  • fficer|head|team

mayor|lead|city

  • fficer|lead|company

Mayor|lead|city boss|lead|company chairman|lead|company director|lead|department chief|lead|department president|lead|government president|lead|state director|lead|company president|lead|department

  • fficer|chair|committee

Chief|lead|department chairman|lead|committee Director|lead|Department Director|lead|department Director|lead|agency Director|lead|company minister|lead|team Director|head|team director|head|team Chairman|lead|company Chairman|lead|Committee President|lead|company Director|chair|Committee President|lead|party President|head|team leader|head|team Director|chair|committee director|chair|committee Director|head|Department president|head|team director|head|department director|head|agency director|head|committee Chairman|run|committee Chairman|chair|Committee Chairman|chair|committee President|chair|Committee President|chair|committee Governor|lead|state chairman|head|committee chairman|run|committee president|chair|committee president|head|committee president|chair|Committee Minister|chair|committee representative|chair|committee representative|head|committee General|command|department General|command|Department General|head|Department General|head|department

  • fficer|head|department

minister|head|department leader|head|agency leader|head|party leader|head|committee leader|head|department minister|head|committee King|run|company leader|head|government Minister|head|government president|head|government

Induction of semantic frames

An SVO triple graph

slide-101
SLIDE 101

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 83/109

Input: an embedding model v ∈ V → ⃗ v ∈ Rd, a set of SVO triples T ⊆ V 3, the number of nearest neighbors k ∈ N, a graph clustering algorithm Cluster. Output: a set of triframes . NN for all Cluster do return

Induction of semantic frames

Triframes frame induction

slide-102
SLIDE 102

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 83/109

Input: an embedding model v ∈ V → ⃗ v ∈ Rd, a set of SVO triples T ⊆ V 3, the number of nearest neighbors k ∈ N, a graph clustering algorithm Cluster. Output: a set of triframes F. NN for all Cluster do return

Induction of semantic frames

Triframes frame induction

slide-103
SLIDE 103

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 83/109

Input: an embedding model v ∈ V → ⃗ v ∈ Rd, a set of SVO triples T ⊆ V 3, the number of nearest neighbors k ∈ N, a graph clustering algorithm Cluster. Output: a set of triframes F. S ← {t→ ⃗ t ∈ R3d : t ∈ T} E ← {(t, t′) ∈ T 2 : t′ ∈ NNS

k (⃗

t), t ̸= t′} F ← ∅ for all C ∈ Cluster(T, E) do fs ← {s ∈ V : (s, v, o) ∈ C} fv ← {v ∈ V : (s, v, o) ∈ C} fo ← {o ∈ V : (s, v, o) ∈ C} F ← F ∪ {(fs, fv, fo)} return F

Induction of semantic frames

Triframes frame induction

slide-104
SLIDE 104

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 84/109

Frame # 848 Subjects: Company, fjrm, company Verbs: buy, supply, discharge, purchase, expect Objects: book, supply, house, land, share, company, grain, which, item, product, ticket, work, this, equipment, House, it, fjlm, water, something, she, what, service, plant, time

Induction of semantic frames

Example of an extracted frame

slide-105
SLIDE 105

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 85/109

Frame # 849 Subjects: student, scientist, we, pupil, member, company, man, nobody, you, they, US, group, it, people, Man, user, he Verbs: do, test, perform, execute, conduct Objects: experiment, test

Induction of semantic frames

Example of an extracted frame

slide-106
SLIDE 106

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 86/109

Frame # 3207 Subjects: people, we, they, you Verbs: feel, seek, look, search Objects: housing, inspiration, gold, witness, part- ner, accommodation, Partner

Induction of semantic frames

Example of an extracted frame

slide-107
SLIDE 107

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 87/109

Dataset # instances # unique # clusters FrameNet Triples 99,744 94,170 383

  • Poly. Verb Classes

246 110 62

Induction of semantic frames

Evaluation datasets

slide-108
SLIDE 108

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 88/109

Dataset # instances # unique # clusters FrameNet Triples 99,744 94,170 383

  • Poly. Verb Classes

246 110 62 Quality Measures: nmPU: normalized modifjed purity, niPU: normalized inverse purity.

Induction of semantic frames

Evaluation settings

slide-109
SLIDE 109

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 88/109

Dataset # instances # unique # clusters FrameNet Triples 99,744 94,170 383

  • Poly. Verb Classes

246 110 62 Quality Measures: nmPU: normalized modifjed purity, niPU: normalized inverse purity.

Induction of semantic frames

Evaluation settings

slide-110
SLIDE 110

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 89/109

F1-scores for verbs, subjects,

  • bjects,

frames

Induction of semantic frames

Results: comparison to state-of-art

slide-111
SLIDE 111

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 90/109

Graph embeddings

slide-112
SLIDE 112

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 91/109

Image source: https://www.tensorflow.org/tutorials/word2vec

Graph embeddings

Text: sparse symbolic representation

slide-113
SLIDE 113

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 91/109

Image source: https://www.tensorflow.org/tutorials/word2vec

Graph embeddings

Text: sparse symbolic representation

slide-114
SLIDE 114

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 92/109 Graph embeddings

Graph: sparse symbolic representation

slide-115
SLIDE 115

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 93/109

From a survey on graph embeddings [Hamilton et al., 2017]:

Graph embeddings

Embedding graph into a vector space

slide-116
SLIDE 116

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 94/109

From a survey on graph embeddings [Hamilton et al., 2017]:

Graph embeddings

Learning with an ``autoencoder''

slide-117
SLIDE 117

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 95/109

From a survey on graph embeddings [Hamilton et al., 2017]:

Graph embeddings

Some established approaches

slide-118
SLIDE 118

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 96/109

An submitted joint work with Andrei Kutuzov and Chris Biemann: Given a tree (V, E) Leackock-Chodorow (LCH) similarity measure: sim(vi, vj) = − log shortest_path_distance(vi, vj) 2h Jiang-Conrath (JCN) similarity measure: ln ln ln

Graph embeddings

Graph embeddings using similarities

slide-119
SLIDE 119

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 96/109

An submitted joint work with Andrei Kutuzov and Chris Biemann: Given a tree (V, E) Leackock-Chodorow (LCH) similarity measure: sim(vi, vj) = − log shortest_path_distance(vi, vj) 2h Jiang-Conrath (JCN) similarity measure: sim(vi, vj) = 2 ln Plcs(vi, vj) ln P(vi) + ln P(vj)

Graph embeddings

Graph embeddings using similarities

slide-120
SLIDE 120

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 97/109

path2vec (arxiv.org/abs/1808.05611): Approximating Structural Node Similarities with Node Embeddings: J = 1 |T| ∑

(vi,vj)∈T

(vi · vj − sim(vi, vj))2, where: sim(vi, vj) - the value of a `gold' similarity measure between a pair of nodes vi and vj; vi - an embeddings of node; T - training batch.

Graph embeddings

Graph embeddings using similarities

slide-121
SLIDE 121

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 98/109

Computation of 82,115 pairwise similarities: Model Running time LCH in NLTK 30 sec. JCN in NLTK 6.7 sec. FSE embeddings 0.713 sec. path2vec and other fmoat vectors 0.007 sec.

Graph embeddings

Speedup: graph vs embeddings

slide-122
SLIDE 122

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 99/109

Spearman correlation scores with WordNet similarities on SimLex999 noun pairs: Selection of synsets Model JCN-SemCor JCN-Brown LCH WordNet 1.0 1.0 1.0 Node2vec 0.655 0.671 0.724 Deepwalk 0.775 0.774 0.868 FSE 0.830 0.820 0.900 path2vec 0.917 0.914 0.934

Graph embeddings

Results: goodness of fjt

slide-123
SLIDE 123

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 100/109

Spearman correlations with human SimLex999 noun similarities: Model Correlation Raw WordNet JCN-SemCor 0.487 Raw WordNet JCN-Brown 0.495 Raw WordNet LCH 0.513 node2vec [Grover & Leskovec, 2016] 0.450 Deepwalk [Perozzi et al., 2014] 0.533 FSE [Subercaze et al., 2015] 0.556 path2vec JCN-SemCor 0.526 path2vec JCN-Brown 0.487 path2vec LCH 0.522

Graph embeddings

Results: SimLex999 dataset

slide-124
SLIDE 124

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 101/109

JCN (left) and LCH (right):

Graph embeddings

Results: SimLex999 dataset

slide-125
SLIDE 125

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 102/109

WSD: each column lists all the possible synsets for the corresponding word.

Graph embeddings

Results: word sense disambiguation

slide-126
SLIDE 126

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 103/109

F1 scores on Senseval-2 word sense disambiguation task:

Model F-measure WordNet JCN-SemCor 0.620 WordNet JCN-Brown 0.561 WordNet LCH 0.547 node2vec [Grover & Leskovec, 2016] 0.501 Deepwalk [Perozzi et al., 2014] 0.528 FSE [Subercaze et al., 2015] 0.536 path2vec Batch size: 20 50 100 JCN-SemCor 0.543 0.543 0.535 JCN-Brown 0.538 0.515 0.542 LCH 0.540 0.535 0.536

Graph embeddings

Results: word sense disambiguation

slide-127
SLIDE 127

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 104/109

L = 1 |T| ∑

(vi,vj)∈T

( (vi · vj − sim(vi, vj))2 + vi · vin + vj · vjm ) , where: sim(vi, vj) - the value of a `gold' similarity measure between a pair of nodes vi and vj; vi - an embeddings of node; T - training batch; vin - random adjacent node of vi.

Graph embeddings

Improved Model and Results

slide-128
SLIDE 128

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 105/109

Conclusion

slide-129
SLIDE 129

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 106/109 Conclusion

Vectors + Graphs = ♡

slide-130
SLIDE 130

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 107/109

We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to

improve performance of WSD; enrich lexical resources with emerging senses.

We can represent language graphs using graph embeddings in deep neural models.

Conclusion

Take home messages

slide-131
SLIDE 131

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 107/109

We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to

improve performance of WSD; enrich lexical resources with emerging senses.

We can represent language graphs using graph embeddings in deep neural models.

Conclusion

Take home messages

slide-132
SLIDE 132

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 107/109

We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to

improve performance of WSD; enrich lexical resources with emerging senses.

We can represent language graphs using graph embeddings in deep neural models.

Conclusion

Take home messages

slide-133
SLIDE 133

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 107/109

We can induce word senses, synsets, semantic classes, and semantic frames in a knowledge-free way using graph clustering and distributional models. We can make the induced word senses interpretable in a knowledge-free way with hypernyms, images, defjnitions. We can link induced senses to lexical resources to

improve performance of WSD; enrich lexical resources with emerging senses.

We can represent language graphs using graph embeddings in deep neural models.

Conclusion

Take home messages

slide-134
SLIDE 134

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 108/109

A special issue on informing neural architectures for NLP with linguistic and background knowledge. … with Ivan Vulić and Simone Paolo Ponzetto.

goo.gl/A76NGX

Conclusion

Natural Language Engineering journal

slide-135
SLIDE 135

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109 Conclusion

Thank you! Questions?

slide-136
SLIDE 136

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109

Arefyev, N., Ermolaev, P., & Panchenko, A. (2018). How much does a word weigh? weighting word embeddings for word sense induction. arXiv preprint arXiv:1805.09209. Bartunov, S., Kondrashkin, D., Osokin, A., & Vetrov, D. (2016). Breaking sticks and ambiguities with adaptive skip-gram. In Artifjcial Intelligence and Statistics (pp. 130--138). Biemann, C. (2006). Chinese whispers: an effjcient graph clustering algorithm and its application to natural language processing problems. In Proceedings of the fjrst workshop on graph based methods for natural language processing (pp. 73--80).: Association for Computational Linguistics. Biemann, C., Faralli, S., Panchenko, A., & Ponzetto, S. P. (2018). A framework for enriching lexical semantic resources with distributional semantics. In Journal of Natural Language Engineering (pp. 56--64).: Cambridge Press.

slide-137
SLIDE 137

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109

Faralli, S., Panchenko, A., Biemann, C., & Ponzetto, S. P. (2016). Linked disambiguated distributional semantic networks. In International Semantic Web Conference (pp. 56--64).: Springer. Grover, A. & Leskovec, J. (2016). Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 855--864).: ACM. Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Representation learning on graphs: Methods and applications. IEEE Data Engineering Bulletin, September 2017. Panchenko, A. (2016). Best of both worlds: Making word sense embeddings interpretable. In LREC.

slide-138
SLIDE 138

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109

Panchenko, A., Faralli, S., Ponzetto, S. P., & Biemann, C. (2017a). Using linked disambiguated distributional networks for word sense disambiguation. In Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications (pp. 72--78). Valencia, Spain: Association for Computational Linguistics. Panchenko, A., Lopukhina, A., Ustalov, D., Lopukhin, K., Arefyev, N., Leontyev, A., & Loukachevitch, N. (2018a). Russe'2018: A shared task on word sense induction for the russian language. arXiv preprint arXiv:1803.05795. Panchenko, A., Marten, F., Ruppert, E., Faralli, S., Ustalov, D., Ponzetto, S. P., & Biemann, C. (2017b). Unsupervised, knowledge-free, and interpretable word sense disambiguation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp.

slide-139
SLIDE 139

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109

91--96). Copenhagen, Denmark: Association for Computational Linguistics. Panchenko, A., Ruppert, E., Faralli, S., Ponzetto, S. P., & Biemann, C. (2017c). Unsupervised does not mean uninterpretable: The case for word sense induction and disambiguation. In Proceedings of the 15th Conference of the European Chapter

  • f the Association for Computational Linguistics: Volume 1,

Long Papers (pp. 86--98). Valencia, Spain: Association for Computational Linguistics. Panchenko, A., Ustalov, D., Faralli, S., Ponzetto, S. P., & Biemann, C. (2018b). Improving hypernymy extraction with distributional semantic classes. In Proceedings of the LREC 2018 Miyazaki, Japan: European Language Resources Association. Pelevina, M., Arefjev, N., Biemann, C., & Panchenko, A. (2016). Making sense of word embeddings.

slide-140
SLIDE 140

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109

In Proceedings of the 1st Workshop on Representation Learning for NLP (pp. 174--183). Berlin, Germany: Association for Computational Linguistics. Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 701--710).: ACM. Remus, S. & Biemann, C. (2018). Retrofjttingword representations for unsupervised sense aware word similarities. In Proceedings of the LREC 2018 Miyazaki, Japan: European Language Resources Association. Rothe, S. & Schütze, H. (2015). Autoextend: Extending word embeddings to embeddings for synsets and lexemes. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint

slide-141
SLIDE 141

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109

Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 1793--1803). Beijing, China: Association for Computational Linguistics. Subercaze, J., Gravier, C., & Laforest, F. (2015). On metric embedding for boosting semantic similarity computations. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (pp. 8--14).: Association for Computational Linguistics. Ustalov, D., Chernoskutov, M., Biemann, C., & Panchenko, A. (2017a). Fighting with the sparsity of synonymy dictionaries for automatic synset induction. In International Conference on Analysis of Images, Social Networks and Texts (pp. 94--105).: Springer. Ustalov, D., Panchenko, A., & Biemann, C. (2017b).

slide-142
SLIDE 142

May 28, 2018 From unsupervised induction of linguistic structures to applications in deep learning, A. Panchenko 109/109

Watset: Automatic induction of synsets from a graph of synonyms. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1579--1590). Vancouver, Canada: Association for Computational Linguistics. Ustalov, D., Panchenko, A., Kutuzov, A., Biemann, C., & Ponzetto, S. P. (2018a). Unsupervised semantic frame induction using triclustering. arXiv preprint arXiv:1805.04715. Ustalov, D., Teslenko, D., Panchenko, A., Chernoskutov, M., & Biemann, C. (2018b). Word sense disambiguation based on automatically induced synsets. In LREC 2018, 11th International Conference on Language Resources and Evaluation : 7-12 May 2018, Miyazaki (Japan) (pp. tba). Paris: European Language Resources Association, ELRA-ELDA. Accepted for publication.