Unsupervised Knowledge-Free Word Sense Disambiguation Dr. Alexander - - PowerPoint PPT Presentation

unsupervised knowledge free word sense disambiguation
SMART_READER_LITE
LIVE PREVIEW

Unsupervised Knowledge-Free Word Sense Disambiguation Dr. Alexander - - PowerPoint PPT Presentation

Introduction Dense Sense Representations Sparse Sense Representations Future Work Unsupervised Knowledge-Free Word Sense Disambiguation Dr. Alexander Panchenko University of Hamburg, Language Technology Group 23 February, 2017 Dr. Alexander


slide-1
SLIDE 1

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Unsupervised Knowledge-Free Word Sense Disambiguation

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group

23 February, 2017

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-2
SLIDE 2

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Overview

Introduction Dense Sense Representations Sparse Sense Representations Future Work

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-3
SLIDE 3

Introduction Dense Sense Representations Sparse Sense Representations Future Work

About me

◮ 2008, Engineering degree (MS.) in Computer Science,

Moscow State Technical University

◮ 2009, Research intern, Xerox Research Centre Europe ◮ 2013, PhD in Natural Language Processing, University of

Louvain

◮ 2013, Research engineer at a start-up related to social

network analysis (Digsolab)

◮ 2015, Postdoc at Technical University of Darmstadt ◮ 2017, Postdoc at University of Hamburg

Topics: computational lexical semantics (semantic similarity/relatedness, semantic relations, sense induction, sense disambiguation), nlp for social network analysis, text categorization Papers, presentations, datasets: http://panchenko.me

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-4
SLIDE 4

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Publications Related to the Talk

◮ Pelevina M., Arefiev N., Biemann C., Panchenko A. (2016)

Making Sense of Word Embeddings. In Proceedings of the 1st Workshop on Representation Learning for NLP. ACL 2016, Berlin, Germany. Best Paper Award

◮ Panchenko A., Simon J., Riedl M., Biemann C. (2016) Noun

Sense Induction and Disambiguation using Graph-Based Distributional Semantics. In Proceedings of the KONVENS 2016, Bochum, Germany

◮ Panchenko A., Ruppert E., Faralli S., Ponzetto S. P., and

Biemann C. (2017). Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation. In Proceedings of the 15th Conference

  • f the European Chapter of the Association for Computational

Linguistics (EACL’2017). Valencia, Spain

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-5
SLIDE 5

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Motivation for Unsupervised Knowledge-Free WSD

◮ A word sense disambiguation (WSD) system:

◮ Input: word and its context. ◮ Output: a sense of this word.

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-6
SLIDE 6

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Motivation for Unsupervised Knowledge-Free WSD

◮ A word sense disambiguation (WSD) system:

◮ Input: word and its context. ◮ Output: a sense of this word.

◮ Existing approaches (Navigli, 2009):

◮ Knowledge-based approaches that rely on hand-crafted

resources, such as WordNet.

◮ Supervised approaches learn from hand-labeled training data,

such as SemCor.

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-7
SLIDE 7

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Motivation for Unsupervised Knowledge-Free WSD

◮ A word sense disambiguation (WSD) system:

◮ Input: word and its context. ◮ Output: a sense of this word.

◮ Existing approaches (Navigli, 2009):

◮ Knowledge-based approaches that rely on hand-crafted

resources, such as WordNet.

◮ Supervised approaches learn from hand-labeled training data,

such as SemCor.

◮ Problem 1: hand-crafted lexical resources and training data

expensive, often inconsistent, domain-dependent.

◮ Problem 2: These methods assume a fixed sense inventory:

◮ senses emerge and disappear over time. ◮ different applications require different granularities.

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-8
SLIDE 8

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Motivation for Unsupervised Knowledge-Free WSD (cont.)

◮ An alternative route is the unsupervised knowledge-free

approach.

◮ learn an interpretable sense inventory ◮ learn a disambiguation model

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-9
SLIDE 9

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Dense Sense Representations for WSD

◮ Pelevina M., Arefiev N., Biemann C., Panchenko A. Making

Sense of Word Embeddings. In Proceedings of the 1st Workshop on Representation Learning for NLP. ACL 2016, Berlin, Germany.

◮ An approach to learn word sense embeddings.

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-10
SLIDE 10

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Overview of the contribution

Prior methods:

◮ Induce inventory by clustering of word instances (Li and

Jurafsky, 2015)

◮ Use existing inventories (Rothe and Sch¨

utze, 2015)

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-11
SLIDE 11

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Overview of the contribution

Prior methods:

◮ Induce inventory by clustering of word instances (Li and

Jurafsky, 2015)

◮ Use existing inventories (Rothe and Sch¨

utze, 2015)

Our method:

◮ Input: word embeddings ◮ Output: word sense embeddings ◮ Word sense induction by clustering of word ego-networks ◮ Word sense disambiguation based on the induced sense

representations

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-12
SLIDE 12

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Learning Word Sense Embeddings

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-13
SLIDE 13

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Word Sense Induction: Ego-Network Clustering

◮ Graph clustering using the Chinese Whispers algorithm

(Biemann, 2006).

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-14
SLIDE 14

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Neighbours of Word and Sense Vectors

Vector Nearest Neighbours table tray, bottom, diagram, bucket, brackets, stack, basket, list, parenthesis, cup, trays, pile, play- field, bracket, pot, drop-down, cue, plate table#0 leftmost#0, column#1, randomly#0, tableau#1, top-left0, indent#1, bracket#3, pointer#0, footer#1, cursor#1, diagram#0, grid#0 table#1 pile#1, stool#1, tray#0, basket#0, bowl#1, bucket#0, box#0, cage#0, saucer#3, mir- ror#1, birdcage#0, hole#0, pan#1, lid#0 ◮ Neighbours of the word “table” and its senses produced by

  • ur method.

◮ The neighbours of the initial vector belong to both senses. ◮ The neighbours of the sense vectors are sense-specific.

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-15
SLIDE 15

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Word Sense Disambiguation

  • 1. Context Extraction

◮ use context words around the target word

  • 2. Context Filtering

◮ based on context word’s relevance for disambiguation

  • 3. Sense Choice

◮ maximize similarity between context vector and sense vector

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-16
SLIDE 16

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Word Sense Disambiguation: Example

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-17
SLIDE 17

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Evaluation on SemEval 2013 Task 13 Dataset: Comparison to the State-of-the-art

Model Jacc. Tau WNDCG F.NMI F.B-Cubed AI-KU (add1000) 0.176 0.609 0.205 0.033 0.317 AI-KU 0.176 0.619 0.393 0.066 0.382 AI-KU (remove5-add1000) 0.228 0.654 0.330 0.040 0.463 Unimelb (5p) 0.198 0.623 0.374 0.056 0.475 Unimelb (50k) 0.198 0.633 0.384 0.060 0.494 UoS (#WN senses) 0.171 0.600 0.298 0.046 0.186 UoS (top-3) 0.220 0.637 0.370 0.044 0.451 La Sapienza (1) 0.131 0.544 0.332 – – La Sapienza (2) 0.131 0.535 0.394 – – AdaGram, α = 0.05, 100 dim 0.274 0.644 0.318 0.058 0.470 w2v 0.197 0.615 0.291 0.011 0.615 w2v (nouns) 0.179 0.626 0.304 0.011 0.623 JBT 0.205 0.624 0.291 0.017 0.598 JBT (nouns) 0.198 0.643 0.310 0.031 0.595 TWSI (nouns) 0.215 0.651 0.318 0.030 0.573

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-18
SLIDE 18

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Conclusion

◮ Novel approach for learning word sense embeddings. ◮ Can use existing word embeddings as input. ◮ WSD performance comparable to the state-of-the-art

systems.

◮ Source code and pre-trained models:

https://github.com/tudarmstadt-lt/SenseGram

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-19
SLIDE 19

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Sparse Sense Representations for WSD

◮ Panchenko A., Simon J., Riedl M., Biemann C. (2016) Noun

Sense Induction and Disambiguation using Graph-Based Distributional Semantics. In Proceedings of the KONVENS 2016, Bochum, Germany

◮ Panchenko A., Ruppert E., Faralli S., Ponzetto S. P., and

Biemann C. (2017). Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation. In Proceedings of the 15th Conference

  • f the European Chapter of the Association for Computational

Linguistics (EACL’2017). Valencia, Spain

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-20
SLIDE 20

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Contributions

◮ A framework that relies on induced inventories as a pivot for

learning contextual feature representations and disambiguation.

◮ The method can integrate several types of context features in

an unsupervised way.

◮ The method is interpretable at several levels.

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-21
SLIDE 21

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Outline of the Method

Training Corpus Contexts Computing Word and Feature Similarities Word Sense Induction Dependencies Language Model Co-occurrences Meta-combination Disambiguated Contexts Disambiguation Dependencies Language Model Co-occurrences Feature Extraction Word-Feature Counts from Contexts Word-Feature Counts from Corpus Word Sense Invenory Dependency Word-Feature Counts from Corpus Word Similarities Feature Similarities

Figure: Outline of our unsupervised interpretable method for word sense induction and disambiguation

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-22
SLIDE 22

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Interpretable Unsupervised Knowledge-Free WSD

Interpretability levels of our model

  • 1. word sense inventory;
  • 2. sense feature representation;
  • 3. results of disambiguation in context.
  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-23
SLIDE 23

Introduction Dense Sense Representations Sparse Sense Representations Future Work

WSD based on an Induced Word Sense Inventory

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-24
SLIDE 24

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Results on the TWSI Dataset

Table: WSD performance of different configurations of our method on the full and the sense-balancedTWSI datasets based on the coarse inventory with 1.96 senses/word

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-25
SLIDE 25

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Impact of Word Sense Inventory Granularity on WSD performance: the TWSI Dataset

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-26
SLIDE 26

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Results on the SemEval 2013 Task 13: Word Sense Induction and Disambiguation

Table: WSD performance of the best configuration of our method identified on the TWSI dataset as compared to participants of the SemEval 2013 Task 13 and two systems based on word sense embeddings (AdaGram and SenseGram).

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-27
SLIDE 27

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Demonstrating Unsupervised Knowledge-Free WSD

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-28
SLIDE 28

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Demonstrating Unsupervised Knowledge-Free WSD

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-29
SLIDE 29

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Demonstrating Unsupervised Knowledge-Free WSD

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-30
SLIDE 30

Introduction Dense Sense Representations Sparse Sense Representations Future Work

WSD without Sense Inventory: Co-Sets

fruit#1 food#0 apple#2 mango#0 pear#0

Hypernym Layer Co-Hyponym Layer Hypernymy Co-Hypernymy

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-31
SLIDE 31

Introduction Dense Sense Representations Sparse Sense Representations Future Work

WSD without Sense Inventory: Co-Sets

ID Hypernym Layer, H(c) ⊂ S Co-Hyponym Layer, c ⊂ S 1 vegetable#0, fruit#0, crop#0, in- gredient#0, food#0 peach#0, banana#0, pineapple#0, berry#0, black- berry#0, grapefruit#0, strawberry#0, blueberry#0, fruit#0, grape#0, melon#0, orange#0, pear#0, plum#0, raspberry#0, watermelon#0, apple#0, apricot#0, cherry#0 2 programming language#3, technology#0, language#0, for- mat#2, app#0 C#4, Basic#2, Haskell#5, Flash#1, Java#1, Pascal#0, Ruby#6, PHP#0, Ada#1, Oracle#3, Python#3, Apache#3, Visual Basic#1, ASP#2, Delphi#2, SQL Server#0, CSS#0, AJAX#0, the Java#0

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation

slide-32
SLIDE 32

Introduction Dense Sense Representations Sparse Sense Representations Future Work

Thank you!

  • Dr. Alexander Panchenko

University of Hamburg, Language Technology Group Unsupervised Knowledge-Free Word Sense Disambiguation