Measuring Semantic Distance using Distributional Profiles of - - PowerPoint PPT Presentation

measuring semantic distance
SMART_READER_LITE
LIVE PREVIEW

Measuring Semantic Distance using Distributional Profiles of - - PowerPoint PPT Presentation

Measuring Semantic Distance using Distributional Profiles of Concepts Saif Mohammad Department of Computer Science University of Toronto Grateful acknowledgments: Graeme Hirst (advisor and co-author); Iryna Gurevych, Torsten Zesch, and Philip


slide-1
SLIDE 1

Measuring Semantic Distance

using Distributional Profiles of Concepts Saif Mohammad Department of Computer Science University of Toronto

Grateful acknowledgments: Graeme Hirst (advisor and co-author); Iryna Gurevych, Torsten Zesch, and Philip Resnik (co-authors); Rada Mihalcea, Renee Miller, Gerald Penn, Suzanne Stevenson, University of Toronto (especially the CL group), and NSERC.

slide-2
SLIDE 2

Semantic Distance

SALSA DANCE CLOWN BRIDGE

A measure of how close or distant two units of language are in terms of their meaning

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 2

slide-3
SLIDE 3

Why measure semantic distance?

  • Natural language processing is teeming with semantic-

distance problems:

Machine translation

You know a person by the company they keep Das Wesen eines Menschen erkennt man an der Gesellschaft, mitder er sich umgibt bag of hypotheses

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 3

slide-4
SLIDE 4

Why measure semantic distance?

  • Natural language processing is teeming with semantic-

distance problems:

Word sense disambiguation

Hermione cast a bewitching spell

CHARM OR INCANTATION

bag of hypotheses

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 4

slide-5
SLIDE 5

Why measure semantic distance?

  • Natural language processing is teeming with semantic-

distance problems:

Speech recognition, real-word spelling correction

. . . interest . . . money . . . band . . . loan . . . bank or bond bag of hypotheses

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 5

slide-6
SLIDE 6

Knowledge source–based semantic measures

  • Structure of a network or resource

The nodes represent senses or concepts Examples: Resnik (1995), Jiang and Conrath (1997)

  • Drawbacks

Resource bottleneck Not easily domain-adaptable Accuracy on pairs other than noun–noun is poor Relatedness estimation is poor Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 6

slide-7
SLIDE 7

Corpus-based distributional measures

  • Words in similar contexts are close.

Distributional profile (DP) of a word: strength of

association of the word with co-occurring words in text

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 7

slide-8
SLIDE 8

DP of a word

DP of DP of star fusion

energy 0.13 gravity 0.03 hydrogen 0.16 light 0.09 space 0.04 hot 0.09 heat 0.16 pressure 0.03 hydrogen 0.07

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 8

slide-9
SLIDE 9

DPs of words

DP of DP of star fusion

energy 0.13 gravity 0.03 hydrogen 0.16 light 0.09 space 0.04 hot 0.09 heat 0.16 pressure 0.03 light 0.12 rich 0.11 heat 0.08 movie 0.16 famous 0.15 planet 0.07 hydrogen 0.07 space 0.21

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 9

slide-10
SLIDE 10

Distance between two words

DP of DP of star fusion

energy 0.13 gravity 0.03 hydrogen 0.16 light 0.09 space 0.04 hot 0.09 heat 0.16 pressure 0.03 light 0.12 rich 0.11 heat 0.08 movie 0.16 famous 0.15 planet 0.07 hydrogen 0.07 space 0.21

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 10

slide-11
SLIDE 11

Distance between two words

DP of DP of star fusion

energy 0.13 gravity 0.03 hydrogen 0.16 light 0.09 space 0.04 hot 0.09 heat 0.16 pressure 0.03 light 0.12 rich 0.11 heat 0.08 movie 0.16 famous 0.15 planet 0.07 hydrogen 0.07 space 0.21

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 11

slide-12
SLIDE 12

Distance between two words

DP of DP of star fusion

energy 0.13 gravity 0.03 hydrogen 0.16 light 0.09 space 0.04 hot 0.09 heat 0.16 pressure 0.03 light 0.12 rich 0.11 heat 0.08 movie 0.16 famous 0.15 planet 0.07 hydrogen 0.07 space 0.21

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 12

slide-13
SLIDE 13

Distance between two words

DP of DP of star fusion

energy 0.13 gravity 0.03 hydrogen 0.16 light 0.09 space 0.04 hot 0.09 heat 0.16 pressure 0.03 light 0.12 rich 0.11 heat 0.08 movie 0.16 famous 0.15 planet 0.07 hydrogen 0.07 space 0.21

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 13

slide-14
SLIDE 14

Distance between two words

DP of DP of star fusion

energy 0.13 gravity 0.03 hydrogen 0.16 light 0.09 space 0.04 hot 0.09 heat 0.16 pressure 0.03 light 0.12 rich 0.11 heat 0.08 movie 0.16 famous 0.15 planet 0.07 hydrogen 0.07 space 0.21

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 14

slide-15
SLIDE 15

Distributional measures of word-distance

  • Words in similar contexts are close.

Distributional profile (DP) of a word: strength of

association of the word with co-occurring words (text)

Distributional measure: distance between DPs

Cosine, Lin, α-skew divergence

  • Drawback

Poor accuracy (albeit higher coverage)

  • Conflation of word senses

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 15

slide-16
SLIDE 16

Problem with distributional word-distance measures

DP of DP of star fusion

energy 0.13 gravity 0.03 hydrogen 0.16 light 0.09 space 0.04 hot 0.09 heat 0.16 pressure 0.03 light 0.12 rich 0.11 heat 0.08 movie 0.16 famous 0.15 planet 0.07 hydrogen 0.07 space 0.21

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 16

slide-17
SLIDE 17

Problem with distributional word-distance measures

DP of DP of star fusion

energy 0.13 gravity 0.03 hydrogen 0.16 light 0.09 space 0.04 hot 0.09 heat 0.16 pressure 0.03 light 0.12 heat 0.08 planet 0.07 hydrogen 0.07 space 0.21 movie 0.16 famous 0.15 rich 0.11

Word sense ambiguity reduces accuracy of distance measures

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 17

slide-18
SLIDE 18

Shared limitations

  • Precomputing all distances is computationally expensive

WordNet-based measures:

117,000×117,000 sense–sense distance matrix

Distributional measures:

100,000×100,000 word–word distance matrix

  • Monolingual

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 18

slide-19
SLIDE 19

A new hybrid approach

  • Combines a knowledge source with text

Thesaurus categories: concepts/coarse senses Most published thesauri: around 1000 categories

  • Profiles concepts (rather than words)

Uses sets of words to represent each concept Creates profiles using bootstrapping Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 19

slide-20
SLIDE 20

Features

  • Can be used in real-time applications

Concept–concept distance matrix: only 1000×1000

  • Accurate for all pos–pos pairs

Not just noun–noun

  • Capable of giving both similarity and relatedness values
  • Easily domain adaptable
  • Cross-lingual

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 20

slide-21
SLIDE 21

Problem with distributional word-distance measures

DP of DP of star fusion

hydrogen 0.16 pressure 0.03 light 0.12 heat 0.08 hydrogen 0.07 space 0.21 movie 0.16 famous 0.15 rich 0.11 planet 0.07

Word sense ambiguity reduces accuracy of distance measures

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 21

slide-22
SLIDE 22

Solution: tease out the senses

DP of fusion DP of star

pressure 0.03 light heat planet hydrogen space hydrogen movie famous rich

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 22

slide-23
SLIDE 23

Solution: tease out the senses

DP of fusion DP of star

pressure 0.03 light heat planet hydrogen space hydrogen movie famous rich

Profile the senses separately.

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 23

slide-24
SLIDE 24

Distributional profiles

  • f concepts

DPs of the concepts referred to by star:

DP of CELESTIAL BODY DP of CELEBRITY

hydrogen 0.06 space 0.36 light 0.27 planet 0.07 heat 0.11 fan 0.10 rich 0.14 movie 0.14 famous 0.24 (celebrity, hero, star,...) hot 0.04 fashion 0.01 hot 0.01

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 24

slide-25
SLIDE 25

Distributional profiles

  • f concepts

DPs of the concepts referred to by star:

DP of CELESTIAL BODY DP of CELEBRITY

(celestial body, star, sun,...) (celebrity, hero, star,...) hydrogen 0.06 space 0.36 light 0.27 planet 0.07 heat 0.11 fan 0.10 rich 0.14 movie 0.14 famous 0.24 hot 0.01 hot 0.04 fashion 0.01

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 25

slide-26
SLIDE 26

Distance: star and fusion

DP of FUSION DP of CELEBRITY

hot 0.09 energy 0.13 hydrogen 0.16 heat 0.16 space 0.04 light 0.09 (atomic reaction, fusion, thermonuclear reaction,...) fashion 0.01

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 26

slide-27
SLIDE 27

Distance: star and fusion

DP of FUSION DP of CELEBRITY

hot 0.09 energy 0.13 hydrogen 0.16 heat 0.16 space 0.04 light 0.09 (atomic reaction, fusion, thermonuclear reaction,...) (celebrity, hero, star,...) fan 0.10 rich 0.14 movie 0.14 famous 0.24 hot 0.04 fashion 0.01

First, consider the CELEBRITY sense of star.

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 27

slide-28
SLIDE 28

Distance: star and fusion

DP of FUSION DP of CELEBRITY

hot 0.09 energy 0.13 hydrogen 0.16 heat 0.16 space 0.04 light 0.09 (atomic reaction, fusion, thermonuclear reaction,...) (celebrity, hero, star,...) fan 0.10 rich 0.14 movie 0.14 famous 0.24 hot 0.04 fashion 0.01

First, consider the CELEBRITY sense of star.

  • Distributionally NOT close

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 28

slide-29
SLIDE 29

Distance: star and fusion

DP of FUSION DP of CELEBRITY

hot 0.09 energy 0.13 hydrogen 0.16 heat 0.16 space 0.04 light 0.09 (atomic reaction, fusion, thermonuclear reaction,...) fashion 0.01

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 29

slide-30
SLIDE 30

Distance: star and fusion

DP of FUSION DP of CELESTIAL BODY

hot 0.09 energy 0.13 hydrogen 0.16 heat 0.16 space 0.04 light 0.09 (atomic reaction, fusion, thermonuclear reaction,...) (celestial body, star, sun...) planet 0.07 heat 0.11 light 0.27 space 0.36 hydrogen 0.07 hot 0.07

Then, consider the CELESTIAL BODY sense of star.

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 30

slide-31
SLIDE 31

Distance: star and fusion

DP of FUSION DP of CELESTIAL BODY

hot 0.09 energy 0.13 hydrogen 0.16 heat 0.16 space 0.04 light 0.09 (atomic reaction, fusion, thermonuclear reaction,...) (celestial body, star, sun...) planet 0.07 heat 0.11 light 0.27 space 0.36 hydrogen 0.07 hot 0.07

Then, consider the CELESTIAL BODY sense of star.

  • Distributionally close
  • Word sense ambiguity NOT a problem

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 31

slide-32
SLIDE 32

Ranking word pairs

(Monolingual)

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 32

slide-33
SLIDE 33

Correcting spelling errors

(Monolingual)

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 33

slide-34
SLIDE 34
  • But. . .

Application of distance algorithms in most languages is hindered by a lack of high-quality linguistic resources.

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 34

slide-35
SLIDE 35

So: Make it cross-lingual

  • Determining distance in a resource-poor language

Combine its text with a thesaurus from a (possibly

resource-rich) language

  • Largely alleviates the knowledge-source bottleneck

Use a bilingual lexicon Without parallel corpora or sense-annotated data

  • Experiments: German as a “resource-poor” language

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 35

slide-36
SLIDE 36

Cross-lingual links

Bank Stern

CELEBRITY

cen wde

German words wde

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 36

slide-37
SLIDE 37

Cross-lingual links

Bank bank star Stern bench

CELEBRITY

cen wen wde

German words wde English translations wen (German–English lexicon)

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 37

slide-38
SLIDE 38

Cross-lingual links

Bank bank star Stern bench

BODY CELESTIAL CELEBRITY BANK RIVER JUDICIARY INSTITUTION FINANCIAL FURNITURE

cen wen wde

German words wde English translations wen (German–English lexicon) English concepts cen (English thesaurus)

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 38

slide-39
SLIDE 39

Dealing with ambiguity

Bank bank star Stern bench

BODY CELESTIAL CELEBRITY BANK RIVER JUDICIARY INSTITUTION FINANCIAL FURNITURE

cen wen wde

The concepts of CELEBRITY, RIVER BANK and

JUDICIARY are semantically unrelated to Stern and

Bank.

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 39

slide-40
SLIDE 40

Losing the English words

Bank bank star Stern bench

BODY CELESTIAL CELEBRITY BANK RIVER JUDICIARY INSTITUTION FINANCIAL FURNITURE

cen wen wde

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 40

slide-41
SLIDE 41

Losing the English words

Bank Stern

BODY CELESTIAL CELEBRITY BANK RIVER JUDICIARY INSTITUTION FINANCIAL FURNITURE

cen wde

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 41

slide-42
SLIDE 42

Losing the English words

Bank Stern

BODY CELESTIAL CELEBRITY BANK RIVER JUDICIARY INSTITUTION FINANCIAL FURNITURE

cen wde

Cross-lingual candidate senses of German words Stern and Bank

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 42

slide-43
SLIDE 43

Cross-lingual DPCs

Cross-lingual DPs of the concepts referred to by star:

DP of CELESTIAL BODY DP of CELEBRITY

(celestial body, star, sun,...) (celebrity, hero, star,...) English German

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 43

slide-44
SLIDE 44

Cross-lingual DPCs

Cross-lingual DPs of the concepts referred to by star:

DP of CELESTIAL BODY DP of CELEBRITY

German Wasserstoff 0.06 Raum 0.36 Licht 0.27 Planet 0.07 Hitze 0.11 Fan 0.10 reich 0.14 Film 0.14 ber¨ uhmt 0.24 heiß 0.04 Mode 0.01 (celestial body, star, sun,...) (celebrity, hero, star,...) English heiß 0.01

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 44

slide-45
SLIDE 45

Ranking word pairs

(Cross-lingual)

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 45

slide-46
SLIDE 46

Solving word choice problems

(Cross-lingual)

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 46

slide-47
SLIDE 47

Distance between a concept and its context

concept1 word ... word ... target word ... word ... word concept2

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 47

slide-48
SLIDE 48

Distance between a concept and its context

CONCEPT1

word ... word ... target word ... word ... word

CONCEPT2

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 48

slide-49
SLIDE 49

Distance between a concept and its context

CONCEPT1

word ... word ... target word ... word ... word

CONCEPT2

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 49

slide-50
SLIDE 50

Distance between a concept and its context

CONCEPT1

word ... word ... target word ... word ... word

CONCEPT2

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 50

slide-51
SLIDE 51

Distance between a concept and its context

CONCEPT1

word ... word ... target word ... word ... word

CONCEPT2

Word sense dominance and word sense disambiguation:

  • Obviate the need of sense-annotated data

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 51

slide-52
SLIDE 52

Word sense dominance

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

baseline baseline Accuracy Distribution (alpha) upper bound upper bound lower bound lower bound Mean distance below upper bound = 0.02 DPC−based dominance measure

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 52

slide-53
SLIDE 53

Unsupervised Na¨ ıve Bayes word sense classifier

  • Estimated probabilities from the DPC
  • Took part in SemEval-07’s:

English Lexical Sample Task

  • Only one percentage point behind the best unsuper-

vised system

Multilingual Chinese–English Lexical Sample Task

  • Placed clear first among unsupervised systems

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 53

slide-54
SLIDE 54

Accomplishments (1)

  • Performed a qualitative and quantitative comparison of

WordNet-based and distributional measures

  • Identified significant limitations of state-of-the-art ap-

proaches to measuring semantic distance

Word sense ambiguity

  • A hurdle for distributional measures

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 54

slide-55
SLIDE 55

Accomplishments (2)

  • Proposed a new hybrid approach to semantic distance

Combines text with a thesaurus Models concepts (rather than words) Uses thesaurus categories as very coarse senses Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 55

slide-56
SLIDE 56

Accomplishments (3)

  • Extensive evaluation

Monolingual

  • By combining English text with an English thesaurus
  • Ranked word pairs
  • Corrected real-word spelling errors
  • Determined word sense dominance
  • Did word sense disambiguation

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 56

slide-57
SLIDE 57

Accomplishments (4)

  • Extensive evaluation (continued)

Cross-lingual

  • By combining German text with an English thesaurus
  • Ranked word pairs and solving word-choice

problems in German

  • By combining Chinese text with an English thesaurus
  • Identified the English translations of Chinese

words from their contexts

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 57

slide-58
SLIDE 58

Future work

  • Adding cross-lingual semantic distance as a feature to a

state-of-the-art MT system (with Philip Resnik)

  • Cross-lingual document clustering
  • Cross-lingual information retrieval
  • Cross-lingual summarization (with Bonnie Dorr)
  • Determining paraphrases, lexical entailment, and contra-

dictions (with Bonnie Dorr)

  • Determining cognates using semantic distance between

words in different languages (with Greg Kondrak)

  • Porting the approach to Wikipedia (with Torsten Zesch

and Iryna Gurevych)

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 58

slide-59
SLIDE 59

Conclusions (1)

  • Distributional profiles of concepts can be used to infer

their semantic properties, and indeed estimate semantic distance.

  • Cross-lingual DPCs allow for a seamless transition from

words in one language to concepts in another.

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 59

slide-60
SLIDE 60

Conclusions (2)

  • Distributional measures of concept-distance are markedly

superior to previous approaches.

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 60

slide-61
SLIDE 61

Conclusions (2)

  • Distributional measures of concept-distance are markedly

superior to previous approaches.

Works well for all pos pairs Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 61

slide-62
SLIDE 62

Conclusions (2)

  • Distributional measures of concept-distance are markedly

superior to previous approaches.

Works well for all pos pairs Gives both relatedness and similarity Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 62

slide-63
SLIDE 63

Conclusions (2)

  • Distributional measures of concept-distance are markedly

superior to previous approaches.

Works well for all pos pairs Gives both relatedness and similarity Domain adaptable Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 63

slide-64
SLIDE 64

Conclusions (2)

  • Distributional measures of concept-distance are markedly

superior to previous approaches.

Works well for all pos pairs Gives both relatedness and similarity Domain adaptable Can be used in real-time systems Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 64

slide-65
SLIDE 65

Conclusions (2)

  • Distributional measures of concept-distance are markedly

superior to previous approaches.

Works well for all pos pairs Gives both relatedness and similarity Domain adaptable Can be used in real-time systems Cross-lingual

  • Solve problems in a one language using a knowledge

source from another

  • Solve problems that involve multiple languages

Measuring Semantic Distance using Distributional Profiles of Concepts. Saif Mohammad. 65