Lexical Semantics Ling571 Deep Processing Techniques for NLP - - PowerPoint PPT Presentation

lexical semantics
SMART_READER_LITE
LIVE PREVIEW

Lexical Semantics Ling571 Deep Processing Techniques for NLP - - PowerPoint PPT Presentation

Lexical Semantics Ling571 Deep Processing Techniques for NLP February 23, 2015 What is a plant? There are more kinds of plants and animals in the rainforests than anywhere else on Earth. Over half of the millions of known species of plants and


slide-1
SLIDE 1

Lexical Semantics

Ling571 Deep Processing Techniques for NLP February 23, 2015

slide-2
SLIDE 2

What is a plant?

There are more kinds of plants and animals in the rainforests than anywhere else on Earth. Over half of the millions of known species of plants and animals live in the rainforest. Many are found nowhere else. There are even plants and animals in the rainforest that we have not yet discovered. The Paulus company was founded in 1938. Since those days the product range has been the subject of constant expansions and is brought up continuously to correspond with the state of the art. We’re engineering, manufacturing, and commissioning world-wide ready-to-run plants packed with our comprehensive know-how.

slide-3
SLIDE 3

Lexical Semantics

— So far, word meanings discrete

— Constants, predicates, functions

slide-4
SLIDE 4

Lexical Semantics

— So far, word meanings discrete

— Constants, predicates, functions

— Focus on word meanings:

— Relations of meaning among words

— Similarities & differences of meaning in sim context

slide-5
SLIDE 5

Lexical Semantics

— So far, word meanings discrete

— Constants, predicates, functions

— Focus on word meanings:

— Relations of meaning among words

— Similarities & differences of meaning in sim context

— Internal meaning structure of words

— Basic internal units combine for meaning

slide-6
SLIDE 6

Terminology

— Lexeme:

— Form: Orthographic/phonological + meaning

slide-7
SLIDE 7

Terminology

— Lexeme:

— Form: Orthographic/phonological + meaning — Represented by lemma

— Lemma: citation form; infinitive in inflection

— Sing: sing, sings, sang, sung,…

slide-8
SLIDE 8

Terminology

— Lexeme:

— Form: Orthographic/phonological + meaning — Represented by lemma

— Lemma: citation form; infinitive in inflection

— Sing: sing, sings, sang, sung,…

— Lexicon: finite list of lexemes

slide-9
SLIDE 9

Sources of Confusion

— Homonymy:

— Words have same form but different meanings

— Generally same POS, but unrelated meaning

slide-10
SLIDE 10

Sources of Confusion

— Homonymy:

— Words have same form but different meanings

— Generally same POS, but unrelated meaning — E.g. bank (side of river) vs bank (financial institution)

— bank1 vs bank2

slide-11
SLIDE 11

Sources of Confusion

— Homonymy:

— Words have same form but different meanings

— Generally same POS, but unrelated meaning — E.g. bank (side of river) vs bank (financial institution)

— bank1 vs bank2

— Homophones: same phonology, diff’t orthographic form

— E.g. two, to, too

slide-12
SLIDE 12

Sources of Confusion

— Homonymy:

— Words have same form but different meanings

— Generally same POS, but unrelated meaning — E.g. bank (side of river) vs bank (financial institution)

— bank1 vs bank2

— Homophones: same phonology, diff’t orthographic form

— E.g. two, to, too

— Homographs: Same orthography, diff’t phonology

— Why?

slide-13
SLIDE 13

Sources of Confusion

— Homonymy:

— Words have same form but different meanings

— Generally same POS, but unrelated meaning — E.g. bank (side of river) vs bank (financial institution)

— bank1 vs bank2

— Homophones: same phonology, diff’t orthographic form

— E.g. two, to, too

— Homographs: Same orthography, diff’t phonology

— Why?

— Problem for applications: TTS, ASR transcription, IR

slide-14
SLIDE 14

Sources of Confusion II

— Polysemy

— Multiple RELATED senses

— E.g. bank: money, organ, blood,…

slide-15
SLIDE 15

Sources of Confusion II

— Polysemy

— Multiple RELATED senses

— E.g. bank: money, organ, blood,…

— Big issue in lexicography

— # of senses, relations among senses, differentiation — E.g. serve breakfast, serve Philadelphia, serve time

slide-16
SLIDE 16

Relations between Senses

— Synonymy:

— (near) identical meaning

slide-17
SLIDE 17

Relations between Senses

— Synonymy:

— (near) identical meaning — Substitutability

— Maintains propositional meaning

— Issues:

slide-18
SLIDE 18

Relations between Senses

— Synonymy:

— (near) identical meaning — Substitutability

— Maintains propositional meaning

— Issues:

— Polysemy – same as some sense

slide-19
SLIDE 19

Relations between Senses

— Synonymy:

— (near) identical meaning — Substitutability

— Maintains propositional meaning

— Issues:

— Polysemy – same as some sense — Shades of meaning – other associations:

— Price/fare; big/large; water H2O

slide-20
SLIDE 20

Relations between Senses

— Synonymy:

— (near) identical meaning — Substitutability

— Maintains propositional meaning

— Issues:

— Polysemy – same as some sense — Shades of meaning – other associations:

— Price/fare; big/large; water H2O

— Collocational constraints: e.g. babbling brook

slide-21
SLIDE 21

Relations between Senses

— Synonymy:

— (near) identical meaning — Substitutability

— Maintains propositional meaning

— Issues:

— Polysemy – same as some sense — Shades of meaning – other associations:

— Price/fare; big/large; water H2O

— Collocational constraints: e.g. babbling brook — Register:

— social factors: e.g. politeness, formality

slide-22
SLIDE 22

Relations between Senses

— Antonyms:

— Opposition

— Typically ends of a scale

— Fast/slow; big/little

slide-23
SLIDE 23

Relations between Senses

— Antonyms:

— Opposition

— Typically ends of a scale

— Fast/slow; big/little

— Can be hard to distinguish automatically from syns

slide-24
SLIDE 24

Relations between Senses

— Antonyms:

— Opposition

— Typically ends of a scale

— Fast/slow; big/little

— Can be hard to distinguish automatically from syns

— Hyponomy:

— Isa relations:

— More General (hypernym) vs more specific (hyponym)

— E.g. dog/golden retriever; fruit/mango;

slide-25
SLIDE 25

Relations between Senses

— Antonyms:

— Opposition

— Typically ends of a scale

— Fast/slow; big/little

— Can be hard to distinguish automatically from syns

— Hyponomy:

— Isa relations:

— More General (hypernym) vs more specific (hyponym)

— E.g. dog/golden retriever; fruit/mango;

— Organize as ontology/taxonomy

slide-26
SLIDE 26

WordNet Taxonomy

— Most widely used English sense resource — Manually constructed lexical database

slide-27
SLIDE 27

WordNet Taxonomy

— Most widely used English sense resource — Manually constructed lexical database

— 3 Tree-structured hierarchies

— Nouns (117K) , verbs (11K), adjective+adverb (27K)

slide-28
SLIDE 28

WordNet Taxonomy

— Most widely used English sense resource — Manually constructed lexical database

— 3 Tree-structured hierarchies

— Nouns (117K) , verbs (11K), adjective+adverb (27K) — Entries: synonym set, gloss, example use

slide-29
SLIDE 29

WordNet Taxonomy

— Most widely used English sense resource — Manually constructed lexical database

— 3 Tree-structured hierarchies

— Nouns (117K) , verbs (11K), adjective+adverb (27K) — Entries: synonym set, gloss, example use

— Relations between entries:

— Synonymy: in synset — Hypo(per)nym: Isa tree

slide-30
SLIDE 30

WordNet

slide-31
SLIDE 31

Noun WordNet Relations

slide-32
SLIDE 32

WordNet Taxonomy

slide-33
SLIDE 33

Word Sense Disambiguation

— WSD

— Tasks, evaluation, features — Selectional Restriction-based Approaches — Robust Approaches

— Dictionary-based Approaches — Distributional Approaches — Resource-based Approaches

— Summary

— Strengths and Limitations

slide-34
SLIDE 34

Word Sense Disambiguation

— Application of lexical semantics — Goal: Given a word in context, identify the appropriate

sense — E.g. plants and animals in the rainforest

— Crucial for real syntactic & semantic analysis

slide-35
SLIDE 35

Word Sense Disambiguation

— Application of lexical semantics — Goal: Given a word in context, identify the appropriate

sense — E.g. plants and animals in the rainforest

— Crucial for real syntactic & semantic analysis

— Correct sense can determine

— .

slide-36
SLIDE 36

Word Sense Disambiguation

— Application of lexical semantics — Goal: Given a word in context, identify the appropriate

sense — E.g. plants and animals in the rainforest

— Crucial for real syntactic & semantic analysis

— Correct sense can determine

— Available syntactic structure — Available thematic roles, correct meaning,..

slide-37
SLIDE 37

Robust Disambiguation

— More to semantics than P-A structure

— Select sense where predicates underconstrain

slide-38
SLIDE 38

Robust Disambiguation

— More to semantics than P-A structure

— Select sense where predicates underconstrain

— Learning approaches

— Supervised, Bootstrapped, Unsupervised

slide-39
SLIDE 39

Robust Disambiguation

— More to semantics than P-A structure

— Select sense where predicates underconstrain

— Learning approaches

— Supervised, Bootstrapped, Unsupervised

— Knowledge-based approaches

— Dictionaries, Taxonomies

— Widen notion of context for sense selection

slide-40
SLIDE 40

Robust Disambiguation

— More to semantics than P-A structure

— Select sense where predicates underconstrain

— Learning approaches

— Supervised, Bootstrapped, Unsupervised

— Knowledge-based approaches

— Dictionaries, Taxonomies

— Widen notion of context for sense selection

— Words within window (2,50,discourse) — Narrow cooccurrence - collocations

slide-41
SLIDE 41

There are more kinds of plants and animals in the rainforests than anywhere else on Earth. Over half of the millions of known species of plants and animals live in the rainforest. Many are found nowhere else. There are even plants and animals in the rainforest that we have not yet discovered. Biological Example The Paulus company was founded in 1938. Since those days the product range has been the subject of constant expansions and is brought up continuously to correspond with the state of the art. We’re engineering, manufacturing and commissioning world- wide ready-to-run plants packed with our comprehensive know-

  • how. Our Product Range includes pneumatic conveying systems

for carbon, carbide, sand, lime and many others. We use reagent injection in molten metal for the… Industrial Example Label the First Use of “Plant”

slide-42
SLIDE 42

Disambiguation Features

— Key: What are the features?

slide-43
SLIDE 43

Disambiguation Features

— Key: What are the features?

— Part of speech

— Of word and neighbors

slide-44
SLIDE 44

Disambiguation Features

— Key: What are the features?

— Part of speech

— Of word and neighbors

— Morphologically simplified form

slide-45
SLIDE 45

Disambiguation Features

— Key: What are the features?

— Part of speech

— Of word and neighbors

— Morphologically simplified form — Words in neighborhood

slide-46
SLIDE 46

Disambiguation Features

— Key: What are the features?

— Part of speech

— Of word and neighbors

— Morphologically simplified form — Words in neighborhood

— Question: How big a neighborhood?

slide-47
SLIDE 47

Disambiguation Features

— Key: What are the features?

— Part of speech

— Of word and neighbors

— Morphologically simplified form — Words in neighborhood

— Question: How big a neighborhood?

— Is there a single optimal size? Why?

— ..

slide-48
SLIDE 48

Disambiguation Features

— Key: What are the features?

— Part of speech

— Of word and neighbors

— Morphologically simplified form — Words in neighborhood

— Question: How big a neighborhood?

— Is there a single optimal size? Why?

— (Possibly shallow) Syntactic analysis

— E.g. predicate-argument relations, modification, phrases

— Collocation vs co-occurrence features

slide-49
SLIDE 49

Disambiguation Features

— Key: What are the features?

— Part of speech

— Of word and neighbors

— Morphologically simplified form — Words in neighborhood

— Question: How big a neighborhood?

— Is there a single optimal size? Why?

— (Possibly shallow) Syntactic analysis

— E.g. predicate-argument relations, modification, phrases

— Collocation vs co-occurrence features

— Collocation: words in specific relation: p-a, 1 word +/-

slide-50
SLIDE 50

Disambiguation Features

— Key: What are the features?

— Part of speech

— Of word and neighbors

— Morphologically simplified form — Words in neighborhood

— Question: How big a neighborhood?

— Is there a single optimal size? Why?

— (Possibly shallow) Syntactic analysis

— E.g. predicate-argument relations, modification, phrases

— Collocation vs co-occurrence features

— Collocation: words in specific relation: p-a, 1 word +/- — Co-occurrence: bag of words..

slide-51
SLIDE 51

WSD Evaluation

— Ideally, end-to-end evaluation with WSD component

— Demonstrate real impact of technique in system

slide-52
SLIDE 52

WSD Evaluation

— Ideally, end-to-end evaluation with WSD component

— Demonstrate real impact of technique in system — Difficult, expensive, still application specific

slide-53
SLIDE 53

WSD Evaluation

— Ideally, end-to-end evaluation with WSD component

— Demonstrate real impact of technique in system — Difficult, expensive, still application specific

— Typically, intrinsic, sense-based

— Accuracy, precision, recall — SENSEVAL/SEMEVAL: all words, lexical sample

— Baseline:

slide-54
SLIDE 54

WSD Evaluation

— Ideally, end-to-end evaluation with WSD component

— Demonstrate real impact of technique in system — Difficult, expensive, still application specific

— Typically, intrinsic, sense-based

— Accuracy, precision, recall — SENSEVAL/SEMEVAL: all words, lexical sample

— Baseline:

— Most frequent sense, Lesk

— Topline:

slide-55
SLIDE 55

WSD Evaluation

— Ideally, end-to-end evaluation with WSD component

— Demonstrate real impact of technique in system — Difficult, expensive, still application specific

— Typically, intrinsic, sense-based

— Accuracy, precision, recall — SENSEVAL/SEMEVAL: all words, lexical sample

— Baseline:

— Most frequent sense, Lesk

— Topline:

— Human inter-rater agreement: 75-80% fine; 90% coarse

slide-56
SLIDE 56

Dictionary-Based Approach

— (Simplified) Lesk algorithm

— “How to tell a pine cone from an ice cream cone”

slide-57
SLIDE 57

Dictionary-Based Approach

— (Simplified) Lesk algorithm

— “How to tell a pine cone from an ice cream cone”

— Compute ‘signature’ of word senses:

— Words in gloss and examples in dictionary

slide-58
SLIDE 58

Dictionary-Based Approach

— (Simplified) Lesk algorithm

— “How to tell a pine cone from an ice cream cone”

— Compute ‘signature’ of word senses:

— Words in gloss and examples in dictionary

— Compute context of word to disambiguate

— Words in surrounding sentence(s)

slide-59
SLIDE 59

Dictionary-Based Approach

— (Simplified) Lesk algorithm

— “How to tell a pine cone from an ice cream cone”

— Compute ‘signature’ of word senses:

— Words in gloss and examples in dictionary

— Compute context of word to disambiguate

— Words in surrounding sentence(s)

— Compare overlap b/t signature and context

slide-60
SLIDE 60

Dictionary-Based Approach

— (Simplified) Lesk algorithm

— “How to tell a pine cone from an ice cream cone”

— Compute ‘signature’ of word senses:

— Words in gloss and examples in dictionary

— Compute context of word to disambiguate

— Words in surrounding sentence(s)

— Compare overlap b/t signature and context — Select sense with highest (non-stopword) overlap

slide-61
SLIDE 61

Applying Lesk

— The bank can guarantee deposits will eventually cover future

tuition costs because it invests in mortgage securities.

slide-62
SLIDE 62

Applying Lesk

— The bank can guarantee deposits will eventually cover future

tuition costs because it invests in mortgage securities.

— Bank1 : 2

slide-63
SLIDE 63

Applying Lesk

— The bank can guarantee deposits will eventually cover future

tuition costs because it invests in mortgage securities.

— Bank1 : 2 — Bank2: 0

slide-64
SLIDE 64

Improving Lesk

— Overlap score:

— All words equally weighted (excluding stopwords)

slide-65
SLIDE 65

Improving Lesk

— Overlap score:

— All words equally weighted (excluding stopwords)

— Not all words equally informative

slide-66
SLIDE 66

Improving Lesk

— Overlap score:

— All words equally weighted (excluding stopwords)

— Not all words equally informative

— Overlap with unusual/specific words – better — Overlap with common/non-specific words – less good

slide-67
SLIDE 67

Improving Lesk

— Overlap score:

— All words equally weighted (excluding stopwords)

— Not all words equally informative

— Overlap with unusual/specific words – better — Overlap with common/non-specific words – less good

— Employ corpus weighting:

— IDF: inverse document frequency

— Idfi = log (Ndoc/ndi)

slide-68
SLIDE 68

Word Similarity

— Synonymy:

slide-69
SLIDE 69

Word Similarity

— Synonymy:

— True propositional substitutability is rare, slippery

slide-70
SLIDE 70

Word Similarity

— Synonymy:

— True propositional substitutability is rare, slippery

— Word similarity (semantic distance):

— Looser notion, more flexible

slide-71
SLIDE 71

Word Similarity

— Synonymy:

— True propositional substitutability is rare, slippery

— Word similarity (semantic distance):

— Looser notion, more flexible — Appropriate to applications:

— IR, summarization, MT

, essay scoring

slide-72
SLIDE 72

Word Similarity

— Synonymy:

— True propositional substitutability is rare, slippery

— Word similarity (semantic distance):

— Looser notion, more flexible — Appropriate to applications:

— IR, summarization, MT

, essay scoring — Don’t need binary +/- synonym decision

slide-73
SLIDE 73

Word Similarity

— Synonymy:

— True propositional substitutability is rare, slippery

— Word similarity (semantic distance):

— Looser notion, more flexible — Appropriate to applications:

— IR, summarization, MT

, essay scoring — Don’t need binary +/- synonym decision — Want terms/documents that have high similarity

slide-74
SLIDE 74

Word Similarity

— Synonymy:

— True propositional substitutability is rare, slippery

— Word similarity (semantic distance):

— Looser notion, more flexible — Appropriate to applications:

— IR, summarization, MT

, essay scoring — Don’t need binary +/- synonym decision — Want terms/documents that have high similarity — Differ from relatedness

slide-75
SLIDE 75

Word Similarity

— Synonymy:

— True propositional substitutability is rare, slippery

— Word similarity (semantic distance):

— Looser notion, more flexible — Appropriate to applications:

— IR, summarization, MT

, essay scoring — Don’t need binary +/- synonym decision — Want terms/documents that have high similarity — Differ from relatedness

— Approaches:

— Thesaurus-based — Distributional

slide-76
SLIDE 76

Distributional Similarity

— Unsupervised approach:

— Clustering, WSD, automatic thesaurus enrichment

Some slides based on Eisenstein 2014

slide-77
SLIDE 77

Distributional Similarity

— Unsupervised approach:

— Clustering, WSD, automatic thesaurus enrichment

— Insight:

— “You shall know a word by the company it keeps!”

— (Firth, 1957)

slide-78
SLIDE 78

Distributional Similarity

— Unsupervised approach:

— Clustering, WSD, automatic thesaurus enrichment

— Insight:

— “You shall know a word by the company it keeps!”

— (Firth, 1957)

— A bottle of tezguino is on the table. — Everybody likes tezguino. — Tezguino makes you drunk. — We make tezguino from corn.

slide-79
SLIDE 79

Distributional Similarity

— Unsupervised approach:

— Clustering, WSD, automatic thesaurus enrichment

— Insight:

— “You shall know a word by the company it keeps!”

— (Firth, 1957)

— A bottle of tezguino is on the table. — Everybody likes tezguino. — Tezguino makes you drunk. — We make tezguino from corn.

— Tezguino: corn-based, alcoholic beverage

slide-80
SLIDE 80

Local Context Clustering

— “Brown” (aka IBM) clustering (1992)

— Generative model over adjacent words — Each wi has class ci — log P(W) = Σilog P(wi|ci) + log P(ci|ci-1)

— (Familiar??)

slide-81
SLIDE 81

Local Context Clustering

— “Brown” (aka IBM) clustering (1992)

— Generative model over adjacent words — Each wi has class ci — log P(W) = Σilog P(wi|ci) + log P(ci|ci-1)

— (Familiar??)

— Greedy clustering

— Start with each word in own cluster — Merge clusters based on log prob of text under model

— Merge those which maximize P(W)

slide-82
SLIDE 82

Clustering Impact

— Improves downstream tasks

— Here Named Entity Recognition vs HMM (Miller et al ’04)