Low-Pass Semantics Fernando Pereira with William Cohen * , Rahul - - PowerPoint PPT Presentation

low pass semantics
SMART_READER_LITE
LIVE PREVIEW

Low-Pass Semantics Fernando Pereira with William Cohen * , Rahul - - PowerPoint PPT Presentation

Low-Pass Semantics Fernando Pereira with William Cohen * , Rahul Gupta, Ni Lao * , Slav Petrov, Michael Ringgaard, and Amar Subramanya * CMU An easy query An easy query frequent query negation class negation class


slide-1
SLIDE 1

Fernando Pereira

with William Cohen*, Rahul Gupta, Ni Lao*, Slav Petrov, Michael Ringgaard, and Amar Subramanya

*CMU

Low-Pass Semantics

slide-2
SLIDE 2

An “easy” query…

slide-3
SLIDE 3

An “easy” query…

frequent query

slide-4
SLIDE 4
slide-5
SLIDE 5

negation

slide-6
SLIDE 6

negation class

slide-7
SLIDE 7

negation class synonyms

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10

tylenol ⇒ painkiller

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13

upset stomach ⇒ adverse effect (on stomach)

slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16

tylenol ⇒ no adverse effects (on stomach)

slide-17
SLIDE 17

tylenol ⇒ no adverse effects (on stomach) ⇒ no upset stomach

slide-18
SLIDE 18

From search to inference

slide-19
SLIDE 19

From search to inference

painkiller

slide-20
SLIDE 20

From search to inference

painkiller upset

slide-21
SLIDE 21

From search to inference

painkiller upset stomach

slide-22
SLIDE 22

From search to inference

painkiller upset stomach is-a

slide-23
SLIDE 23

From search to inference

painkiller upset stomach tylenol is-a

slide-24
SLIDE 24

From search to inference

painkiller upset stomach tylenol is-a a-kind-of

slide-25
SLIDE 25

From search to inference

painkiller upset stomach tylenol adverse effects is-a a-kind-of

slide-26
SLIDE 26

From search to inference

painkiller upset stomach tylenol adverse effects is-a a-kind-of not

slide-27
SLIDE 27

From search to inference

painkiller upset stomach tylenol adverse effects is-a a-kind-of not arg

slide-28
SLIDE 28

From search to inference

painkiller upset stomach tylenol adverse effects is-a a-kind-of not not arg

slide-29
SLIDE 29

From search to inference

painkiller upset stomach tylenol adverse effects is-a a-kind-of not not arg arg

slide-30
SLIDE 30

“Things, not strings”

slide-31
SLIDE 31

“Things, not strings”

From To Requires

slide-32
SLIDE 32

“Things, not strings”

From To Requires Term Concept Parsing, disambiguation, coreference

slide-33
SLIDE 33

“Things, not strings”

From To Requires Term Concept Parsing, disambiguation, coreference Term identity Entailment Concept relations

slide-34
SLIDE 34

“Things, not strings”

From To Requires Term Concept Parsing, disambiguation, coreference Term identity Entailment Concept relations Co-occurrence Syntactic relation Document structure, parsing

slide-35
SLIDE 35

“Things, not strings”

From To Requires Term Concept Parsing, disambiguation, coreference Term identity Entailment Concept relations Co-occurrence Syntactic relation Document structure, parsing Term index Semantic index Concept disambiguation, inference

slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38
slide-39
SLIDE 39

Hypotheses

slide-40
SLIDE 40

Hypotheses

  • Web understanding will arise from machine

learning of relationships implicit in Web content and use

slide-41
SLIDE 41

Hypotheses

  • Web understanding will arise from machine

learning of relationships implicit in Web content and use

  • Seed meaning: expert annotation, curation
slide-42
SLIDE 42

Hypotheses

  • Web understanding will arise from machine

learning of relationships implicit in Web content and use

  • Seed meaning: expert annotation, curation
  • Most evidence is not explicitly annotated
slide-43
SLIDE 43

Hypotheses

  • Web understanding will arise from machine

learning of relationships implicit in Web content and use

  • Seed meaning: expert annotation, curation
  • Most evidence is not explicitly annotated
  • Web text and queries “in the wild”
slide-44
SLIDE 44

Hypotheses

  • Web understanding will arise from machine

learning of relationships implicit in Web content and use

  • Seed meaning: expert annotation, curation
  • Most evidence is not explicitly annotated
  • Web text and queries “in the wild”
  • Similar contexts ⇒ similar meanings
slide-45
SLIDE 45

Knowledge Bases?

slide-46
SLIDE 46

Knowledge Bases?

  • Curated, disambiguated “text”
slide-47
SLIDE 47

Knowledge Bases?

  • Curated, disambiguated “text”
  • Disambiguation as “translation”
slide-48
SLIDE 48

Knowledge Bases?

  • Curated, disambiguated “text”
  • Disambiguation as “translation”
  • Bootstrapping vocabulary
slide-49
SLIDE 49

Knowledge Bases?

  • Curated, disambiguated “text”
  • Disambiguation as “translation”
  • Bootstrapping vocabulary
  • Join keys
slide-50
SLIDE 50

Layers of meaning

slide-51
SLIDE 51

Layers of meaning

  • Grammatical structure (parsing)
slide-52
SLIDE 52

Layers of meaning

  • Grammatical structure (parsing)
  • Referring expression classification
slide-53
SLIDE 53

Layers of meaning

  • Grammatical structure (parsing)
  • Referring expression classification
  • Within-document coreference
slide-54
SLIDE 54

Layers of meaning

  • Grammatical structure (parsing)
  • Referring expression classification
  • Within-document coreference
  • Entity resolution to external KB
slide-55
SLIDE 55

Layers of meaning

  • Grammatical structure (parsing)
  • Referring expression classification
  • Within-document coreference
  • Entity resolution to external KB
  • Entity co-occurrences ⇒ relations
slide-56
SLIDE 56

Putting it all together

Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.

slide-57
SLIDE 57

Putting it all together

Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.

slide-58
SLIDE 58

Putting it all together

Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.

slide-59
SLIDE 59

Putting it all together

Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.

slide-60
SLIDE 60

Putting it all together

Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.

slide-61
SLIDE 61

Putting it all together

Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.

slide-62
SLIDE 62

Linguistic analysis matters

+ examples patterns relations known triples (likely) correct triples LAS

Relation extraction quality

slide-63
SLIDE 63

Linguistic analysis matters

text + examples patterns relations known triples (likely) correct triples LAS 580k 187 24 69k 67k

Relation extraction quality

slide-64
SLIDE 64

Linguistic analysis matters

text text slow parse + examples patterns relations known triples (likely) correct triples LAS 580k 580k 187 478 24 32 69k 191k 67k 198k 86.49

Relation extraction quality

slide-65
SLIDE 65

Linguistic analysis matters

text text slow parse text slow parse coref + examples patterns relations known triples (likely) correct triples LAS 580k 580k 935k 187 478 1023 24 32 38 69k 191k 281k 67k 198k 478k 86.49 86.49

Relation extraction quality

slide-66
SLIDE 66

Linguistic analysis matters

text text slow parse text slow parse coref text fast parse coref + examples patterns relations known triples (likely) correct triples LAS 580k 580k 935k 941k 187 478 1023 1013 24 32 38 38 69k 191k 281k 270k 67k 198k 478k 454k 86.49 86.49 82.72

Relation extraction quality

slide-67
SLIDE 67

Unifying idea: graph inference

slide-68
SLIDE 68

Unifying idea: graph inference

  • Nodes: objects of interest
slide-69
SLIDE 69

Unifying idea: graph inference

  • Nodes: objects of interest
  • Edges: distributional, syntactic, semantic

relationships

  • Edge weights: relationship strength
slide-70
SLIDE 70

Unifying idea: graph inference

  • Nodes: objects of interest
  • Edges: distributional, syntactic, semantic

relationships

  • Edge weights: relationship strength
  • Curated seed subgraph
slide-71
SLIDE 71

Unifying idea: graph inference

  • Nodes: objects of interest
  • Edges: distributional, syntactic, semantic

relationships

  • Edge weights: relationship strength
  • Curated seed subgraph
  • Outputs: node interpretations, new edges
slide-72
SLIDE 72

Unifying idea: graph inference

  • Nodes: objects of interest
  • Edges: distributional, syntactic, semantic

relationships

  • Edge weights: relationship strength
  • Curated seed subgraph
  • Outputs: node interpretations, new edges
  • Naturally (almost) parallel algorithms
slide-73
SLIDE 73

Entities ⇒ relations

  • N. Lao, A. Subramanya, F. Pereira, W. Cohen EMNLP 2012
slide-74
SLIDE 74

Joint KB+text inference

wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention

Coreference Resolution Entity Resolution Freebase News Corpus

Dependency Trees Write Patrick Brontë HasFather ? Profession Writer

Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010

slide-75
SLIDE 75

Joint KB+text inference

wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention

Coreference Resolution Entity Resolution Freebase News Corpus

Dependency Trees Write Patrick Brontë HasFather ? Profession Writer

score(s, t) = ⇤

∈B

P(s ⇥ t; ⇥).

Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010

slide-76
SLIDE 76

Joint KB+text inference

wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention

Coreference Resolution Entity Resolution Freebase News Corpus

Dependency Trees Write Patrick Brontë HasFather ? Profession Writer

score(s, t) = ⇤

∈B

P(s ⇥ t; ⇥).

Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010

  • Path types π: edge label

sequences

slide-77
SLIDE 77

Joint KB+text inference

wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention

Coreference Resolution Entity Resolution Freebase News Corpus

Dependency Trees Write Patrick Brontë HasFather ? Profession Writer

score(s, t) = ⇤

∈B

P(s ⇥ t; ⇥).

Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010

  • Path types π: edge label

sequences

  • Random walk probabilities
slide-78
SLIDE 78

Joint KB+text inference

wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention

Coreference Resolution Entity Resolution Freebase News Corpus

Dependency Trees Write Patrick Brontë HasFather ? Profession Writer

score(s, t) = ⇤

∈B

P(s ⇥ t; ⇥).

Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010

  • Path types π: edge label

sequences

  • Random walk probabilities
  • Weights θπ learned by

logistic regression

slide-79
SLIDE 79

Case study: extending Freebase

slide-80
SLIDE 80

Case study: extending Freebase

  • Freebase: 21M concepts, 70M edges
slide-81
SLIDE 81

Case study: extending Freebase

  • Freebase: 21M concepts, 70M edges
  • 60M Web pages mention Freebase concepts

relevant to this study

slide-82
SLIDE 82

Case study: extending Freebase

  • Freebase: 21M concepts, 70M edges
  • 60M Web pages mention Freebase concepts

relevant to this study

  • Study relations: profession, nationality, parent
slide-83
SLIDE 83

Case study: extending Freebase

  • Freebase: 21M concepts, 70M edges
  • 60M Web pages mention Freebase concepts

relevant to this study

  • Study relations: profession, nationality, parent
  • Simplified entity resolution: most likely concept for

named mentions in coref cluster

slide-84
SLIDE 84

Case study: extending Freebase

  • Freebase: 21M concepts, 70M edges
  • 60M Web pages mention Freebase concepts

relevant to this study

  • Study relations: profession, nationality, parent
  • Simplified entity resolution: most likely concept for

named mentions in coref cluster

  • Profession stats:
slide-85
SLIDE 85

Case study: extending Freebase

  • Freebase: 21M concepts, 70M edges
  • 60M Web pages mention Freebase concepts

relevant to this study

  • Study relations: profession, nationality, parent
  • Simplified entity resolution: most likely concept for

named mentions in coref cluster

  • Profession stats:
  • 2M people in Freebase
slide-86
SLIDE 86

Case study: extending Freebase

  • Freebase: 21M concepts, 70M edges
  • 60M Web pages mention Freebase concepts

relevant to this study

  • Study relations: profession, nationality, parent
  • Simplified entity resolution: most likely concept for

named mentions in coref cluster

  • Profession stats:
  • 2M people in Freebase
  • 0.3M have a recorded profession
slide-87
SLIDE 87

Case study: extending Freebase

  • Freebase: 21M concepts, 70M edges
  • 60M Web pages mention Freebase concepts

relevant to this study

  • Study relations: profession, nationality, parent
  • Simplified entity resolution: most likely concept for

named mentions in coref cluster

  • Profession stats:
  • 2M people in Freebase
  • 0.3M have a recorded profession
  • Biased data (0.24M politicians, actors)
slide-88
SLIDE 88

Selecting training data

slide-89
SLIDE 89

Selecting training data

s

π

− → t, |π| ≤ 4

slide-90
SLIDE 90

Selecting training data

Positive: r(s,t), downsample for popular s,t

s

π

− → t, |π| ≤ 4

slide-91
SLIDE 91

Selecting training data

Positive: r(s,t), downsample for popular s,t Negative: sample t′ such that ¬r(s,t′)

s

π

− → t, |π| ≤ 4

slide-92
SLIDE 92

Selecting training data

Positive: r(s,t), downsample for popular s,t Negative: sample t′ such that ¬r(s,t′)

Task Training Set Test Set Profession 22,829 15,219 Nationality 14,431 9,620 Parents 21,232 14,155

s

π

− → t, |π| ≤ 4

slide-93
SLIDE 93

A Learned path for profession

  • M, conj, M−1, Profession

  • M conj−1 M−1 Profession

slide-94
SLIDE 94

A Learned path for profession

  • M, conj, M−1, Profession

  • M conj−1 M−1 Profession

Miles Davis John Coltrane

Profession Musician

slide-95
SLIDE 95

A Learned path for profession

  • M, conj, M−1, Profession

  • M conj−1 M−1 Profession

Miles Davis John Coltrane

Profession Musician

slide-96
SLIDE 96

A Learned path for profession

  • M, conj, M−1, Profession

  • M conj−1 M−1 Profession

Miles Davis John Coltrane

Profession Musician

M

slide-97
SLIDE 97

A Learned path for profession

  • M, conj, M−1, Profession

  • M conj−1 M−1 Profession

Miles Davis John Coltrane

Profession Musician

M M-1

slide-98
SLIDE 98

Relation extraction results

slide-99
SLIDE 99

Relation extraction results

MRR = 1 |Q| ⇤

q∈Q

1

rank of q’s first correct answer

Task KB Text KB+Text KB+Text[b] Profession 0.532 0.516 0.583 0.453 Nationality 0.734 0.729 0.812 0.693 Parents 0.329 0.332 0.392 0.319

Known triples

slide-100
SLIDE 100

Relation extraction results

MRR = 1 |Q| ⇤

q∈Q

1

rank of q’s first correct answer

Task KB Text KB+Text KB+Text[b] Profession 0.532 0.516 0.583 0.453 Nationality 0.734 0.729 0.812 0.693 Parents 0.329 0.332 0.392 0.319

Known triples

Task p@100 p@1k p@10k Profession 0.97 0.92 0.84 Nationality 0.98 0.97 0.90 Parents 0.86 0.81 0.79

Human evaluation

slide-101
SLIDE 101

Relation extraction results

MRR = 1 |Q| ⇤

q∈Q

1

rank of q’s first correct answer

Task KB Text KB+Text KB+Text[b] Profession 0.532 0.516 0.583 0.453 Nationality 0.734 0.729 0.812 0.693 Parents 0.329 0.332 0.392 0.319

Known triples

Task p@100 p@1k p@10k Profession 0.97 0.92 0.84 Nationality 0.98 0.97 0.90 Parents 0.86 0.81 0.79

Human evaluation

Profession triples People 1,000 970 10,000 8,726 100,000 79,885

Coverage

slide-102
SLIDE 102

Scaling up

  • Index text graph by entity mentions
  • Explore short paths in parallel
  • Aggregate path types with MapReduce
slide-103
SLIDE 103

Better inference and learning

slide-104
SLIDE 104

Better inference and learning

  • Interactions between interpretation layers
  • entity knowledge helps parsing
slide-105
SLIDE 105

Better inference and learning

  • Interactions between interpretation layers
  • entity knowledge helps parsing
  • Explicit sparsity constraints
  • most terms are just a few interpretations
slide-106
SLIDE 106

Better inference and learning

  • Interactions between interpretation layers
  • entity knowledge helps parsing
  • Explicit sparsity constraints
  • most terms are just a few interpretations
  • Learn from mistakes
  • predict Wikipedia/Freebase edits
slide-107
SLIDE 107

Better inference and learning

  • Interactions between interpretation layers
  • entity knowledge helps parsing
  • Explicit sparsity constraints
  • most terms are just a few interpretations
  • Learn from mistakes
  • predict Wikipedia/Freebase edits
  • Better generalization through latent features
  • unreliable context evidence for rare entities