SLIDE 1 Fernando Pereira
with William Cohen*, Rahul Gupta, Ni Lao*, Slav Petrov, Michael Ringgaard, and Amar Subramanya
*CMU
Low-Pass Semantics
SLIDE 2
An “easy” query…
SLIDE 3
An “easy” query…
frequent query
SLIDE 4
SLIDE 7 negation class synonyms
SLIDE 8
SLIDE 9
SLIDE 10
tylenol ⇒ painkiller
SLIDE 11
SLIDE 12
SLIDE 13
upset stomach ⇒ adverse effect (on stomach)
SLIDE 14
SLIDE 15
SLIDE 16
tylenol ⇒ no adverse effects (on stomach)
SLIDE 17
tylenol ⇒ no adverse effects (on stomach) ⇒ no upset stomach
SLIDE 18
From search to inference
SLIDE 19
From search to inference
painkiller
SLIDE 20
From search to inference
painkiller upset
SLIDE 21
From search to inference
painkiller upset stomach
SLIDE 22
From search to inference
painkiller upset stomach is-a
SLIDE 23
From search to inference
painkiller upset stomach tylenol is-a
SLIDE 24
From search to inference
painkiller upset stomach tylenol is-a a-kind-of
SLIDE 25
From search to inference
painkiller upset stomach tylenol adverse effects is-a a-kind-of
SLIDE 26
From search to inference
painkiller upset stomach tylenol adverse effects is-a a-kind-of not
SLIDE 27
From search to inference
painkiller upset stomach tylenol adverse effects is-a a-kind-of not arg
SLIDE 28
From search to inference
painkiller upset stomach tylenol adverse effects is-a a-kind-of not not arg
SLIDE 29
From search to inference
painkiller upset stomach tylenol adverse effects is-a a-kind-of not not arg arg
SLIDE 30
“Things, not strings”
SLIDE 31
“Things, not strings”
From To Requires
SLIDE 32
“Things, not strings”
From To Requires Term Concept Parsing, disambiguation, coreference
SLIDE 33
“Things, not strings”
From To Requires Term Concept Parsing, disambiguation, coreference Term identity Entailment Concept relations
SLIDE 34
“Things, not strings”
From To Requires Term Concept Parsing, disambiguation, coreference Term identity Entailment Concept relations Co-occurrence Syntactic relation Document structure, parsing
SLIDE 35
“Things, not strings”
From To Requires Term Concept Parsing, disambiguation, coreference Term identity Entailment Concept relations Co-occurrence Syntactic relation Document structure, parsing Term index Semantic index Concept disambiguation, inference
SLIDE 36
SLIDE 37
SLIDE 38
SLIDE 39
Hypotheses
SLIDE 40 Hypotheses
- Web understanding will arise from machine
learning of relationships implicit in Web content and use
SLIDE 41 Hypotheses
- Web understanding will arise from machine
learning of relationships implicit in Web content and use
- Seed meaning: expert annotation, curation
SLIDE 42 Hypotheses
- Web understanding will arise from machine
learning of relationships implicit in Web content and use
- Seed meaning: expert annotation, curation
- Most evidence is not explicitly annotated
SLIDE 43 Hypotheses
- Web understanding will arise from machine
learning of relationships implicit in Web content and use
- Seed meaning: expert annotation, curation
- Most evidence is not explicitly annotated
- Web text and queries “in the wild”
SLIDE 44 Hypotheses
- Web understanding will arise from machine
learning of relationships implicit in Web content and use
- Seed meaning: expert annotation, curation
- Most evidence is not explicitly annotated
- Web text and queries “in the wild”
- Similar contexts ⇒ similar meanings
SLIDE 45
Knowledge Bases?
SLIDE 46 Knowledge Bases?
- Curated, disambiguated “text”
SLIDE 47 Knowledge Bases?
- Curated, disambiguated “text”
- Disambiguation as “translation”
SLIDE 48 Knowledge Bases?
- Curated, disambiguated “text”
- Disambiguation as “translation”
- Bootstrapping vocabulary
SLIDE 49 Knowledge Bases?
- Curated, disambiguated “text”
- Disambiguation as “translation”
- Bootstrapping vocabulary
- Join keys
SLIDE 50
Layers of meaning
SLIDE 51 Layers of meaning
- Grammatical structure (parsing)
SLIDE 52 Layers of meaning
- Grammatical structure (parsing)
- Referring expression classification
SLIDE 53 Layers of meaning
- Grammatical structure (parsing)
- Referring expression classification
- Within-document coreference
SLIDE 54 Layers of meaning
- Grammatical structure (parsing)
- Referring expression classification
- Within-document coreference
- Entity resolution to external KB
SLIDE 55 Layers of meaning
- Grammatical structure (parsing)
- Referring expression classification
- Within-document coreference
- Entity resolution to external KB
- Entity co-occurrences ⇒ relations
SLIDE 56 Putting it all together
Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.
SLIDE 57 Putting it all together
Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.
SLIDE 58 Putting it all together
Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.
SLIDE 59 Putting it all together
Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.
SLIDE 60 Putting it all together
Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.
SLIDE 61 Putting it all together
Erik Wesner will talk about his research tomorrow. Though Wesner's parents are from Poland, he was born and raised in Raleigh, NC.
SLIDE 62 Linguistic analysis matters
+ examples patterns relations known triples (likely) correct triples LAS
Relation extraction quality
SLIDE 63 Linguistic analysis matters
text + examples patterns relations known triples (likely) correct triples LAS 580k 187 24 69k 67k
Relation extraction quality
SLIDE 64 Linguistic analysis matters
text text slow parse + examples patterns relations known triples (likely) correct triples LAS 580k 580k 187 478 24 32 69k 191k 67k 198k 86.49
Relation extraction quality
SLIDE 65 Linguistic analysis matters
text text slow parse text slow parse coref + examples patterns relations known triples (likely) correct triples LAS 580k 580k 935k 187 478 1023 24 32 38 69k 191k 281k 67k 198k 478k 86.49 86.49
Relation extraction quality
SLIDE 66 Linguistic analysis matters
text text slow parse text slow parse coref text fast parse coref + examples patterns relations known triples (likely) correct triples LAS 580k 580k 935k 941k 187 478 1023 1013 24 32 38 38 69k 191k 281k 270k 67k 198k 478k 454k 86.49 86.49 82.72
Relation extraction quality
SLIDE 67
Unifying idea: graph inference
SLIDE 68 Unifying idea: graph inference
- Nodes: objects of interest
SLIDE 69 Unifying idea: graph inference
- Nodes: objects of interest
- Edges: distributional, syntactic, semantic
relationships
- Edge weights: relationship strength
SLIDE 70 Unifying idea: graph inference
- Nodes: objects of interest
- Edges: distributional, syntactic, semantic
relationships
- Edge weights: relationship strength
- Curated seed subgraph
SLIDE 71 Unifying idea: graph inference
- Nodes: objects of interest
- Edges: distributional, syntactic, semantic
relationships
- Edge weights: relationship strength
- Curated seed subgraph
- Outputs: node interpretations, new edges
SLIDE 72 Unifying idea: graph inference
- Nodes: objects of interest
- Edges: distributional, syntactic, semantic
relationships
- Edge weights: relationship strength
- Curated seed subgraph
- Outputs: node interpretations, new edges
- Naturally (almost) parallel algorithms
SLIDE 73 Entities ⇒ relations
- N. Lao, A. Subramanya, F. Pereira, W. Cohen EMNLP 2012
SLIDE 74 Joint KB+text inference
wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention
Coreference Resolution Entity Resolution Freebase News Corpus
Dependency Trees Write Patrick Brontë HasFather ? Profession Writer
Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010
SLIDE 75 Joint KB+text inference
wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention
Coreference Resolution Entity Resolution Freebase News Corpus
Dependency Trees Write Patrick Brontë HasFather ? Profession Writer
score(s, t) = ⇤
∈B
P(s ⇥ t; ⇥).
Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010
SLIDE 76 Joint KB+text inference
wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention
Coreference Resolution Entity Resolution Freebase News Corpus
Dependency Trees Write Patrick Brontë HasFather ? Profession Writer
score(s, t) = ⇤
∈B
P(s ⇥ t; ⇥).
Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010
sequences
SLIDE 77 Joint KB+text inference
wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention
Coreference Resolution Entity Resolution Freebase News Corpus
Dependency Trees Write Patrick Brontë HasFather ? Profession Writer
score(s, t) = ⇤
∈B
P(s ⇥ t; ⇥).
Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010
sequences
- Random walk probabilities
SLIDE 78 Joint KB+text inference
wrote She Mention dobj Charlotte was nsubj nsubj Jane Eyre Charlotte Bronte Mention Jane Eyre Mention
Coreference Resolution Entity Resolution Freebase News Corpus
Dependency Trees Write Patrick Brontë HasFather ? Profession Writer
score(s, t) = ⇤
∈B
P(s ⇥ t; ⇥).
Path ranking algorithm (PRA): Lao and Cohen, Machine Learning, 2010
sequences
- Random walk probabilities
- Weights θπ learned by
logistic regression
SLIDE 79
Case study: extending Freebase
SLIDE 80 Case study: extending Freebase
- Freebase: 21M concepts, 70M edges
SLIDE 81 Case study: extending Freebase
- Freebase: 21M concepts, 70M edges
- 60M Web pages mention Freebase concepts
relevant to this study
SLIDE 82 Case study: extending Freebase
- Freebase: 21M concepts, 70M edges
- 60M Web pages mention Freebase concepts
relevant to this study
- Study relations: profession, nationality, parent
SLIDE 83 Case study: extending Freebase
- Freebase: 21M concepts, 70M edges
- 60M Web pages mention Freebase concepts
relevant to this study
- Study relations: profession, nationality, parent
- Simplified entity resolution: most likely concept for
named mentions in coref cluster
SLIDE 84 Case study: extending Freebase
- Freebase: 21M concepts, 70M edges
- 60M Web pages mention Freebase concepts
relevant to this study
- Study relations: profession, nationality, parent
- Simplified entity resolution: most likely concept for
named mentions in coref cluster
SLIDE 85 Case study: extending Freebase
- Freebase: 21M concepts, 70M edges
- 60M Web pages mention Freebase concepts
relevant to this study
- Study relations: profession, nationality, parent
- Simplified entity resolution: most likely concept for
named mentions in coref cluster
- Profession stats:
- 2M people in Freebase
SLIDE 86 Case study: extending Freebase
- Freebase: 21M concepts, 70M edges
- 60M Web pages mention Freebase concepts
relevant to this study
- Study relations: profession, nationality, parent
- Simplified entity resolution: most likely concept for
named mentions in coref cluster
- Profession stats:
- 2M people in Freebase
- 0.3M have a recorded profession
SLIDE 87 Case study: extending Freebase
- Freebase: 21M concepts, 70M edges
- 60M Web pages mention Freebase concepts
relevant to this study
- Study relations: profession, nationality, parent
- Simplified entity resolution: most likely concept for
named mentions in coref cluster
- Profession stats:
- 2M people in Freebase
- 0.3M have a recorded profession
- Biased data (0.24M politicians, actors)
SLIDE 88
Selecting training data
SLIDE 89 Selecting training data
s
π
− → t, |π| ≤ 4
SLIDE 90 Selecting training data
Positive: r(s,t), downsample for popular s,t
s
π
− → t, |π| ≤ 4
SLIDE 91 Selecting training data
Positive: r(s,t), downsample for popular s,t Negative: sample t′ such that ¬r(s,t′)
s
π
− → t, |π| ≤ 4
SLIDE 92 Selecting training data
Positive: r(s,t), downsample for popular s,t Negative: sample t′ such that ¬r(s,t′)
Task Training Set Test Set Profession 22,829 15,219 Nationality 14,431 9,620 Parents 21,232 14,155
s
π
− → t, |π| ≤ 4
SLIDE 93 A Learned path for profession
⇥
⇥
SLIDE 94 A Learned path for profession
⇥
⇥
Miles Davis John Coltrane
Profession Musician
SLIDE 95 A Learned path for profession
⇥
⇥
Miles Davis John Coltrane
Profession Musician
SLIDE 96 A Learned path for profession
⇥
⇥
Miles Davis John Coltrane
Profession Musician
M
SLIDE 97 A Learned path for profession
⇥
⇥
Miles Davis John Coltrane
Profession Musician
M M-1
SLIDE 98
Relation extraction results
SLIDE 99 Relation extraction results
MRR = 1 |Q| ⇤
q∈Q
1
rank of q’s first correct answer
Task KB Text KB+Text KB+Text[b] Profession 0.532 0.516 0.583 0.453 Nationality 0.734 0.729 0.812 0.693 Parents 0.329 0.332 0.392 0.319
Known triples
SLIDE 100 Relation extraction results
MRR = 1 |Q| ⇤
q∈Q
1
rank of q’s first correct answer
Task KB Text KB+Text KB+Text[b] Profession 0.532 0.516 0.583 0.453 Nationality 0.734 0.729 0.812 0.693 Parents 0.329 0.332 0.392 0.319
Known triples
Task p@100 p@1k p@10k Profession 0.97 0.92 0.84 Nationality 0.98 0.97 0.90 Parents 0.86 0.81 0.79
Human evaluation
SLIDE 101 Relation extraction results
MRR = 1 |Q| ⇤
q∈Q
1
rank of q’s first correct answer
Task KB Text KB+Text KB+Text[b] Profession 0.532 0.516 0.583 0.453 Nationality 0.734 0.729 0.812 0.693 Parents 0.329 0.332 0.392 0.319
Known triples
Task p@100 p@1k p@10k Profession 0.97 0.92 0.84 Nationality 0.98 0.97 0.90 Parents 0.86 0.81 0.79
Human evaluation
Profession triples People 1,000 970 10,000 8,726 100,000 79,885
Coverage
SLIDE 102 Scaling up
- Index text graph by entity mentions
- Explore short paths in parallel
- Aggregate path types with MapReduce
SLIDE 103
Better inference and learning
SLIDE 104 Better inference and learning
- Interactions between interpretation layers
- entity knowledge helps parsing
SLIDE 105 Better inference and learning
- Interactions between interpretation layers
- entity knowledge helps parsing
- Explicit sparsity constraints
- most terms are just a few interpretations
SLIDE 106 Better inference and learning
- Interactions between interpretation layers
- entity knowledge helps parsing
- Explicit sparsity constraints
- most terms are just a few interpretations
- Learn from mistakes
- predict Wikipedia/Freebase edits
SLIDE 107 Better inference and learning
- Interactions between interpretation layers
- entity knowledge helps parsing
- Explicit sparsity constraints
- most terms are just a few interpretations
- Learn from mistakes
- predict Wikipedia/Freebase edits
- Better generalization through latent features
- unreliable context evidence for rare entities