Learning Dependency-Based Compositional Semantics Semantic - - PowerPoint PPT Presentation
Learning Dependency-Based Compositional Semantics Semantic - - PowerPoint PPT Presentation
Learning Dependency-Based Compositional Semantics Semantic Representations for Textual Inference Workshop Mar. 10, 2012 Percy Liang Google/Stanford joint work with Michael Jordan and Dan Klein Motivating Problem: Question Answering 2
Motivating Problem: Question Answering
2
Motivating Problem: Question Answering
What is the largest city in California?
2
Motivating Problem: Question Answering
What is the largest city in California? What is the largest city in a state bordering California?
2
Semantic Interpretation
4
Semantic Interpretation
What is the largest city in a state bordering California? Phoenix
4
Semantic Interpretation
What is the largest city in a state bordering California?
?
Phoenix
4
Semantic Interpretation
What is the largest city in a state bordering California?
argmax({c :city(c)∧∃s.state(s) ∧ loc(c, s) ∧ border(s, CA)}, population)
Phoenix
4
Semantic Interpretation
What is the largest city in a state bordering California?
argmax({c :city(c) ∧ ∃s.state(s) ∧ loc(c, s)∧border(s, CA)}, population)
Phoenix
4
Semantic Interpretation
What is the largest city in a state bordering California?
argmax({c :city(c) ∧ ∃s.state(s) ∧ loc(c, s) ∧ border(s, CA)}, population)
Phoenix
4
Semantic Interpretation
What is the largest city in a state bordering California?
argmax({c : city(c) ∧ ∃s.state(s) ∧ loc(c, s) ∧ border(s, CA)}, population)
Phoenix
4
Semantic Interpretation
What is the largest city in a state bordering California?
argmax({c : city(c) ∧ ∃s.state(s) ∧ loc(c, s) ∧ border(s, CA)}, population)
computation Phoenix
4
Semantic Interpretation
What is the largest city in a state bordering California?
?
computation Phoenix
4
Supervision for Semantic Interpretation
6
Supervision for Semantic Interpretation
Detailed Supervision (current)
What is the largest city in California?
argmax({c : city(c) ∧ loc(c, CA)}, population)
6
Supervision for Semantic Interpretation
Detailed Supervision (current)
What is the largest city in California? expert
argmax({c : city(c) ∧ loc(c, CA)}, population)
6
Supervision for Semantic Interpretation
Detailed Supervision (current)
- doesn’t scale up
What is the largest city in California? expert
argmax({c : city(c) ∧ loc(c, CA)}, population)
6
Supervision for Semantic Interpretation
Detailed Supervision (current)
- doesn’t scale up
What is the largest city in California? expert
argmax({c : city(c) ∧ loc(c, CA)}, population)
Natural Supervision (new)
What is the largest city in California? Los Angeles
6
Supervision for Semantic Interpretation
Detailed Supervision (current)
- doesn’t scale up
What is the largest city in California? expert
argmax({c : city(c) ∧ loc(c, CA)}, population)
Natural Supervision (new)
What is the largest city in California? non-expert Los Angeles
6
Supervision for Semantic Interpretation
Detailed Supervision (current)
- doesn’t scale up
What is the largest city in California? expert
argmax({c : city(c) ∧ loc(c, CA)}, population)
Natural Supervision (new)
- scales up
What is the largest city in California? non-expert Los Angeles
6
Supervision for Semantic Interpretation
Detailed Supervision (current)
- doesn’t scale up
- representation-dependent
What is the largest city in California? expert
argmax({c : city(c) ∧ loc(c, CA)}, population)
Natural Supervision (new)
- scales up
What is the largest city in California? non-expert Los Angeles
6
Supervision for Semantic Interpretation
Detailed Supervision (current)
- doesn’t scale up
- representation-dependent
What is the largest city in California? expert
argmax({c : city(c) ∧ loc(c, CA)}, population)
Natural Supervision (new)
- scales up
- representation-independent
What is the largest city in California? non-expert Los Angeles
6
Outline
Representation
1 2 1 1 2 1 1 1 2 1
CA border state loc
1 1 1 1 1 1
major
2 1
AZ traverse river traverse city
Learning
x θ z w y
Experiments
9
Considerations
Computational: how to efficiently search exponential space?
10
Considerations
Computational: how to efficiently search exponential space? What is the most populous city in California? Los Angeles
10
Considerations
Computational: how to efficiently search exponential space? What is the most populous city in California? λx.state(x) Los Angeles
10
Considerations
Computational: how to efficiently search exponential space? What is the most populous city in California? λx.city(x) Los Angeles
10
Considerations
Computational: how to efficiently search exponential space? What is the most populous city in California? λx.city(x) ∧ loc(x, CA) Los Angeles
10
Considerations
Computational: how to efficiently search exponential space? What is the most populous city in California? λx.state(x) ∧ border(x, CA) Los Angeles
10
Considerations
Computational: how to efficiently search exponential space? What is the most populous city in California?
population(CA)
Los Angeles
10
Considerations
Computational: how to efficiently search exponential space? What is the most populous city in California?
argmax(λx.city(x) ∧ loc(x, CA), λx.population(x))
Los Angeles
10
Considerations
Computational: how to efficiently search exponential space? What is the most populous city in California? · · · LF LF LF LF LF LF LF LF LF LF LF LF LF LF LF LF LF · · · Los Angeles
10
Considerations
Computational: how to efficiently search exponential space? What is the most populous city in California? · · · LF LF LF LF LF LF LF LF LF LF LF LF LF LF LF LF LF · · · Los Angeles Statistical: how to parametrize mapping from sentence to logical form? What is the most populous city in California?
argmax(λx.city(x) ∧ loc(x, CA), λx.population(x))
10
Dependency-Based Compositional Semantics (DCS)
What is the most populous city in California?
11
Dependency-Based Compositional Semantics (DCS)
What is the most populous city in California?
1 1 1 1 c
argmax population
2 1
CA loc city
Los Angeles
11
Dependency-Based Compositional Semantics (DCS)
What is the most populous city in California?
1 1 1 1 c
argmax population
2 1
CA loc city
Los Angeles Advantages of DCS: nice computational, statistical, linguistic properties
11
Where do the answers come from?
What is the most populous city in California?
1 1 1 1 c
argmax population
2 1
CA loc city
Los Angeles
12
Where do the answers come from?
What is the most populous city in California?
1 1 1 1 c
argmax population
2 1
CA loc city
Database Los Angeles
12
Database
city
San Francisco Chicago Boston · · ·
state
Alabama Alaska Arizona · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
border
Washington Oregon Washington Idaho Oregon Washington · · · · · · · · · · · ·
13
Basic DCS Trees
DCS tree
city
1 1
loc
2 1
CA
Database
14
Basic DCS Trees
DCS tree Constraints
city
1 1
loc
2 1
CA
Database A DCS tree encodes a constraint satisfaction problem (CSP)
14
Basic DCS Trees
DCS tree Constraints
city
c ∈ city
1 1
loc
2 1
CA
Database
city
San Francisco Chicago Boston · · ·
A DCS tree encodes a constraint satisfaction problem (CSP)
14
Basic DCS Trees
DCS tree Constraints
city
c ∈ city
1 1
loc
` ∈ loc
2 1
CA
Database
city
San Francisco Chicago Boston · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
A DCS tree encodes a constraint satisfaction problem (CSP)
14
Basic DCS Trees
DCS tree Constraints
city
c ∈ city
1 1
loc
` ∈ loc
2 1
CA
s ∈ CA Database
city
San Francisco Chicago Boston · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
CA
California
A DCS tree encodes a constraint satisfaction problem (CSP)
14
Basic DCS Trees
DCS tree Constraints
city
c ∈ city
1 1
c1 = `1
loc
` ∈ loc
2 1
CA
s ∈ CA Database
city
San Francisco Chicago Boston · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
CA
California
A DCS tree encodes a constraint satisfaction problem (CSP)
14
Basic DCS Trees
DCS tree Constraints
city
c ∈ city
1 1
c1 = `1
loc
` ∈ loc
2 1
`2 = s1
CA
s ∈ CA Database
city
San Francisco Chicago Boston · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
CA
California
A DCS tree encodes a constraint satisfaction problem (CSP)
14
Basic DCS Trees
DCS tree Constraints
city
c ∈ city
1 1
c1 = `1
loc
` ∈ loc
2 1
`2 = s1
CA
s ∈ CA Database
city
San Francisco Chicago Boston · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
CA
California
A DCS tree encodes a constraint satisfaction problem (CSP)
14
Basic DCS Trees
DCS tree Constraints
city
c ∈ city
1 1
c1 = `1
loc
` ∈ loc
2 1
`2 = s1
CA
s ∈ CA Database
city
San Francisco Chicago Boston · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
CA
California
A DCS tree encodes a constraint satisfaction problem (CSP)
14
Basic DCS Trees
DCS tree Constraints
city
c ∈ city
1 1
c1 = `1
loc
` ∈ loc
2 1
`2 = s1
CA
s ∈ CA Database
city
San Francisco Chicago Boston · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
CA
California
A DCS tree encodes a constraint satisfaction problem (CSP)
14
Basic DCS Trees
DCS tree Constraints
city
c ∈ city
1 1
c1 = `1
loc
` ∈ loc
2 1
`2 = s1
CA
s ∈ CA Database
city
San Francisco Chicago Boston · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
CA
California
A DCS tree encodes a constraint satisfaction problem (CSP) Computation: dynamic programming ⇒ time = O(# nodes)
14
Properties of DCS Trees
1 2 1 1 2 1 1 1 2 1
CA border state loc
1 1 1 1 1 1
major
2 1
AZ traverse river traverse city
15
Properties of DCS Trees
1 2 1 1 2 1 1 1 2 1
CA border state loc
1 1 1 1 1 1
major
2 1
AZ traverse river traverse city
Trees
15
Properties of DCS Trees
1 2 1 1 2 1 1 1 2 1
CA border state loc
1 1 1 1 1 1
major
2 1
AZ traverse river traverse city
Linguistics syntactic locality Trees
15
Properties of DCS Trees
1 2 1 1 2 1 1 1 2 1
CA border state loc
1 1 1 1 1 1
major
2 1
AZ traverse river traverse city
Linguistics syntactic locality Trees Computation efficient interpretation
15
Divergence between Syntactic and Semantic Scope
most populous city in California
16
Divergence between Syntactic and Semantic Scope
most populous city in California Syntax most populous California in city
16
Divergence between Syntactic and Semantic Scope
most populous city in California Syntax Semantics most populous California in city
argmax(λx.city(x) ∧ loc(x, CA), λx.population(x))
16
Divergence between Syntactic and Semantic Scope
most populous city in California Syntax Semantics most populous California in city
argmax(λx.city(x) ∧ loc(x, CA), λx.population(x))
16
Divergence between Syntactic and Semantic Scope
most populous city in California Syntax Semantics most populous California in city
argmax(λx.city(x) ∧ loc(x, CA), λx.population(x))
Problem: syntactic scope is lower than semantic scope
16
Divergence between Syntactic and Semantic Scope
most populous city in California Syntax Semantics most populous California in city
argmax(λx.city(x) ∧ loc(x, CA), λx.population(x))
Problem: syntactic scope is lower than semantic scope If DCS trees look like syntax, how do we get correct semantics?
16
Solution: Mark-Execute
most populous city in California
x1 x1 1 1 1 1 c
argmax population
2 1
CA loc city
∗∗ Superlatives
17
Solution: Mark-Execute
most populous city in California Mark at syntactic scope
x1 x1 1 1 1 1 c
argmax population
2 1
CA loc city
∗∗ Superlatives
17
Solution: Mark-Execute
most populous city in California Execute at semantic scope Mark at syntactic scope
x1 x1 1 1 1 1 c
argmax population
2 1
CA loc city
∗∗ Superlatives
17
Solution: Mark-Execute
Alaska borders no states. Execute at semantic scope Mark at syntactic scope
x1 x1 2 1 1 1
AK
q
no state border
∗∗ Negation
17
Solution: Mark-Execute
Some river traverses every city. Execute at semantic scope Mark at syntactic scope
x12 x12 2 1 1 1 q
some river
q
every city traverse
∗∗ Quantification (narrow)
17
Solution: Mark-Execute
Some river traverses every city. Execute at semantic scope Mark at syntactic scope
x21 x21 2 1 1 1 q
some river
q
every city traverse
∗∗ Quantification (wide)
17
Solution: Mark-Execute
Some river traverses every city. Execute at semantic scope Mark at syntactic scope
x21 x21 2 1 1 1 q
some river
q
every city traverse
∗∗ Quantification (wide) Analogy: Montague’s quantifying in, Carpenter’s scoping constructor
17
Outline
Representation
1 2 1 1 2 1 1 1 2 1
CA border state loc
1 1 1 1 1 1
major
2 1
AZ traverse river traverse city
Learning
x θ z w y
Experiments
18
Graphical Model
z
1 2 1 1
CA capital
∗∗
database w
19
Graphical Model
z
1 2 1 1
CA capital
∗∗
database w y Sacramento
19
Graphical Model
z
1 2 1 1
CA capital
∗∗
database w y Sacramento Interpretation: p(y | z, w) (deterministic)
19
Graphical Model
x capital of California? z
1 2 1 1
CA capital
∗∗
database w y Sacramento Interpretation: p(y | z, w) (deterministic)
19
Graphical Model
x capital of California? parameters θ z
1 2 1 1
CA capital
∗∗
database w y Sacramento Interpretation: p(y | z, w) (deterministic)
19
Graphical Model
x capital of California? parameters θ z
1 2 1 1
CA capital
∗∗
database w y Sacramento Semantic Parsing: p(z | x, θ) (probabilistic) Interpretation: p(y | z, w) (deterministic)
19
Plan
x capital of California? parameters θ z
1 2 1 1
CA capital
∗∗
database w y Sacramento
- What’s possible? z ∈ Z(x)
- What’s probable? p(z | x, θ)
- Learning θ from (x, y) data
20
Words to Predicates (Lexical Semantics)
What is the most populous city in CA ?
21
Words to Predicates (Lexical Semantics)
CA
What is the most populous city in CA ? Lexical Triggers:
- 1. String match
CA ⇒ CA
21
Words to Predicates (Lexical Semantics)
argmax CA
What is the most populous city in CA ? Lexical Triggers:
- 1. String match
CA ⇒ CA
- 2. Function words (20 words) most ⇒ argmax
21
Words to Predicates (Lexical Semantics)
city city state state river river argmax population population CA
What is the most populous city in CA ? Lexical Triggers:
- 1. String match
CA ⇒ CA
- 2. Function words (20 words) most ⇒ argmax
- 3. Nouns/adjectives
city ⇒ city state river population
21
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j
22
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j k
22
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j k Ci,k Ck,j
22
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j k Ci,k Ck,j
c
argmax population
1 1 2 1
CA loc city
22
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j k Ci,k Ck,j
c
argmax population
1 1 2 1
CA loc city
1 1 1 1 c
argmax population
2 1
CA loc city
22
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j k Ci,k Ck,j
c
argmax population
1 1 2 1
CA loc city
1 1 1 2 c
argmax population
2 1
CA loc city
22
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j k Ci,k Ck,j
c
argmax population
1 1 2 1
CA loc city
1 1 1 1 2 1 c
argmax population loc
2 1
CA loc city
22
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j k Ci,k Ck,j
c
argmax population
1 1 2 1
CA loc city
1 1 1 2 1 1 c
argmax population loc
2 1
CA loc city
22
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j k Ci,k Ck,j
c
argmax population
1 1 2 1
CA loc city
1 1 1 2 1 1 c
argmax population border
2 1
CA loc city
22
Predicates to DCS Trees (Compositional Semantics)
Ci,j = set of DCS trees for span [i, j] most populous city in California i j k Ci,k Ck,j
c
argmax population
1 1 2 1
CA loc city
1 1 c
argmax
1 1 2 1
CA loc city population
22
Plan
x capital of California? parameters θ z
1 2 1 1
CA capital
∗∗
database w y Sacramento
- What’s possible? z ∈ Z(x)
- What’s probable? p(z | x, θ)
- Learning θ from (x, y) data
23
Log-linear Model
z:
city city loc CA
x: city in California
1 1 2 1
24
Log-linear Model
z:
city city loc CA
x: city in California
1 1 2 1
features(x, z) =(
)
∈ Rd
24
Log-linear Model
z:
city city loc CA
x: city in California
1 1 2 1
features(x, z) =(
in
loc
: 1
)
∈ Rd
24
Log-linear Model
z:
city city loc CA
x: city in California
1 1 2 1
features(x, z) =(
in
loc
: 1
1 1 loc
city
: 1)
∈ Rd
24
Log-linear Model
z:
city city loc CA
x: city in California
1 1 2 1
features(x, z) =(
in
loc
: 1
1 1 loc
city
: 1 · · ·
)
∈ Rd
24
Log-linear Model
z:
city city loc CA
x: city in California
1 1 2 1
features(x, z) =(
in
loc
: 1
1 1 loc
city
: 1 · · ·
)
∈ Rd score(x, z) = features(x, z) · θ
24
Log-linear Model
z:
city city loc CA
x: city in California
1 1 2 1
features(x, z) =(
in
loc
: 1
1 1 loc
city
: 1 · · ·
)
∈ Rd score(x, z) = features(x, z) · θ p(z | x, θ) =
escore(x,z) P
z02Z(x) escore(x,z0)
24
Plan
x capital of California? parameters θ z
1 2 1 1
CA capital
∗∗
database w y Sacramento
- What’s possible? z ∈ Z(x)
- What’s probable? p(z | x, θ)
- Learning θ from (x, y) data
25
Learning
Objective Function:
p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing
26
Learning
Objective Function:
maxθ p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing
26
Learning
Objective Function:
maxθ P
z p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing
26
Learning
Objective Function:
maxθ P
z p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing EM-like Algorithm: parameters θ (0, 0, . . . , 0)
26
Learning
Objective Function:
maxθ P
z p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing EM-like Algorithm: parameters θ (0, 0, . . . , 0) enumerate/score DCS trees
26
Learning
Objective Function:
maxθ P
z p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing EM-like Algorithm: parameters θ k-best list (0, 0, . . . , 0) enumerate/score DCS trees
tree1 tree2 tree3 tree4 tree5
26
Learning
Objective Function:
maxθ P
z p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing EM-like Algorithm: parameters θ k-best list (0.2, −1.3, . . . , 0.7) enumerate/score DCS trees numerical optimization (L-BFGS)
tree1 tree2 tree3 tree4 tree5
26
Learning
Objective Function:
maxθ P
z p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing EM-like Algorithm: parameters θ k-best list (0.2, −1.3, . . . , 0.7) enumerate/score DCS trees numerical optimization (L-BFGS)
tree3 tree8 tree6 tree2 tree4
26
Learning
Objective Function:
maxθ P
z p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing EM-like Algorithm: parameters θ k-best list (0.3, −1.4, . . . , 0.6) enumerate/score DCS trees numerical optimization (L-BFGS)
tree3 tree8 tree6 tree2 tree4
26
Learning
Objective Function:
maxθ P
z p(y | z, w) p(z | x, θ)
Interpretation Semantic parsing EM-like Algorithm: parameters θ k-best list (0.3, −1.4, . . . , 0.6) enumerate/score DCS trees numerical optimization (L-BFGS)
tree3 tree8 tree2 tree4 tree9
26
Outline
Representation
1 2 1 1 2 1 1 1 2 1
CA border state loc
1 1 1 1 1 1
major
2 1
AZ traverse river traverse city
Learning
x θ z w y
Experiments
27
US Geography Benchmark
Standard semantic parsing benchmark since 1990s 600 training examples, 280 test examples
28
US Geography Benchmark
Standard semantic parsing benchmark since 1990s 600 training examples, 280 test examples What is the highest point in Florida? How many states have a city called Rochester? What is the longest river that runs through a state that borders Tennessee? Of the states washed by the Mississippi river which has the lowest point? · · ·
28
US Geography Benchmark
Standard semantic parsing benchmark since 1990s 600 training examples, 280 test examples What is the highest point in Florida? ⇒ answer(A,highest(A,(place(A),loc(A,B),const(B,stateid(florida))))) How many states have a city called Rochester? ⇒ answer(A,count(B,(state(B),loc(C,B),const(C,cityid(rochester, ))),A)) What is the longest river that runs through a state that borders Tennessee? ⇒ answer(A,longest(A,(river(A),traverse(A,B),state(B),next to(B,C),const(C,stateid(tennessee))))) Of the states washed by the Mississippi river which has the lowest point? ⇒ answer(A,lowest(B,(state(A),traverse(C,A),const(C,riverid(mississippi)),loc(B,A),place(B)))) · · · Supervision in past work: question + program
28
US Geography Benchmark
Standard semantic parsing benchmark since 1990s 600 training examples, 280 test examples What is the highest point in Florida? ⇒ Walton County How many states have a city called Rochester? ⇒ 2 What is the longest river that runs through a state that borders Tennessee? ⇒ Missouri Of the states washed by the Mississippi river which has the lowest point? ⇒ Louisiana · · · Supervision in past work: question + program Supervision in this work: question + answer
28
Input to Learning Algorithm
Training data (600 examples)
What is the highest point in Florida? ⇒ Walton County How many states have a city called Rochester? ⇒ 2 What is the longest river that runs through a state that borders Tennessee? ⇒ Missouri Of the states washed by the Mississippi river which has the lowest point? ⇒ Louisiana · · · · · ·
29
Input to Learning Algorithm
Training data (600 examples)
What is the highest point in Florida? ⇒ Walton County How many states have a city called Rochester? ⇒ 2 What is the longest river that runs through a state that borders Tennessee? ⇒ Missouri Of the states washed by the Mississippi river which has the lowest point? ⇒ Louisiana · · · · · ·
Lexicon (75 words)
city
⇒ city
state
⇒ state
mountain ⇒ mountain, peak
· · · · · ·
29
Input to Learning Algorithm
Training data (600 examples)
What is the highest point in Florida? ⇒ Walton County How many states have a city called Rochester? ⇒ 2 What is the longest river that runs through a state that borders Tennessee? ⇒ Missouri Of the states washed by the Mississippi river which has the lowest point? ⇒ Louisiana · · · · · ·
Lexicon (75 words)
city
⇒ city
state
⇒ state
mountain ⇒ mountain, peak
· · · · · ·
Database
city
San Francisco Chicago Boston · · ·
state
Alabama Alaska Arizona · · ·
loc
Mount Shasta California San Francisco California Boston Massachusetts · · · · · ·
border
Washington Oregon Washington Idaho Oregon Washington · · · · · · · · · · · ·
29
Experiment 1
On Geo, 250 training examples, 250 test examples
75 80 85 90 95 100
test accuracy
30
Experiment 1
On Geo, 250 training examples, 250 test examples System Description Lexicon (gen./spec.) Logical forms cgcr10 FunQL [Clarke et al., 2010]
cgcr10
73.2%
75 80 85 90 95 100
test accuracy
30
Experiment 1
On Geo, 250 training examples, 250 test examples System Description Lexicon (gen./spec.) Logical forms cgcr10 FunQL [Clarke et al., 2010] dcs
- ur system
cgcr10
73.2%
dcs
78.9%
75 80 85 90 95 100
test accuracy
30
Experiment 1
On Geo, 250 training examples, 250 test examples System Description Lexicon (gen./spec.) Logical forms cgcr10 FunQL [Clarke et al., 2010] dcs
- ur system
dcs+
- ur system
cgcr10
73.2%
dcs
78.9%
dcs+
87.2%
75 80 85 90 95 100
test accuracy
30
Experiment 2
On Geo, 600 training examples, 280 test examples
31
Experiment 2
On Geo, 600 training examples, 280 test examples System Description Lexicon Logical forms
75 80 85 90 95 100
test accuracy
31
Experiment 2
On Geo, 600 training examples, 280 test examples System Description Lexicon Logical forms zc05 CCG [Zettlemoyer & Collins, 2005]
zc05
79.3%
75 80 85 90 95 100
test accuracy
31
Experiment 2
On Geo, 600 training examples, 280 test examples System Description Lexicon Logical forms zc05 CCG [Zettlemoyer & Collins, 2005] zc07 relaxed CCG [Zettlemoyer & Collins, 2007]
zc05
79.3%
zc07
86.1%
75 80 85 90 95 100
test accuracy
31
Experiment 2
On Geo, 600 training examples, 280 test examples System Description Lexicon Logical forms zc05 CCG [Zettlemoyer & Collins, 2005] zc07 relaxed CCG [Zettlemoyer & Collins, 2007] kzgs10 CCG w/unification [Kwiatkowski et al., 2010]
zc05
79.3%
zc07
86.1%
kzgs10
88.9%
75 80 85 90 95 100
test accuracy
31
Experiment 2
On Geo, 600 training examples, 280 test examples System Description Lexicon Logical forms zc05 CCG [Zettlemoyer & Collins, 2005] zc07 relaxed CCG [Zettlemoyer & Collins, 2007] kzgs10 CCG w/unification [Kwiatkowski et al., 2010] dcs
- ur system
zc05
79.3%
zc07
86.1%
kzgs10
88.9%
dcs
88.6%
75 80 85 90 95 100
test accuracy
31
Experiment 2
On Geo, 600 training examples, 280 test examples System Description Lexicon Logical forms zc05 CCG [Zettlemoyer & Collins, 2005] zc07 relaxed CCG [Zettlemoyer & Collins, 2007] kzgs10 CCG w/unification [Kwiatkowski et al., 2010] dcs
- ur system
dcs+
- ur system
zc05
79.3%
zc07
86.1%
kzgs10
88.9%
dcs
88.6%
dcs+
91.1%
75 80 85 90 95 100
test accuracy
31
Some Intuition on Learning
32
Some Intuition on Learning
parameters θ (1) search DCS trees (hard!) (2) numerical optimization k-best lists
32
Some Intuition on Learning
parameters θ (1) search DCS trees (hard!) (2) numerical optimization k-best lists
If no DCS tree on k-best list is correct, skip example in (2)
32
Some Intuition on Learning
parameters θ (1) search DCS trees (hard!) (2) numerical optimization k-best lists
If no DCS tree on k-best list is correct, skip example in (2)
1 2 3 4
iteration
20 40 60 80 100
% examples trained on
32
Some Intuition on Learning
parameters θ (1) search DCS trees (hard!) (2) numerical optimization k-best lists
If no DCS tree on k-best list is correct, skip example in (2)
1 2 3 4
iteration
20 40 60 80 100
% examples trained on
Effect: automatic curriculum learning, learning improves search
32
Current Limitations
29
Current Limitations
Only using forward information Execute program to get answer, but want to invert
29
Current Limitations
Only using forward information Execute program to get answer, but want to invert Non-identifiability of program If all cities in database are in US, then can’t distinguish {c : city(c)} and {c : city(c) ∧ loc(c, US)}
29
Current Limitations
Only using forward information Execute program to get answer, but want to invert Non-identifiability of program If all cities in database are in US, then can’t distinguish {c : city(c)} and {c : city(c) ∧ loc(c, US)} Unknown facts: How far is Los Angeles from Boston? Database has no distance information
29
Current Limitations
Only using forward information Execute program to get answer, but want to invert Non-identifiability of program If all cities in database are in US, then can’t distinguish {c : city(c)} and {c : city(c) ∧ loc(c, US)} Unknown facts: How far is Los Angeles from Boston? Database has no distance information Unknown concepts: What states are landlocked? Need to induce database view for landlocked(x) = ¬border(x, ocean)
29
Conclusion
Goal: learn to answer questions from question/answer pairs
30
Conclusion
Goal: learn to answer questions from question/answer pairs Empirical result: DCS (no logical forms) u existing systems (with logical forms)
30
Conclusion
Goal: learn to answer questions from question/answer pairs Empirical result: DCS (no logical forms) u existing systems (with logical forms) Conceptual contribution: DCS trees
- Trees: connects dependency syntax with efficient evaluation
30
Conclusion
Goal: learn to answer questions from question/answer pairs Empirical result: DCS (no logical forms) u existing systems (with logical forms) Conceptual contribution: DCS trees
- Trees: connects dependency syntax with efficient evaluation
- Mark-Execute: unifying framework for handling scope
30
2 1
you thank
35