Dynamic Programming for Linear-Time Incremental Parsing
Liang Huang
Information Sciences Institute University of Southern California
(Joint work with Kenji Sagae, USC/ICT)
JHU CLSP Seminar September 14, 2010
Dynamic Programming for Linear-Time Incremental Parsing Liang Huang - - PowerPoint PPT Presentation
Dynamic Programming for Linear-Time Incremental Parsing Liang Huang Information Sciences Institute University of Southern California (Joint work with Kenji Sagae, USC/ICT) JHU CLSP Seminar September 14, 2010 Remembering Fred Jelinek
Liang Huang
Information Sciences Institute University of Southern California
(Joint work with Kenji Sagae, USC/ICT)
JHU CLSP Seminar September 14, 2010
Remembering Fred Jelinek (1932-2010)
He was very supportive of this work, which is related to his work on structured language models, and I dedicate my work to his memory.
DP for Incremental Parsing
3
DP for Incremental Parsing
3
One morning in Africa, I shot an elephant in my pajamas;
DP for Incremental Parsing
3
One morning in Africa, I shot an elephant in my pajamas; how he got into my pajamas I’ll never know.
DP for Incremental Parsing
3
One morning in Africa, I shot an elephant in my pajamas; how he got into my pajamas I’ll never know.
DP for Incremental Parsing
3
One morning in Africa, I shot an elephant in my pajamas; how he got into my pajamas I’ll never know.
CS 562 - Intro
4
CS 562 - Intro
4
CS 562 - Intro
4
Google translate: carefully slide
CS 562 - Intro
4
Google translate: carefully slide
CS 562 - Intro
4
Google translate: carefully slide
CS 562 - Intro
5
CS 562 - Intro
5
Google translate: Once the theft to the police
CS 562 - Intro
6
CS 562 - Intro
6
clear evidence that NLP is used in real life!
DP for Incremental Parsing
I feed cats nearby in the garden ...
7
DP for Incremental Parsing
I feed cats nearby in the garden ...
7
DP for Incremental Parsing
I feed cats nearby in the garden ...
7
DP for Incremental Parsing
I feed cats nearby in the garden ...
7
DP for Incremental Parsing
8
I feed cats nearby in the garden ...
DP for Incremental Parsing
8
I feed cats nearby in the garden ...
DP for Incremental Parsing
8
I feed cats nearby in the garden ...
DP for Incremental Parsing
9
greedy search principled search
incremental parsing
(e.g. shift-reduce)
(Nivre 04; Collins/Roark 04; ...)
this work:
fast shift-reduce parsing
with dynamic programming
full DP
(e.g. CKY)
(Eisner 96; Collins 99; ...)
fast (linear-time) slow (cubic-time)
DP for Incremental Parsing
10
natural languages programming languages human computer
psycholinguistics
compiler theory
(LR, LALR, ...)
DP for Incremental Parsing
10
natural languages programming languages human computer
psycholinguistics
compiler theory
(LR, LALR, ...)
DP for Incremental Parsing
11
0.2 0.4 0.6 0.8 1 1.2 1.4 0 10 20 30 40 50 60 70 parsing time (secs) sentence length
DP for Incremental Parsing
11
C h a r n i a k B e r k e l e y MST this work
0.2 0.4 0.6 0.8 1 1.2 1.4 0 10 20 30 40 50 60 70 parsing time (secs) sentence length
DP for Incremental Parsing
11
C h a r n i a k B e r k e l e y MST this work
0.2 0.4 0.6 0.8 1 1.2 1.4 0 10 20 30 40 50 60 70 parsing time (secs) sentence length 100 102 104 106 108 1010 0 10 20 30 40 50 60 70 sentence length
DP: exponential
non-DP beam search
DP for Incremental Parsing
12
DP for Incremental Parsing
I feed cats ... feed cats nearby ... cats nearby in ... cats nearby in ... nearby in the ... nearby in the ... in the garden ...
13
action stack queue
I feed cats nearby in the garden.
DP for Incremental Parsing
I feed cats ... feed cats nearby ... cats nearby in ... cats nearby in ... nearby in the ... nearby in the ... in the garden ...
14
action stack queue
I feed cats nearby in the garden.
I
shift
DP for Incremental Parsing
I feed cats ... feed cats nearby ... cats nearby in ... cats nearby in ... nearby in the ... nearby in the ... in the garden ...
15
action stack queue
I feed cats nearby in the garden.
I feed I
shift 2 shift
DP for Incremental Parsing
I feed cats ... feed cats nearby ... cats nearby in ... cats nearby in ... nearby in the ... nearby in the ... in the garden ...
16
action stack queue
I feed cats nearby in the garden.
I feed I feed
I
shift 2 shift 3 l-reduce
DP for Incremental Parsing
I feed cats ... feed cats nearby ... cats nearby in ... cats nearby in ... nearby in the ... nearby in the ... in the garden ...
17
action stack queue
I feed cats nearby in the garden.
I feed I feed
I
feed cats
I
shift 2 shift 3 l-reduce 4 shift
DP for Incremental Parsing
18
action stack queue
I feed cats nearby in the garden.
I feed cats ... feed cats nearby ... cats nearby in ... cats nearby in ... nearby in the ... nearby in the ... in the garden ...
I feed I feed
I
feed cats
I
feed
I
cats
shift 2 shift 3 l-reduce 4 shift 5a r-reduce
DP for Incremental Parsing
19
action stack queue
I feed cats ... feed cats nearby ... cats nearby in ... cats nearby in ... nearby in the ... nearby in the ... in the garden ...
I feed cats nearby in the garden.
I feed I feed
I
feed cats
I
feed
I
cats
feed cats nearby
I
shift 2 shift 3 l-reduce 4 shift 5a r-reduce 5b shift
DP for Incremental Parsing
20
action stack queue
shift-reduce conflict
I feed cats nearby in the garden.
I feed cats ... feed cats nearby ... cats nearby in ... cats nearby in ... nearby in the ... nearby in the ... in the garden ...
I feed I feed
I
feed cats
I
feed
I
cats
feed cats nearby
I
shift 2 shift 3 l-reduce 4 shift 5a r-reduce 5b shift
DP for Incremental Parsing
21
... s2 s1 s0 q0 q1 ...
← stack queue → ← stack queue →
features: (s0.w, s0.rc, q0, ...) = (cats, nearby, in, ...)
... feed cats
I nearby
in the garden ...
DP for Incremental Parsing
22
DP for Incremental Parsing
23
DP for Incremental Parsing
24
DP for Incremental Parsing
25
DP for Incremental Parsing
26
“graph-structured stack” (Tomita, 1988)
DP for Incremental Parsing
27
“graph-structured stack” (Tomita, 1988)
DP for Incremental Parsing
27
“graph-structured stack” (Tomita, 1988)
each DP state corresponds to exponentially many non-DP states
DP for Incremental Parsing
28
“graph-structured stack” (Tomita, 1988)
100 102 104 106 108 1010 0 10 20 30 40 50 60 70 sentence length
DP: exponential
non-DP beam search
each DP state corresponds to exponentially many non-DP states
DP for Incremental Parsing
I
I
29
... s2 s1 s0 q0 q1 ...
← stack queue → ... cats
re feed
... feed I
sh sh
DP for Incremental Parsing
I
I
29
... s2 s1 s0 q0 q1 ...
← stack queue → ... cats
re feed
... feed I
sh sh
assume features only look at root of s0
DP for Incremental Parsing
I
I
29
... s2 s1 s0 q0 q1 ...
← stack queue → ... cats
re feed
... feed I
sh sh
assume features only look at root of s0 two states are equivalent if they agree on root of s0
DP for Incremental Parsing
I
I
29
... s2 s1 s0 q0 q1 ...
← stack queue → ... cats
re feed
... feed I
sh sh
assume features only look at root of s0 two states are equivalent if they agree on root of s0
DP for Incremental Parsing
I
I cats
30
... s2 s1 s0 q0 q1 ...
← stack queue →
sh re
... cats ... nearby ... feed ...
DP for Incremental Parsing
I
I cats
30
... s2 s1 s0 q0 q1 ...
← stack queue →
sh re
... cats ... nearby ... feed ...
DP for Incremental Parsing
I nearby
I cats
31
... s2 s1 s0 q0 q1 ...
← stack queue →
sh re
... cats ... nearby ... feed
re ... cats sh ... nearby
...
DP for Incremental Parsing
I cats nearby
I cats nearby
32
... s2 s1 s0 q0 q1 ...
← stack queue →
sh re
... cats ... nearby ... feed
re ... cats
... feed
re
... feed
re sh ... nearby
...
DP for Incremental Parsing
I cats nearby
I cats nearby
32
... s2 s1 s0 q0 q1 ...
← stack queue →
sh re
... cats ... nearby ... feed
re ... cats
... feed
re
... feed
re sh ... nearby
...
DP for Incremental Parsing
I cats nearby
I cats nearby
33
... s2 s1 s0 q0 q1 ...
← stack queue →
sh re
... cats ... nearby ... feed
re ... cats re
... feed
re sh ... nearby
...
DP for Incremental Parsing
I cats nearby
I cats nearby
33
... s2 s1 s0 q0 q1 ...
← stack queue →
sh re
... cats ... nearby ... feed
re ... cats re
... feed
re sh ... nearby
...
(local) ambiguity-packing!
DP for Incremental Parsing
I cats nearby
I cats nearby
34
... s2 s1 s0 q0 q1 ...
← stack queue →
sh re
... cats ... nearby ... feed
re ... cats re
... feed
re sh ... nearby
... in
sh
...
DP for Incremental Parsing
I cats nearby
I cats nearby
34
... s2 s1 s0 q0 q1 ...
← stack queue →
sh re
... cats ... nearby ... feed
re ... cats re
... feed
re sh ... nearby
... in
sh
...
graph-structured stack
DP for Incremental Parsing
away from stack top than from trees closer to top
35
... s2 s1 s0 q0 q1 ...
← stack queue →
DP for Incremental Parsing
annotating the parent (otherwise DP would fail)
36
parent grand-parent s1 s2 s0
stack
DP for Incremental Parsing
37
DP for Incremental Parsing
37
DP for Incremental Parsing
38 In: M. Johnson, S. Khudanpur, M. Ostendorf, and R. Rosenfeld (eds.): Mathematical Foundations of Speech and Language Processing, 2004
DP for Incremental Parsing
38
graph-structured stack!
In: M. Johnson, S. Khudanpur, M. Ostendorf, and R. Rosenfeld (eds.): Mathematical Foundations of Speech and Language Processing, 2004
DP for Incremental Parsing
38
I don’t know anything about this paper... graph-structured stack!
In: M. Johnson, S. Khudanpur, M. Ostendorf, and R. Rosenfeld (eds.): Mathematical Foundations of Speech and Language Processing, 2004
DP for Incremental Parsing
39
see also (Chelba and Jelinek, 98; 00; Xu, Chelba, Jelinek, 02)
DP for Incremental Parsing
39
pSLM( a | has, show) p3gram( a | its, host)
see also (Chelba and Jelinek, 98; 00; Xu, Chelba, Jelinek, 02)
DP for Incremental Parsing
41
time (hours)
n
P DP
DP for Incremental Parsing
42
92.2 92.3 92.4 92.5 92.6 92.7 92.8 92.9 93 93.1 2365 2370 2375 2380 2385 2390 2395 dependency accuracy average model score DP non-DP
DP for Incremental Parsing
43
100 102 104 106 108 1010 0 10 20 30 40 50 60 70 sentence length
DP: exponential
non-DP: fixed (beam-width)
number of trees explored
DP for Incremental Parsing
44
DP forest oracle (98.15) DP k-best in forest n
P k
e s t i n b e a m
DP for Incremental Parsing
45
DP for Incremental Parsing
46
DP non-DP
DP for Incremental Parsing
47
0.2 0.4 0.6 0.8 1 1.2 1.4 0 10 20 30 40 50 60 70 parsing time (secs) sentence length
DP for Incremental Parsing
47
C h a r n i a k B e r k e l e y MST t h i s w
k
0.2 0.4 0.6 0.8 1 1.2 1.4 0 10 20 30 40 50 60 70 parsing time (secs) sentence length
DP for Incremental Parsing
47
C h a r n i a k B e r k e l e y MST t h i s w
k
0.2 0.4 0.6 0.8 1 1.2 1.4 0 10 20 30 40 50 60 70 parsing time (secs) sentence length
O(n2) O(n) O(n2.4) O(n2.5)
DP for Incremental Parsing
time
complexity trees searched
0.12
O(n2)
exponential
exponential 0.11
O(n)
constant 0.04
O(n)
exponential 0.49
O(n2.5)
exponential 0.21
O(n2.4)
exponential
McDonald et al 05 - MST Koo et al 08 baseline* Zhang & Clark 08 single this work Charniak 00 Petrov & Klein 07 89 91 93 92.4 92.5 92.1 91.4 92.0 90.2
DP for Incremental Parsing
time
complexity trees searched
0.12
O(n2)
exponential
exponential 0.11
O(n)
constant 0.04
O(n)
exponential 0.49
O(n2.5)
exponential 0.21
O(n2.4)
exponential
McDonald et al 05 - MST Koo et al 08 baseline* Zhang & Clark 08 single this work Charniak 00 Petrov & Klein 07 89 91 93 92.4 92.5 92.1 91.4 92.0 90.2
DP for Incremental Parsing
time
complexity trees searched
0.12
O(n2)
exponential
exponential 0.11
O(n)
constant 0.04
O(n)
exponential 0.49
O(n2.5)
exponential 0.21
O(n2.4)
exponential
McDonald et al 05 - MST Koo et al 08 baseline* Zhang & Clark 08 single this work Charniak 00 Petrov & Klein 07 89 91 93 92.4 92.5 92.1 91.4 92.0 90.2
*at this ACL: Koo & Collins 10: 93.0 with O(n4)
DP for Incremental Parsing
49
Duan et al. 2007 Zhang & Clark 08 (single) this work 70 85
78.3 76.7 73.7
85.5 84.7 84.4
85.2 84.3 83.9
word non-root root
DP for Incremental Parsing
50
greedy search principled search
incremental parsing
(e.g. shift-reduce)
full dynamic programming
(e.g. CKY)
fast (linear-time) slow (cubic-time)
DP for Incremental Parsing
50
greedy search principled search
incremental parsing
(e.g. shift-reduce)
full dynamic programming
(e.g. CKY)
fast (linear-time) slow (cubic-time)
linear-time
shift-reduce parsing
w/ dynamic programming
DP for Incremental Parsing
51
natural languages programming languages human computer
psycholinguistics
NLP
compiler theory still a long way to go...
DP for Incremental Parsing
52
DP for Incremental Parsing
Liang Haung or his student Sagae.”
“As you can see, I am completely confused!” And he was right.
53