Probabilistic Parsing: Issues & Improvement
LING 571 — Deep Processing Techniques for NLP October 14, 2019 Shane Steinert-Threlkeld
1
Probabilistic Parsing: Issues & Improvement LING 571 Deep - - PowerPoint PPT Presentation
Probabilistic Parsing: Issues & Improvement LING 571 Deep Processing Techniques for NLP October 14, 2019 Shane Steinert-Threlkeld 1 Announcements HW2 grades posted (mean 87) Reference code available in
LING 571 — Deep Processing Techniques for NLP October 14, 2019 Shane Steinert-Threlkeld
1
2
3
4
5
page for more references]
goes that-which is-a-coyote “The/a coyote goes”
is-a-coyote that-which goes “The one who goes is a coyote”
6
7
8
9
10
S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . .
11
S → NPVP . 1 S → * 1 S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . .
12
S → NP VP . 1 NP→ NNP NNP 1 S → * 1 NP→ * 1 S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . .
13
S → NP VP . 1 NP → NNP NNP 1 VP → VBZ NP 1 S → * 1 NP→ * 1 VP → * 1 S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . .
14
S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . . S → NP VP . 1 NP → NNP NNP 1 VP → VBZ NP 1 NP→ NP PP 1 S → * 1 NP→ * 2 VP → * 1
15
S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . . S → NP VP . 1 NP → NNP NNP 1 VP → VBZ NP 1 NP→ NP PP 1 PP→ IN NP 1 S → * 1 NP→ * 2 VP → * 1 PP→ * 1
16
S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . . S → NP VP . 1 NP → NNP NNP 1 VP → VBZ NP 1 NP→ NP PP 1 PP→ IN NP 1 NP→ NP , NP 1 S → * 1 NP→ * 3 VP → * 1 PP→ * 1
17
S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . . S → NP VP . 1 NP→ NNP NNP 2 VP → VBZ NP 1 NP→ NP PP 1 PP→ IN NP 1 NP→ NP , NP 1 S → * 1 NP→ * 4 VP → * 1 PP→ * 1
18
S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . . S → NP VP . 1 NP→ NNP NNP 2 VP → VBZ NP 1 NP→ NP PP 1 PP→ IN NP 1 NP→ NP , NP 1
NP→ DT NNP VBG NN
1 S → * 1 NP→ * 5 VP → * 1 PP→ * 1
19
S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . . S → NP VP . 1 NP→ NNP NNP 2 VP → VBZ NP 1 NP→ NP PP 1 PP→ IN NP 1 NP→ NP , NP 1
NP→ DT NNP VBG NN
1 S → * 1 NP→ * 5 VP → * 1 PP→ * 1
20
S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . . S → NP VP . 1 NP→ NNP NNP 2/5 VP → VBZ NP 1 NP→ NP PP 1/5 PP→ IN NP 1 NP→ NP , NP 1/5
NP→ DT NNP VBG NN
1/5 S → * 1 NP→ * 5 VP → * 1 PP→ * 1
21
S NP NNP Mr. NNP Vinken VP VBZ is NP NP NN chairman PP IN
NP NP NNP Elsevier NNP N.V. , , NP DT the NNP Dutch VBG publishing NN group . . S → NP VP . 1 NP→ NNP NNP 0.4 VP → VBZ NP 1 NP→ NP PP 0.2 PP→ IN NP 1 NP→ NP , NP 0.2
NP→ DT NNP VBG NN
0.2 S → * 1 NP→ * 5 VP → * 1 PP→ * 1
22
23
24
Semantic Role of NPs in Switchboard Corpus Pronomial Non-Pronomial
Subject 91% 9% Object 34% 66%
(“into a bin” = location of sacks after dumping) OK!
25
(“into a bin” = *the sacks which were located in PP) not OK
S NP NNS workers VP VBD dumped NP NNS sacks PP P into NP DT a NN bin S NP NNS workers VP VBD dumped NP NNS sacks PP P into NP DT a NN bin
26
S NP NNS workers VP VBD dumped NP NNS sacks PP P into NP DT a NN bin
(“into a bin” = *the sacks which were located in PP) not OK
S NP NNS workers VP VBD dumped NP NNS sacks PP P in NP DT a NN bin
(“in a bin” = location of sacks before dumping) OK!
27
28
NP NP NP Noun dogs PP Prep in NP Noun houses Conj and NP Noun cats NP NP Noun dogs PP Prep in NP NP Noun houses Conj and NP Noun cats
29
NP NP NP Noun dogs PP Prep in NP Noun houses Conj and NP Noun cats NP NP Noun dogs PP Prep in NP NP Noun houses Conj and NP Noun cats
NP → NP Conj NP NP → NP PP Noun → “dogs” PP → Prep NP Prep → “in” NP → Noun Noun → “houses” Conj → “and” NP → Noun Noun → “cats” NP → NP PP Noun → “dogs” PP → Prep NP Prep → “in” NP → NP Conj NP NP → Noun Noun → “houses” Conj → “and” NP → Noun Noun → “cats”
30
NP NP NP Noun dogs PP Prep in NP Noun houses Conj and NP Noun cats NP NP Noun dogs PP Prep in NP NP Noun houses Conj and NP Noun cats
NP → NP Conj NP NP → NP PP Noun → “dogs” PP → Prep NP Prep → “in” NP → Noun Noun → “houses” Conj → “and” NP → Noun Noun → “cats” NP → NP PP Noun → “dogs” PP → Prep NP Prep → “in” NP → NP Conj NP NP → Noun Noun → “houses” Conj → “and” NP → Noun Noun → “cats”
31
NP NP NP Noun dogs PP Prep in NP Noun houses Conj and NP Noun cats NP NP Noun dogs PP Prep in NP NP Noun houses Conj and NP Noun cats
NP → NP Conj NP NP → NP PP Noun → “dogs” PP → Prep NP Prep → “in” NP → Noun Noun → “houses” Conj → “and” NP → Noun Noun → “cats” NP → NP PP Noun → “dogs” PP → Prep NP Prep → “in” NP → NP Conj NP NP → Noun Noun → “houses” Conj → “and” NP → Noun Noun → “cats”
32
33
34
S aaa a ! ! ! ! NP Pron I VP aaa ! ! ! V prefer NP HH H
a Nom bb b " " " Nom NN flight PP c c # # IN
NP NNP TWA
35
S PPPP P ⇣ ⇣ ⇣ ⇣ ⇣ NPS PronNP I VPS PPPP ⇣ ⇣ ⇣ ⇣ VV P prefer NPV P PPPP ⇣ ⇣ ⇣ ⇣ DetNP a NomNP aaa a ! ! ! ! NomNom NNNom flight PPNom bb b " " " INP P
NPP P NNPNP TWA
S PPPP P ⇣ ⇣ ⇣ ⇣ ⇣ NPS PronNP I VPS PPPP ⇣ ⇣ ⇣ ⇣ VV P prefer NPV P PPPP ⇣ ⇣ ⇣ ⇣ DetNP a NomNP aaa a ! ! ! ! NomNom NNNom flight PPNom bb b " " " INP P
NPP P NNPNP TWA
36
37
38
39
VP → VBD NP PP VP(dumped) → VBD(dumped) NP(sacks) PP(into)
VP(dumped, VBD) → VBD(dumped, VBD) NP(sacks, NNS) PP(into, IN)
40
41 Internal Rules Lexical Rules TOP → S(prefer, V) Pron(I, Pron) → I S(prefer, V) → NP(I, Pron) VP(prefer, V) V(prefer, V) → prefer NP(I, Pron) → Pron(I, Pron) Det(a, Det) → a VP(prefer, V) → V(prefer, V) NP(flight, NN) NN(flight, NN) → flight NP(flight, NN) → Det(a, Det) Nom(flight, NN) IN(on, IN) →
PP(on, IN) → IN(on, IN) NP(TWA, NNP) NNP(NWA, NNP) → TWA
TOP S[prefer, V] `````` ` NP[I, Pron] Pron[I, Pron] I VP[prefer, V] `````` ` V[prefer, V] prefer NP[flight, NN] `````` ` Det[a, Det] a Nom[flight, NN] XXXXX X ⇠ ⇠ ⇠ ⇠ ⇠ ⇠ Nom[flight, NN] NN[flight, NN] flight PP[on, IN] PPP P ⇣ ⇣ ⇣ ⇣ IN[on, IN]
NP[TWA, NNP] NNP[TWA, NNP] TWA
42 Internal Rules Lexical Rules TOP → S(prefer, V) Pron(I, Pron) → I S(prefer, V) → NP(I, Pron) VP(prefer, V) V(prefer, V) → prefer NP(I, Pron) → Pron(I, Pron) Det(a, Det) → a VP(prefer, V) → V(prefer, V) NP(flight, NN) NN(flight, NN) → flight NP(flight, NN) → Det(a, Det) Nom(flight, NN) IN(on, IN) →
PP(on, IN) → IN(on, IN) NP(TWA, NNP) NNP(NWA, NNP) → TWA
TOP S[prefer, V] `````` ` NP[I, Pron] Pron[I, Pron] I VP[prefer, V] `````` ` V[prefer, V] prefer NP[flight, NN] `````` ` Det[a, Det] a Nom[flight, NN] XXXXX X ⇠ ⇠ ⇠ ⇠ ⇠ ⇠ Nom[flight, NN] NN[flight, NN] flight PP[on, IN] PPP P ⇣ ⇣ ⇣ ⇣ IN[on, IN]
NP[TWA, NNP] NNP[TWA, NNP] TWA
43 Internal Rules Lexical Rules TOP → S(prefer, V) Pron(I, Pron) → I S(prefer, V) → NP(I, Pron) VP(prefer, V) V(prefer, V) → prefer NP(I, Pron) → Pron(I, Pron) Det(a, Det) → a VP(prefer, V) → V(prefer, V) NP(flight, NN) NN(flight, NN) → flight NP(flight, NN) → Det(a, Det) Nom(flight, NN) IN(on, IN) →
PP(on, IN) → IN(on, IN) NP(TWA, NNP) NNP(NWA, NNP) → TWA
TOP S[prefer, V] `````` ` NP[I, Pron] Pron[I, Pron] I VP[prefer, V] `````` ` V[prefer, V] prefer NP[flight, NN] `````` ` Det[a, Det] a Nom[flight, NN] XXXXX X ⇠ ⇠ ⇠ ⇠ ⇠ ⇠ Nom[flight, NN] NN[flight, NN] flight PP[on, IN] PPP P ⇣ ⇣ ⇣ ⇣ IN[on, IN]
NP[TWA, NNP] NNP[TWA, NNP] TWA
44 Internal Rules Lexical Rules TOP → S(prefer, V) Pron(I, Pron) → I S(prefer, V) → NP(I, Pron) VP(prefer, V) V(prefer, V) → prefer NP(I, Pron) → Pron(I, Pron) Det(a, Det) → a VP(prefer, V) → V(prefer, V) NP(flight, NN) NN(flight, NN) → flight NP(flight, NN) → Det(a, Det) Nom(flight, NN) IN(on, IN) →
PP(on, IN) → IN(on, IN) NP(TWA, NNP) NNP(NWA, NNP) → TWA
TOP S[prefer, V] `````` ` NP[I, Pron] Pron[I, Pron] I VP[prefer, V] `````` ` V[prefer, V] prefer NP[flight, NN] `````` ` Det[a, Det] a Nom[flight, NN] XXXXX X ⇠ ⇠ ⇠ ⇠ ⇠ ⇠ Nom[flight, NN] NN[flight, NN] flight PP[on, IN] PPP P ⇣ ⇣ ⇣ ⇣ IN[on, IN]
NP[TWA, NNP] NNP[TWA, NNP] TWA
45
S[dumped, VBD] hhhhhhhhhh h ( ( ( ( ( ( ( ( ( ( ( NP[workers, NNS] NNS[workers, NNS] workers VP[dumped, VBD] hhhhhhhhhh h ( ( ( ( ( ( ( ( ( ( ( VBD[dumped, VBD] dumped NP[sacks, NNS] NNS[sacks, NNS] sacks PP[into, P] XXXX X ⇠ ⇠ ⇠ ⇠ ⇠ P[into, P] into NP[bin, NN] aaa a ! ! ! ! DT[a, DT] a NN[bin, NN] bin
46
47
48
S NP NNS workers VP VBD dumped NP NNS sacks PP P into NP DT a NN bin S NP NNS workers VP VBD dumped NP NNS sacks PP P into NP DT a NN bin
49
= Count (VP (dumped) → VBD NP) ∑β Count (VP (dumped) → β) = 1 9 = 0.11
= 0 PR(into|PP, sacks)
= Count (X (sacks) → … PP (into) …) ∑β Count (X (sacks) → … PP …)
= Count (VP (dumped) → VBD NP PP) ∑β Count (VP (dumped) → β) = 6 9 = 0.67
PR(into|PP, dumped)
= Count (X (dumped) → … PP (into) …) ∑β Count (X (dumped) → … PP …) = 2 9 = 0.22
50
51
52
NP aaa a
! ! ! DT JJ NN NNS
NP aaa ! ! ! DT NP:JJ+NN+NNS HH H
NP:NN+NNS c c # # NN NNS
NP HH H
NP:JJ+NN HH H
NP:NN+NNS c c # # NN NNS
NP HH H
NP:JJ bb " " JJ NP:NN c c # # NN NNS
NP HH H
NP bb " " JJ NP c c # # NN NNS
53
PCFG Time(s) Words/s |V| |P| LR LP F1
Right-factored 4848 6.7 10105 23220 69.2 73.8 71.5 Right-factored, Markov order-2 1302 24.9 2492 11659 68.8 73.8 71.3 Right-factored, Markov order-1 445 72.7 564 6354 68.0 730 70.5 Right-factored, Markov order-0 206 157.1 99 3803 61.2 65.5 63.3 Parent-annotated, Right-factored, Markov order-2 7510 4.3 5876 22444 76.2 78.3 77.2 from Mohri & Roark 2006
54
55
56
57
Baseline 0.897 Oracle 0.968 Discriminative 0.917
etc
58
59
60
61
62