1
Natural Language Processing
Parsing III
Dan Klein – UC Berkeley
Natural Language Processing Parsing III Dan Klein UC Berkeley 1 - - PowerPoint PPT Presentation
Natural Language Processing Parsing III Dan Klein UC Berkeley 1 Unsupervised Tagging 2 Unsupervised Tagging? AKA part of speech induction Task: Raw sentences in Tagged sentences out Obvious thing to do: Start
1
Parsing III
Dan Klein – UC Berkeley
2
3
4
tags) and reestimating parameters
kind of transition and emission we have under current params:
5
6
7
8
statistical fit of the grammar
9
statistical fit of the grammar
10
statistical fit of the grammar
11
Parse Tree Sentence Parameters ... Derivations
12
Forward
EM algorithm: X1 X2 X7 X4 X5 X6 X3
He was right .
Just like Forward‐Backward for HMMs.
Backward
13
DT DT-1 DT-2 DT-3 DT-4
14
Hierarchical refinement
15
74 76 78 80 82 84 86 88 90 100 300 500 700 900 1100 1300 1500 1700
Total Number of grammar symbols Parsing accuracy (F1)
Model F1 Flat Training 87.3 Hierarchical Training 88.4
16
17
were least useful
18
Model F1 Previous 88.4 With 50% Merging 89.5
19
5 10 15 20 25 30 35 40 NP VP PP ADVP S ADJP SBAR QP WHNP PRN NX SINV PRT WHPP SQ CONJP FRAG NAC UCP WHADVP INTJ SBARQ RRC WHADJP X ROOT LST
Number of Phrasal Subcategories
20
Number of Lexical Subcategories
10 20 30 40 50 60 70 NNP JJ NNS NN VBN RB VBG VB VBD CD IN VBZ VBP DT NNPS CC JJR JJS : PRP PRP$ MD RBR WP POS PDT WRB
. EX WP$ WDT
'' FW RBS TO $ UH , `` SYM RP LS #
21
NNP-14 Oct. Nov. Sept. NNP-12 John Robert James NNP-2 J. E. L. NNP-1 Bush Noriega Peters NNP-15 New San Wall NNP-3 York Francisco Street PRP-0 It He I PRP-1 it he they PRP-2 it them him
22
RBR-0 further lower higher RBR-1 more less More RBR-2 earlier Earlier later CD-7
two Three CD-4 1989 1990 1988 CD-11 million billion trillion CD-0 1 50 100 CD-3 1 30 31 CD-9 78 58 34
23
≤ 40 words F1 all F1 ENG Charniak&Johnson ‘05 (generative) 90.1 89.6 Split / Merge 90.6 90.1 GER Dubey ‘05 76.3
80.8 80.1 CHN Chiang et al. ‘02 80.0 76.6 Split / Merge 86.3 83.4 Still higher numbers from reranking / self-training methods
24
25
?????????
26
… QP NP VP …
coarse: split in two:
… QP1 QP2 NP1 NP2 VP1 VP2 … … QP1 QP1 QP3 QP4 NP1 NP2 NP3 NP4 VP1 VP2 VP3 VP4 …
split in four: split in eight: …
… … … … … … … … … … … … … … … …
27
28
29
30
31
[Huang and Chiang 05, Pauls, Klein, Quirk 10]
32
graph
questioned lawyer witness the the
33
Dependency Parsing
Y[h] Z[h’] X[h] i h k h’ j h h’ h h k h’
34
35
36
37
with adjunction
sensitive
dependencies naturally
weird stuff that CFGs don’t capture well (e.g. cross‐serial dependencies)
38
39
Categorial Grammar
lexicalized grammar
argument sequences
to the lambda calculus (more later)
ambiguities (why?)