Natural Language Processing (CSE 517): Dependency Syntax and Parsing
Noah A. Smith Swabha Swayamdipta
c 2018
University of Washington {nasmith,swabha}@cs.washington.edu May 11, 2018
1 / 93
Natural Language Processing (CSE 517): Dependency Syntax and Parsing - - PowerPoint PPT Presentation
Natural Language Processing (CSE 517): Dependency Syntax and Parsing Noah A. Smith Swabha Swayamdipta 2018 c University of Washington { nasmith,swabha } @cs.washington.edu May 11, 2018 1 / 93 Recap: Phrase Structure S NP NP VP JJ
1 / 93
2 / 93
SROOT NPS DTNP The NNNP luxury NNNP auto NNNP maker NPS JJNP last NNNP year VPS VBDVP sold NPVP CDNP 1,214 NNNP cars PPVP INPP in NPPP DTNP the NNPNP U.S.
3 / 93
4 / 93
Ssold NPmaker DTThe The NNluxury luxury NNauto auto NNmaker maker NPyear JJlast last NNyear year VPsold VBDsold sold NPcars CD1,214 1,214 NNcars cars PPin INin in NPU.S. DTthe the NNPU.S. U.S.
5 / 93
6 / 93
7 / 93
8 / 93
9 / 93
10 / 93
11 / 93
12 / 93
13 / 93
14 / 93
root sbj dobj prep pobj
15 / 93
16 / 93
17 / 93
18 / 93
19 / 93
ROOT ATT ATT SBJ PU VC TMP PC ATT 20 / 93
◮ More powerful, less local rule sets, possibly collapsing some words into arc labels. ◮ Stanford dependencies are a popular example (de Marneffe et al., 2006). ◮ Only results in projective trees.
21 / 93
22 / 93
23 / 93
24 / 93
25 / 93
26 / 93
27 / 93
28 / 93
29 / 93
30 / 93
31 / 93
32 / 93
33 / 93
34 / 93
35 / 93
36 / 93
37 / 93
38 / 93
39 / 93
40 / 93
41 / 93
42 / 93
43 / 93
root
44 / 93
45 / 93
46 / 93
47 / 93
48 / 93
◮ E.g., beam search, which we’ll discuss in the context of machine translation later.
◮ As yet, no principled solution to this problem, but see “dynamic oracles” (Goldberg
49 / 93
50 / 93
51 / 93
52 / 93
53 / 93
54 / 93
55 / 93
56 / 93
57 / 93
58 / 93
59 / 93
◮ Arborescences can’t have cycles, so some edge in C needs to be kicked out. ◮ We also need to find an incoming edge for C. ◮ Choosing the incoming edge for C determines which edge to kick out. 60 / 93
61 / 93
62 / 93
◮ contract the nodes in C into a new node vC
◮ Edges incoming to any node in C now get destination vC ◮ For each node v in C, and for each edge e incoming to v from outside of C: ◮ Set e.kicksOut to bestInEdge[v], and ◮ Set e.s to be e.s − e.kicksOut.s ◮ Edges outgoing from any node in C now get source vC
63 / 93
64 / 93
V1 ROOT V3 V2 a : 5 b : 1 c : 1 f : 5 d : 11 h : 9 e : 4 i : 8 g : 10
65 / 93
V1 ROOT V3 V2 a : 5 b : 1 c : 1 f : 5 d : 11 h : 9 e : 4 i : 8 g : 10
66 / 93
V1 ROOT V3 V2 a : 5 b : 1 c : 1 f : 5 d : 11 h : 9 e : 4 i : 8 g : 10
67 / 93
V1 ROOT V3 V2 a : 5 − 10 b : 1 − 11 c : 1 f : 5 d : 11 h : 9 − 10 e : 4 i : 8 − 11 g : 10
68 / 93
V4 ROOT V3 b : −10 c : 1 f : 5 a : −5 h : −1 e : 4 i : −3
69 / 93
V4 ROOT V3 b : −10 c : 1 f : 5 a : −5 h : −1 e : 4 i : −3
70 / 93
V4 ROOT V3 b : −10 c : 1 f : 5 a : −5 h : −1 e : 4 i : −3
71 / 93
V4 ROOT V3 b : −10 − −1 c : 1 − 5 f : 5 a : −5 − −1 h : −1 e : 4 i : −3
72 / 93
V5 ROOT b : −9 a : −4 c : −4
73 / 93
V5 ROOT b : −9 a : −4 c : −4
74 / 93
75 / 93
V5 ROOT b : −9 a : −4 c : −4
76 / 93
V5 ROOT b : −9 a : −4 c : −4
77 / 93
V4 ROOT V3 b : −10 c : 1 f : 5 a : −5 h : −1 e : 4 i : −3
78 / 93
V4 ROOT V3 b : −10 c : 1 f : 5 a : −5 h : −1 e : 4 i : −3
79 / 93
V1 ROOT V3 V2 a : 5 b : 1 c : 1 f : 5 d : 11 h : 9 e : 4 i : 8 g : 10
80 / 93
V1 ROOT V3 V2 a : 5 b : 1 c : 1 f : 5 d : 11 h : 9 e : 4 i : 8 g : 10
81 / 93
82 / 93
83 / 93
84 / 93
◮ As a matter of preprocessing, for each p, c, keep only the top-scoring labeled edge. 85 / 93
◮ As a matter of preprocessing, for each p, c, keep only the top-scoring labeled edge.
86 / 93
87 / 93
88 / 93
◮ Specialized algorithm that efficiently solves your problem, under your assumptions.
◮ General-purpose method that solves many problems, allowing you to test the effect
89 / 93
◮ Specialized algorithm that efficiently solves your problem, under your assumptions.
◮ General-purpose method that solves many problems, allowing you to test the effect
◮ Fast (linear-time) but greedy ◮ Model-optimal but slow 90 / 93
◮ Specialized algorithm that efficiently solves your problem, under your assumptions.
◮ General-purpose method that solves many problems, allowing you to test the effect
◮ Fast (linear-time) but greedy ◮ Model-optimal but slow ◮ Dirty secret: the best way to get (English) dependency trees is to run
91 / 93
92 / 93
93 / 93