Advanced Dependency Parsing Joakim Nivre Uppsala University - - PowerPoint PPT Presentation

advanced dependency parsing
SMART_READER_LITE
LIVE PREVIEW

Advanced Dependency Parsing Joakim Nivre Uppsala University - - PowerPoint PPT Presentation

Advanced Dependency Parsing Joakim Nivre Uppsala University Linguistics and Philology Based on tutorials with Ryan McDonald Advanced Dependency Parsing 1(36) Introduction Plan for the Lecture 1. Graph-based vs. transition-based dependency


slide-1
SLIDE 1

Advanced Dependency Parsing

Joakim Nivre

Uppsala University Linguistics and Philology Based on tutorials with Ryan McDonald

Advanced Dependency Parsing 1(36)

slide-2
SLIDE 2

Introduction

Plan for the Lecture

  • 1. Graph-based vs. transition-based dependency parsing
  • 2. Advanced graph-based parsing techniques

◮ Higher order models ◮ Non-projective parsing

  • 3. Advanced transition-based parsing techniques

◮ Beam search ◮ Dynamic oracles ◮ Non-projective parsing Advanced Dependency Parsing 2(36)

slide-3
SLIDE 3

Graph-Based vs. Transition-Based Dependency Parsing

Graph-Based Parsing

◮ Basic idea:

◮ Define a space of candidate dependency trees for a sentence ◮ Learning: Induce a model for scoring an entire dependency tree

for a sentence

◮ Parsing: Find the highest-scoring dependency tree, given the

induced model

◮ Characteristics:

◮ Global learning of a model for optimal dependency trees ◮ Exhaustive search during parsing (exact) Advanced Dependency Parsing 3(36)

slide-4
SLIDE 4

Graph-Based vs. Transition-Based Dependency Parsing

Graph-Based Parsing Trade-Off

◮ Learning and inference are global

◮ Decoding guaranteed to find highest scoring tree ◮ Training algorithms use global structure learning

◮ But this is only possible with local feature factorizations

◮ Must limit context statistical model can look at ◮ Results in bad ‘easy’ decisions ◮ For example, first-order models often predict two subjects ◮ No parameter exists to discourage this

John Smith was tall

noun noun verb adj

nsubj nsubj acomp

Advanced Dependency Parsing 4(36)

slide-5
SLIDE 5

Graph-Based vs. Transition-Based Dependency Parsing

Transition-Based Parsing

◮ Basic idea:

◮ Define a transition system (state machine) for mapping a

sentence to its dependency graph

◮ Learning: Induce a model for predicting the next state

transition, given the transition history

◮ Parsing: Construct the optimal transition sequence, given the

induced model

◮ Characteristics:

◮ Local learning of a model for optimal transitions ◮ Greedy best-first search (heuristic) Advanced Dependency Parsing 5(36)

slide-6
SLIDE 6

Graph-Based vs. Transition-Based Dependency Parsing

Transition-Based Parsing Trade-Off

◮ Advantages:

◮ Highly efficient parsing – linear time complexity ◮ Rich history-based feature representations – no rigid

constraints from parsing algorithm

◮ Drawback:

◮ Sensitive to search errors and error propagation due to greedy

inference and local learning

Advanced Dependency Parsing 6(36)

slide-7
SLIDE 7

Graph-Based vs. Transition-Based Dependency Parsing

Error Analysis [McDonald and Nivre 2007]

10 20 30 40 50 50+

Sentence Length (bins of size 10)

0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84

Dependency Accuracy

MSTParser MaltParser

◮ MaltParser is more accurate than MSTParser for short

sentences (1–10 words) but its performance degrades more with increasing sentence length

Advanced Dependency Parsing 7(36)

slide-8
SLIDE 8

Graph-Based vs. Transition-Based Dependency Parsing

Error Analysis [McDonald and Nivre 2007]

5 10 15 20 25 30

Dependency Length

0.3 0.4 0.5 0.6 0.7 0.8 0.9

Dependency Precision

MSTParser MaltParser 5 10 15 20 25 30

Dependency Length

0.3 0.4 0.5 0.6 0.7 0.8 0.9

Dependency Recall

MSTParser MaltParser

◮ MaltParser is more precise than MSTParser for short

dependencies (1–3 words) but its performance degrades drastically with increasing dependency length (> 10 words)

◮ MSTParser has more or less constant precision for

dependencies longer than 3 words

◮ Recall is very similar across systems

Advanced Dependency Parsing 8(36)

slide-9
SLIDE 9

Advanced Graph-Based Parsing Techniques

Plan for the Lecture

  • 1. Graph-based vs. transition-based dependency parsing
  • 2. Advanced graph-based parsing techniques

◮ Higher order models ◮ Non-projective parsing

  • 3. Advanced transition-based parsing techniques

◮ Beam search ◮ Dynamic oracles ◮ Non-projective parsing

  • 4. Neural networks in dependency parsing

Advanced Dependency Parsing 9(36)

slide-10
SLIDE 10

Advanced Graph-Based Parsing Techniques

Higher-Order Models

◮ Two main dimensions of higher-order models

◮ Vertical: e.g.,“remain”is the grandparent of“emeritus” ◮ Horizontal: e.g.,“remain”is first child of“will” Advanced Dependency Parsing 10(36)

slide-11
SLIDE 11

Advanced Graph-Based Parsing Techniques

2nd-Order Horizontal Projective Parsing

◮ Score factors by pairs of horizontally adjacent arcs ◮ Often called sibling dependencies ◮ s(i, j, j′) = score of adjacent arcs xi → xj and xi → xj′

s(T) =

  • (i,j):(i,j′)∈A

s(i, j, j′) = . . . + s(i0, i1, i2) + s(i0, i2, i3) + . . . + s(i0, ij−1, ij) + s(i0, ij+1, ij+2) + . . . + s(i0, im−1, im) + . . .

Advanced Dependency Parsing 11(36)

slide-12
SLIDE 12

Advanced Graph-Based Parsing Techniques

Higher-Order Projective Parsing

◮ People played this game since 2006

◮ McDonald and Pereira [2006] (2nd-order sibling) ◮ Carreras [2007] (2nd-order sibling and grandparent) ◮ Koo and Collins [2010] (3rd-order grand-sibling and tri-sibling) ◮ Ma and Zhao [2012] (4th-order grand-tri-sibling+) h m h m s g m h HORIZONTAL CONTEXT VERTICAL CONTEXT

* From Koo et al. 2010 presentation

h m s s’ g m h s

1 1 2 3 2

O(n3) O(n3) O(n4) O(n4) O(n4) h m s s’ O(n5) g Advanced Dependency Parsing 12(36)

slide-13
SLIDE 13

Advanced Graph-Based Parsing Techniques

Parsing Algorithms

◮ Eisner’s algorithm can be generalized to higher orders ◮ But there is a price to pay:

◮ Specialized chart items and combination rules ◮ Time complexity increases for every added order ◮ Anything beyond 2nd-order is too slow in practice

◮ Remember basic trade-off:

◮ Global training and exact inference – local feature scope ◮ Increasing feature scope makes exact inference harder

◮ This has led to research on approximate graph-based parsing

Advanced Dependency Parsing 13(36)

slide-14
SLIDE 14

Advanced Graph-Based Parsing Techniques

Non-Projective Parsing

◮ First-order model – equivalent to MST problem ◮ Chu-Liu-Edmonds’ algorithm:

◮ Construct a graph with the highest-scoring head for each word ◮ If this is a tree, it must be the MST ◮ If not, contract a cycle and recurse on smaller graph

John saw Mary ROOT 9 10 20 9 30 11 3 30 John saw Mary ROOT 10 30 30

Advanced Dependency Parsing 14(36)

slide-15
SLIDE 15

Advanced Graph-Based Parsing Techniques

Non-Projective Parsing

◮ First-order model – equivalent to MST problem ◮ Chu-Liu-Edmonds’ algorithm:

◮ Construct a graph with the highest-scoring head for each word ◮ If this is a tree, it must be the MST ◮ If not, contract a cycle and recurse on smaller graph

John saw Mary ROOT 9 10 20 9 30 11 3 30 John saw Mary ROOT 10 30 30

◮ This does not generalize to higher orders – no exact algorithm

Advanced Dependency Parsing 14(36)

slide-16
SLIDE 16

Advanced Transition-Based Parsing Techniques

Plan for the Lecture

  • 1. Graph-based vs. transition-based dependency parsing
  • 2. Advanced graph-based parsing techniques

◮ Higher order models ◮ Non-projective parsing

  • 3. Advanced transition-based parsing techniques

◮ Beam search ◮ Dynamic oracles ◮ Non-projective parsing

  • 4. Neural networks in dependency parsing

Advanced Dependency Parsing 15(36)

slide-17
SLIDE 17

Advanced Transition-Based Parsing Techniques

Greedy Search

◮ Take the single best action at any point (given by oracle o):

Parse(w1, . . . , wn) 1 c ← ([ ]S, [0, 1, . . . , n]B, { }) 2 while Bc = [ ] 3 t ← o(c) 4 c ← t(c) 5 return G = ({0, 1, . . . , n}, Ac)

◮ Maximally efficient – linear time complexity ◮ Sensitive to search errors and error propagation

Advanced Dependency Parsing 16(36)

slide-18
SLIDE 18

Advanced Transition-Based Parsing Techniques

Beam Search

◮ Maintain the k best hypotheses [Johansson and Nugues 2006]:

Parse(w1, . . . , wn) 1 Beam ← {([ ]S, [0, 1, . . . , n]B, { })} 2 while ∃c ∈ Beam [Bc = [ ]] 3 foreach c ∈ Beam 4 foreach t 5 Add(t(c), NewBeam) 6 Beam ← Top(k, NewBeam) 7 return G = ({0, 1, . . . , n}, ATop(1, Beam))

◮ Note:

◮ Pruning the beam requires that we score transition sequences ◮ Global learning to maximize score of entire sequence Advanced Dependency Parsing 17(36)

slide-19
SLIDE 19

Advanced Transition-Based Parsing Techniques

Beam Size

[Zhang and Clark 2008]

Advanced Dependency Parsing 18(36)

slide-20
SLIDE 20

Advanced Transition-Based Parsing Techniques

The Best of Two Worlds?

◮ Like graph-based dependency parsing:

◮ Global learning – minimize loss over entire sentence ◮ Non-greedy search – accuracy increases with beam size

◮ Like (old school) transition-based parsing:

◮ Highly efficient – complexity still linear for fixed beam size ◮ Rich features – no constraints from parsing algorithm Advanced Dependency Parsing 19(36)

slide-21
SLIDE 21

Advanced Transition-Based Parsing Techniques

Precision by Dependency Length

2 4 6 8 10 12 14 0.4 0.5 0.6 0.7 0.8 0.9 MST Malt ZPar

[Zhang and Nivre 2012]

Advanced Dependency Parsing 20(36)

slide-22
SLIDE 22

Advanced Transition-Based Parsing Techniques

Dynamic Oracles

◮ Beam search helps because it explores the search space

◮ At parsing time, the parser can recover from early bad decisions ◮ At training time, the parser can learn to avoid costly mistakes

◮ Can the parser benefit from exploration only at training time?

◮ Yes – but we need dynamic oracles for training ◮ Then we can improve greedy parsing for maximum speed Advanced Dependency Parsing 21(36)

slide-23
SLIDE 23

Advanced Transition-Based Parsing Techniques

Online Learning with a Conventional Oracle

Learn({T1, . . . , TN}) 1 w ← 0.0 2 for i in 1..K 3 for j in 1..N 4 c ← ([ ], [0, 1, . . . , nj], { }) 5 while Bc = [ ] 6 t∗ ← argmaxt w · f(c, t) 7 to ← o(c, Ti) 8 if t∗ = to 9 w ← w + f(c, to) − f(c, t∗) 10 c ← to(c) 11 return w

Advanced Dependency Parsing 22(36)

slide-24
SLIDE 24

Advanced Transition-Based Parsing Techniques

Online Learning with a Conventional Oracle

Learn({T1, . . . , TN}) 1 w ← 0.0 2 for i in 1..K 3 for j in 1..N 4 c ← ([ ], [0, 1, . . . , nj], { }) 5 while Bc = [ ] 6 t∗ ← argmaxt w · f(c, t) 7 to ← o(c, Ti) 8 if t∗ = to 9 w ← w + f(c, to) − f(c, t∗) 10 c ← to(c) 11 return w

◮ Oracle o(c, Ti) returns the optimal transition for c and Ti

Advanced Dependency Parsing 22(36)

slide-25
SLIDE 25

Advanced Transition-Based Parsing Techniques

Conventional Oracle for Arc-Eager Parsing

  • (c, T)

=        Left-Arc if top(Sc) ← first(Bc) in T Right-Arc if top(Sc) → first(Bc) in T Reduce if ∃v < top(Sc) : v ↔ first(Bc) in T Shift

  • therwise

◮ Correct:

◮ Derives T in a configuration sequence Co,T = c0, . . . , cm

◮ Problems:

◮ Deterministic: Ignores other derivations of T ◮ Incomplete: Valid only for configurations in Co,T Advanced Dependency Parsing 23(36)

slide-26
SLIDE 26

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: Stack Buffer Arcs [ ] [ROOT, He, sent, her, a, letter, .]

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-27
SLIDE 27

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH Stack Buffer Arcs [ROOT] [He, sent, her, a, letter, .]

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-28
SLIDE 28

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH Stack Buffer Arcs [ROOT, He] [sent, her, a, letter, .]

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-29
SLIDE 29

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH-LA Stack Buffer Arcs [ROOT] [sent, her, a, letter, .] He

sbj

← − sent

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-30
SLIDE 30

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH-LA-RA Stack Buffer Arcs [ROOT, sent] [her, a, letter, .] He

sbj

← − sent

ROOT root

− → sent

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-31
SLIDE 31

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH-LA-RA-RA Stack Buffer Arcs [ROOT, sent, her] [a, letter, .] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-32
SLIDE 32

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH-LA-RA-RA-SH Stack Buffer Arcs [ROOT, sent, her, a] [letter, .] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-33
SLIDE 33

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH-LA-RA-RA-SH-LA Stack Buffer Arcs [ROOT, sent, her] [letter, .] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her a

det

← − letter

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-34
SLIDE 34

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH-LA-RA-RA-SH-LA-RE Stack Buffer Arcs [ROOT, sent] [letter, .] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her a

det

← − letter

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-35
SLIDE 35

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA Stack Buffer Arcs [ROOT, sent, letter] [.] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her a

det

← − letter sent

dobj

− → letter

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-36
SLIDE 36

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE Stack Buffer Arcs [ROOT, sent] [.] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her a

det

← − letter sent

dobj

− → letter

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-37
SLIDE 37

Advanced Transition-Based Parsing Techniques

Oracle Parse

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA Stack Buffer Arcs [ROOT, sent, .] [ ] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her a

det

← − letter sent

dobj

− → letter sent

p

− → .

ROOT

He sent her a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 24(36)

slide-38
SLIDE 38

Advanced Transition-Based Parsing Techniques

Non-Determinisim

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-RA Stack Buffer Arcs [ROOT, sent, her] [a, letter, .] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 25(36)

slide-39
SLIDE 39

Advanced Transition-Based Parsing Techniques

Non-Determinisim

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-RA-RE Stack Buffer Arcs [ROOT, sent] [a, letter, .] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 25(36)

slide-40
SLIDE 40

Advanced Transition-Based Parsing Techniques

Non-Determinisim

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-RA-RE-SH Stack Buffer Arcs [ROOT, sent, a] [letter, .] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 25(36)

slide-41
SLIDE 41

Advanced Transition-Based Parsing Techniques

Non-Determinisim

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-RA-RE-SH-LA Stack Buffer Arcs [ROOT, sent] [letter, .] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her a

det

← − letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 25(36)

slide-42
SLIDE 42

Advanced Transition-Based Parsing Techniques

Non-Determinisim

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-RA-RE-SH-LA-RA Stack Buffer Arcs [ROOT, sent, letter] [.] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her a

det

← − letter sent

dobj

− → letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 25(36)

slide-43
SLIDE 43

Advanced Transition-Based Parsing Techniques

Non-Determinisim

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-RA-RE-SH-LA-RA-RE Stack Buffer Arcs [ROOT, sent] [.] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her a

det

← − letter sent

dobj

− → letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 25(36)

slide-44
SLIDE 44

Advanced Transition-Based Parsing Techniques

Non-Determinisim

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-RA-RE-SH-LA-RA-RE-RA Stack Buffer Arcs [ROOT, sent, .] [ ] He

sbj

← − sent

ROOT root

− → sent sent

iobj

− → her a

det

← − letter sent

dobj

− → letter sent

p

− → .

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 25(36)

slide-45
SLIDE 45

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA Stack Buffer Arcs [ROOT, sent] [her, a, letter, .] He

sbj

← − sent

ROOT root

− → sent

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-46
SLIDE 46

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH Stack Buffer Arcs [ROOT, sent, her] [a, letter, .] He

sbj

← − sent

ROOT root

− → sent

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-47
SLIDE 47

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH-SH Stack Buffer Arcs [ROOT, sent, her, a] [letter, .] He

sbj

← − sent

ROOT root

− → sent

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-48
SLIDE 48

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH-SH-LA Stack Buffer Arcs [ROOT, sent, her] [letter, .] He

sbj

← − sent

ROOT root

− → sent a

det

← − letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-49
SLIDE 49

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH-SH-LA-SH Stack Buffer Arcs [ROOT, sent, her, letter] [.] He

sbj

← − sent

ROOT root

− → sent a

det

← − letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-50
SLIDE 50

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH-SH-LA-SH-SH [3/6] Stack Buffer Arcs [ROOT, sent, letter, .] [ ] He

sbj

← − sent

ROOT root

− → sent a

det

← − letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-51
SLIDE 51

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH-SH-LA-SH-SH [3/6] SH-RA-LA-SH-SH-SH-LA Stack Buffer Arcs [ROOT, sent, her] [letter, .] He

sbj

← − sent

ROOT root

− → sent a

det

← − letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-52
SLIDE 52

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH-SH-LA-SH-SH [3/6] SH-RA-LA-SH-SH-SH-LA-LA Stack Buffer Arcs [ROOT, sent] [letter, .] He

sbj

← − sent

ROOT root

− → sent a

det

← − letter her

?

← − letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-53
SLIDE 53

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH-SH-LA-SH-SH [3/6] SH-RA-LA-SH-SH-SH-LA-LA-RA Stack Buffer Arcs [ROOT, sent, letter] [.] He

sbj

← − sent

ROOT root

− → sent a

det

← − letter her

?

← − letter sent

dobj

− → letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-54
SLIDE 54

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH-SH-LA-SH-SH [3/6] SH-RA-LA-SH-SH-SH-LA-LA-RA-RE Stack Buffer Arcs [ROOT, sent] [.] He

sbj

← − sent

ROOT root

− → sent a

det

← − letter her

?

← − letter sent

dobj

− → letter

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-55
SLIDE 55

Advanced Transition-Based Parsing Techniques

Non-Optimality

Transitions: SH-SH-LA-RA-RA-SH-LA-RE-RA-RE-RA SH-SH-LA-RA-SH-SH-LA-SH-SH [3/6] SH-RA-LA-SH-SH-SH-LA-LA-RA-RE-RA [5/6] Stack Buffer Arcs [ROOT, sent, .] [ ] He

sbj

← − sent

ROOT root

− → sent a

det

← − letter her

?

← − letter sent

dobj

− → letter sent

p

− → .

ROOT

She sent him a letter .

ROOT

pron verb pron det noun .

root nsubj iobj det dobj p

Advanced Dependency Parsing 26(36)

slide-56
SLIDE 56

Advanced Transition-Based Parsing Techniques

Dynamic Oracles

◮ Optimality:

◮ A transition is optimal if the best tree remains reachable ◮ Best tree = argminT ′ L(T, T ′)

◮ Oracle:

◮ Boolean function o(c, t, T) = true if t is optimal for c and T ◮ Non-deterministic: More than one transition can be optimal ◮ Complete: Correct for all configurations

◮ New problem:

◮ How do we know which trees are reachable? ◮ Easy for some transition systems (called arc-decomposable) Advanced Dependency Parsing 27(36)

slide-57
SLIDE 57

Advanced Transition-Based Parsing Techniques

Oracles for Arc-Decomposable Systems

  • (c, t, T) =

true if [R(c) − R(t(c))] ∩ T = ∅ false

  • therwise

where R(c) ≡ {a | a is an arc reachable in c } Arc-Eager

  • (c, LA, T)

= false if ∃w ∈ Bc : s ↔ w ∈ T (except s ← b) true

  • therwise
  • (c, RA, T)

= false if ∃w ∈ Sc : w ↔ b ∈ T (except s → b) true

  • therwise
  • (c, RE, T)

= false if ∃w ∈ Bc : s → w ∈ T true

  • therwise
  • (c, SH, T)

= false if ∃w ∈ Sc : w ↔ b ∈ T true

  • therwise

Notation: s = node on top of the stack S b = first node in the buffer B

Advanced Dependency Parsing 28(36)

slide-58
SLIDE 58

Advanced Transition-Based Parsing Techniques

Online Learning with a Dynamic Oracle

Learn({T1, . . . , TN}) 1 w ← 0.0 2 for i in 1..K 3 for j in 1..N 4 c ← ([ ]S, [w1, . . . , wnj]B, { }) 5 while Bc = [ ] 6 t∗ ← argmaxt w · f(c, t) 7 to ← argmaxt∈{t|o(c,t,Ti)} w · f(c, t) 8 if t∗ = to 9 w ← w + f(c, to) − f(c, t∗) 10 c ← choice(to(c), t∗(c)) 11 return w

Advanced Dependency Parsing 29(36)

slide-59
SLIDE 59

Advanced Transition-Based Parsing Techniques

Online Learning with a Dynamic Oracle

Learn({T1, . . . , TN}) 1 w ← 0.0 2 for i in 1..K 3 for j in 1..N 4 c ← ([ ]S, [w1, . . . , wnj]B, { }) 5 while Bc = [ ] 6 t∗ ← argmaxt w · f(c, t) 7 to ← argmaxt∈{t|o(c,t,Ti)} w · f(c, t) 8 if t∗ = to 9 w ← w + f(c, to) − f(c, t∗) 10 c ← choice(to(c), t∗(c)) 11 return w

◮ Ambiguity: use model score to break ties ◮ Exploration: follow model prediction even if not optimal

Advanced Dependency Parsing 29(36)

slide-60
SLIDE 60

Advanced Transition-Based Parsing Techniques

[Goldberg and Nivre 2012]

Advanced Dependency Parsing 30(36)

slide-61
SLIDE 61

Advanced Transition-Based Parsing Techniques

Non-Projective Parsing

◮ Standard transition systems only derive projective trees ◮ Approaches to non-projective transition-based parsing:

◮ Pseudo-projective parsing [Nivre and Nilsson 2005] ◮ Non-adjacent arc transitions

[Covington 2001, Attardi 2006, Nivre 2007]

◮ Online reordering [Nivre 2009, Nivre et al. 2009] Advanced Dependency Parsing 31(36)

slide-62
SLIDE 62

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-63
SLIDE 63

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-64
SLIDE 64

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-65
SLIDE 65

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-66
SLIDE 66

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-67
SLIDE 67

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-68
SLIDE 68

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-69
SLIDE 69

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-70
SLIDE 70

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-71
SLIDE 71

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-72
SLIDE 72

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-73
SLIDE 73

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-74
SLIDE 74

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-75
SLIDE 75

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-76
SLIDE 76

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-77
SLIDE 77

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-78
SLIDE 78

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-79
SLIDE 79

Advanced Transition-Based Parsing Techniques

Projectivity and Word Order

◮ Projectivity is a property of a dependency tree only in relation

to a particular word order

◮ Words can always be reordered to make the tree projective ◮ Given a dependency tree T = (V , A, <), let the projective

  • rder <p be the order defined by an inorder traversal of T with

respect to < [Vesel´

a et al. 2004]

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

1 2 6 7 3 4 5 8 9

root det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 32(36)

slide-80
SLIDE 80

Advanced Transition-Based Parsing Techniques

Transition System for Online Reordering

Configuration: (S, B, A) [S = Stack, B = Buffer, A = Arcs] Initial: ([ ], [w0, w1, . . . , wn], { }) (w0 = ROOT) Terminal: ([0], [ ], A) Shift: (S, wi|B, A) ⇒ (S|wi, B, A) Right-Arc(l): (S|wi|wj, B, A) ⇒ (S|wi, B, A ∪ {(wi, l, wj)}) Left-Arc(l): (S|wi|wj, B, A) ⇒ (S|wj, B, A ∪ {(wj, l, wi)})

i = 0

Swap: (S|wi|wj, B, A) ⇒ (S|wj, wi|B, A)

0 < i < j

Advanced Dependency Parsing 33(36)

slide-81
SLIDE 81

Advanced Transition-Based Parsing Techniques

Transition System for Online Reordering

Configuration: (S, B, A) [S = Stack, B = Buffer, A = Arcs] Initial: ([ ], [w0, w1, . . . , wn], { }) (w0 = ROOT) Terminal: ([0], [ ], A) Shift: (S, wi|B, A) ⇒ (S|wi, B, A) Right-Arc(l): (S|wi|wj, B, A) ⇒ (S|wi, B, A ∪ {(wi, l, wj)}) Left-Arc(l): (S|wi|wj, B, A) ⇒ (S|wj, B, A ∪ {(wj, l, wi)})

i = 0

Swap: (S|wi|wj, B, A) ⇒ (S|wj, wi|B, A)

0 < i < j ◮ Transition-based parsing with two interleaved processes:

  • 1. Sort words into projective order <p
  • 2. Build tree T by connecting adjacent subtrees

◮ T is projective with respect to <p but not (necessarily) <

Advanced Dependency Parsing 33(36)

slide-82
SLIDE 82

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ ]S [ROOT, A, hearing, is, scheduled, on, the, issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

Advanced Dependency Parsing 34(36)

slide-83
SLIDE 83

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT]S [A, hearing, is, scheduled, on, the, issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

Advanced Dependency Parsing 34(36)

slide-84
SLIDE 84

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, A]S [hearing, is, scheduled, on, the, issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

Advanced Dependency Parsing 34(36)

slide-85
SLIDE 85

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, A, hearing]S [is, scheduled, on, the, issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

Advanced Dependency Parsing 34(36)

slide-86
SLIDE 86

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing]S [is, scheduled, on, the, issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det

Advanced Dependency Parsing 34(36)

slide-87
SLIDE 87

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, is]S [scheduled, on, the, issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det

Advanced Dependency Parsing 34(36)

slide-88
SLIDE 88

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, is, scheduled]S [on, the, issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det

Advanced Dependency Parsing 34(36)

slide-89
SLIDE 89

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, scheduled]S [on, the, issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux

Advanced Dependency Parsing 34(36)

slide-90
SLIDE 90

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, scheduled, on]S [the, issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux

Advanced Dependency Parsing 34(36)

slide-91
SLIDE 91

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, scheduled, on, the]S [issue, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux

Advanced Dependency Parsing 34(36)

slide-92
SLIDE 92

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, scheduled, on, the, issue]S [today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux

Advanced Dependency Parsing 34(36)

slide-93
SLIDE 93

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, scheduled, on, issue]S [today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux det

Advanced Dependency Parsing 34(36)

slide-94
SLIDE 94

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, scheduled, on]S [today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux pobj det

Advanced Dependency Parsing 34(36)

slide-95
SLIDE 95

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, on]S [scheduled, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux pobj det

Advanced Dependency Parsing 34(36)

slide-96
SLIDE 96

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing]S [scheduled, today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux prep pobj det

Advanced Dependency Parsing 34(36)

slide-97
SLIDE 97

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, hearing, scheduled]S [today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux prep pobj det

Advanced Dependency Parsing 34(36)

slide-98
SLIDE 98

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, scheduled]S [today, .]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux nsubj prep pobj det

Advanced Dependency Parsing 34(36)

slide-99
SLIDE 99

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, scheduled, today]S [.]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux nsubj prep pobj det

Advanced Dependency Parsing 34(36)

slide-100
SLIDE 100

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, scheduled]S [.]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux nsubj prep pobj det tmod

Advanced Dependency Parsing 34(36)

slide-101
SLIDE 101

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, scheduled, .]S [ ]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux nsubj prep pobj det tmod

Advanced Dependency Parsing 34(36)

slide-102
SLIDE 102

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT, scheduled]S [ ]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

det aux nsubj prep pobj det tmod p

Advanced Dependency Parsing 34(36)

slide-103
SLIDE 103

Advanced Transition-Based Parsing Techniques

Example Transition Sequence

[ROOT]S [ ]B

ROOT

A hearing is scheduled

  • n

the issue today .

ROOT

det noun verb verb prep det noun adv .

root det aux nsubj prep pobj det tmod

Advanced Dependency Parsing 34(36)

slide-104
SLIDE 104

Advanced Transition-Based Parsing Techniques

Analysis

◮ Correctness:

◮ Sound and complete for the class of non-projective trees

◮ Complexity for greedy or beam search parsing:

◮ Quadratic running time in the worst case ◮ Linear running time in the average case

◮ Works well with beam search

Czech German LAS UAS LAS UAS Projective 80.8 86.3 86.2 88.5 Reordering 83.9 89.1 88.7 90.9

[Bohnet and Nivre 2012]

Advanced Dependency Parsing 35(36)

slide-105
SLIDE 105

Conclusion

Conclusion

◮ Graph-based and transition-based parsing have complementary

strengths and weaknesses

◮ Many recent developments can be understood in this light:

◮ Graph-based: Increase feature scope (higher order models)

while keeping learning and inference tractable

◮ Transition-based: Improve learning and inference (beam

search, dynamic oracles) without sacrificing efficiency

◮ Convergence: global learning, rich features, heuristic search ◮ And then there is this thing called deep learning . . .

Advanced Dependency Parsing 36(36)

slide-106
SLIDE 106

Conclusion References and Further Reading ◮ Giuseppe Attardi. 2006. Experiments with a multilanguage non-projective dependency parser. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL), pages 166–170. ◮ Bernd Bohnet and Joakim Nivre. 2012. A transition-based system for joint part-of-speech tagging and labeled non-projective dependency parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1455–1465. ◮ Xavier Carreras. 2007. Experiments with a higher-order projective dependency

  • parser. In Proceedings of the Joint Conference on Empirical Methods in Natural

Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 957–961. ◮ Michael A. Covington. 2001. A fundamental algorithm for dependency parsing. In Proceedings of the 39th Annual ACM Southeast Conference, pages 95–102. ◮ Yoav Goldberg and Joakim Nivre. 2012. A dynamic oracle for arc-eager dependency

  • parsing. In Proceedings of COLING 2012, pages 959–976.

Advanced Dependency Parsing 36(36)

slide-107
SLIDE 107

Conclusion ◮ Richard Johansson and Pierre Nugues. 2006. Investigating multilingual dependency

  • parsing. In Proceedings of the Tenth Conference on Computational Natural

Language Learning (CoNLL), pages 206–210. ◮ Terry Koo and Michael Collins. 2010. Efficient third-order dependency parsers. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1–11. Association for Computational Linguistics. ◮ Xuezhe Ma and Hai Zhao. 2012. Fourth-order dependency parsing. In Proceedings

  • f the Conference on Computational Linguistics (COLING), pages 785–796.

◮ Ryan McDonald and Joakim Nivre. 2007. Characterizing the errors of data-driven dependency parsing models. In Proceedings of the Join Conference on Empirical Methods in Natural Language Processing and the Conference on Computational Natural Language Learning (EMNLP-CoNLL). ◮ Ryan McDonald and Fernando Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 81–88. ◮ Joakim Nivre and Jens Nilsson. 2005. Pseudo-projective dependency parsing. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pages 99–106.

Advanced Dependency Parsing 36(36)

slide-108
SLIDE 108

Conclusion ◮ Joakim Nivre, Marco Kuhlmann, and Johan Hall. 2009. An improved oracle for dependency parsing with online reordering. In Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09), pages 73–76. ◮ Joakim Nivre. 2007. Incremental non-projective dependency parsing. In Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), pages 396–403. ◮ Joakim Nivre. 2009. Non-projective dependency parsing in expected linear time. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics (ACL), pages 351–359. ◮ Katerina Vesel´ a, Havelka Jiri, and Eva Hajicov´

  • a. 2004. Condition of projectivity in

the underlying dependency structures. In Proceedings of the 20th International Conference on Computational Linguistics (COLING), pages 289–295. ◮ Yue Zhang and Stephen Clark. 2008. A tale of two parsers: Investigating and combining graph-based and transition-based dependency parsing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 562–571.

Advanced Dependency Parsing 36(36)

slide-109
SLIDE 109

Conclusion ◮ Yue Zhang and Joakim Nivre. 2012. Analyzing the effect of global learning and beam-search on transition-based dependency parsing. In Proceedings of COLING 2012: Posters, pages 1391–1400.

Advanced Dependency Parsing 36(36)