Dependency Parsing with Bounded Block Degree and Well-nestedness via - - PowerPoint PPT Presentation

dependency parsing with bounded block degree and well
SMART_READER_LITE
LIVE PREVIEW

Dependency Parsing with Bounded Block Degree and Well-nestedness via - - PowerPoint PPT Presentation

Dependency Parsing with Bounded Block Degree and Well-nestedness via Lagrangian Relaxation and Branch-and-Bound Caio Corro, Joseph Le Roux, Mathieu Lacroix, Antoine Rozenknop and Roberto Wolfler-Calvo August 7-12 Universit Paris 13 LIPN


slide-1
SLIDE 1

Dependency Parsing with Bounded Block Degree and Well-nestedness via Lagrangian Relaxation and Branch-and-Bound

Caio Corro, Joseph Le Roux, Mathieu Lacroix, Antoine Rozenknop and Roberto Wolfler-Calvo August 7-12

Université Paris 13 – LIPN This work is supported by a public grant overseen by the French National Research Agency (ANR) as part of the Investissements d’Avenir program (ANR-10-LABX-0083).

slide-2
SLIDE 2

Dependency trees

  • Association of each word of sentence with a vertex
  • Dependency tree: spanning tree rooted at 0

1 2 3 4 5 6 * They solved the problem with statistics

Dependency parsing

  • Set of valid dependency trees for sentence x: Yx
  • Arc factored model: score(y) =

a∈Y score(a)

  • Dependency parsing: ˆ

yx = arg maxy∈Yx score(y)

1.Introduction 2 / 21

slide-3
SLIDE 3

Dependency trees

  • Association of each word of sentence with a vertex
  • Dependency tree: spanning tree rooted at 0

1 2 3 4 5 6 * They solved the problem with statistics

Structural properties [Bodirsky et al. 2009; Kuhlmann 2010] Non-projective Projective

1.Introduction 2 / 21

slide-4
SLIDE 4

Dependency trees

  • Association of each word of sentence with a vertex
  • Dependency tree: spanning tree rooted at 0

1 2 3 4 5 6 * They solved the problem with statistics

Structural properties [Bodirsky et al. 2009; Kuhlmann 2010] Non-projective Projective k-Bounded Block Degree Well-nested

1.Introduction 2 / 21

slide-5
SLIDE 5

Distribution of dependency tree characteristics

English (PTB/LTH) German (SPMRL) Dutch (UD) WN WN WN WN WN WN BD 1 92.26 67.60 69.13 BD 2 7.58 0.12 27.12 0.79 28.50 0.08 BD 3 0.12 0.01 3.86 0.30 2.24 0.01 BD 4 0.00 0.00 0.19 <0.01 0.04 0.00 BD > 4 0.00 0.00 0.11 <0.01 0.00 0.00 Spanish (UD) Portuguese (UD) WN WN WN WN BD 1 93.95 81.56 BD 2 5.99 0.04 13.92 0.05 BD 3 0.02 0.00 3.76 0.02 BD 4 0.00 0.00 0.54 0.00 BD > 4 0.00 0.00 0.14 0.00

1.Introduction 3 / 21

slide-6
SLIDE 6

Distribution of dependency tree characteristics

English (PTB/LTH) German (SPMRL) Dutch (UD) WN WN WN WN WN WN BD 1 92.26 67.60 69.13 BD 2 7.58 0.12 27.12 0.79 28.50 0.08 BD 3 0.12 0.01 3.86 0.30 2.24 0.01 BD 4 0.00 0.00 0.19 <0.01 0.04 0.00 BD > 4 0.00 0.00 0.11 <0.01 0.00 0.00 Spanish (UD) Portuguese (UD) WN WN WN WN BD 1 93.95 81.56 BD 2 5.99 0.04 13.92 0.05 BD 3 0.02 0.00 3.76 0.02 BD 4 0.00 0.00 0.54 0.00 BD > 4 0.00 0.00 0.14 0.00

  • Blue: Projective dependency trees

1.Introduction 3 / 21

slide-7
SLIDE 7

Distribution of dependency tree characteristics

English (PTB/LTH) German (SPMRL) Dutch (UD) WN WN WN WN WN WN BD 1 92.26 67.60 69.13 BD 2 7.58 0.12 27.12 0.79 28.50 0.08 BD 3 0.12 0.01 3.86 0.30 2.24 0.01 BD 4 0.00 0.00 0.19 <0.01 0.04 0.00 BD > 4 0.00 0.00 0.11 <0.01 0.00 0.00 Spanish (UD) Portuguese (UD) WN WN WN WN BD 1 93.95 81.56 BD 2 5.99 0.04 13.92 0.05 BD 3 0.02 0.00 3.76 0.02 BD 4 0.00 0.00 0.54 0.00 BD > 4 0.00 0.00 0.14 0.00

  • Blue: Projective dependency trees
  • Blue + Purple: ≈ 99% of the dependency trees

1.Introduction 3 / 21

slide-8
SLIDE 8

Distribution of dependency tree characteristics

English (PTB/LTH) German (SPMRL) Dutch (UD) WN WN WN WN WN WN BD 1 92.26 67.60 69.13 BD 2 7.58 0.12 27.12 0.79 28.50 0.08 BD 3 0.12 0.01 3.86 0.30 2.24 0.01 BD 4 0.00 0.00 0.19 <0.01 0.04 0.00 BD > 4 0.00 0.00 0.11 <0.01 0.00 0.00 Spanish (UD) Portuguese (UD) WN WN WN WN BD 1 93.95 81.56 BD 2 5.99 0.04 13.92 0.05 BD 3 0.02 0.00 3.76 0.02 BD 4 0.00 0.00 0.54 0.00 BD > 4 0.00 0.00 0.14 0.00

  • Blue: Projective dependency trees
  • Blue + Purple: ≈ 99% of the dependency trees
  • Blue + Purple + Red: Non-projective dependency trees

1.Introduction 3 / 21

slide-9
SLIDE 9

Motivations

Observation

  • Projective parsing: does not correctly cover datasets
  • Non-projective parsing: produce invalid structures

Problem

  • WN and k-BBD parsing: no tractable algorithm

Contribution

  • First efficient parsing algorithm based on Lagrangian Relaxation

1.Introduction 4 / 21

slide-10
SLIDE 10

Outline

1.Introduction

  • 2. Dependency tree characterization
  • 3. Existing parsing algorithms
  • 4. Novel characterization based on arc-sets
  • 5. Efficient parsing with fine-grained constraints
  • 6. Experiments
  • 7. Conclusion

1.Introduction 5 / 21

slide-11
SLIDE 11

Yield

Yield of a node v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4

  • 2. Dependency tree characterization

6 / 21

slide-12
SLIDE 12

Yield

Yield of a node v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} 2 4 1 3

  • 2. Dependency tree characterization

6 / 21

slide-13
SLIDE 13

Yield

Yield of a node v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} Yield(1) = {1} 1

  • 2. Dependency tree characterization

6 / 21

slide-14
SLIDE 14

Yield

Yield of a node v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} Yield(1) = {1} Yield(2) = {1, 2, 3, 4} 2 4 1 3

  • 2. Dependency tree characterization

6 / 21

slide-15
SLIDE 15

Yield

Yield of a node v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} Yield(1) = {1} Yield(2) = {1, 2, 3, 4} Yield(3) = {3} 3

  • 2. Dependency tree characterization

6 / 21

slide-16
SLIDE 16

Yield

Yield of a node v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} Yield(1) = {1} Yield(2) = {1, 2, 3, 4} Yield(3) = {3} Yield(4) = {3, 4} 4 1 3

  • 2. Dependency tree characterization

6 / 21

slide-17
SLIDE 17

Structural properties of dependencies

Projective dependency trees ⇒ Trees with contiguous yields only

2 1 4 3 s0 s1 s2 s3 s4

  • 2. Dependency tree characterization

7 / 21

slide-18
SLIDE 18

Structural properties of dependencies

Projective dependency trees ⇒ Trees with contiguous yields only

2 1 4 3 s0 s1 s2 s3 s4

Non-projective dependency trees ⇒ Unconstrained trees

2 4 1 3 s0 s1 s2 s3 s4

  • 2. Dependency tree characterization

7 / 21

slide-19
SLIDE 19

Example: Projective dependency trees

  • English

1 2 3 4 5 6 * They solved the problem with statistics

  • Dutch

3 4 1 4 5 6 7 * Dit effect is nu bezig te verdwijnen

  • 2. Dependency tree characterization

8 / 21

slide-20
SLIDE 20

Example: Non-projective dependency trees

  • English: surrounding argument

5 3 4 6 7 2 1 8 9 * The man , they say , was tall .

  • Dutch: cross-serial dependencies

1 6 2 7 3 8 5 4 ... dat Jan Piet de kinderen zag helpen zwemmen

  • 2. Dependency tree characterization

9 / 21

slide-21
SLIDE 21

Structural properties (1/2): k-BBD

k-Bounded Block Degree (k-BBD)

  • BD of a vertex: number of contiguous intervals described by its yield
  • BD of a tree: the maximal block degree of its vertices
  • k-BBD tree: tree with a BD less or equal to k

1 2 3 4 s0 s1 s2 s3 s4

Tree of block degree 2

  • 2. Dependency tree characterization

10 / 21

slide-22
SLIDE 22

Structural properties (1/2): k-BBD

k-Bounded Block Degree (k-BBD)

  • BD of a vertex: number of contiguous intervals described by its yield
  • BD of a tree: the maximal block degree of its vertices
  • k-BBD tree: tree with a BD less or equal to k

1 2 3 4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 1 2 3 4

Tree of block degree 2

  • 2. Dependency tree characterization

10 / 21

slide-23
SLIDE 23

Structural properties (1/2): k-BBD

k-Bounded Block Degree (k-BBD)

  • BD of a vertex: number of contiguous intervals described by its yield
  • BD of a tree: the maximal block degree of its vertices
  • k-BBD tree: tree with a BD less or equal to k

1 2 3 4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 1 4

Tree of block degree 2

  • 2. Dependency tree characterization

10 / 21

slide-24
SLIDE 24

Structural properties (1/2): k-BBD

k-Bounded Block Degree (k-BBD)

  • BD of a vertex: number of contiguous intervals described by its yield
  • BD of a tree: the maximal block degree of its vertices
  • k-BBD tree: tree with a BD less or equal to k

1 2 3 4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 Yield(2) = [2 . . . 3] BD(2) = 1 2 3

Tree of block degree 2

  • 2. Dependency tree characterization

10 / 21

slide-25
SLIDE 25

Structural properties (1/2): k-BBD

k-Bounded Block Degree (k-BBD)

  • BD of a vertex: number of contiguous intervals described by its yield
  • BD of a tree: the maximal block degree of its vertices
  • k-BBD tree: tree with a BD less or equal to k

1 2 3 4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 Yield(2) = [2 . . . 3] BD(2) = 1 Yield(3) = [3] BD(3) = 1 3

Tree of block degree 2

  • 2. Dependency tree characterization

10 / 21

slide-26
SLIDE 26

Structural properties (1/2): k-BBD

k-Bounded Block Degree (k-BBD)

  • BD of a vertex: number of contiguous intervals described by its yield
  • BD of a tree: the maximal block degree of its vertices
  • k-BBD tree: tree with a BD less or equal to k

1 2 3 4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 Yield(2) = [2 . . . 3] BD(2) = 1 Yield(3) = [3] BD(3) = 1 Yield(4) = [4] BD(4) = 1 4

Tree of block degree 2

  • 2. Dependency tree characterization

10 / 21

slide-27
SLIDE 27

Structural properties (1/2): k-BBD

k-Bounded Block Degree (k-BBD)

  • BD of a vertex: number of contiguous intervals described by its yield
  • BD of a tree: the maximal block degree of its vertices
  • k-BBD tree: tree with a BD less or equal to k

1 2 3 4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 Yield(2) = [2 . . . 3] BD(2) = 1 Yield(3) = [3] BD(3) = 1 Yield(4) = [4] BD(4) = 1 1 4

Tree of block degree 2

  • 2. Dependency tree characterization

10 / 21

slide-28
SLIDE 28

Structural properties (2/2): WN

Well-nestedness (WN)

  • Interleaving sets I1, I2: there exist i, j ∈ I1 and k, l ∈ I2 such that

i < k < j < l

  • Well-nested tree: does not contain two vertices whose yields are

disjoint and interleave

1 2 3 4 s0 s1 s2 s3 s4

Well-nested tree

1 2 3 4 s0 s1 s2 s3 s4

Not well-nested tree

  • 2. Dependency tree characterization

11 / 21

slide-29
SLIDE 29

Parsing algorithms

Complexity (arc-factored) Non-projective O(n2) [McDonald et al. 2005] Projective O(n3) [Eisner 2000] WN + 2-BBD O(n7) [Gómez-Rodríguez et al. 2009] WN + k-BBD, k ≥ 2 O(n5+2(k−1)) [Gómez-Rodríguez et al. 2009] Remark Projective ⇔ 1-BBD and WN Tractability

  • Non-projective and projective: tractable
  • WN + k-BBD: not tractable
  • 3. Existing parsing algorithms

12 / 21

slide-30
SLIDE 30

Non-projective dependency parsing

Integer Linear Program for non-projective parsing z ∈ RA: incidence vector such that za = 1 iff arc a is in the tree.

  • 3. Existing parsing algorithms

13 / 21

slide-31
SLIDE 31

Non-projective dependency parsing

Integer Linear Program for non-projective parsing z ∈ RA: incidence vector such that za = 1 iff arc a is in the tree. max

z

  • a∈A

score(a) × za Arc-factored model (1) s.t.

  • a∈δin(v)

za = 1 ∀v ∈ V + One head/word (2)

  • a∈δin(W )

za ≥ 1 ∀W ⊆ V + Connectedness (3) z ∈ {0, 1}A Integrality (4)

  • 3. Existing parsing algorithms

13 / 21

slide-32
SLIDE 32

Non-projective dependency parsing

Integer Linear Program for non-projective parsing z ∈ RA: incidence vector such that za = 1 iff arc a is in the tree. max

z

  • a∈A

score(a) × za Arc-factored model (1) s.t.

  • a∈δin(v)

za = 1 ∀v ∈ V + One head/word (2)

  • a∈δin(W )

za ≥ 1 ∀W ⊆ V + Connectedness (3) z ∈ {0, 1}A Integrality (4) Efficient decoding In practice: (directed) Maximum Spanning Tree (MST) algorithm [Schrijver 2003; McDonald et al. 2005]

  • 3. Existing parsing algorithms

13 / 21

slide-33
SLIDE 33

Non-projective dependency parsing

Integer Linear Program for non-projective parsing z ∈ RA: incidence vector such that za = 1 iff arc a is in the tree. max

z

  • a∈A

score(a) × za Arc-factored model (1) s.t.

  • a∈δin(v)

za = 1 ∀v ∈ V + One head/word (2)

  • a∈δin(W )

za ≥ 1 ∀W ⊆ V + Connectedness (3) z ∈ {0, 1}A Integrality (4) Problem enhancement ⇒ Integrating fine-grained structural constraints ?

  • 3. Existing parsing algorithms

13 / 21

slide-34
SLIDE 34

k-Bounded Block Degree Constraint

Definition Wk+1: vertex subsets describing at least k + 1 non-adjacent intervals

  • 4. Novel characterization based on arc-sets

14 / 21

slide-35
SLIDE 35

k-Bounded Block Degree Constraint

Definition Wk+1: vertex subsets describing at least k + 1 non-adjacent intervals Example with k = 2 and {1, 3, 5} ∈ W3 1 2 3 4 5 Not 2-BBD 1 2 3 4 5 2-BBD →: arcs adjacent to the vertex subset {1, 3, 5}

  • 4. Novel characterization based on arc-sets

14 / 21

slide-36
SLIDE 36

k-Bounded Block Degree Constraint

Definition Wk+1: vertex subsets describing at least k + 1 non-adjacent intervals Example with k = 2 and {1, 3, 5} ∈ W3 1 2 3 4 5 Not 2-BBD 1 2 3 4 5 2-BBD →: arcs adjacent to the vertex subset {1, 3, 5}

  • 4. Novel characterization based on arc-sets

14 / 21

slide-37
SLIDE 37

k-Bounded Block Degree Constraint

Definition Wk+1: vertex subsets describing at least k + 1 non-adjacent intervals Example with k = 2 and {1, 3, 5} ∈ W3 1 2 3 4 5 Not 2-BBD 1 2 3 4 5 2-BBD →: arcs adjacent to the vertex subset {1, 3, 5}

  • 4. Novel characterization based on arc-sets

14 / 21

slide-38
SLIDE 38

k-Bounded Block Degree Constraint

Definition Wk+1: vertex subsets describing at least k + 1 non-adjacent intervals Example with k = 2 and {1, 3, 5} ∈ W3 1 2 3 4 5 Not 2-BBD 1 2 3 4 5 2-BBD →: arcs adjacent to the vertex subset {1, 3, 5}

  • 4. Novel characterization based on arc-sets

14 / 21

slide-39
SLIDE 39

k-Bounded Block Degree Constraint

Definition Wk+1: vertex subsets describing at least k + 1 non-adjacent intervals Example with k = 2 and {1, 3, 5} ∈ W3 1 2 3 4 5 Not 2-BBD 1 2 3 4 5 2-BBD →: arcs adjacent to the vertex subset {1, 3, 5} Constraint For each vertex subset W ∈ W≥k+1 ⇒ At least two adjacent arcs

  • 4. Novel characterization based on arc-sets

14 / 21

slide-40
SLIDE 40

Well-nestedness constraint

Definition I: family of couples of disjoint interleaving vertex subsets

  • 4. Novel characterization based on arc-sets

15 / 21

slide-41
SLIDE 41

Well-nestedness constraint

Definition I: family of couples of disjoint interleaving vertex subsets Example with ({1, 3}, {2, 4}) ∈ I 1 2 3 4 Not Well-nested 1 2 3 4 Well-nested

Yield(1) = {1, 3} Yield(2) = {2, 4} 1 < 2 < 3 < 4 Yield(1) = {1, 3} Yield(2) = {2} Yield(4) = {4}

  • 4. Novel characterization based on arc-sets

15 / 21

slide-42
SLIDE 42

Well-nestedness constraint

Definition I: family of couples of disjoint interleaving vertex subsets Example with ({1, 3}, {2, 4}) ∈ I 1 2 3 4 Not Well-nested 1 2 3 4 Well-nested Constraint For each couple (I1, I2) ∈ I ⇒ At least two adjacent arcs for I1 or I2

  • 4. Novel characterization based on arc-sets

15 / 21

slide-43
SLIDE 43

Full ILP: parsing with k-BBD and WN constraints

max

z

  • a∈A

score(a) × za Arc-factored (5) s.t.z ∈ Z Non-projective (6)

  • a∈δ(W )

za ≥ 2 ∀ W ∈ W≥k+1 k-BBD (7)

  • a∈δ(I1)

za +

  • a∈δ(I2)

za ≥ 3 ∀(I1, I2) ∈ I WN (8) Problem

  • MST: k-BBD and WN constraints can not be integrated
  • Generic solver: exponential number of constraints
  • Polynomial algorithm: intractable [Gómez-Rodríguez et al. 2009]

Solving the ILP ⇒ Lagrangian Relaxation applied on constraints (7)-(8)

  • 5. Efficient parsing with fine-grained constraints

16 / 21

slide-44
SLIDE 44

Lagrangian Relaxation

Lagrangian Dual Problem min

u≥0 max z∈Z L(z, u)

Efficient minimization of the dual

  • Min: Subgradient descent
  • Max: Maximum Spanning Tree
  • Many relaxed constraints: Non Delayed Relax-and-Cut

Efficient maximization of the primal

  • Branch-and-Bound
  • Problem reduction (exact pruning technique)
  • 5. Efficient parsing with fine-grained constraints

17 / 21

slide-45
SLIDE 45

Distribution of dependency tree characteristics

English (PTB/LTH) German (SPMRL) Dutch (UD) WN WN WN WN WN WN BD 1 92.26 67.60 69.13 BD 2 7.58 0.12 27.12 0.79 28.50 0.08 BD 3 0.12 0.01 3.86 0.30 2.24 0.01 BD 4 0.00 0.00 0.19 <0.01 0.04 0.00 BD > 4 0.00 0.00 0.11 <0.01 0.00 0.00 Spanish (UD) Portuguese (UD) WN WN WN WN BD 1 93.95 81.56 BD 2 5.99 0.04 13.92 0.05 BD 3 0.02 0.00 3.76 0.02 BD 4 0.00 0.00 0.54 0.00 BD > 4 0.00 0.00 0.14 0.00

  • Blue: Projective dependency trees
  • Blue + Purple: ≈ 99% of the dependency trees
  • 6. Experiments

18 / 21

slide-46
SLIDE 46

UAS (Ratio of correct arcs)

English German Dutch Spanish Portuguese

89.5 87.7 77.4 83.4 83.2 89.8 86.9 76.6 83.5 83.1 89.4 87.7 77.3 83.3 83.1

Non-projective Projective This work

3-BBD + WN 2-BBD + WN 3-BBD + WN 3-BBD 2-BBD + WN

  • 6. Experiments

19 / 21

slide-47
SLIDE 47

Efficiency: Relative parsing time

English German Dutch Spanish Portuguese 10 20 30 40 50 60 Relative parsing time Arc-factored & Non-projective This work Turbo Parser 2nd order Turbo Parser 3rd order

  • 6. Experiments

20 / 21

slide-48
SLIDE 48

Efficiency: Relative parsing time

English German Dutch Spanish Portuguese 10 20 30 40 50 60 Relative parsing time Arc-factored & Non-projective This work Turbo Parser 2nd order Turbo Parser 3rd order

  • 6. Experiments

20 / 21

slide-49
SLIDE 49

Conclusion: k-BBD and WN dependency parsing

Our contribution

  • Novel characterization based on arc sets only
  • The first efficient and flexible algorithm:
  • k-BBD with arbitrary k

Tunable for different languages/properties

  • WN optional
  • First experimental results with K-BBD and WN parsing

Surprising observation

  • Does not improve UAS under an arc-factored model

Perspectives

  • LTAG derivation parsing (2-BBD and WN)
  • Parsing lexicalized mildly context sensitive languages
  • 7. Conclusion

21 / 21

slide-50
SLIDE 50
slide-51
SLIDE 51

References I

Bodirsky, Manuel, Marco Kuhlmann, and Mathias Möhl (2009). “Well-nested drawings as models of syntactic structure”. In: Tenth Conference on Formal Grammar and Ninth Meeting on Mathematics of Language, pp. 195–203. Eisner, Jason (2000). “Bilexical grammars and their cubic-time parsing algorithms”. In: Advances in probabilistic and other parsing

  • technologies. Springer, pp. 29–61.

Gómez-Rodríguez, Carlos, David Weir, and John Carroll (2009). “Parsing mildly non-projective dependency structures”. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics,

  • pp. 291–299.
slide-52
SLIDE 52

References II

Koo, Terry et al. (2010). “Dual decomposition for parsing with non-projective head automata”. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 1288–1298. Kuhlmann, Marco (2010). Dependency Structures and Lexicalized Grammars: An Algebraic Approach. Vol. 6270. Springer. Lemaréchal, Claude (2001). “Lagrangian relaxation”. In: Computational combinatorial optimization. Springer, pp. 112–156. McDonald, Ryan et al. (2005). “Non-projective dependency parsing using spanning tree algorithms”. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics,

  • pp. 523–530.

Schrijver, A. (2003). Combinatorial Optimization - Polyhedra and

  • Efficiency. Springer.
slide-53
SLIDE 53

Lagrangian Relaxation: Optimality Rate

50 100 150 200 0.96 0.97 0.98 0.99 1 (y-axis) Optimality rate (blue) English (solid) With certificate (x-avis) Number of iterations (red) German (dashed) Without