Transforming Dependency Structures to LTAG Derivation Trees 13th - - PowerPoint PPT Presentation

transforming dependency structures to ltag derivation
SMART_READER_LITE
LIVE PREVIEW

Transforming Dependency Structures to LTAG Derivation Trees 13th - - PowerPoint PPT Presentation

Transforming Dependency Structures to LTAG Derivation Trees 13th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+13) Caio Corro Joseph Le Roux September 1, 2017 Laboratoire Informatique de Paris Nord (LIPN),


slide-1
SLIDE 1

Transforming Dependency Structures to LTAG Derivation Trees

13th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+13)

Caio Corro Joseph Le Roux September 1, 2017

Laboratoire Informatique de Paris Nord (LIPN), Universit´ e Paris 13 (France), CNRS UMR 7030

slide-2
SLIDE 2

Introduction

slide-3
SLIDE 3

Lexicalized Tree Adjoining Grammar (LTAG)

Why LTAGs?

  • Constituency structure
  • Linguistically plausible
  • Built-in bi-lexical relations
  • Deep syntax

Weighted grammars

  • Disambiguation/Preference
  • Robustness:
  • Unknown words
  • Errors

walks VBZ VP S NP NP dog NN NP walks VBZ VP S NP NP river NN NP

  • 1
slide-4
SLIDE 4

LTAG parsing

CKY-type algorithm

  • Deduction-rule based
  • Bottom-up

Complexity O(n6 max(n, g)gt): n: sentence length t: maximum number of nodes in an elementary tree g: maximum ambiguity ⇒ O(n7) asymptotically w.r.t. the sentence length [Eisner et al., 2000]

2

slide-5
SLIDE 5

LTAG parsing problem

Lexical ambiguity Combination constraints Non-trivial dependency structure

3

slide-6
SLIDE 6

Supertagging approach (1)

Lexical ambiguity Combination constraints Non-trivial dependency structure

4

slide-7
SLIDE 7

Supertagging approach (2)

She walks the dog

5

slide-8
SLIDE 8

Supertagging approach (2)

She walks VBZ VP S NP NP the dog NN NP

5

slide-9
SLIDE 9

Supertagging approach (2)

She walks VBZ VP S NP NP the dog NN NP

5

slide-10
SLIDE 10

Supertagging approach (2)

She walks VBZ VP S NP NP the dog NN NP She walks, despite her hatred for quadruped mammals, the dog

5

slide-11
SLIDE 11

Supertagging approach (2)

She walks VBZ VP S NP NP the dog NN NP She walks, despite her hatred for quadruped mammals, the dog VBZ VP S NP NN NP

5

slide-12
SLIDE 12

Supertagging approach (2)

She walks VBZ VP S NP NP the dog NN NP She walks, despite her hatred for quadruped mammals, the dog VBZ VP S NP NN NP

?

5

slide-13
SLIDE 13

Supertagging approach (3)

Pipeline

  • 1. Supertagging
  • 2. Constraint LTAG parsing

Downsides

  • Long distance relationship
  • 2nd step complexity: O(n7t)

⇒ No lexical ambiguity

6

slide-14
SLIDE 14

Phrase structure tree VS Dependency tree

”. . . One should always distinguish the type of representation [. . . ] from the content of the representation...” [Rambow, 2010] Syntactic content

  • Syntactic dependency
  • Syntactic phrase/constituency structure

Representation types

  • Dependency tree
  • Hierarchy structure tree

⇒ Syntactic phrase-structure parsing as a dependency structure parsing task

7

slide-15
SLIDE 15

LTAG derivation tree

walks VBZ VP S NP NP VP VP* ADVP RB deliberately NP PRP She NP NN dog NP NP∗ DET the

Bottom-up construction of the syntactic phrase structure

v1 τ1 v2 τ2 v3 τ3 v4 τ4 v5 τ5 1 . 1 1 . 2 1 . 2 . 2 1 She deliberately walks the dog

Representation alternative: the LTAG derivation tree is a dependency tree [Rambow et al., 1997]

8

slide-16
SLIDE 16

Proposed approach (1)

Lexical ambiguity Combination constraints Non-trivial dependency structure

9

slide-17
SLIDE 17

Proposed approach (1)

Lexical ambiguity Combination constraints Non-trivial dependency structure

9

slide-18
SLIDE 18

Proposed approach (2)

v1 v2 v3 v4 v5 She deliberately walks the dog ⇒ v1 τ1 v2 τ2 v3 τ3 v4 τ4 v5 τ5 1.1 1.2 1.2.2 1.1 She deliberately walks the dog

Alternative pipeline

  • 1. Bi-lexical dependency parsing: long distance relationships
  • 2. LTAG parse labeler

Downsides

  • 1st step complexity: O(n7) [G´
  • mez-Rodr´

ıguez et al., 2009]

  • 2nd step complexity?

10

slide-19
SLIDE 19

Proposed approach (2)

v1 v2 v3 v4 v5 She deliberately walks the dog ⇒ v1 τ1 v2 τ2 v3 τ3 v4 τ4 v5 τ5 1.1 1.2 1.2.2 1.1 She deliberately walks the dog

Alternative pipeline

  • 1. Bi-lexical dependency parsing: long distance relationships
  • 2. LTAG parse labeler

Downsides

  • 1st step complexity: O(n7) [G´
  • mez-Rodr´

ıguez et al., 2009] ⇒ Efficient decoding in practice via Lagrangian relaxation [Corro et al., 2016]

  • 2nd step complexity?

⇒ This contribution!

10

slide-20
SLIDE 20

Table of contents

  • 1. Introduction
  • 2. Characterization of LTAG derivation trees
  • 3. Outline of the algorithm
  • 4. Complexity
  • 5. Conclusion

11

slide-21
SLIDE 21

Characterization of LTAG derivation trees

slide-22
SLIDE 22

LTAG derivation trees

Structural properties [Bodirsky et al., 2005]

  • Arborescence (directed tree)
  • 2-bounded block degree
  • Well-nestedness

2-bounded block degree

  • Maximum 1 gap in the yield of a sub-arborescence

⇒ Due to wrapping adjunction Well-nestedness

  • Sub-arborescences must not interleave (not used in this presentation)

12

slide-23
SLIDE 23

Yield

Yield of a vertex v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 13

slide-24
SLIDE 24

Yield

Yield of a vertex v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} 2 4 1 3 13

slide-25
SLIDE 25

Yield

Yield of a vertex v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} Yield(1) = {1} 1 13

slide-26
SLIDE 26

Yield

Yield of a vertex v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} Yield(1) = {1} Yield(2) = {1, 2, 3, 4} 2 4 1 3 13

slide-27
SLIDE 27

Yield

Yield of a vertex v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} Yield(1) = {1} Yield(2) = {1, 2, 3, 4} Yield(3) = {3} 3 13

slide-28
SLIDE 28

Yield

Yield of a vertex v: set of all nodes reachable from v

2 4 1 3 s0 s1 s2 s3 s4 Yield(0) = {0, 1, 2, 3, 4} Yield(1) = {1} Yield(2) = {1, 2, 3, 4} Yield(3) = {3} Yield(4) = {1, 3, 4} 4 1 3 13

slide-29
SLIDE 29

2-bounded block degree

Bound degree

  • Vertex: number of contiguous intervals described by its yield
  • Arborescence: the maximal block degree of its vertices

2 Bounded degree arborescence

  • Arborescence with a bound degree less or equal to 2

v0 v1 v2 v3 v4 s0 s1 s2 s3 s4 14

slide-30
SLIDE 30

2-bounded block degree

Bound degree

  • Vertex: number of contiguous intervals described by its yield
  • Arborescence: the maximal block degree of its vertices

2 Bounded degree arborescence

  • Arborescence with a bound degree less or equal to 2

v0 v1 v2 v3 v4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 v0 v1 v2 v3 v4 14

slide-31
SLIDE 31

2-bounded block degree

Bound degree

  • Vertex: number of contiguous intervals described by its yield
  • Arborescence: the maximal block degree of its vertices

2 Bounded degree arborescence

  • Arborescence with a bound degree less or equal to 2

v0 v1 v2 v3 v4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 v1 v4 14

slide-32
SLIDE 32

2-bounded block degree

Bound degree

  • Vertex: number of contiguous intervals described by its yield
  • Arborescence: the maximal block degree of its vertices

2 Bounded degree arborescence

  • Arborescence with a bound degree less or equal to 2

v0 v1 v2 v3 v4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 Yield(2) = [2 . . . 3] BD(2) = 1 v2 v3 14

slide-33
SLIDE 33

2-bounded block degree

Bound degree

  • Vertex: number of contiguous intervals described by its yield
  • Arborescence: the maximal block degree of its vertices

2 Bounded degree arborescence

  • Arborescence with a bound degree less or equal to 2

v0 v1 v2 v3 v4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 Yield(2) = [2 . . . 3] BD(2) = 1 Yield(3) = [3] BD(3) = 1 v3 14

slide-34
SLIDE 34

2-bounded block degree

Bound degree

  • Vertex: number of contiguous intervals described by its yield
  • Arborescence: the maximal block degree of its vertices

2 Bounded degree arborescence

  • Arborescence with a bound degree less or equal to 2

v0 v1 v2 v3 v4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 Yield(2) = [2 . . . 3] BD(2) = 1 Yield(3) = [3] BD(3) = 1 Yield(4) = [4] BD(4) = 1 v4 14

slide-35
SLIDE 35

2-bounded block degree

Bound degree

  • Vertex: number of contiguous intervals described by its yield
  • Arborescence: the maximal block degree of its vertices

2 Bounded degree arborescence

  • Arborescence with a bound degree less or equal to 2

v0 v1 v2 v3 v4 s0 s1 s2 s3 s4 Yield(0) = [0 . . . 4] BD(0) = 1 Yield(1) = [1] ∪ [4] BD(1) = 2 Yield(2) = [2 . . . 3] BD(2) = 1 Yield(3) = [3] BD(3) = 1 Yield(4) = [4] BD(4) = 1 v1 v4

Intuition

  • Auxiliary tree anchored at s1 adjoined via wrapping adjunction
  • Anchors s2 and s3 attached below the foot node

14

slide-36
SLIDE 36

Parsing

Dynamic programming [G´

  • mez-Rodr´

ıguez et al., 2009]

  • Complexity: O(n7), intractable on long sentences

⇒ Asymptotically equivalent to LTAG parsing! Combinatorial optimization [Corro et al., 2016]

  • Complexity: exponential
  • Practically: fast

⇒ ”Simple” optimization problem as there is no constraint

  • n combination operations

Intuition

  • 1. Non-trivial dependency structure parsing tackled via combinatorial
  • ptimization
  • 2. Complexity of parse tree labeling?

15

slide-37
SLIDE 37

Outline of the algorithm

slide-38
SLIDE 38

Parse tree labeling

Lexical ambiguity Combination constraints Non-trivial dependency structure

16

slide-39
SLIDE 39

Parse tree labeling

Lexical ambiguity Combination constraints Non-trivial dependency structure

16

slide-40
SLIDE 40

Deduction system

v1 v2 v3 v4 v5 She deliberately walks the dog ⇒ v1 τ1 v2 τ2 v3 τ3 v4 τ4 v5 τ5 1.1 1.2 1.2.2 1 She deliberately walks the dog

Dynamic program

  • Deduction rule
  • Agenda

Bottom-up

  • 1. Dependency tree: words considered after its modifiers
  • 2. Elementary tree: non-terminal considered after its children

17

slide-41
SLIDE 41

Key idea: extract information from the dependency structure

v1 v2 v3 v4 v5 v6 Why, he asks, does she walk ? Information about v4

  • Parent: v3
  • Yield span: [1, 6]
  • Gap span: [2, 3]

Notation Value (v4)⇐ v1 (v4)⇒ v6 (v4)← v2 (v4)→ v3 (v4)↑ v3

18

slide-42
SLIDE 42

Key idea: no integer span

Main difference Vertices are used to define spans instead of integers ⇒ combination rule constrained by arcs between vertices Standard LTAG parser items (CKY) [h, τ, p, c, i, j, k, l] with: h: anchor word index τ: elementary tree p: gorn address c: combination flag i, l: yield span (integers) j, k: gap span (integers) Our parser items [vh, τ, p, c, bl, br] with: vh: vertex (anchor word) τ: elementary tree p: gorn address c: combination flag bl: left boundary (vertex) br: right boundary (vertex)

19

slide-43
SLIDE 43

Moving

Let’s start with something simple... :-) Move unary: [vh, τ, 1.2.1, ⊤, bl, br] walks VBZ VP S NP

20

slide-44
SLIDE 44

Moving

Let’s start with something simple... :-) Move unary: [vh, τ, 1.2.1, ⊤, bl, br]

(p · 2) / ∈ τ

[vh, τ, 1.2, ⊥, bl, br] walks VBZ VP S NP

20

slide-45
SLIDE 45

Moving

Let’s start with something simple... :-) Move binary: [vh, τ, 1.1, ⊤, bl1, br1] [vh, τ, 1.2, ⊤, bl2, br2] walks VBZ VP S NP

20

slide-46
SLIDE 46

Moving

Let’s start with something simple... :-) Move binary: [vh, τ, 1.1, ⊤, bl1, br1] [vh, τ, 1.2, ⊤, bl2, br2]

(br1)⇒ + 1 = (bl2)⇐

[vh, τ, 1, ⊥, bl1, br2] ⇒ Similar to LTAG parsing but with constraint on boundary vertices walks VBZ VP S NP

20

slide-47
SLIDE 47

Substitution

And now let’s see something nice! O o Substitute: [vm, τ ′, 1, ⊤, bl, br] She PRP NP walks VBZ VP S NP

21

slide-48
SLIDE 48

Substitution

And now let’s see something nice! O o Substitute: [vm, τ ′, 1, ⊤, bl, br]

(vm)← = −, fSS(τ, p, τ ′)

[vh, τ, p, ⊤, vm, vm] ⇒ Fixed boundaries for the antecedent by the dependency tree She PRP NP walks VBZ VP S NP

21

slide-49
SLIDE 49

Substitution

And now let’s see something nice! O o Substitute: [vm, τ ′]

(vm)← = −, fSS(τ, p, τ ′)

[vh, τ, p, ⊤, vm, vm] ⇒ vh fixed by the dependency tree ⇒ Number of applications linearly bounded She PRP NP walks VBZ VP S NP

21

slide-50
SLIDE 50

Wrapping adjunction

But for a more complicated operation? :/ Wrapping adjoin: [vm, τ ′, 1, ⊤, bl1, br1] [vh, τ, p, ⊥, bl2, br2] like VB NP NP SQ∗ WHNP SQ does VBZ SQOA SBARQ

22

slide-51
SLIDE 51

Wrapping adjunction

But for a more complicated operation? :/ Wrapping adjoin: [vm, τ ′, 1, ⊤, bl1, br1] [vh, τ, p, ⊥, bl2, br2]

fSA(τ, p, τ ′)

[vh, τ, p, ⊤, vm, vm] ⇒ Boundaries of the left antecedent are fixed (similarly to substitution) like VB NP NP SQ∗ WHNP SQ does VBZ SQOA SBARQ

22

slide-52
SLIDE 52

Wrapping adjunction

But for a more complicated operation? :/ Wrapping adjoin: [vm, τ ′] [vh, τ, p, ⊥, bl, br]

fSA(τ, p, τ ′)

[vh, τ, p, ⊤, vm, vm] ⇒ Gap filled with boundaries of the right antecedent? like VB NP NP SQ∗ WHNP SQ does VBZ SQOA SBARQ

22

slide-53
SLIDE 53

Wrapping adjunction

But for a more complicated operation? :/ Wrapping adjoin: [vm, τ ′] [vh, τ, p, ⊥, bl, br]

(vm)← = (bl)⇐, (vm)→ = (br )⇒, fSA(τ, p, τ ′)

[vh, τ, p, ⊤, vm, vm] ⇒ vh fixed by the dependency tree ⇒ Number of applications linearly bounded, again like VB NP NP SQ∗ WHNP SQ does VBZ SQOA SBARQ

22

slide-54
SLIDE 54

Left/Right adjunction

Wait, we don’t know the gap boundaries for left/right adjunctions! :’( Left adjoin: [vm, τ ′, 1, ⊤, bl1, br1] [vh, τ, p, ⊥, bl2, br2] ⇒ Right limit of the gap br1 unknown in the dependency tree VP VP* ADVP RB deliberately walks VBZ VP S NP NP

23

slide-55
SLIDE 55

Left/Right adjunction

Wait, we don’t know the gap boundaries for left/right adjunctions! :’( Left adjoin: [vm, τ ′, 1, ⊤, bl1, −] [vh, τ, p, ⊥, bl2, br2] ⇒ Workaround: − boundary to prevent anything in the right side of the gap VP VP* ADVP RB deliberately walks VBZ VP S NP NP

23

slide-56
SLIDE 56

Left/Right adjunction

Wait, we don’t know the gap boundaries for left/right adjunctions! :’( Left adjoin: [vm, τ ′, ←] [vh, τ, p, ⊥, bl, br] ⇒ Left antecedent fixed by the dependency tree VP VP* ADVP RB deliberately walks VBZ VP S NP NP

23

slide-57
SLIDE 57

Left/Right adjunction

Wait, we don’t know the gap boundaries for left/right adjunctions! :’( Left adjoin: [vm, τ ′, ←] [vh, τ, p, ⊥, bl, br]

(vm)⇒ = (bl)⇐ − 1, fSA(τ, p, τ ′)

[vh, τ, p, ⊤, vm, br] ⇒ Is the number of applications linearly bounded? (yes, proof in the paper) VP VP* ADVP RB deliberately walks VBZ VP S NP NP

23

slide-58
SLIDE 58

Complexity

slide-59
SLIDE 59

Complexity

Move binary: [vh, τ, 1.1, ⊤, bl1, br1] [vh, τ, 1.2, ⊤, bl2, br2]

(br1)⇒ + 1 = (bl2)⇐

[vh, τ, 1, ⊥, bl1, br2] Proof intuition 3 boundaries ⇒ O(n3) ?

24

slide-60
SLIDE 60

Complexity

Move binary: [vh, τ, 1.1, ⊤, bl1, br1] [vh, τ, 1.2, ⊤, bl2, br2]

(br1)⇒ + 1 = (bl2)⇐

[vh, τ, 1, ⊥, bl1, br2] Proof intuition 3 boundaries ⇒ O(n3) ? ⇒ Bounded by the elementary tree size if no multiple adjunction

24

slide-61
SLIDE 61

Complexity

Move binary: [vh, τ, 1.1, ⊤, bl1, br1] [vh, τ, 1.2, ⊤, bl2, br2]

(br1)⇒ + 1 = (bl2)⇐

[vh, τ, 1, ⊥, bl1, br2] Proof intuition 3 boundaries ⇒ O(n3) ? ⇒ Bounded by the elementary tree size if no multiple adjunction Complexity O(min(t, n)2ntg) with: n: sentence length t: maximum number of nodes in an elementary tree g: maximum ambiguity ⇒ Asymptotically linear w.r.t. the sentence length

24

slide-62
SLIDE 62

Conclusion

slide-63
SLIDE 63

Conclusion

Contributions

  • New perspective on efficient LTAG parsing
  • Linear time LTAG parse labeler

Future work

  • Experimentation!
  • Multiple adjunctions?
  • Extension to other lexicalized formalisms:

Lexicalized Linear Context-Free Rewriting Systems, . . .

25

slide-64
SLIDE 64

Questions?

25