Global Neural CCG Parsing with Optimality Guarantees Kenton Lee Mike - - PowerPoint PPT Presentation

global neural ccg parsing with optimality guarantees
SMART_READER_LITE
LIVE PREVIEW

Global Neural CCG Parsing with Optimality Guarantees Kenton Lee Mike - - PowerPoint PPT Presentation

Global Neural CCG Parsing with Optimality Guarantees Kenton Lee Mike Lewis Luke Zettlemoyer University of Washington UWNLP Now at Facebook AI Research 1 This Talk Challenge : Global models (e.g. Recursive NNs) break dynamic programs


slide-1
SLIDE 1

Global Neural CCG Parsing with Optimality Guarantees

Kenton Lee Mike Lewis† Luke Zettlemoyer University of Washington

† Now at Facebook AI Research

1

UWNLP

slide-2
SLIDE 2

This Talk

Fruit

NP/NP

flies

NP

like

(S\NP)/NP

bananas

NP NP S\NP S

2

Challenge: Global models (e.g. Recursive NNs) break dynamic programs

slide-3
SLIDE 3

This Talk

Fruit

NP/NP

flies

NP

like

(S\NP)/NP

bananas

NP NP S\NP S

Challenge: Global models (e.g. Recursive NNs) break dynamic programs Our approach: Combine local and global models in A* parser Result: Global model with exact inference

3

slide-4
SLIDE 4

Parsing with Hypergraphs

Fruit flies like bananas

Input Output

Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

4

Klein and Manning, 2001

slide-5
SLIDE 5

Parsing with Hypergraphs

Fruit flies like bananas

Input Output

Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

∅ Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

Klein and Manning, 2001

5

slide-6
SLIDE 6

∅ Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

Parsing with Hypergraphs

Fruit flies like bananas

Input Output

Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

Nodes represent partial parses

Klein and Manning, 2001

6

slide-7
SLIDE 7

Parsing with Hypergraphs

Fruit flies like bananas

Input Output

Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

Hyperedges represent rule productions

∅ Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

7

Klein and Manning, 2001

slide-8
SLIDE 8

∅ Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

Parsing with Hypergraphs

Fruit flies like bananas

Input Output

Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

Path represents a parse derivation

y = {e1, . . . , em}

8

Klein and Manning, 2001

slide-9
SLIDE 9

Parsing with Hypergraphs

Fruit flies like bananas

Input Output

∅ Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

9

slide-10
SLIDE 10

Parsing with Hypergraphs

Fruit flies like bananas

Input Output

∅ Fruit NP flies NP\NP like (S\NP)/NP bananas NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S ∅ Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

10

slide-11
SLIDE 11

∅ Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S ∅ Fruit NP flies NP\NP like (S\NP)/NP bananas NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S

Parsing with Hypergraphs

Fruit flies like bananas

Input Output

Each hyperedge is weighted with a score

e g(e)

11

slide-12
SLIDE 12

Parsing with Hypergraphs

Fruit flies like bananas

Input Output

∅ Fruit NP flies NP\NP like (S\NP)/NP bananas NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S ∅ Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP like bananas (S\NP)/NP NP

>

S\NP Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

Score of parse derivation:

g(y) = X

e∈y

g(e)

12

slide-13
SLIDE 13

Parsing with Hypergraphs

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP > NP Fruit flies NP NP\NP < NP like bananas (S\NP)/NP NP > S\NP like bananas (S\S)/NP NP > S\S Fruit flies NP S\NP < S Fruit flies like bananas NP/NP NP (S\NP)/NP NP > > NP S\NP < S Fruit flies like bananas NP NP\NP (S\NP)/NP NP < > NP S\NP < S Fruit flies like bananas NP S\NP (S\S)/NP NP < > S S\S < S

13

slide-14
SLIDE 14

Parsing with Hypergraphs

Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP > NP Fruit flies NP NP\NP < NP like bananas (S\NP)/NP NP > S\NP like bananas (S\S)/NP NP > S\S Fruit flies NP S\NP < S Fruit flies like bananas NP/NP NP (S\NP)/NP NP > > NP S\NP < S Fruit flies like bananas NP NP\NP (S\NP)/NP NP < > NP S\NP < S Fruit flies like bananas NP S\NP (S\S)/NP NP < > S S\S < S

14

slide-15
SLIDE 15

❖ Predicted parse: ❖ Exponential number of nodes

Intractable inference

Parsing with Hypergraphs

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP > NP Fruit flies NP NP\NP < NP like bananas (S\NP)/NP NP > S\NP like bananas (S\S)/NP NP > S\S Fruit flies NP S\NP < S Fruit flies like bananas NP/NP NP (S\NP)/NP NP > > NP S\NP < S Fruit flies like bananas NP NP\NP (S\NP)/NP NP < > NP S\NP < S Fruit flies like bananas NP S\NP (S\S)/NP NP < > S S\S < S

y∗ = argmax

y∈Y

g(y)

15

slide-16
SLIDE 16

Managing Intractable Search Spaces

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP > NP Fruit flies NP NP\NP < NP like bananas (S\NP)/NP NP > S\NP like bananas (S\S)/NP NP > S\S Fruit flies NP S\NP < S Fruit flies like bananas NP/NP NP (S\NP)/NP NP > > NP S\NP < S Fruit flies like bananas NP NP\NP (S\NP)/NP NP < > NP S\NP < S Fruit flies like bananas NP S\NP (S\S)/NP NP < > S S\S < S

Approximate inference with global expressivity, e.g.

16

❖ Greedy / beam search: ❖ Nivre, 2008 ❖ Chen and Manning, 2014 ❖ Andor et al., 2016 ❖ Reranking: ❖ Charniak and Johnson, 2005 ❖ Huang, 2008 ❖ Socher et al., 2013

slide-17
SLIDE 17

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies

?

NP like bananas

?

S\NP like bananas

?

S\S Fruit flies

?

S Fruit flies like bananas

?

S

Scores condition on local structures

Locally Factored Parsing

❖ Make locality assumptions: ❖ e.g. features are local to CFG

productions

❖ Polynomial number of nodes ❖ Dynamic programs enable

tractable inference

17

slide-18
SLIDE 18

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies

?

NP like bananas

?

S\NP like bananas

?

S\S Fruit flies

?

S Fruit flies like bananas

?

S

Scores condition on local structures

Locally Factored Parsing

18

Dynamic programs with locally factored models, e.g.

❖ CKY: ❖ Collins, 1997 ❖ Durrett and Klein, 2015 ❖ Minimum spanning tree: ❖ McDonald et al., 2005 ❖ Kiperwasser and Goldberg, 2016

slide-19
SLIDE 19

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies

?

NP like bananas

?

S\NP like bananas

?

S\S Fruit flies

?

S Fruit flies like bananas

?

S

Scores condition on local structures

Locally Factored Parsing

19

Dynamic programs with locally factored models, e.g.

❖ CKY: ❖ Collins, 1997 ❖ Durrett and Klein, 2015 ❖ Minimum spanning tree: ❖ McDonald et al., 2005 ❖ Kiperwasser and Goldberg, 2016

Recursive neural networks break dynamic programs!

slide-20
SLIDE 20

y∗ = argmax

y∈Y

  • gglobal(y)
  • y∗ = argmax

y∈Y

  • glocal(y)
  • Local vs. Global Models

Global model: Local model:

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP > NP Fruit flies NP NP\NP < NP like bananas (S\NP)/NP NP > S\NP like bananas (S\S)/NP NP > S\S Fruit flies NP S\NP < S Fruit flies like bananas NP/NP NP (S\NP)/NP NP > > NP S\NP < S Fruit flies like bananas NP NP\NP (S\NP)/NP NP < > NP S\NP < S Fruit flies like bananas NP S\NP (S\S)/NP NP < > S S\S < S ∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies ? NP like bananas ? S\NP like bananas ? S\S Fruit flies ? S Fruit flies like bananas ? S

Efficient Expressive Inexpressive Intractable

20

slide-21
SLIDE 21

This Work

Combined model:

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP > NP Fruit flies NP NP\NP < NP like bananas (S\NP)/NP NP > S\NP like bananas (S\S)/NP NP > S\S Fruit flies NP S\NP < S Fruit flies like bananas NP/NP NP (S\NP)/NP NP > > NP S\NP < S Fruit flies like bananas NP NP\NP (S\NP)/NP NP < > NP S\NP < S Fruit flies like bananas NP S\NP (S\S)/NP NP < > S S\S < S

Efficient Expressive

y∗ = argmax

y∈Y

  • glocal(y) + gglobal(y)
  • 21
slide-22
SLIDE 22

Outline

❖ Background: A* parsing ❖ Combined global and local parsing model ❖ Learning to search accurately and efficiently ❖ Experiments on CCGBank 22

slide-23
SLIDE 23

y∗ = argmax

y∈Y

g(y)

A* Parsing

❖ Search in the space of partial parses ❖ First explored full parse guaranteed to be optimal

23

Klein and Manning, 2003

slide-24
SLIDE 24

A* Parsing

Fruit flies like bananas (S\NP)/NP NP

>

S\NP

?

Partial parse

24

slide-25
SLIDE 25

A* Parsing

Partial parse

Fruit flies like bananas (S\NP)/NP NP

>

S\NP

?

25

slide-26
SLIDE 26

A* Parsing

Exploration priority

?

Partial parse

Fruit flies like bananas (S\NP)/NP NP

>

S\NP

?

f( )

26

slide-27
SLIDE 27

A* Parsing

f( ) = g( ) + h( )

Inside score

f( ) = g( ) + h( )

Exploration priority Admissible A* heuristic

Fruit flies like bananas (S\NP)/NP NP > S\NP

?

Fruit flies like bananas (S\NP)/NP NP

>

S\NP

?

Fruit flies like bananas (S\NP)/NP NP

>

S\NP

?

Fruit flies like bananas (S\NP)/NP NP > S\NP

?

Fruit flies like bananas (S\NP)/NP NP > S\NP

?

27

slide-28
SLIDE 28

A* Parsing

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

explored agenda unexplored

28

slide-29
SLIDE 29

A* Parsing

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

Agenda position

1 4.5 2 3.1 3 1.9 4

  • 0.5

bananas NP

f(y)

y

like (S\NP)/NP

Fruit NP

Fruit NP/NP

explored agenda unexplored

29

slide-30
SLIDE 30

A* Parsing

Agenda position

1 4.5 2 3.1 3 1.9 4

  • 0.5

like (S\NP)/NP

Fruit NP

Fruit NP/NP

f(y)

y

bananas NP

bananas NP ∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

explored agenda unexplored

30

slide-31
SLIDE 31

A* Parsing

Agenda position

2 3.1 3 1.9 4

  • 0.5

f(y)

y

bananas NP ∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

like (S\NP)/NP

Fruit NP

Fruit NP/NP

explored agenda unexplored

31

slide-32
SLIDE 32

A* Parsing

Agenda position

1 3.1 2 1.9 3

  • 0.5

4

  • 1.3

f(y)

y

bananas NP ∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

flies NP

like (S\NP)/NP

Fruit NP

Fruit NP/NP

explored agenda unexplored

32

slide-33
SLIDE 33

A* Parsing

Agenda position

1 3.1 2 1.9 3

  • 0.5

4

  • 1.3

f(y)

y

like (S\NP)/NP

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

flies NP

Fruit NP

Fruit NP/NP

explored agenda unexplored

33

slide-34
SLIDE 34

A* Parsing

Agenda position

2 1.9 3

  • 0.5

4

  • 1.3

f(y)

y

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

flies NP

Fruit NP

Fruit NP/NP

explored agenda unexplored

34

slide-35
SLIDE 35

A* Parsing

Agenda position

1 2.1 2 1.9 3

  • 0.5

4

  • 1.3

f(y)

y

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

like bananas (S\NP)/NP NP

>

S\NP

flies NP

Fruit NP

Fruit NP/NP

explored agenda unexplored

35

slide-36
SLIDE 36

A* Parsing

Agenda position

1 2.1 2 1.9 3

  • 0.5

4

  • 1.3

f(y)

y

like bananas (S\NP)/NP NP

>

S\NP

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

flies NP

Fruit NP

Fruit NP/NP

explored agenda unexplored

36

slide-37
SLIDE 37

A* Parsing

Agenda position

1 1.9 2

  • 1.5

3 … … 4 … …

f(y)

y

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

like (S\S)/NP

Fruit NP

explored agenda unexplored

37

slide-38
SLIDE 38

Locally Factored Model

Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

38

Supertag-factored A* CCG Parser (Lewis et al, 2016):

slide-39
SLIDE 39

Locally Factored Model

Supertag-factored A* CCG Parser (Lewis et al, 2016):

like (S\NP)/NP

bananas NP

Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S

Fruit flies like bananas NP/NP NP (S\NP)/NP NP > > NP S\NP < S

flies NP

Fruit NP/NP

glocal( ) : g( ) + g( ) + g( ) + g( )

39

slide-40
SLIDE 40

Locally Factored Model

Fruit flies like bananas (S\NP)/NP NP

>

S\NP

?

40

Supertag-factored A* CCG Parser (Lewis et al, 2016):

slide-41
SLIDE 41

Locally Factored Model

Fruit flies like bananas (S\NP)/NP NP

>

S\NP

?

like (S\NP)/NP

bananas NP

glocal( ) : g( ) + g( )

Fruit flies like bananas (S\NP)/NP NP > S\NP

?

41

Supertag-factored A* CCG Parser (Lewis et al, 2016):

slide-42
SLIDE 42

Locally Factored Model

Fruit flies like bananas (S\NP)/NP NP

>

S\NP

?

like (S\NP)/NP

bananas NP

glocal( ) : g( ) + g( )

Fruit flies like bananas (S\NP)/NP NP > S\NP

?

Fruit flies like bananas (S\NP)/NP NP > S\NP

?

Fruit tag

flies tag

hlocal( ) : max

tag g(

) + max

tag g(

)

42

Supertag-factored A* CCG Parser (Lewis et al, 2016):

slide-43
SLIDE 43

Outline

❖ Background: A* parsing ❖ Combined global and local parsing model ❖ Learning to search accurately and efficiently ❖ Experiments on CCGBank 43

slide-44
SLIDE 44

y∗ = argmax

y∈Y

g(y)

Global A* Parsing

❖ First explored full parse guaranteed to be optimal ❖ Global search graph is exponential in sentence length ❖ Open question: Can we still learn to search efficiently?

44

slide-45
SLIDE 45

Modeling Global Structure

Fruit

NP/NP

flies

NP

like

(S\NP)/NP

bananas

NP NP S\NP S

gglobal(y) : hglobal(y) :

Fruit flies like bananas (S\NP)/NP NP

>

S\NP

?

45

slide-46
SLIDE 46

Non-positive global model

Modeling Global Structure

g(y) = gglobal(y)

h(y) = 0

46

slide-47
SLIDE 47

Non-positive global model

g(y) = glocal(y) + gglobal(y)

h(y) = hlocal(y) + 0

Modeling Global Structure

47

slide-48
SLIDE 48

Any locally factored model with an admissible A* heuristic Non-positive global model

g(y) = glocal(y) + gglobal(y)

h(y) = hlocal(y) + 0

Modeling Global Structure

48

slide-49
SLIDE 49

Division of Labor

❖ Global expressivity ❖ Discriminative only

when necessary

❖ Limited expressivity ❖ Provides guidance with

an A* heuristic

g(y) = glocal(y) + gglobal(y)

49

slide-50
SLIDE 50

Global Model:

Word embeddings Bidirectional LSTM Tree-LSTM Parse Scores

Fruit

NP/NP

flies

NP

like

(S\NP)/NP

bananas

NP NP S\NP S

gglobal(y)

50

slide-51
SLIDE 51

Non-positive Global Model

Log-probability of a logistic regression layer

Fruit

NP/NP

flies

NP

like

(S\NP)/NP

bananas

NP NP S\NP S

gglobal( ) = log(σ(w· ))

51

slide-52
SLIDE 52

Division of Labor

❖ Global expressivity ❖ Discriminative only

when necessary

❖ Limited expressivity ❖ Provides guidance with

an A* heuristic

g(y) = glocal(y) + gglobal(y)

52

slide-53
SLIDE 53

Outline

❖ Background: A* parsing ❖ Combined global and local parsing model ❖ Learning to search accurately and efficiently ❖ Experiments on CCGBank 53

slide-54
SLIDE 54

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

Learning with A*

explored agenda unexplored

54

slide-55
SLIDE 55

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S Agenda position

Is correct?

1 4.5 2 3.1 3 1.9 4

  • 0.5

bananas NP

like (S\NP)/NP

Fruit NP Fruit NP/NP

f(y)

y

Learning with A*

explored agenda unexplored

55

slide-56
SLIDE 56

Agenda position

Is correct?

1 1.9 2

  • 0.5

3 … … … 4 … … …

Fruit NP Fruit NP/NP

f(y)

y

Learning with A*

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

explored agenda unexplored

56

slide-57
SLIDE 57

explored agenda unexplored

Agenda position

Is correct?

1 1.9 2

  • 0.5

3 … … … 4 … … …

Fruit NP Fruit NP/NP

f(y)

y

Learning with A*

∅ Fruit NP flies S\NP like (S\S)/NP flies NP\NP Fruit NP/NP flies NP like (S\NP)/NP bananas NP Fruit flies NP/NP NP

>

NP Fruit flies NP NP\NP

<

NP like bananas (S\NP)/NP NP

>

S\NP like bananas (S\S)/NP NP

>

S\S Fruit flies NP S\NP

<

S Fruit flies like bananas NP/NP NP (S\NP)/NP NP

> >

NP S\NP

<

S Fruit flies like bananas NP NP\NP (S\NP)/NP NP

< >

NP S\NP

<

S Fruit flies like bananas NP S\NP (S\S)/NP NP

< >

S S\S

<

S

Agenda violation: incorrect partial parse explored

57

slide-58
SLIDE 58

Violation-based Loss

58

A : [ … … ]

slide-59
SLIDE 59

Violation-based Loss

Top of agenda Best gold partial parse

59

A : [ … … ]

L(A) =

T

X

t=1

max

y∈At f(y) −

max

y∈gold(At) f(y)

slide-60
SLIDE 60

Correct partial parse can still be predicted via backtracking

Jointly Optimizing Accuracy and Efficiency

Agenda position

Is correct?

1 1.9 2

  • 0.5

3 … … … 4 … … …

Fruit NP Fruit NP/NP

f(y)

y

60

slide-61
SLIDE 61

Correct partial parse can still be predicted via backtracking

Jointly Optimizing Accuracy and Efficiency

Agenda position

Is correct?

1 1.9 2

  • 0.5

3 … … … 4 … … …

Fruit NP Fruit NP/NP

f(y)

y

Explicitly optimize for search efficiency!

61

slide-62
SLIDE 62

Outline

❖ Background: A* parsing ❖ Combined global and local parsing model ❖ Learning to search accurately and efficiently ❖ Experiments on CCGBank 62

slide-63
SLIDE 63

Experimental Setup

❖ : supertag-factored model from Lewis et al. (2016) ❖ Evaluate on CCGBank (Hockenmaier & Steedman, 2007) ❖ Comparisons:

glocal(y)

63

Clark & Curran
 (2007) Xu et al. (2015) Lewis et al. (2016) Vaswani et al. (2016)

Is global?

✓ ✓

Is exact?

slide-64
SLIDE 64

Experimental Setup

❖ : supertag-factored model from Lewis et al. (2016) ❖ Evaluate on CCGBank (Hockenmaier & Steedman, 2007) ❖ Comparisons:

glocal(y)

64

Clark & Curran
 (2007) Xu et al. (2015) Lewis et al. (2016) Vaswani et al. (2016)

Global A*

Is global?

✓ ✓ ✓

Is exact?

✓ ✓

slide-65
SLIDE 65

Test F1 (%) 84.0 85.0 86.0 87.0 88.0 89.0

88.7 88.3 88.1 87.0 85.2

CCG Parsing Results

65

Clark & Curran
 (2007) Xu et al. (2015) Lewis et al. (2016) Vaswani et al. (2016)

Global A*

Is global?

✓ ✓ ✓

Is exact?

✓ ✓

slide-66
SLIDE 66

Test F1 (%) 84.0 85.0 86.0 87.0 88.0 89.0

88.7 88.3 88.1 87.0 85.2

CCG Parsing Results

66

Clark & Curran
 (2007) Xu et al. (2015) Lewis et al. (2016) Vaswani et al. (2016)

Global A*

Is global?

✓ ✓ ✓

Is exact?

✓ ✓

❖ Optimal parse found for 99.9% of sentences ❖ Explores only 190 partial parses on average

slide-67
SLIDE 67

Decoder Comparisons

10 20 30 87.0 87.4 87.8 88.2 88.6 89.0 10-best Reranking 100-best Reranking 4-best Beam Search Global A*

Development F1 (%) Speed (sentences / second)

27.1 4.0 0.4 3.2 88.4 88.3 88.2 87.9

67

slide-68
SLIDE 68

Context Ablation

150 300 450 600 750 87.0 87.5 88.0 88.5 89.0 Global A* Global A* without context

Development F1 (%) Number of explorations (lower is better)

610.5 309.6 88.1 88.4

flies like

(S\NP)/NP

bananas

NP S\NP

flies like

(S\NP)/NP

bananas

NP S\NP

68

slide-69
SLIDE 69

Garden Paths

The favorite U.S. small business is one whose research and development can be milked for future Japanese use. Incorrect partial parse (syntactically plausible in isolation): Input sentence:

U.S. small business is

  • ne

N/N (N/N)\(N/N) N (S\NP)/NP N

< > > <

S

Heavily penalized by the global model

69

slide-70
SLIDE 70

Conclusion

❖ Combining local and global models enables exact inference

with global features

❖ Efficient decoding by learning to search ❖ State of the art for CCG parsing ❖ Applicable to other structured prediction tasks 70