A CYK+ Variant for SCFG Decoding Without a Dot Chart Rico Sennrich - - PowerPoint PPT Presentation

a cyk variant for scfg decoding without a dot chart
SMART_READER_LITE
LIVE PREVIEW

A CYK+ Variant for SCFG Decoding Without a Dot Chart Rico Sennrich - - PowerPoint PPT Presentation

A CYK+ Variant for SCFG Decoding Without a Dot Chart Rico Sennrich Institute for Language, Cognition and Computation University of Edinburgh October 25 2014 R. Sennrich Recursive CYK+ 1 / 15 Outline CYK+ and the role of the dot chart


slide-1
SLIDE 1

A CYK+ Variant for SCFG Decoding Without a Dot Chart

Rico Sennrich

Institute for Language, Cognition and Computation University of Edinburgh

October 25 2014

  • R. Sennrich

Recursive CYK+ 1 / 15

slide-2
SLIDE 2

Outline

CYK+ and the role of the dot chart Recursive variant Evaluation

  • R. Sennrich

Recursive CYK+ 2 / 15

slide-3
SLIDE 3

Problem

CYK+ parsing

CYK+ and Earley-style variants are popular parsers for decoding with SCFGs (Moses, cdec, SAMT, Jane, ...). alternative: binarization and decoding with plain CYK.

problem

CYK+ parsing [with syntactic models] takes a lot of memory.

n = 20 n = 40 n = 80

0.32 GB 2.63 GB 51.64 GB most of the memory is consumed by the dot chart.

  • R. Sennrich

Recursive CYK+ 3 / 15

slide-4
SLIDE 4

Problem

CYK+ parsing

CYK+ and Earley-style variants are popular parsers for decoding with SCFGs (Moses, cdec, SAMT, Jane, ...). alternative: binarization and decoding with plain CYK.

problem

CYK+ parsing [with syntactic models] takes a lot of memory.

n = 20 n = 40 n = 80

0.32 GB 2.63 GB 51.64 GB most of the memory is consumed by the dot chart.

solution

in this talk, we present a variant of CYK+ without a dot chart.

  • ur variant requires less memory and is faster, with same result.
  • R. Sennrich

Recursive CYK+ 3 / 15

slide-5
SLIDE 5

CYK+

The CYK+ algorithm

bottom-up chart parser generalization of CYK to n-ary rules two data structures:

main chart: non-terminal symbols dot chart: rule prefix applications (dotted items)

difference to Earley: dotted item represents all rules with same prefix dot chart allows dynamic binarization: rules that match span (i,j) are found by combining dotted item in (i,k) and (non-)terminal symbol in span (k,j).

  • R. Sennrich

Recursive CYK+ 4 / 15

slide-6
SLIDE 6

CYK+

it is a trap 1 2 3 4 5 6 7 8 9 10 it is a trap 1 2 3 4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-7
SLIDE 7

CYK+

it is a trap 1 2 3 4 5 6 7 8 9 10 it is a trap 1

NP

2 3 4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-8
SLIDE 8

CYK+

it is a trap 1 NP• 2 3 4 5 6 7 8 9 10 it is a trap 1

NP

2 3 4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-9
SLIDE 9

CYK+

it is a trap 1 NP• 2 3 4 5 6 7 8 9 10 it is a trap 1

NP

2

V

3 4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-10
SLIDE 10

CYK+

it is a trap 1 NP• 2 3 4 5 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-11
SLIDE 11

CYK+

it is a trap 1 NP• 2 3

ART•

4 5 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-12
SLIDE 12

CYK+

it is a trap 1 NP• 2 3

ART•

4 5 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4

NN

5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-13
SLIDE 13

CYK+

it is a trap 1 NP• 2 3

ART•

4 5 NP V• 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4

NN

5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-14
SLIDE 14

CYK+

it is a trap 1 NP• 2 3

ART•

4 5 NP V• 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4

NN

5 6 7

NP

8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-15
SLIDE 15

CYK+

it is a trap 1 NP• 2 3

ART•

4 5 NP V• 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4

NN

5 6 7

NP

8 9 10 S S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

CYK+ steps

1

search for terminal rule of size 1.

2

combine dotted item and (non-)terminal of two subspans.

3

create new dotted item from (non-)terminal in cell.

  • R. Sennrich

Recursive CYK+ 5 / 15

slide-16
SLIDE 16

CYK+: complexity considerations

monolingual 1-best parser

main chart: O(n2) dot chart: O(n2) parsing steps: O(n3)

SCFG decoding

Non-locality of LM scores restricts recombination of dotted items [Hopkins and Langmead, 2010] main chart: O(n2) (with beam search) dot chart: O(nscope(G)) parsing steps: O(nscope(G)) rule scope: number of choice points in rule

  • n

the fast jet ski

  • f

mr smith the JJ NPB

  • f

NNP the JJ NPB

  • f

NNP the JJ NPB

  • f

NNP the JJ NPB

  • f

NNP choice point choice point

  • R. Sennrich

Recursive CYK+ 6 / 15

slide-17
SLIDE 17

The dot chart in SCFG decoding

purpose of dot chart

allows recombination of different dotted items

→ does not apply to SCFG decoding

allows re-use of same dotted item for different spans it is a trap 1 NP• 2 3

ART•

4 5 NP V• 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4

NN

5 6 7

NP

8 9 10 S dot chart main chart

  • R. Sennrich

Recursive CYK+ 7 / 15

slide-18
SLIDE 18

The dot chart in SCFG decoding

purpose of dot chart

allows recombination of different dotted items

→ does not apply to SCFG decoding

allows re-use of same dotted item for different spans it is a trap 1 NP• 2 3

ART•

4 5 NP V• 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4

NN

5 6 7

NP

8 9 10 S dot chart main chart

  • R. Sennrich

Recursive CYK+ 7 / 15

slide-19
SLIDE 19

The dot chart in SCFG decoding

purpose of dot chart

allows recombination of different dotted items

→ does not apply to SCFG decoding

allows re-use of same dotted item for different spans it is a trap 1 NP• 2 3

ART•

4 5 NP V• 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4

NN

5 6 7

NP

8 9 10 S dot chart main chart

  • R. Sennrich

Recursive CYK+ 7 / 15

slide-20
SLIDE 20

The dot chart in SCFG decoding

purpose of dot chart

allows recombination of different dotted items

→ does not apply to SCFG decoding

allows re-use of same dotted item for different spans it is a trap 1 NP• 2 3

ART•

4 5 NP V• 6 7 8 9 10 it is a trap 1

NP

2

V

3

ART

4

NN

5 6 7

NP

8 9 10 S dot chart main chart

  • R. Sennrich

Recursive CYK+ 7 / 15

slide-21
SLIDE 21

A recursive variant

Core idea

we do not initially know if rule prefix application can be extended.

→ dotted items are re-visited throughout time.

we can change chart traversal order to guarantee that when span (i,k) is visited, all spans (k,j) have been visited before. this eliminates need to store dotted items; instead, they are extended recursively, then discarded.

it is a trap 1 2 3 4 5 6 7 8 9 10 it is a trap 1 2 3 4 5 6 7 8 9 10 traditional proposed

  • R. Sennrich

Recursive CYK+ 8 / 15

slide-22
SLIDE 22

Recursive CYK+

it is a trap 1 2 3 4 5 6 7 8 9 10 it is a trap 1 2 3 4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

recursive CYK+ steps

1

search for terminal rule of size 1.

2

initial call to rule consume function.

3

recursive call to rule consume function.

  • R. Sennrich

Recursive CYK+ 9 / 15

slide-23
SLIDE 23

Recursive CYK+

it is a trap 1 2 3 4 5 6 7 8 9 10 it is a trap 1

NN

2 3 4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

recursive CYK+ steps

1

search for terminal rule of size 1.

2

initial call to rule consume function.

3

recursive call to rule consume function.

  • R. Sennrich

Recursive CYK+ 9 / 15

slide-24
SLIDE 24

Recursive CYK+

it is a trap 1 2 3 4 5 6 7 8 9 10 it is a trap 1

NN

2

ART

3 4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

recursive CYK+ steps

1

search for terminal rule of size 1.

2

initial call to rule consume function.

3

recursive call to rule consume function.

  • R. Sennrich

Recursive CYK+ 9 / 15

slide-25
SLIDE 25

Recursive CYK+

it is a trap 1 2

ART•

3 4 5 6 7 8 9 10 it is a trap 1

NN

2

ART

3

NP

4 5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

recursive CYK+ steps

1

search for terminal rule of size 1.

2

initial call to rule consume function.

3

recursive call to rule consume function.

  • R. Sennrich

Recursive CYK+ 9 / 15

slide-26
SLIDE 26

Recursive CYK+

it is a trap 1 2 3 4 5 6 7 8 9 10 it is a trap 1

NN

2

ART

3

NP

4

V

5 6 7 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

recursive CYK+ steps

1

search for terminal rule of size 1.

2

initial call to rule consume function.

3

recursive call to rule consume function.

  • R. Sennrich

Recursive CYK+ 9 / 15

slide-27
SLIDE 27

Recursive CYK+

it is a trap 1 2 3 4 5 6 7 8 9 10 it is a trap 1

NN

2

ART

3

NP

4

V

5 6 7 NP 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

recursive CYK+ steps

1

search for terminal rule of size 1.

2

initial call to rule consume function.

3

recursive call to rule consume function.

  • R. Sennrich

Recursive CYK+ 9 / 15

slide-28
SLIDE 28

Recursive CYK+

it is a trap 1 2 3 4 5 6 7 NP• 8 NP V • 9 10 it is a trap 1

NN

2

ART

3

NP

4

V

5 6 7 NP 8 9 10 S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

recursive CYK+ steps

1

search for terminal rule of size 1.

2

initial call to rule consume function.

3

recursive call to rule consume function.

  • R. Sennrich

Recursive CYK+ 9 / 15

slide-29
SLIDE 29

Recursive CYK+

it is a trap 1 2 3 4 5 6 7 NP• 8 NP V • 9 10 it is a trap 1

NN

2

ART

3

NP

4

V

5 6 7 NP 8 9 10 S S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

recursive CYK+ steps

1

search for terminal rule of size 1.

2

initial call to rule consume function.

3

recursive call to rule consume function.

  • R. Sennrich

Recursive CYK+ 9 / 15

slide-30
SLIDE 30

Recursive CYK+

it is a trap 1 2 3 4 5 6 7 8 9 10 it is a trap 1

NN

2

ART

3

NP

4

V

5 6 7 NP 8 9 10 S S

NP V NP NP

ART NN NP

it V

is ART

a NN

trap dot chart main chart grammar

recursive CYK+ steps

1

search for terminal rule of size 1.

2

initial call to rule consume function.

3

recursive call to rule consume function.

  • R. Sennrich

Recursive CYK+ 9 / 15

slide-31
SLIDE 31

Recursive CYK+

Implementation notes

dot chart exists implicitly in stack of recursive function: O(|R|) each rule prefix application is constructed exactly once. rule applications may be found asynchronously; we keep (pruned) list for each span, and perform cube pruning synchronously.

→ no difference in translation output to original CYK+ algorithm.

  • R. Sennrich

Recursive CYK+ 10 / 15

slide-32
SLIDE 32

Evaluation

Task

English→German string-to-tree SMT system [Williams et al., 2014] grammar pruned to scope 3 [Hopkins and Langmead, 2010] all algorithms implemented in Moses focus on memory and speed (same translation) we ignore memory cost and loading times of model

Baselines

CYK+ Scope-3 parser [Williams and Koehn, 2012]; inspired by [Hopkins and Langmead, 2010] no dot chart, but more complex algorithm that constructs lattice for each rule and span representing all rule applications.

  • R. Sennrich

Recursive CYK+ 11 / 15

slide-33
SLIDE 33

Evaluation: memory

algorithm

n = 20 n = 40 n = 80

Scope-3 0.02 0.04 0.34 CYK+ 0.32 2.63 51.64 + recursive 0.02 0.04 0.15 + compression 0.02 0.04 0.15

Table : Peak memory consumption (in GB) of string-to-tree SMT decoder

  • R. Sennrich

Recursive CYK+ 12 / 15

slide-34
SLIDE 34

Evaluation: speed

20 40 60 80 100 200 300 400 sentence length decoding time (seconds) Scope-3 parser CYK+ + recursive + compression

algorithm length 80 average parse total parse total Scope-3 74.5 81.1 1.9 2.6 CYK+ 358.0 365.4 8.4 9.1 + recursive 33.7 40.1 1.5 2.2 + compression 15.0 21.2 1.0 1.7

Table : Parse time and total decoding time per sentence (in seconds).

  • R. Sennrich

Recursive CYK+ 13 / 15

slide-35
SLIDE 35

Discussion

Is Recursive CYK+ ever a bad Idea?

complexity characteristics are different in monolingual case there might be smarter ways to organize/prune dot chart

→ memory consumption will still be worse → pruning non-trivial because dotted item represents many rule

little effect for grammars with scope < 3

→ true for default hiero extraction heuristics

  • R. Sennrich

Recursive CYK+ 14 / 15

slide-36
SLIDE 36

Conclusion

Summary

dot chart is common, but of limited use in SCFG decoding reordering of chart traversal eliminates need for dot chart no speed-memory trade-off: recursive variant consumes less memory and is faster than CYK+ in the poster: matrix compression for more efficiency gains algorithm narrows efficiency gap between phrase-based and syntax-based (string-to-tree) systems new default in Moses

  • R. Sennrich

Recursive CYK+ 15 / 15

slide-37
SLIDE 37

Thank you!

  • R. Sennrich

Recursive CYK+ 16 / 15

slide-38
SLIDE 38

Bibliography I

Heafield, K., Koehn, P ., and Lavie, A. (2013). Grouping Language Model Boundary Words to Speed K-Best Extraction from Hypergraphs. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 958–968, Atlanta, Georgia, USA. Hopkins, M. and Langmead, G. (2010). SCFG Decoding Without Binarization. In EMNLP, pages 646–655. Williams, P . and Koehn, P . (2012). GHKM Rule Extraction and Scope-3 Parsing in Moses. In Proceedings of the Seventh Workshop on Statistical Machine Translation, pages 388–394, Montréal, Canada. Association for Computational Linguistics. Williams, P ., Sennrich, R., Nadejde, M., Huck, M., Hasler, E., and Koehn, P . (2014). Edinburgh’s Syntax-Based Systems at WMT 2014. In Proceedings of the Ninth Workshop on Statistical Machine Translation, pages 207–214, Baltimore, Maryland, USA. Association for Computational Linguistics.

  • R. Sennrich

Recursive CYK+ 17 / 15