Discontinuous Statistical Machine Translation with Target-Side - - PowerPoint PPT Presentation

discontinuous statistical machine translation with target
SMART_READER_LITE
LIVE PREVIEW

Discontinuous Statistical Machine Translation with Target-Side - - PowerPoint PPT Presentation

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion Discontinuous Statistical Machine Translation with Target-Side Dependency Syntax Nina Seemann Andreas Maletti University of Stuttgart Institute


slide-1
SLIDE 1

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Discontinuous Statistical Machine Translation with Target-Side Dependency Syntax

Nina Seemann Andreas Maletti

University of Stuttgart – Institute for Natural Language Processing – Pfaffenwaldring 5b 70569 Stuttgart

September 17, 2015

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 1

slide-2
SLIDE 2

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Outline

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 2

slide-3
SLIDE 3

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Syntax-based Machine Translation

phrase syntax semantics English foreign

◮ Source language side is a string ◮ Target language side requires syntactic annotations

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 3

slide-4
SLIDE 4

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Discontinuous Target Languages

We want to translate from English to Russian and Polish:

◮ morphologically rich ◮ free word order languages ◮ grammatically agreeing parts spread out over whole sentence ◮ syntax difficult to express in terms of constituency structure ◮ not parseable by constituency parser ◮ but by dependency parsers

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 4

slide-5
SLIDE 5

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Outline

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 5

slide-6
SLIDE 6

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Dependency Parsing

S A P S S I S I konwencja haska w sprawie

  • bligacji

( g losowanie )

ROOT ADJUNCT ADJUNCT COMP MWE PAR MWE PAR

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 6

slide-7
SLIDE 7

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Non-projective Dependency Parse

◮ h → d is projective iff h dominates all nodes in the linear span

between h and d.

◮ Dependency parse is projective iff all its edges are projective.

S A P S S I S I konwencja haska w sprawie

  • bligacji

( g losowanie )

ROOT ADJUNCT ADJUNCT COMP MWE PAR MWE PAR Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 7

slide-8
SLIDE 8

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Lifting [Kahane et al., 1998]

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 8

slide-9
SLIDE 9

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Lifting [Kahane et al., 1998]

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 8

slide-10
SLIDE 10

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Lifting [Kahane et al., 1998]

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 8

slide-11
SLIDE 11

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Lifting [Kahane et al., 1998]

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 8

slide-12
SLIDE 12

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Lifting [Nivre and Nilsson, 2005]

Refined the lifting process by performing the same operation but document the lifting in the labels ⇒ path

S A P S S I S I konwencja haska w sprawie

  • bligacji

( g losowanie )

ROOT ADJUNCT ADJUNCT COMP MWE↓ PAR MWE↑ PAR Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 9

slide-13
SLIDE 13

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Conversion from dependency to constituency tree

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 10

slide-14
SLIDE 14

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Conversion from dependency to constituency tree

ROOT ADJUNCT S konwencja A haska COMP ADJUNCT P w S sprawie MWE↓ S

  • bligacji

PAR I ( MWE↑ S głosowanie PAR I )

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 10

slide-15
SLIDE 15

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Conversion from dependency to constituency tree

ROOT ADJUNCT S konwencja A haska COMP ADJUNCT P w S sprawie MWE↓ S

  • bligacji

PAR I ( MWE↑ S głosowanie PAR I )

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 10

slide-16
SLIDE 16

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Conversion from dependency to constituency tree

ROOT ADJUNCT S konwencja A haska COMP ADJUNCT P w S sprawie MWE↓ S

  • bligacji

PAR I ( MWE↑ S głosowanie PAR I )

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 10

slide-17
SLIDE 17

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Conversion from dependency to constituency tree

ROOT ADJUNCT S konwencja A haska COMP ADJUNCT P w S sprawie MWE↓ S

  • bligacji

PAR I ( MWE↑ S głosowanie PAR I )

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 10

slide-18
SLIDE 18

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Conversion from dependency to constituency tree

ROOT ADJUNCT S konwencja A haska COMP ADJUNCT P w S sprawie MWE↓ S

  • bligacji

PAR I ( MWE↑ S głosowanie PAR I )

Preserves discontinuities!

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 10

slide-19
SLIDE 19

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Outline

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 11

slide-20
SLIDE 20

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

String-to-Tree Multi Bottom-up Tree Transducer

lexical continuous rule:

motivated by →

  • P

motywowane

  • lexical discontinuous rule:

this is not something that →

  • ADJUNCT

nie jest to co´ s , I , , S co

  • structural continuous rule:

technologies X →

  • ADJUNCT

technologii MWE

  • structural discontinuous rules:

there are X that X →

  • IMP

sa MWE , ADJUNCT

  • it needs to X →

ADJUNCT musi MWE , PUNCT

  • Nina Seemann

Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 12

slide-21
SLIDE 21

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Translation Model

Standard log-linear model with the following 8 features:

◮ . . . ◮ gap penalty 1001−c (c is the number of target tree fragments)

We use the MBOT-Moses decoder [Braune et al. 2013]:

◮ standard Moses syntax-based decoder ◮ extended to handle target side discontinuities

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 13

slide-22
SLIDE 22

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Outline

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 14

slide-23
SLIDE 23

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Setup

English to Polish English to Russian training data 7th EuroParl corpus WMT 2014 language model 5-gram SRILM tuning data cut from EuroParl (≈ 3k) WMT 2014 test data cut from EuroParl(≈ 3k) WMT 2014

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 15

slide-24
SLIDE 24

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Training Pipeline

Target side:

◮ TreeTagger [Schmid 1996] ◮ MaltParser [Nivre et al. 2006, Sharoff & Nivre 2011] ◮ Path-Lifting ◮ Conversion into constituency tree

Parallel Data:

◮ tokenized and lowercased ◮ length-ratio filtered up to length 80 ◮ word alignments by GIZA++ [Och & Ney 2003] with

grow-diag-final-and Tuning: Minimum error rate training [Och 2003]

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 16

slide-25
SLIDE 25

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Experimental Results

Translation task System BLEU English-to-Polish Baseline 21.29 MBOT 23.43 GHKM 23.31 Phrase-based 24.35 Hiero 24.56 English-to-Russian Baseline 24.66 MBOT 26.13 GHKM 25.97 Phrase-based 27.90 Hiero 27.72

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 17

slide-26
SLIDE 26

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Losses across the systems

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 18

slide-27
SLIDE 27

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Analysis of rules used during decoding

English-to-Polish All rules: Structural rules: English-to-Russian All rules: Structural rules:

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 19

slide-28
SLIDE 28

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Outline

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 20

slide-29
SLIDE 29

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Recap

◮ Translation into free word order languages ◮ Discontinuous constituents ◮ Dependency parsers producing non-projective parses:

  • 1. Projectivize by lifting technique documenting process
  • 2. Transform projective dependency trees into constituent-like trees

◮ String-to-tree local multi bottom-up tree transducers ◮ Discontinuous translation model

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 21

slide-30
SLIDE 30

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Conclusion

◮ MBOT avoids large quality drop between (hierarchical)

phrase-based system and continuous string-to-tree one

◮ Discontinuous tree fragments yield significant improvements ◮ Overall performance similar to (hierarchical) phrase-based systems ◮ But, outscoring (hierarchical) phrase-based remains a challenge ◮ Can syntactic information actually help the translation quality in

those translation tasks?

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 22

slide-31
SLIDE 31

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Thank you! Questions?!?

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 23

slide-32
SLIDE 32

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

Related Work

Xie et al., 2011:

◮ dependency-to-string model with head-dependent rules ◮ custom-made decoder

Li et al., 2014:

◮ transform dependency trees into (a kind of) constituency trees ◮ use the conventional syntax-based models of Moses

Sennrich et al., 2015:

◮ transform (non-projective) dependency trees into constituency trees ◮ using the syntactic functions provided by the parser ◮ string-to-tree GHKM model of Moses

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 24

slide-33
SLIDE 33

Introduction Transformation Process Discontinuous Translation Model Experiments Conclusion

References

Eisner: Learning Non-Isomorphic Tree Mappings for Machine Translation. ACL 2003. Kahane et al.: Pseudo-Projectivity: A Polynomially Parsable Non-Projective Dependency Grammar. ACL 1998. Li et al.: Transformation and Decomposition for Efficiently Implementing and Improving Dependency-to-String Model In Moses. SSST 2014. Nivre and Nilsson: Pseudo-projective Dependency Parsing. ACL 2005. Nivre et al.: MaltParser: A Data-Driven Parser-Generator for Dependency Parsing. LREC 2006. Schmid: Probabilistic Part-of-Speech Tagging Using Decision Trees. New Methods in Language Processing 1994. Sennrich et al.: A tree does not make a well-formed sentence: Improving syntactic string-to-tree statistical machine translation with more linguistic knowledge. Computer Speech & Language 32, 2015. Sharoff and Nivre: The proper place of men and machines in language technology Processing Russian without any linguistic knowledge. Dialogue 2011. Sun et al.: A Non-Contigous Tree Sequence Alignment-based Model for Statistical Machine Translation. ACL 2009. Wr´

  • blewska and Przepi´
  • rkowski: Induction of Dependency Structures Based on

Weighted Projection. ICCCI 2012. Xie et al.: A Novel Dependency-to-string Model for Statistical Machine Translation. EMNLP 2011.

Nina Seemann Discontinuous SMT with Target-Side Dependency Syntax WMT 2015 · 25