Compositions of Tree-to-Tree Statistical Machine Translation Models - PowerPoint PPT Presentation

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti Institute for Natural Language Processing University of Stuttgart, Germany Montréal — July 28, 2016 Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 1

Statistical machine translation Required resource Parallel corpus (containing sentences in both languages) Additional resources Language model data (text in the target language) Word alignments Parse trees Various (morphological analyzers, parsers) Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 2

Statistical machine translation Assume translation direction → (English to Catalan) but no parallel corpus for this pair Pivot approach but parallel corpus for → → (parallel corpus for English-Spanish and Spanish-Catalan) (similar motivation for non-deterministic pre- or post-processing steps) Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 3

Statistical machine translation Consequences 2 translation models (operating in sequence) Inefficiencies in sequential operation (due to sequential pruning) Theoretical guarantees missing for many operations (e.g., tuning) Remedies Partial Evaluation [MayKniVog10] Composition Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 4

Statistical machine translation Vauquois triangle: target source semantics syntax phrase Translation model: Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 5

Statistical machine translation Vauquois triangle: target source semantics syntax phrase Translation model: tree-to-tree Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 5

Overview Background 1 Theoretical Model 2 Unweighted Compositions 3 Weighted Compositions 4 Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 6

Extended tree transducers Definition Transducer M = ( Q , Σ , ( q 1 , q 2 ) , R , wt ) with finite set Q of states finite set Σ of symbols source initial state q 1 and target initial state q 2 finite set R of rules (see below) weight assignment wt : R → A (into a semiring) SENT S � � � q ′ � NP VN NP VP 2 q , — q ′ , q 1 q 2 PRP q ′ CI 1 I J’ Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 7

Extended tree transducers Restrictions Transducer M = ( Q , Σ , ( q 1 , q 2 ) , R , wt ) is STSG if Q = Σ and state = root label SCFG if STSG and all rules are shallow ε -free if no lef-hand side is in Q strict if no right-hand side is in Q simple if ε -free and strict S SENT � � � � NP VP NP VN NP � � � � S SENT S , — SENT , S , — SENT , PRP VBP NP CI V I VBP NP J’ V NP I J’ Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 8

Extended tree transducers Rules: SENT S � � � � q ′ NP NP NP VN NP VP 2 � � � � — q ′ , q , q 2 , NNS — q ′ D NC q 1 q 2 2 , PRP q ′ CI 1 flowers les fleurs I J’ Use in a derivation step: S SENT SENT S NP VP q ′ NP VN NP NP VN NP VP 2 — q 1 — ⇒ PRP NP q 1 q 2 q ′ CI D NC PRP q ′ CI 1 1 I NNS I J’ les fleurs J’ flowers Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 9

Extended tree transducers Initial sentential form: — q 1 q 2 Final sentential form: No more linked states Weight of a derivation: product of the weights of the used rules of a tree pair: sum of all derivations for the pair Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 10

Extended tree transducers Composition Given two weighted relations τ : T Σ × T ∆ → A and τ ′ : T ∆ × T Γ → A � ( τ ; τ ′ )( t , s ) = τ ( t , u ) · τ ′ ( u , s ) u ∈ T ∆ Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 11

Known results SCFG � STSG � TT [Eis03] and folklore Model Composition closure Reference top-down transducer 1 [Eng75] simple transducer 2 [ArnDau82] other transducer [EngFulMal16] ∞ (top-down tree transducer = all lef-hand sides shallow) Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 13

First result Theorem SCFG are closed under composition Matching original rules Newly constructed rule NP NP NP NP NP — — — a une a ADJ NN NC A ein ADJA NN ADJ NN ein ADJA NN Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 14

Second result Difficulty for STSGs: NP NP ... — — ... DT NP DT NP ADJ NP’ Extending the rules does not help: NP NP DT NP ... — — ... DT NP ADJ NP’ ADJ NP’ CC ADJ NP Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 15

Second result Theorem Compositions of n ≥ 2 simple STSGs are as expressive as compositions of n simple transducers Proof idea: Encode finite states in the intermediate tree(s) SENT S ( SENT , q , q ′ ) S � � � q ′ � NP VN NP VP 2 ( NP , q 2 , q ′ 2 ) VP NP VN q ′ , NP q , — — q 1 q 2 PRP q ′ CI 1 PRP VBP NP ( V , q 1 , q ′ 1 ) CI I J’ I J’ Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 16

Second result Corollary The composition closure for simple STSGs is obtained at level 2 Proof idea: s-STSG � s-TT � s-TT 2 = s-TT 3 = s-STSG 3 = s-STSG 2 Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 17

Third result Theorem The composition hierarchy for the remaining STSGs is infinite Proof idea: Inspect the corresponding proof for transducers and realize that the counterexamples can be generated by STSGs Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 18

Known results and main lemma s-wTT ; wREL ⊆ s-wTT wREL ; s-wTT ⊆ s-wTT follows from [FulMalVog11] and [Kui99] Lemma s-wTT ⊆ su-TT inj ; wREL Proof idea: Simply annotate rule identifier of applied rule to output and apply weights using the relabeling (su-TT inj = injective relations computed by simple unamb. transducers) Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 20

First weighted result Theorem The composition closure of weighted simple STSGs is obtained at level 2 Proof idea: s-wTT ; s-wTT 2 ⊆ su-TT inj ; wREL ; s-wTT 2 ⊆ su-TT inj ; s-wTT 2 ⊆ su-TT 2 inj ; wREL ; s-wTT ⊆ su-TT 2 inj ; s-wTT ⊆ su-TT 3 inj ; wREL ⊆ s-TT 2 ⊆ su-TT 2 inj ; wREL ⊆ s-wTT 2 ; wREL � �� injective Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 21

Second weighted result Theorem The composition hierarchy for the remaining weighted STSGs is infinite Proof idea: Utilize linking technique of [Mal15] to lif the corresponding unweighted result Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 22

Summary Model Composition closure (weighted) SCFGs 1 (weighted) simple STSGs 2 (weighted) other STSGs ∞ Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 23

Literature May, Knight, Vogler: Efficient inference through cascades of weighted tree transducers. ACL 2010 Eisner: Learning non-isomorphic tree mappings for machine translation. ACL 2003 Engelfriet: Bottom-up and top-down tree transformations: a comparison. Math. Syst. Theor., 1975 Arnold, Dauchet: Morphismes et bimorphismes d’arbres. Theor. Comput. Sci. 1982 Engelfriet, Fülöp, M.: Composition closure of linear extended top-down tree transducers. Theor. Comput. Syst. 2016 Fülöp, M., Vogler: Weighted extended tree transducers. Fundam. Informaticae 2011 Kuich: Full abstract families of tree series I. In: Jewels are Forever, Springer 1999 M.: The power of weighted regularity-preserving multi bottom-up tree transducers. Int. J. Found. Comput. Sci. 2015 Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 24

Compositions of Tree-to-Tree Statistical Machine Translation Models - PowerPoint PPT Presentation

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti Institute for Natural Language Processing University of Stuttgart, Germany Montral July 28, 2016 Compositions of Tree-to-Tree Statistical Machine

Optimization of construction Optimization of construction compositions for design of green

Compositions of Extended Top-down Tree Transducers Andreas Maletti March 30, 2007 Short

Chapter 5: Integer Compositions and Partitions and Set Partitions Prof. Tesler Math 184A

Compositions and Infinite Matrices Rod Canfield 9 Feb 2013 Compositions and Infinite Matrices

Chapter 5: Integer Compositions and Partitions and Set Partitions Prof. Tesler Math 184A Fall

Statistical Machine Translation George Foster George Foster Statistical Machine Translation A

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Capturing Translational Divergences with Zhechev & Andy Way a Statistical Tree-to-Tree

Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universit at

Compositions of Tree Series Transformations Andreas Maletti a Technische Universit at Dresden

Compositions of Tree Series Transformations Andreas Maletti Technische Universit at Dresden

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Chapter 11 Tree-based models Statistical Machine Translation Tree-Based Models Traditional

Statistical Machine Translation Graham Neubig Nara Institute of Science and Technology (NAIST)

Grammar transformation with DPO rewriting Aleks Kissinger 1 Vladimir Zamdzhiev 2 1 iCIS Radboud

On classification of XML document transformations Jana Dvo rkov FMFI UK, Bratislava

An Approach for Bridging the Gap Between Business Rules and the Semantic Web Birgit Demuth

Syntactical analysis Syntactical analysis Context-free grammars A context-free grammar is a

Syntax Analysis Sukree Sinthupinyo 1 1 Department of Computer Engineering Chulalongkorn University

Programming Languages: Parsing Onur Tolga S ehito glu Computer Engineering,METU 27 May

SENTINEL TESTING, RESULTS TO DATE TOTAL NUMBERS: Inmates have been tested at the Butte

Mat 2170 Week 4 Chapter Four Part A Review Boolean Control Control Statements

Compositions of Tree-to-Tree Statistical Machine Translation Models - PowerPoint PPT Presentation

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti Institute for Natural Language Processing University of Stuttgart, Germany Montral July 28, 2016 Compositions of Tree-to-Tree Statistical Machine

Optimization of construction Optimization of construction compositions for design of green

Compositions of Extended Top-down Tree Transducers Andreas Maletti March 30, 2007 Short

Chapter 5: Integer Compositions and Partitions and Set Partitions Prof. Tesler Math 184A

Compositions and Infinite Matrices Rod Canfield 9 Feb 2013 Compositions and Infinite Matrices

Chapter 5: Integer Compositions and Partitions and Set Partitions Prof. Tesler Math 184A Fall

Statistical Machine Translation George Foster George Foster Statistical Machine Translation A

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Capturing Translational Divergences with Zhechev &amp; Andy Way a Statistical Tree-to-Tree

Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universit at

Compositions of Tree Series Transformations Andreas Maletti a Technische Universit at Dresden

Compositions of Tree Series Transformations Andreas Maletti Technische Universit at Dresden

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

Chapter 11 Tree-based models Statistical Machine Translation Tree-Based Models Traditional

Statistical Machine Translation Graham Neubig Nara Institute of Science and Technology (NAIST)

Grammar transformation with DPO rewriting Aleks Kissinger 1 Vladimir Zamdzhiev 2 1 iCIS Radboud

On classification of XML document transformations Jana Dvo rkov FMFI UK, Bratislava

An Approach for Bridging the Gap Between Business Rules and the Semantic Web Birgit Demuth

Syntactical analysis Syntactical analysis Context-free grammars A context-free grammar is a

Syntax Analysis Sukree Sinthupinyo 1 1 Department of Computer Engineering Chulalongkorn University

Programming Languages: Parsing Onur Tolga S ehito glu Computer Engineering,METU 27 May

SENTINEL TESTING, RESULTS TO DATE TOTAL NUMBERS: Inmates have been tested at the Butte

Mat 2170 Week 4 Chapter Four Part A Review Boolean Control Control Statements

Capturing Translational Divergences with Zhechev & Andy Way a Statistical Tree-to-Tree