Compositions of Tree-to-Tree Statistical Machine Translation Models - - PowerPoint PPT Presentation

compositions of tree to tree statistical machine
SMART_READER_LITE
LIVE PREVIEW

Compositions of Tree-to-Tree Statistical Machine Translation Models - - PowerPoint PPT Presentation

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti Institute for Natural Language Processing University of Stuttgart, Germany Montral July 28, 2016 Compositions of Tree-to-Tree Statistical Machine


slide-1
SLIDE 1

Compositions of Tree-to-Tree Statistical Machine Translation Models

Andreas Maletti

Institute for Natural Language Processing University of Stuttgart, Germany

Montréal — July 28, 2016

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 1

slide-2
SLIDE 2

Statistical machine translation

Required resource

Parallel corpus (containing sentences in both languages)

Additional resources

Language model data (text in the target language) Word alignments Parse trees Various (morphological analyzers, parsers)

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 2

slide-3
SLIDE 3

Statistical machine translation

Assume translation direction → (English to Catalan) but no parallel corpus for this pair

Pivot approach

but parallel corpus for → → (parallel corpus for English-Spanish and Spanish-Catalan) (similar motivation for non-deterministic pre- or post-processing steps)

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 3

slide-4
SLIDE 4

Statistical machine translation

Consequences

2 translation models (operating in sequence) Inefficiencies in sequential operation (due to sequential pruning) Theoretical guarantees missing for many operations (e.g., tuning)

Remedies

Partial Evaluation [MayKniVog10] Composition

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 4

slide-5
SLIDE 5

Statistical machine translation

Vauquois triangle: phrase syntax semantics source target Translation model:

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 5

slide-6
SLIDE 6

Statistical machine translation

Vauquois triangle: phrase syntax semantics source target Translation model: tree-to-tree

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 5

slide-7
SLIDE 7

Overview

1

Background

2

Theoretical Model

3

Unweighted Compositions

4

Weighted Compositions

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 6

slide-8
SLIDE 8

Extended tree transducers

Definition

Transducer M = (Q, Σ, (q1, q2), R, wt) with finite set Q of states finite set Σ of symbols source initial state q1 and target initial state q2 finite set R of rules (see below) weight assignment wt: R → A (into a semiring)

  • q,

S NP PRP I VP q1 q2

  • q′,

SENT NP CI J’ VN q′

1

q′

2

  • Compositions of Tree-to-Tree Statistical Machine Translation Models

Andreas Maletti · 7

slide-9
SLIDE 9

Extended tree transducers

Restrictions

Transducer M = (Q, Σ, (q1, q2), R, wt) is STSG if Q = Σ and state = root label SCFG if STSG and all rules are shallow ε-free if no lef-hand side is in Q strict if no right-hand side is in Q simple if ε-free and strict

  • S,

S NP PRP I VP VBP NP

  • SENT,

SENT NP CI J’ VN V NP

  • S,

S I VBP NP

  • SENT,

SENT J’ V NP

  • Compositions of Tree-to-Tree Statistical Machine Translation Models

Andreas Maletti · 8

slide-10
SLIDE 10

Extended tree transducers

Rules:

  • q,

S NP PRP I VP q1 q2

  • q′,

SENT NP CI J’ VN q′

1

q′

2

  • q2,

NP NNS flowers

  • q′

2,

NP D les NC fleurs

  • Use in a derivation step:

S NP PRP I VP q1 q2 — SENT NP CI J’ VN q′

1

q′

2

⇒ S NP PRP I VP q1 NP NNS flowers — SENT NP CI J’ VN q′

1

NP D les NC fleurs

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 9

slide-11
SLIDE 11

Extended tree transducers

Initial sentential form:

q1 — q2

Final sentential form: No more linked states

Weight

  • f a derivation: product of the weights of the used rules
  • f a tree pair: sum of all derivations for the pair

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 10

slide-12
SLIDE 12

Extended tree transducers

Composition

Given two weighted relations τ : TΣ × T∆ → A and τ ′ : T∆ × TΓ → A (τ ; τ ′)(t, s) =

  • u∈T∆

τ(t, u) · τ ′(u, s)

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 11

slide-13
SLIDE 13

Overview

1

Background

2

Theoretical Model

3

Unweighted Compositions

4

Weighted Compositions

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 12

slide-14
SLIDE 14

Known results

SCFG STSG TT [Eis03] and folklore Model Composition closure Reference top-down transducer 1 [Eng75] simple transducer 2 [ArnDau82]

  • ther transducer

∞ [EngFulMal16] (top-down tree transducer = all lef-hand sides shallow)

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 13

slide-15
SLIDE 15

First result

Theorem

SCFG are closed under composition

Matching original rules NP a ADJ NN — NP une NC A — NP ein ADJA NN Newly constructed rule NP a ADJ NN — NP ein ADJA NN

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 14

slide-16
SLIDE 16

Second result

Difficulty for STSGs: ... — NP DT NP NP DT NP ADJ NP’ — ... Extending the rules does not help: ... — NP DT NP ADJ NP’ CC ADJ NP NP DT NP ADJ NP’ — ...

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 15

slide-17
SLIDE 17

Second result

Theorem

Compositions of n ≥ 2 simple STSGs are as expressive as compositions of n simple transducers Proof idea: Encode finite states in the intermediate tree(s)

  • q,

S NP PRP I VP q1 q2

  • q′,

SENT NP CI J’ VN q′

1

q′

2

  • S

NP PRP I VP VBP NP — (SENT, q, q′) NP CI J’ VN (V, q1, q′

1)

(NP, q2, q′

2) Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 16

slide-18
SLIDE 18

Second result

Corollary

The composition closure for simple STSGs is obtained at level 2 Proof idea: s-STSG s-TT s-TT2 = s-TT3 = s-STSG3 = s-STSG2

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 17

slide-19
SLIDE 19

Third result

Theorem

The composition hierarchy for the remaining STSGs is infinite Proof idea: Inspect the corresponding proof for transducers and realize that the counterexamples can be generated by STSGs

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 18

slide-20
SLIDE 20

Overview

1

Background

2

Theoretical Model

3

Unweighted Compositions

4

Weighted Compositions

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 19

slide-21
SLIDE 21

Known results and main lemma

s-wTT ; wREL ⊆ s-wTT wREL ; s-wTT ⊆ s-wTT follows from [FulMalVog11] and [Kui99]

Lemma

s-wTT ⊆ su-TTinj ; wREL Proof idea: Simply annotate rule identifier of applied rule to output and apply weights using the relabeling (su-TTinj = injective relations computed by simple unamb. transducers)

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 20

slide-22
SLIDE 22

First weighted result

Theorem

The composition closure of weighted simple STSGs is obtained at level 2 Proof idea: s-wTT ; s-wTT2 ⊆ su-TTinj ; wREL ; s-wTT2 ⊆ su-TTinj ; s-wTT2 ⊆ su-TT2

inj ; wREL ; s-wTT

⊆ su-TT2

inj ; s-wTT ⊆ su-TT3 inj ; wREL

⊆ s-TT2

injective

; wREL ⊆ su-TT2

inj ; wREL ⊆ s-wTT2

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 21

slide-23
SLIDE 23

Second weighted result

Theorem

The composition hierarchy for the remaining weighted STSGs is infinite Proof idea: Utilize linking technique of [Mal15] to lif the corresponding unweighted result

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 22

slide-24
SLIDE 24

Summary

Model Composition closure (weighted) SCFGs 1 (weighted) simple STSGs 2 (weighted) other STSGs ∞

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 23

slide-25
SLIDE 25

Literature

May, Knight, Vogler: Efficient inference through cascades of weighted tree transducers. ACL 2010 Eisner: Learning non-isomorphic tree mappings for machine translation. ACL 2003 Engelfriet: Bottom-up and top-down tree transformations: a comparison.

  • Math. Syst. Theor., 1975

Arnold, Dauchet: Morphismes et bimorphismes d’arbres. Theor. Comput. Sci. 1982 Engelfriet, Fülöp, M.: Composition closure of linear extended top-down tree

  • transducers. Theor. Comput. Syst. 2016

Fülöp, M., Vogler: Weighted extended tree transducers. Fundam. Informaticae 2011 Kuich: Full abstract families of tree series I. In: Jewels are Forever, Springer 1999 M.: The power of weighted regularity-preserving multi bottom-up tree transducers.

  • Int. J. Found. Comput. Sci. 2015

Compositions of Tree-to-Tree Statistical Machine Translation Models Andreas Maletti · 24