As Easy As Vanda, Two, Three: Components for Machine Translation - - PowerPoint PPT Presentation

as easy as vanda two three components for machine
SMART_READER_LITE
LIVE PREVIEW

As Easy As Vanda, Two, Three: Components for Machine Translation - - PowerPoint PPT Presentation

As Easy As Vanda, Two, Three: Components for Machine Translation Based on Formal Grammars Matthias B uchse Theorietag, Prague, 2012-10-04 1 / 18 Outline Basic Principles of Machine Translation State of the Art Vanda: Versatile Components


slide-1
SLIDE 1

As Easy As Vanda, Two, Three: Components for Machine Translation Based on Formal Grammars

Matthias B¨ uchse Theorietag, Prague, 2012-10-04

1 / 18

slide-2
SLIDE 2

Outline

Basic Principles of Machine Translation State of the Art Vanda: Versatile Components

2 / 18

slide-3
SLIDE 3

Outline

Basic Principles of Machine Translation State of the Art Vanda: Versatile Components

3 / 18

slide-4
SLIDE 4

Goal

. . . ich s¨ age ihre ente ich sah, wie sie sich duckte ich esse spaghetti mit der gabel ich esse spaghetti mit fleischkl¨

  • ßen

. . . . . . i saw her duck i saw her ducking i eat spaghetti with a fork i eat spaghetti with meatballs . . . SL TL h

4 / 18

slide-5
SLIDE 5

Modelling and Algorithmisation

S → ich s¨ age X, I saw X X → ihre Ente, her duck S ⇒ · · · ⇒ Ich s¨ age ihre Ente, I saw her duck

modelling select H ⊆ {h | h: SL → TL} e.g., via synchronous grammar

5 / 18

slide-6
SLIDE 6

Modelling and Algorithmisation

S → ich s¨ age X, I saw X X → ihre Ente, her duck S ⇒ · · · ⇒ Ich s¨ age ihre Ente, I saw her duck

modelling select H ⊆ {h | h: SL → TL} e.g., via synchronous grammar

program Decoder; ... begin ... end.

algorithmise e.g., decoder: given h ∈ H, s ∈ SL compute h(s)

5 / 18

slide-7
SLIDE 7

Training

sentence-aligned bilingual data: parallel corpus

001 Resumption of the session 002 I declare resumed the session of the European Parliament ad- journed on Friday 17 December 1999 , [. . . ] . 001 Wiederaufnahme der Sitzungsperiode 002 Ich erkl¨ are die am Freitag , dem 17. Dezember unterbroche- ne Sitzungsperiode des Europ¨ aischen Parlaments f¨ ur wiederaufge- nommen , [. . . ] . EuroParl corpus, 11 languages, 1.5M sentences each

6 / 18

slide-8
SLIDE 8

Training

sentence-aligned bilingual data: parallel corpus ↓ apply heuristic, statistical methods

6 / 18

slide-9
SLIDE 9

Training

sentence-aligned bilingual data: parallel corpus ↓ apply heuristic, statistical methods ↓ grammar rules, weights

6 / 18

slide-10
SLIDE 10

Evaluation

1 2 3 . . . SL 1 2 3 . . . TL 1 2 3 . . . TL ↑ → decoder

7 / 18

slide-11
SLIDE 11

Evaluation

1 2 3 . . . SL 1 2 3 . . . TL → score ↑ evaluation e.g. BLEU ← 1 2 3 . . . TL ↑ → decoder

7 / 18

slide-12
SLIDE 12

Evaluation

1 2 3 . . . SL 1 2 3 . . . TL → score ↑ evaluation e.g. BLEU ← 1 2 3 . . . TL ↑ → decoder if score > oldscore then publish else perish

7 / 18

slide-13
SLIDE 13

Outline

Basic Principles of Machine Translation State of the Art Vanda: Versatile Components

8 / 18

slide-14
SLIDE 14

Synchronous Context-Free Grammar

π1: S → S X, S X π2: S → X, X π3: X → yu X 1 you X 2 , have X 2 with X 1 π4: X → X 1 de X 2 , the X 2 that X 1 π5: X → X zhiyi, one of X π6: X → Aozhou, Australia π7: X → Beihan, North Korea π8: X → shi, is π9: X → bangjiao, diplomatic relations π10: X → shaoshu guojia, few countries

9 / 18

slide-15
SLIDE 15

Derivation

S, S

10 / 18

slide-16
SLIDE 16

Derivation

S, S

π1

⇒ S X, S X

10 / 18

slide-17
SLIDE 17

Derivation

S, S

π1

⇒ S X, S X

10 / 18

slide-18
SLIDE 18

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

10 / 18

slide-19
SLIDE 19

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

10 / 18

slide-20
SLIDE 20

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

10 / 18

slide-21
SLIDE 21

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

10 / 18

slide-22
SLIDE 22

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

10 / 18

slide-23
SLIDE 23

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

10 / 18

slide-24
SLIDE 24

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

10 / 18

slide-25
SLIDE 25

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

10 / 18

slide-26
SLIDE 26

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

10 / 18

slide-27
SLIDE 27

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

10 / 18

slide-28
SLIDE 28

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

10 / 18

slide-29
SLIDE 29

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

10 / 18

slide-30
SLIDE 30

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

π3

⇒ Aozhou shi yu X 1 you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with X 1

10 / 18

slide-31
SLIDE 31

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

π3

⇒ Aozhou shi yu X 1 you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with X 1

10 / 18

slide-32
SLIDE 32

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

π3

⇒ Aozhou shi yu X 1 you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with X 1

π7

⇒ Aozhou shi yu Beihan you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with North Korea

10 / 18

slide-33
SLIDE 33

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

π3

⇒ Aozhou shi yu X 1 you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with X 1

π7

⇒ Aozhou shi yu Beihan you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with North Korea

10 / 18

slide-34
SLIDE 34

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

π3

⇒ Aozhou shi yu X 1 you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with X 1

π7

⇒ Aozhou shi yu Beihan you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with North Korea

π9

⇒ Aozhou shi yu Beihan you bangjiao de X 2 zhiyi, Australia is one of the X 2 that have diplomatic relations with . . .

10 / 18

slide-35
SLIDE 35

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

π3

⇒ Aozhou shi yu X 1 you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with X 1

π7

⇒ Aozhou shi yu Beihan you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with North Korea

π9

⇒ Aozhou shi yu Beihan you bangjiao de X 2 zhiyi, Australia is one of the X 2 that have diplomatic relations with . . .

10 / 18

slide-36
SLIDE 36

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

π3

⇒ Aozhou shi yu X 1 you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with X 1

π7

⇒ Aozhou shi yu Beihan you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with North Korea

π9

⇒ Aozhou shi yu Beihan you bangjiao de X 2 zhiyi, Australia is one of the X 2 that have diplomatic relations with . . .

π10

⇒ Aozhou shi yu Beihan you bangjiao de shaoshu guojia zhiyi, Australia is one of the few countries that have . . .

10 / 18

slide-37
SLIDE 37

Derivation

S, S

π1

⇒ S X, S X

π1

⇒ S X 1 X 2 , S X 1 X 2

π2

⇒ X 0 X 1 X 2 , X 0 X 1 X 2

π6

⇒ Aozhou X 1 X 2 , Australia X 1 X 2

π8

⇒ Aozhou shi X, Australia is X

π5

⇒ Aozhou shi X zhiyi, Australia is one of X

π4

⇒ Aozhou shi X 1 de X 2 zhiyi, Australia is one of the X 2 that X 1

π3

⇒ Aozhou shi yu X 1 you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with X 1

π7

⇒ Aozhou shi yu Beihan you X 0 de X 2 zhiyi, Australia is one of the X 2 that have X 0 with North Korea

π9

⇒ Aozhou shi yu Beihan you bangjiao de X 2 zhiyi, Australia is one of the X 2 that have diplomatic relations with . . .

π10

⇒ Aozhou shi yu Beihan you bangjiao de shaoshu guojia zhiyi, Australia is one of the few countries that have . . .

10 / 18

slide-38
SLIDE 38

Model

H = {hG,Φ,θ | G ∈ G, Φ: D(G) → Rm, θ ∈ Rm} where hG,Φ,θ : s → SL TL D(G) Rm R

11 / 18

slide-39
SLIDE 39

Model

H = {hG,Φ,θ | G ∈ G, Φ: D(G) → Rm, θ ∈ Rm} where hG,Φ,θ : s → SL TL D(G) Rm R πSL πTL

11 / 18

slide-40
SLIDE 40

Model

H = {hG,Φ,θ | G ∈ G, Φ: D(G) → Rm, θ ∈ Rm} where hG,Φ,θ : s → SL TL D(G) Rm R πSL πTL

11 / 18

slide-41
SLIDE 41

Model

H = {hG,Φ,θ | G ∈ G, Φ: D(G) → Rm, θ ∈ Rm} where hG,Φ,θ : s → SL TL D(G) Rm R πSL πTL

11 / 18

slide-42
SLIDE 42

Model

H = {hG,Φ,θ | G ∈ G, Φ: D(G) → Rm, θ ∈ Rm} where hG,Φ,θ : s → πTL

  • SL

TL D(G) Rm R πSL πTL

11 / 18

slide-43
SLIDE 43

Model

H = {hG,Φ,θ | G ∈ G, Φ: D(G) → Rm, θ ∈ Rm} where hG,Φ,θ : s → πTL

  • G(d) + Φ(d) · θ
  • SL

TL D(G) Rm R Φ (·θ) πSL πTL

11 / 18

slide-44
SLIDE 44

Model

H = {hG,Φ,θ | G ∈ G, Φ: D(G) → Rm, θ ∈ Rm} where hG,Φ,θ : s → πTL

  • argmaxd : πSL(d)=s G(d) + Φ(d) · θ
  • SL

TL D(G) Rm R Φ (·θ) πSL πTL

11 / 18

slide-45
SLIDE 45

SCFG Implementations

name language BLEU speed reference Hiero Python 31.22 27.2 s/sent Chiang et al. 2005 Joshua Java 31.55 2.34 s/sent Li et al. 2009 cdec C++ 31.47 0.37 s/sent Dyer et al. 2010 Moses C++ ? ? Koehn et al. 2007

12 / 18

slide-46
SLIDE 46

SCFG Implementations

name language BLEU speed reference Hiero Python 31.22 27.2 s/sent Chiang et al. 2005 Joshua Java 31.55 2.34 s/sent Li et al. 2009 cdec C++ 31.47 0.37 s/sent Dyer et al. 2010 Moses C++ ? ? Koehn et al. 2007

◮ BLEU score and speed trump adherance to specification

12 / 18

slide-47
SLIDE 47

SCFG Implementations

name language BLEU speed reference Hiero Python 31.22 27.2 s/sent Chiang et al. 2005 Joshua Java 31.55 2.34 s/sent Li et al. 2009 cdec C++ 31.47 0.37 s/sent Dyer et al. 2010 Moses C++ ? ? Koehn et al. 2007

◮ BLEU score and speed trump adherance to specification ◮ lacking conceptualization, vague presentation

12 / 18

slide-48
SLIDE 48

SCFG Implementations

name language BLEU speed reference Hiero Python 31.22 27.2 s/sent Chiang et al. 2005 Joshua Java 31.55 2.34 s/sent Li et al. 2009 cdec C++ 31.47 0.37 s/sent Dyer et al. 2010 Moses C++ ? ? Koehn et al. 2007

◮ BLEU score and speed trump adherance to specification ◮ lacking conceptualization, vague presentation ◮ concepts and code not reusable

12 / 18

slide-49
SLIDE 49

SCFG Implementations

name language BLEU speed reference Hiero Python 31.22 27.2 s/sent Chiang et al. 2005 Joshua Java 31.55 2.34 s/sent Li et al. 2009 cdec C++ 31.47 0.37 s/sent Dyer et al. 2010 Moses C++ ? ? Koehn et al. 2007

◮ BLEU score and speed trump adherance to specification ◮ lacking conceptualization, vague presentation ◮ concepts and code not reusable ◮ obstructs progress

12 / 18

slide-50
SLIDE 50

SCFG Implementations

name language BLEU speed reference Hiero Python 31.22 27.2 s/sent Chiang et al. 2005 Joshua Java 31.55 2.34 s/sent Li et al. 2009 cdec C++ 31.47 0.37 s/sent Dyer et al. 2010 Moses C++ ? ? Koehn et al. 2007

◮ BLEU score and speed trump adherance to specification ◮ lacking conceptualization, vague presentation ◮ concepts and code not reusable ◮ obstructs progress ◮ SCFGs, XTOPs, STSSG, STAG, MBOT, . . .

12 / 18

slide-51
SLIDE 51

Conventional Model and Decoder

POLISHED!

hG,Φ,θ : s → πTL

  • argmax

d : πSL(d)=s

G(d) + Φ(d) · θ

  • 13 / 18
slide-52
SLIDE 52

Conventional Model and Decoder

POLISHED!

hG,Φ,θ : s → πTL

  • argmax

d : πSL(d)=s

G(d) + Φ(d) · θ

  • = πTL
  • argmax

d

(= s)(πSL(d)) + G(d) + Φ(d) · θ

  • 13 / 18
slide-53
SLIDE 53

Conventional Model and Decoder

POLISHED!

hG,Φ,θ : s → πTL

  • argmax

d : πSL(d)=s

G(d) + Φ(d) · θ

  • = πTL
  • argmax

d

(= s)(πSL(d)) + G(d) + Φ(d) · θ

  • = πTL
  • argmax

d

DF(G, s)(d) + θ1 · F1(d) + · · · + θn · Fn(d)

  • derivation forest for s (weighted tree automaton)

DF(G, s): D(G) → R: d → (= s)(πSL(d)) + G(d)

◮ CYK+ or Earley algorithm

O(|G| · |s|3)

13 / 18

slide-54
SLIDE 54

Conventional Model and Decoder

POLISHED!

hG,Φ,θ : s → πTL

  • argmax

d : πSL(d)=s

G(d) + Φ(d) · θ

  • = πTL
  • argmax

d

(= s)(πSL(d)) + G(d) + Φ(d) · θ

  • = πTL
  • argmax

d

DF(G, s)(d) + θ1 · F1(d) + · · · + θn · Fn(d)

  • = πTL
  • argmax

d

(DF(G, s) + θ1 · F1 + · · · + θn · Fn)(d)

  • derivation forest for s (weighted tree automaton)

DF(G, s): D(G) → R: d → (= s)(πSL(d)) + G(d)

◮ CYK+ or Earley algorithm

O(|G| · |s|3)

◮ Hadamard product of weighted tree automata

O(|A1| · |A2|)

13 / 18

slide-55
SLIDE 55

Conventional Model and Decoder

POLISHED!

hG,Φ,θ : s → πTL

  • argmax

d : πSL(d)=s

G(d) + Φ(d) · θ

  • = πTL
  • argmax

d

(= s)(πSL(d)) + G(d) + Φ(d) · θ

  • = πTL
  • argmax

d

DF(G, s)(d) + θ1 · F1(d) + · · · + θn · Fn(d)

  • = πTL
  • argmax

d

(DF(G, s) + θ1 · F1 + · · · + θn · Fn)(d)

  • derivation forest for s (weighted tree automaton)

DF(G, s): D(G) → R: d → (= s)(πSL(d)) + G(d)

◮ CYK+ or Earley algorithm

O(|G| · |s|3)

◮ Hadamard product of weighted tree automata

O(|A1| · |A2|)

◮ Knuth algorithm

O(|E| · log|V |)

13 / 18

slide-56
SLIDE 56

Conventional Model and Decoder

POLISHED!

hG,Φ,θ : s → πTL

  • argmax

d : πSL(d)=s

G(d) + Φ(d) · θ

  • = πTL
  • argmax

d

(= s)(πSL(d)) + G(d) + Φ(d) · θ

  • = πTL
  • argmax

d

DF(G, s)(d) + θ1 · F1(d) + · · · + θn · Fn(d)

  • = πTL
  • argmax

d

(DF(G, s) + θ1 · F1 + · · · + θn · Fn)(d)

  • derivation forest for s (weighted tree automaton)

DF(G, s): D(G) → R: d → (= s)(πSL(d)) + G(d)

◮ CYK+ or Earley algorithm

O(|G| · |s|3)

◮ Hadamard product of weighted tree automata

O(|A1| · |A2|)

◮ Knuth algorithm

O(|E| · log|V |) neither program code nor concepts can be reused

13 / 18

slide-57
SLIDE 57

Outline

Basic Principles of Machine Translation State of the Art Vanda: Versatile Components

14 / 18

slide-58
SLIDE 58

IRTGs over (A1, A2)

(Koller and Kuhlmann 2011) D(G) TΣ T∆1 T∆2 A1 A2 h1 h2 (.)A1 (.)A2 state behavior generational behavior semantic domains

15 / 18

slide-59
SLIDE 59

IRTGs over (A1, A2)

(Koller and Kuhlmann 2011) D(G) TΣ T∆1 T∆2 A1 A2 h1 h2 (.)A1 (.)A2 state behavior generational behavior semantic domains SCFG rule X → yu X 1 you X 2 , have X 2 with X 1 RTG rule X → σ(X, X) tree hom. h1 σ → concat4(yu, x1, you, x2) tree hom. h2 σ → concat4(have, x2, with, x1) A1, A2 string algebra: constants, concatenation

15 / 18

slide-60
SLIDE 60

Model and Decoder for IRTGs over (A1, A2)

hG,Φ,θ :

s → πTL

  • argmaxd : πSL(d)=s G(d) + Φ(d) · θ
  • 16 / 18
slide-61
SLIDE 61

Model and Decoder for IRTGs over (A1, A2)

hG,Φ,θ :

s → πTL

  • argmaxd : πSL(d)=s G(d) + Φ(d) · θ
  • = πTL (argmaxd terms(s)(h1(d)) + G(d) + Φ(d) · θ)

terms(s)(t) =

  • if tA1 = s

−∞

  • therwise

recognizable weighted tree language

16 / 18

slide-62
SLIDE 62

Model and Decoder for IRTGs over (A1, A2)

hG,Φ,θ :

s → πTL

  • argmaxd : πSL(d)=s G(d) + Φ(d) · θ
  • = πTL (argmaxd terms(s)(h1(d)) + G(d) + Φ(d) · θ)

= πTL (argmaxd terms(s) ⊳ G(d) + Φ(d) · θ) terms(s)(t) =

  • if tA1 = s

−∞

  • therwise

recognizable weighted tree language input product L ⊳ G(d) = L(h1(d)) + G(d)

16 / 18

slide-63
SLIDE 63

Model and Decoder for IRTGs over (A1, A2)

hG,Φ,θ :

s → πTL

  • argmaxd : πSL(d)=s G(d) + Φ(d) · θ
  • = πTL (argmaxd terms(s)(h1(d)) + G(d) + Φ(d) · θ)

= πTL (argmaxd terms(s) ⊳ G(d) + Φ(d) · θ) = πTL (argmaxd terms(s) ⊳ G(d) + θ1 · F1(h2(d)) + . . . + θn · Fn(h2(d))) terms(s)(t) =

  • if tA1 = s

−∞

  • therwise

recognizable weighted tree language input product L ⊳ G(d) = L(h1(d)) + G(d)

16 / 18

slide-64
SLIDE 64

Model and Decoder for IRTGs over (A1, A2)

hG,Φ,θ :

s → πTL

  • argmaxd : πSL(d)=s G(d) + Φ(d) · θ
  • = πTL (argmaxd terms(s)(h1(d)) + G(d) + Φ(d) · θ)

= πTL (argmaxd terms(s) ⊳ G(d) + Φ(d) · θ) = πTL (argmaxd terms(s) ⊳ G(d) + θ1 · F1(h2(d)) + . . . + θn · Fn(h2(d))) = πTL (argmaxd terms(s) ⊳ G(d) + (θ1 · F1 + . . . + θn · Fn)(h2(d))) terms(s)(t) =

  • if tA1 = s

−∞

  • therwise

recognizable weighted tree language input product L ⊳ G(d) = L(h1(d)) + G(d)

16 / 18

slide-65
SLIDE 65

Model and Decoder for IRTGs over (A1, A2)

hG,Φ,θ :

s → πTL

  • argmaxd : πSL(d)=s G(d) + Φ(d) · θ
  • = πTL (argmaxd terms(s)(h1(d)) + G(d) + Φ(d) · θ)

= πTL (argmaxd terms(s) ⊳ G(d) + Φ(d) · θ) = πTL (argmaxd terms(s) ⊳ G(d) + θ1 · F1(h2(d)) + . . . + θn · Fn(h2(d))) = πTL (argmaxd terms(s) ⊳ G(d) + (θ1 · F1 + . . . + θn · Fn)(h2(d))) = πTL (argmaxd terms(s) ⊳ G ⊲ (θ1 · F1 + . . . + θn · Fn)(d)) terms(s)(t) =

  • if tA1 = s

−∞

  • therwise

recognizable weighted tree language input product L ⊳ G(d) = L(h1(d)) + G(d)

  • utput product G ⊲ L(d) = G(d) + L(h2(d))

16 / 18

slide-66
SLIDE 66

Implementation

◮ IRTG over (strings, trees) ◮ components in Vanda Toolbox (Haskell)

◮ input product (Earley’s algorithm) ◮ binarization ◮ Knuth’s algorithm

◮ accessible in Vanda Studio

17 / 18

slide-67
SLIDE 67

References

Chiang, David et al. (2005). “The Hiero machine translation system: extensions, evaluation, and analysis”. In: Proc. Human Language Technology and Empirical Methods in Natural Language

  • Processing. ACL, pp. 779–786.

Dyer, Chris et al. (2010). “cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models”. In: Proc. ACL 2010 System Demonstrations. ACL,

  • pp. 7–12.

Koehn, Philipp et al. (2007). “Moses: open source toolkit for statistical machine translation”. In: Proc. ACL on Interactive Poster and Demonstration Sessions. ACL, pp. 177–180. Koller, Alexander and Marco Kuhlmann (2011). “A Generalized View on Parsing and Translation”. In: Proc. IWPT 2011, pp. 2–13. Li, Zhifei et al. (2009). “Joshua: an open source toolkit for parsing-based machine translation”. In: Proc. Workshop on Statistical Machine Translation. ACL, pp. 135–139.

18 / 18