Machine T ranslation between Languages with Significant Word Reordering and Rich T arget-side Morphology
20
th Week of Doctoral Students, June 3 rd, 2011
ÚFAL, Charles University in Prague Bushra Jawaid
- RNDr. Ondřej Bojar (PhD. Advisor)
Machine T ranslation between Languages with Significant Word - - PowerPoint PPT Presentation
Machine T ranslation between Languages with Significant Word Reordering and Rich T arget-side Morphology Machine Translation between Languages with Significant Word Reordering and Rich T arget-side Morphology th Week of Doctoral Students,
20
th Week of Doctoral Students, June 3 rd, 2011
ÚFAL, Charles University in Prague Bushra Jawaid
English Sentence: I understand English and Urdu? Urdu Translation: ںوہ یتھججمسودُرا ُ روایزییرگناںییم Transliteration: meñ angrezī aor Urdū samjhte hūñ Gloss: I English and Urdu understand (Auxiliary)
2
number.
3
Root Infinitive Oblique Intransitive/ (di) Transitive b n ə نب b nn ə ɑ اننب b nne ə ےننننب Direct Causative b n ə ɑ انب b n n ə ɑ ɑ انانب b n ne ə ɑ ےنانب Indirect Causative b nw ə ɑ اونب b nw n ə ɑ ɑ اناونب b nw ne ə ɑ ےناونب
4
5
English Czech Form Form +LM Lemma Lemma +LM Morphology Morphology +LM
6
7 Reordering Plain text Input Middle Language Plain text Output
Middle layer
Morphology
1st step 2nd step
8
9
10 Moses Plain text Input Plain text Output
Monotone
Moses
Distance, Lexicalized reordering
Strings
1. Reordering options:
improved reordering.
movements.
11 Plain text Input Moses-chart Joshua Strings
Plain text Output
Monotone
Moses
1-best output
12 Plain text Input
Transformation System
Moses-chart Joshua Strings
Plain text Output
Monotone
Moses
1-best output 1-best output
13 Plain text Input
Transformation System
Strings
Plain text Output
Monotone
Moses L a t t i c e s
1-best output N-best output
Moses
2. Middle Layer options:
Passing lattices of possible hypothesis from1st step to 2nd step instead of passing hypothesis of simple string. Multiple reorderings are considered and 2nd step is free to choose the one that is the easiest to inflect.
14 Lattices Plain text Input
Transformation System
Plain text Output
Monotone
Moses S t r i n g s / L a t t i c e s Joshua Moses-chart
N-best output 1 or N best
Moses
nd Layer options:
15 Lattices Plain text Input
Transformation System
Plain text Output
Monotone
Classifier S t r i n g s / L a t t i c e s
N-best output 1 or N best
Joshua Moses-chart Moses
– Collecting tools such as tagger and morphological analyzer for
Urdu.
– Trying to combine the taggers to improve precision. – Need to merge the different tagsets.
16
17