Machine Translation: An Overview Marcello Federico FBK, Trento - - - PDF document

machine translation an overview
SMART_READER_LITE
LIVE PREVIEW

Machine Translation: An Overview Marcello Federico FBK, Trento - - - PDF document

Machine Translation: An Overview Marcello Federico FBK, Trento - Italy 2014 M. Federico MT 2014 Outline 1 Introduction Motivation Approaches Brief history Evaluation State-of-the-art Examples References: P.


slide-1
SLIDE 1

Machine Translation: An Overview

Marcello Federico FBK, Trento - Italy 2014

  • M. Federico

MT 2014 1

Outline

  • Introduction
  • Motivation
  • Approaches
  • Brief history
  • Evaluation
  • State-of-the-art
  • Examples

References:

  • P. Koehn, Statistical Machine Translation, Cambridge University Press, 2009.
  • A. Lopez, Statistical Machine Translation, ACM Computing Surveys, vol. 40, number 3, 2008.
  • D. Jurafsky and J. H. Martin, Speech and Language Processing, Prentice Hall, 2009.
  • C. Manning and H. Sch¨

utze, Foundations of Statistical Natural Language Processing, MIT Press, 199 9.

  • M. Federico

MT 2014

slide-2
SLIDE 2

2

Machine Translation

Wikipedia Machine translation, often referred to by the acronym MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. Personal Definition MT generally investigates the automatic translation of ”standard” language that can be systematically observed in ordinary communication – e.g. conversations, news, speeches, business letters, user manuals, etc. –. MT is generally not concerned with literature genres, nor creative and sophisticated use of language. For several reasons, such kind of language is simply out of the scope of MT.

1For a very interesting introduction to issues related to the translation of literature work see Umberto Eco,

”Experiences in Translation”, U. Toronto Press, 2001.

  • M. Federico

MT 2014 3

Introduction to MT

Why is Machine Translation so Important?1

  • Information society and production of multilingual content

7 billion people - 193 countries - over 150 official languages

  • Globalization and demand for translation services:

1,000 global companies operating in at least 160 countries

  • Size of worldwide translation market:

12.5 billion $ per year ≈ 34 million $ per day

  • Size of translation industry:

3,000 translation companies 250,000 translators

  • MT can improve productivity of human translators:

integration of MT with human translation (post-editing)

  • MT can supply cheap gist translation

competitive quality-cost-speed trade-off

1Source: Common Sense Advisory, 2010

  • M. Federico

MT 2014

slide-3
SLIDE 3

4

Introduction to MT

Do we need more research in MT? Chinglish examples, some of which resulting from MT errors.

  • M. Federico

MT 2014 5

Introduction to MT

Do we need more research in MT?

  • M. Federico

MT 2014

slide-4
SLIDE 4

6

Introduction to MT

Why is Machine Translation so Difficult? High quality human translation implies:

  • deep and rich understanding of source language and text
  • sophisticated and creative command of target language

Nowadays, feasible goals for machine translation are tasks were:

  • even approximate translation are helpful (gist translation)
  • professional translators can take advantage of it (computer assisted translation)
  • linguistic domain is very focused and limited (apps for travelers)

In general, difficulty of translating depends on how similar the target and source languages are in their vocabulary, grammar, and conceptual structure.

  • M. Federico

MT 2014 7

Applications of MT

Gist translation for social media.

  • M. Federico

MT 2014

slide-5
SLIDE 5

8

Applications of MT

Carrier 12:00 PM Carrier 12:00 PM

Speech translation app.

  • M. Federico

MT 2014 9

Applications of MT

Integration of MT into computer assisted translation.

  • M. Federico

MT 2014

slide-6
SLIDE 6

10

Differences and Similarities of Languages

  • Universal communicative role of language

– names for people, words for talking about women, men, children – every language seems to have nouns and verbs

  • Differences/similarities across large classes of languages:

– Morphology: one vs. many morphemes per words, agglutination vs. fusion – Syntax: Subj-Verb-Obj structure (E) vs. SOV (J) vs. VSO (Irish) – Semantics: mapping of semantic roles and meaning of words e.g. direction/manner of motion indicated by verb/satellite in the bottle floated out (E) → la botella sali´

  • flotando (S)
  • Lexical divergence between languages:

– Semantical: there is no corresponding word with the same meaning wall (E) → Wand/Mauer (G, inside/outside) – Syntactical: a word is better translated into another part-of-speech she likes to sing (E,v) → sie singt gerne (D,adv)

  • Cultural Differences: philosophical argument=is translation possible at all?
  • M. Federico

MT 2014 11

Lexical Divergence

English brother Japanese

  • tooto (younger)
  • niisan (older)

English is Japanese isu (subj animate) aru (subj not animate) English know French conna^ ıtre (be acquainted with) savoir (know a proposition) English they French ils (masculine) elles (feminine) German Berg English hill mountain

  • some languages make distinctions that other languages don’t
  • difficulty to translate from less specific into more specific information
  • ?? do language differences enforce different conceptual structures ??
  • ?? do people who speak different languages think differently ??2

2 Watch talk by Lera Boroditsky (U. Stanford), ”How Language Shapes Thought”, fora.tv.

  • M. Federico

MT 2014

slide-7
SLIDE 7

12

Approaches to MT

Rough classification according to employed linguistic representations:

  • Direct model: translate and re-order single words or n-grams

– basically, no linguistic representation is used

  • Transfer model: use explicit knowledge about language differences

– analyze lexical and syntactic structure of source sentence – transfer structures from source to target language – generate corresponding sentence in the target language

  • Interlingua model: extract the meaning and express it in the target language

– analyze lexical, syntactical and semantical structure of source sentence – interpret the meaning into a canonical interlingua – generate the target sentence from the interlingua Notice: required knowledge for the interlingua approach grows linearly with number of languages, rather than to the square.

  • M. Federico

MT 2014 13

Vauquois’s Triangle

Interlingua Direct Target String String Source Transfer Analysis Generation

Syntax Semantics Semantics Syntax

  • M. Federico

MT 2014

slide-8
SLIDE 8

14

Approaches to MT

How is knowledge and linguistic information acquired by the system?

  • Hand-crafted:

knowledge for analysis, transfer, generation, meaning representation, or direct translation is manually developed – most of commercial MT systems fall into this category – requires lots of human labor and expertise – includes: rule-based MT

  • Machine-learned: representations are implemented by mathematical models

learnable from data, e.g. parallel corpora of human translations – much less human effort is needed – requires huge amounts of data, the more, the better! – includes: statistical MT and example-based MT

  • M. Federico

MT 2014 15

Transfer-Based MT

context-free grammar

NP → DT NPB NPB → JJ NN NPB → NN · · · DT → the JJ → north NN → wind · · ·

Synchronous context-free grammar

NP → DT1 NPB2 / DT1 NPB2 NPB → JJ1 NN2 / NN2 JJ1 NPB → NN / NN · · · DT → the / il JJ → north / settentrionale NN → wind / vento · · ·

NP NPB DT the JJ NN north wind NP NPB DT il NN vento settentrionale JJ settentrionale

  • M. Federico

MT 2014

slide-9
SLIDE 9

16

Transfer-Based MT

context-free grammar

NP → DT NPB NPB → JJ NN NPB → NN · · · DT → the JJ → north NN → wind · · ·

synchronous context-free grammar

NP → DT1 NPB2 / DT1 NPB2 NPB → JJ1 NN2 / NN2 JJ1 NPB → NN / NN · · · DT → the / il JJ → north / settentrionale NN → wind / vento · · ·

NP NPB DT the JJ NN north wind NP NPB DT il NN vento settentrionale JJ settentrionale

1This is a toy example. Working approaches use a very large set of probabilistic and lexicalized rules.

  • M. Federico

MT 2014 17

Interlingua-Based MT

  • Applied to linguistic domains with a limited number of relations and concepts

– tourist information, hotel booking, flight reservation, ...

  • Semantics of a sentence can be expressed with predicate argument structure

– I need a twin bed room reservation for tomorrow – book-room(date=tomorrow,type=single)

  • Interlingua language has to be designed carefully (by hand)

– for some application formalism similar to SQL language

  • Processing steps in IBMT:

– extract content from source sentence – map content into SQL like IL format

  • generate translation from IL format
  • M. Federico

MT 2014

slide-10
SLIDE 10

18

Interlingua-Based MT

  • S3 : I’m arriving on june sixth
  • I: give-information+temporal+arrival (who=I, time=(june, md6))
  • T: my arrival time is sixth of june
  • S: no that’s not necessary
  • I: negate
  • T: no
  • S: and i was wondering what you have in the way of rooms available during

that time

  • I: request-information+availability+room (room-type=question)
  • T: what kind of rooms are available?

3S: speech (English), I: Interlingua, T: translation (English)

  • M. Federico

MT 2014 19

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given a parallel corpus of translation examples

Italian German sono possibili deboli nevicate leichte Schneef¨ alle sind m¨

  • glich

sono possibili alcuni rovesci ein paar Regenschauer sind m¨

  • glich

le deboli precipitazioni cesseranno die leichte Niederschl¨ age klingen ab si verificheranno deboli precipitazioni leichte Niederschl¨ age werden einsetzen.

  • Learn Translation patterns

sono possibili X X sind m¨

  • glich

deboli precipitazioni leichte Niederschl¨ age

  • M. Federico

MT 2014

slide-11
SLIDE 11

19

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given a parallel corpus of translation examples

Italian German sono possibili deboli nevicate leichte Schneef¨ alle sind m¨

  • glich

sono possibili alcuni rovesci ein paar Regenschauer sind m¨

  • glich

le deboli precipitazioni cesseranno die leichte Niederschl¨ age klingen ab si verificheranno deboli precipitazioni leichte Niederschl¨ age werden einsetzen.

  • Learn Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni leichte Niederschl¨ age

  • M. Federico

MT 2014 19

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given a parallel corpus of translation examples

Italian German sono possibili deboli nevicate leichte Schneef¨ alle sind m¨

  • glich

sono possibili alcuni rovesci ein paar Regenschauer sind m¨

  • glich

le deboli precipitazioni cesseranno die leichte Niederschl¨ age klingen ab si verificheranno deboli precipitazioni leichte Niederschl¨ age werden einsetzen.

  • Learn Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • M. Federico

MT 2014

slide-12
SLIDE 12

19

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given a parallel corpus of translation examples

Italian German sono possibili deboli nevicate leichte Schneef¨ alle sind m¨

  • glich

sono possibili alcuni rovesci ein paar Regenschauer sind m¨

  • glich

le deboli precipitazioni cesseranno die leichte Niederschl¨ age klingen ab si verificheranno deboli precipitazioni leichte Niederschl¨ age werden einsetzen.

  • Learn Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • M. Federico

MT 2014 20

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given Learned Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • Translate a (possibly new) source sentence

Italian German sono possibili deboli precipitazioni

  • M. Federico

MT 2014

slide-13
SLIDE 13

20

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given Learned Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

sono precipitazioni − → leichte Niederschl¨ age

  • Translate a (possibly new) source sentence

Italian German sono possibili deboli precipitazioni

  • M. Federico

MT 2014 20

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given Learned Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • Translate a (possibly new) source sentence

Italian German sono possibili deboli precipitazioni − → leichte Niederschl¨ age sind m¨

  • glich
  • M. Federico

MT 2014

slide-14
SLIDE 14

20

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given Learned Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • Translate a (possibly new) source sentence

Italian German sono possibili deboli precipitazioni − → leichte Niederschl¨ age sind m¨

  • glich
  • M. Federico

MT 2014 21

Statistical Machine Translation

  • parallel texts
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since eastern Alps the affects breeze cool an un Alpi le interessa est da freddo vento

  • M. Federico

MT 2014

slide-15
SLIDE 15

21

Statistical Machine Translation

  • parallel texts and word alignments
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since un Alpi le interessa est da freddo vento eastern Alps the affects breeze cool an

  • M. Federico

MT 2014 21

Statistical Machine Translation

  • parallel texts and word alignments
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since un Alpi le interessa est da freddo vento eastern Alps the affects breeze cool an

  • word translation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo

  • M. Federico

MT 2014

slide-16
SLIDE 16

21

Statistical Machine Translation

  • parallel texts and word alignments
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since un Alpi le interessa est da freddo vento eastern Alps the affects breeze cool an

  • word translation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento

  • M. Federico

MT 2014 21

Statistical Machine Translation

  • parallel texts and word alignments
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since un Alpi le interessa est da freddo vento eastern Alps the affects breeze cool an

  • word translation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento

  • word concatenation probabilities

5 0.05 eastern cool ... eastern ... ... probs 0.12 0.10 0.07 7 eastern breeze eastern wind 12 10 eastern chilly counts bigrams with eastern

  • M. Federico

MT 2014

slide-17
SLIDE 17

22

Statistical Machine Translation

  • given word translations and word concatenation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento 5 0.05 eastern cool ... eastern ... ... probs 0.12 0.10 0.07 7 eastern breeze eastern wind 12 10 eastern chilly counts bigrams with eastern

  • generate possible translations of the source sentence

un freddo vento da est

  • M. Federico

MT 2014 22

Statistical Machine Translation

  • given word translations and word concatenation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento 5 0.05 eastern cool ... eastern ... ... probs 0.12 0.10 0.07 7 eastern breeze eastern wind 12 10 eastern chilly counts bigrams with eastern

  • generate possible translations of the source sentence and score them

0.10 an eastern chilly wind 0.09 a eastern cool wind ... ... an eastern chilly breeze 0.05 a cold eastern wind 0.12 0.08 un freddo vento da est a cool eastern breeze

  • M. Federico

MT 2014

slide-18
SLIDE 18

22

Statistical Machine Translation

  • given word translations and word concatenation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento 5 0.05 eastern cool ... eastern ... ... probs 0.12 0.10 0.07 7 eastern breeze eastern wind 12 10 eastern chilly counts bigrams with eastern

  • generate possible translations of the source sentence and score them

0.10 an eastern chilly wind 0.09 a eastern cool wind ... ... an eastern chilly breeze 0.05 a cold eastern wind 0.12 0.08 un freddo vento da est a cool eastern breeze

  • return the best scoring translation
  • M. Federico

MT 2014 23

Phrase-based SMT

PBSMT is a generalization of word-based SMT:

  • given automatic alignment of words in parallel texts:
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since

  • aligned words blocks, or phrases, are detected:
  • many phrase-pairs are collected and stored with their probabilities

dalla serata di domani - since tomorrow morning un freddo vento occidentale - an eastern chilly wind will blow - soffier` a serata di domani - tomorrow evening dalla - since di domani - tomorrow ...

  • ...
  • M. Federico

MT 2014

slide-19
SLIDE 19

24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • M. Federico

MT 2014 24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • M. Federico

MT 2014

slide-20
SLIDE 20

24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • M. Federico

MT 2014 24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • M. Federico

MT 2014

slide-21
SLIDE 21

24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • scores: linear combination of feature functions
  • features: phrase pairs, target n-grams, relative phrase movement , ...
  • decoder: efficient algorithm to compute (sub-)optimal solutions
  • features and combination weights are machine learnable from parallel data
  • M. Federico

MT 2014 25

Hierarchical Phrase-Based SMT

  • First tree-to-tree approach to perform better than phrase-based SMT
  • n large scale evaluations involving very distant languages
  • Discontinuous phrases, i.e. phrases with gaps
  • Long-range reordering rules
  • Formalized as synchronous context-free grammars (transfer approach?)
  • Not based on syntactic rules: just two non-terminal symbols!
  • The model is fully machine learnable!
  • Example. Chinese-English: original, transliteration, glosses, and translation
  • M. Federico

MT 2014

slide-22
SLIDE 22

26

HPBM: Motivations

  • Example. Typical Phrase-Based Chinese-English Translation:

Let us model some syntactic differences with simple phrase-rules:

  • Chinese VPs follow PPs / English VPs precede PPs

yu X1 you X2 / have X2 with X1

  • Chinese NPs follow RCs / English NPs precede RCs

X1 de X2 / the X2 that X1

  • translation of zhiyi construct in English word order

X1 zhiyi / one of X1 Our rules use one non-terminal X and indices to mark multiple occurrences. These rules can be inferred automatically from word-aligned parallel data.

  • M. Federico

MT 2014 27

HPBM: Example Rules

S → X1 / X1 (1) S → S1 X2 / S1 X2 (2) X → yu X1 you X2 / have X2 with X1 (3) X → X1 de X2 / the X2 that X1 (4) X → X1 zhiyi / one of X1 (5) X → Aozhou / Australia (6) X → Beihan / N. Korea (7) X → she / is (8) X → bangjiao / dipl.rels. (9) X → shaoshu guojia / few countries (10)

  • M. Federico

MT 2014

slide-23
SLIDE 23

28

  • M. Federico

MT 2014 28

  • M. Federico

MT 2014

slide-24
SLIDE 24

28

  • M. Federico

MT 2014 28

  • M. Federico

MT 2014

slide-25
SLIDE 25

28

  • M. Federico

MT 2014 28

  • M. Federico

MT 2014

slide-26
SLIDE 26

28

  • M. Federico

MT 2014 28

  • M. Federico

MT 2014

slide-27
SLIDE 27

28

  • M. Federico

MT 2014 28

  • M. Federico

MT 2014

slide-28
SLIDE 28

28

  • M. Federico

MT 2014 28

  • M. Federico

MT 2014

slide-29
SLIDE 29

29

A Brief History of Machine Translation

before 1900 various suggestions about “mechanic” translation 1933 French Patent by George Artsouni: storage device on paper tape to find translations of words Russian Patent by Petr Petrovich Troyanskii: lexical-syntactic transfer (base-forms+syntactic functions) 1949 memorandum by Warren Weaver (and Andrew D. Booth): cryptography methods, statistical methods, Shannon’s theory 1951 First research position on MT at MIT 1954 rule-based MT project by Georgetown U. + IBM: public demo Russian to English (Vocab: 250 words, Grammar: 6 rules) 1955

  • U. Leningrad: interlingua as artificial language

3A rich source of historical information about MT is in John Hutchins’ website http://www.hutchinsweb.me.uk.

  • M. Federico

MT 2014 30

A Brief History of Machine Translation

1933: translating machine by Artsouni

  • M. Federico

MT 2014

slide-30
SLIDE 30

31

A Brief History of Machine Translation

1954: computer IBM 701 used for the first MT demo.

  • M. Federico

MT 2014 32

A Brief History of Machine Translation

1956-1966 large scale funding in US: high expectation & disillusion 1957 Peter Toma starts building Systran 1958

  • U. Washington, IBM : word-for-word approach

Russian-English system for US Air Force (up to 1970) 1960 RAND corp. rough translation with statistical approach 1961

  • U. Georgetown (+ P. Toma) Russian to English demo

rule based (more levels of analysis) around 1960 MIT and U. Texas work on syntactic transfer approach 1967 ALPAC report: US funding drastically reduced for 10 years 1970-1981

  • U. Montreal, TAUM project: rule-based, logic-programming

success with weather forecasts, failure with aviation manuals 1960-1971

  • U. Texas and U. Grenoble work on interlingua approach, logic

1975 interlingua looses interest

  • M. Federico

MT 2014

slide-31
SLIDE 31

33

A Brief History of Machine Translation

1980 - Rule based transfer and new interlingua approaches based on linguistic theories, logic programming, AI 1990 - Rule based MT dominance is broken Statistical alignment models for French-English (IBM) Example-based translation (Sato and Nagao, Japan) 1990 - Speech translation projects: limited domains ATR, Kyoto: automatic telephony research CSTAR consortium (US, Europe, Asia) Verbmobil project (Germany) 2000 - Unrestricted Language Translation Automatic evaluation metrics for MT (IBM) TIDES/GALE (US): written/spoken news Chi/Ara to Eng TC-STAR (EU): news Chi to Eng speeches Spa-Eng 2005 - Open source for MT Toolkits: Moses, Hiero, SRILM, Irstlm, ... Resources: Europarl, UN, French-English 109 corpus

  • M. Federico

MT 2014 34

Experimental Development Cycle

Experimental research in HLT generally follows this development cycle:

Insight Model Decide Results Benchmark Discard Improve Analyse Implement Evaluate Exploit

Evaluation bottleneck MT developers need to monitor the effect of changes to their systems in order to weed out bad ideas from good ideas!

  • M. Federico

MT 2014

slide-32
SLIDE 32

35

Evaluating MT Performance

How do we evaluate the output of a MT system?

  • Human MT evaluation

– criteria: adequacy, fluency, ranking, post-edit effort – pros: very accurate, high quality – cons: expensive, slow, difficult, subjective, difficult to reproduce

  • Automatic MT evaluation

– criteria: similarity with respect to one or more human translations – pros: cheap, quick, correlates with human judgments, good sensitivity – cons: correlation with human scores is not high, not much informative MT systems can be tuned to optimize automatic metrics!

  • M. Federico

MT 2014 36

Automatic Evaluation of MT

Compare MT output against one or more human translations (references):

  • Word alignment methods

– WER: based on edit distance – TER: extends edit distance with shifts of blocks

  • N-gram matching methods

– BLEU: matching n-grams – NIST: extends BLEU by weighting n-gram with informativeness – GTM: F-score of matching n-grams rewarding longer matches. – METEOR: weighted F-score of 1-gram matches with synonymy matching. – ... – ....

  • M. Federico

MT 2014

slide-33
SLIDE 33

37

WER: Word Error Rate

Compute alignment between hypothesis (H) and reference (R) that minimizes editing operations (word insertions, deletions, and substitutions) to fix H. H: it is a guide to action which ensures that the military ... R: it is a guide to action that ensures that the military ... .... always obeys the

  • commands of the party

.... will forever heed party commands -

  • To fix H we need to apply 4 substitutions + 1 insertion + 3 deletions = 8

We compute the ratio between the number of operations on the length of R WER = 8 16 = 0.50

  • M. Federico

MT 2014 38

TER: Translation Error Rate

  • H: this week the Saudis denied information published in the New

York Times

  • R:

Saudi Arabia denied this week information published in the American New York Times

  • “this week” is shifted
  • “Saudi Arabia” in the REF appears as “the Saudis” in the HYP
  • “American” appears only in the REF

In this case, the number of edits is 4 (1 shift, 2 substitutions, and 1 deletion): TER% = 4 13 × 100 = 30.8% TER is computationally intractable, indeed we rely on approximate calculations.

  • M. Federico

MT 2014

slide-34
SLIDE 34

39

BLEU:

H: it is a guide to action which ensures that the military always

  • beys the commands of the party

R1: it is a guide to action that ensures that the military will forever heed party commands R2: it is the guiding principle which guarantees the military forces always being under the command of the party 1-grams matches: 15 out of 18 2-grams matches: 10 out of 17 3-grams matches: 7 out of 16 ... The BLEU score combines statistics of matches at different levels of granularity.

  • M. Federico

MT 2014 40

The State of the Art

  • SMT is now a very competitive technology

– in many evaluations SMT outperformed rule-based MT – several commercial SMT systems: Google, Microsoft, IBM, LW, ...

  • Interest in SMT revamped around seminal work at IBM in early 90’

– indeed the core ideas go back to Warren Weaver in 1949

  • Best performing SMT systems use either:

– brute force direct translation exploiting huge amounts of data – combination of phrase-based and tree-based models – integration of morphology and syntax

  • Automatic evaluation has boosted research in SMT:

– model training directly optimizes the evaluation metric

  • Evaluation campaigns are organized every year:

– NIST: news texts - Chi/Ara to Eng (2002-) – IWSLT: traveling/lectures speech - Asian-EU languages (2004-) – WMT: news texts - many EU languages (2005-)

  • M. Federico

MT 2014

slide-35
SLIDE 35

41

Example 1: Arabic English

Human Dubai 2 - 7 ( AFP ) - The Secretary-General of the United Nations Kofi Annan said he would donate the international Zayed Prize for the Environment , which he received on Monday night in Dubai worth 500000 dollars , to setup a foundation for agriculture and educating girls in Africa . Machine Dubai 2-7 (AFP) - United Nations Secretary-General Kofi Annan said that the award will Zayed International Environment, which received Monday evening in Dubai worth 500,000 dollars to establish an institution for agriculture and education of girls in the African continent.

  • M. Federico

MT 2014 42

Example 1: Arabic English

Human Dubai 2 - 7 ( AFP ) - The Secretary-General of the United Nations Kofi Annan said he would donate the international Zayed Prize for the Environment , which he received on Monday night in Dubai worth 500000 dollars , to setup a foundation for agriculture and educating girls in Africa . Machine Dubai 2-7 (AFP) - United Nations Secretary-General Kofi Annan said that the award will Zayed International Environment, which ... he ... received ... on... Monday evening in Dubai worth 500,000 dollars ... , will be donated ... to establish an institution for agriculture and education

  • f girls in the African continent.

Looks useful for post-editing!

  • M. Federico

MT 2014

slide-36
SLIDE 36

43

Example 2: Arabic English

Human New York ( The United Nations ) 2 - 8 ( AFP ) - United Nations Secretary General Kofi Annan expressed his concern today , Tuesday , about the wave of targeted liquidations being carried out by Israel in Gaza and the West Bank , and he also condemned the rocket attacks targeting the Hebrew State , according to his spokesman . Machine New York (United Nations) 2-8 (AFP) - United Nations Secretary General Kofi Annan expressed concern today, Tuesday, the wave of qualifiers quality by Israel in Gaza and the West Bank, also condemned the missile attacks against the Jewish state, his spokesman said.

  • M. Federico

MT 2014 44

Example 2: Arabic English

Human New York ( The United Nations ) 2 - 8 ( AFP ) - United Nations Secretary General Kofi Annan expressed his concern today , Tuesday , about the wave of targeted liquidations being carried out by Israel in Gaza and the West Bank , and he also condemned the rocket attacks targeting the Hebrew State , according to his spokesman . Machine New York (United Nations) 2-8 (AFP) - United Nations Secretary General Kofi Annan expressed concern today, Tuesday, ... about ... the wave of qualifiers quality targeted liquidations by Israel in Gaza and the West Bank, ... and he ... also condemned the missile attacks against the Jewish state, his spokesman said. Looks also useful for post-editing!

  • M. Federico

MT 2014

slide-37
SLIDE 37

45

Example 3: Chinese English

Human Today was the Catholic Church’s annual ”Life Day”. Pope Benedict XVI delivered a speech in St. Peter’s Basilica, in which he criticized that the hedonism of wealthy society impairs the Christian value system of respect for life, and he strongly condemned abortion and euthanasia. Machine Today is the ”life” of the Catholic Church once a year, when 16 of the pope delivered a speech in St. Peter’s cathedral, criticized the joy of an affluent society, undermine the values of the Christian faith to respect life, and strongly condemned euthanasia and abortion.

  • M. Federico

MT 2014 46

Example 3: Chinese English

Human (?) Today was the Catholic Church’s annual ” Life Day ”. Pope Benedict XVI delivered a speech in St. Peter’s Basilica, in which he criticized that the hedonism of ...our... wealthy society ...which... impairs the Christian value system of respect for life, and he strongly condemned abortion and euthanasia. Machine Today is the ”life ..day...” of the Catholic Church once a year, when 16 of the pope delivered a speech in St. Peter’s cathedral, ...he... criticized the joy of an affluent society, ... that... undermines the values of the Christian faith to respect life, and strongly condemned euthanasia and abortion. Difficult to make out the meaning of this!

  • M. Federico

MT 2014

slide-38
SLIDE 38

47

Example 4: Chinese English

Human The Pope told thousands of believers making the pilgrimage to St . Peter’s Basilica , ” Life is often glorified during times of happiness , but no longer respected during times

  • f sickness and trouble or when it is impaired . ”

Machine The pope told thousands who came to St. Peter’s church followers, ”when the joys of life were often, but sick or disabled, will no longer be respected.”

  • M. Federico

MT 2014 48

Example 4: Chinese English

Human The Pope told thousands of believers making the pilgrimage to St . Peter’s Basilica , ” Life is often glorified during times of happiness , but no longer respected during times

  • f sickness and trouble or when it is impaired . ”

Machine The pope told thousands ... of followers... who came to St. Peter’s church followers, ”when the-re is joys of life were..was.. often ..glorified.., but ...when... sick or disabled, will..it is.. no longer be respected.” Slightly better but still hard to grasp!

  • M. Federico

MT 2014

slide-39
SLIDE 39

49

Translation of Text Genres

There is very much content to be translated in the world and not all of it is actually expressed with high quality, creative and sophisticated language. Text or speech genres can be characterized by:

  • Purpose: e.g., informative, persuasive, instructive
  • Type: e.g. narrative, argumentative, descriptive, expository
  • Register: e.g. formal, casual, intimate, ...
  • Style: e.g. dialogic, descriptive, grammatical choices, sentence length, ...

Different genres present different translation difficulties, e.g.:

  • Novels, speeches, critical reviews: style, rhetorical figures, idioms, ...
  • News stories, technical documentation: names, terminology, ...

Remark: MT is so far better addressing genres using simple linguistic structures and words with a literal although technical meaning.

  • M. Federico

MT 2014 50

Translation of Text Genres

Example from The New York Times, Critics’s Notebook, 2011: A string of tedious shows can turn the intrepid theater goer into a couch potato.

  • Genre: critical review
  • Purpose: persuasive
  • Type: argumentative
  • Register: formal
  • Style: use of humorous, idioms, rhetorical figures (hyperbole)

[Exercise 2. Try to translated this sentence with an on-line translation systems and comment the results.]

  • M. Federico

MT 2014

slide-40
SLIDE 40

50

Translation of Text Genres

Example from The New York Times, Critics’s Notebook, 2011: A string of tedious shows can turn the intrepid theater goer into a couch potato.

  • Genre: critical review
  • Purpose: persuasive
  • Type: argumentative
  • Register: formal
  • Style: use of humorous, idioms, rhetorical figures (hyperbole)

MT output by Google Translate: Una serie di spettacoli noiosi pu`

  • trasformare il frequentatore di teatro

intrepido in un teledipendente. MT output correctly reflects the structure and meaning, but part of the writing style and consequent effect are lost. Rendering the original effect likely requires language and world knowledge of a native speaker.

  • M. Federico

MT 2014 51

Translation of Text Genres

Example from IBM’s online technical documentation, 2013: Similarly, each message displayed in the interface includes a link to the help for that message.

  • Genre: technical documentation
  • Purpose: informative
  • Type: expository
  • Register: formal
  • Style: use literal meaning of words though technical, neutral text, low ambiguity

(polysemy, coreference, ...) [Exercise 4. Try to translated this sentence with an on-line translation systems and comment the results.]

  • M. Federico

MT 2014

slide-41
SLIDE 41

51

Translation of Text Genres

Example from IBM’s online technical documentation, 2013: Similarly, each message displayed in the interface includes a link to the help for that message.

  • Genre: technical documentation
  • Purpose: informative
  • Type: expository
  • Register: formal
  • Style: use literal meaning of words though technical, neutral text, low ambiguity

(polysemy, coreference, ...) MT output by Google Translate: Allo stesso modo, ogni messaggio visualizzato nell’interfaccia include un collegamento alla Guida per quel messaggio. MT output correctly reflects the grammatical structure, meaning and style.

  • M. Federico

MT 2014