Machine Translation Overview Marcello Federico FBK-irst Trento, - - PDF document

machine translation overview
SMART_READER_LITE
LIVE PREVIEW

Machine Translation Overview Marcello Federico FBK-irst Trento, - - PDF document

Machine Translation Overview Marcello Federico FBK-irst Trento, Italy 2013 M. Federico, FBK-irst SMT - Part 1 2013 Outline 1 Introduction Motivation Approaches Brief history Evaluation State-of-the-art Examples


slide-1
SLIDE 1

Machine Translation Overview

Marcello Federico FBK-irst Trento, Italy 2013

  • M. Federico, FBK-irst

SMT - Part 1 2013 1

Outline

  • Introduction
  • Motivation
  • Approaches
  • Brief history
  • Evaluation
  • State-of-the-art
  • Examples

References:

  • P. Koehn, Statistical Machine Translation, Cambridge University Press, 2009.
  • A. Lopez, Statistical Machine Translation, ACM Computing Surveys, vol. 40, number 3, 2008.
  • D. Jurafsky and J. H. Martin, Speech and Language Processing, Prentice Hall, 2009.
  • C. Manning and H. Sch¨

utze, Foundations of Statistical Natural Language Processing, MIT Press, 199 9.

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-2
SLIDE 2

2

Machine Translation

Wikipedia Machine translation, often referred to by the acronym MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. Personal Definition MT generally investigates the automatic translation of ”standard” language that can be systematically observed in ordinary communication – e.g. conversations, news, speeches, business letters, user manuals, etc. –. MT is generally not concerned with literature genres, nor creative and sophisticated use of language. For several reasons, such kind of language is simply out of the scope of MT.

1For a very interesting introduction to issues related to the translation of literature work see Umberto Eco,

”Experiences in Translation”, U. Toronto Press, 2001.

  • M. Federico, FBK-irst

SMT - Part 1 2013 3

Introduction to MT

Why is Machine Translation so Important?1

  • Information society and production of multilingual content

7 billion people - 6,000 languages - 250,000 translators

  • Globalization and demand for translation services:

1,000 global companies operating in ≥ 160 countries

  • Size of worldwide translation market:

12.5 billion $ per year ≈ 34 million $ per day

  • Size of translation industry:

3,150 translation companies (3.1 billion $) 200,000 freelance translators (9.4 billion $)

  • MT can improve productivity of human translators:

integration of MT with human translation (post-editing)

  • MT can supply cheap gist translation

competitive quality-cost-speed trade-off

1Source: Common Sense Advisory, 2010

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-3
SLIDE 3

4

Introduction to MT

Do we need more research in MT? Chinglish examples, some of which resulting from MT errors.

  • M. Federico, FBK-irst

SMT - Part 1 2013 5

Introduction to MT

Do we need more research in MT?

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-4
SLIDE 4

6

Introduction to MT

Why is Machine Translation so Difficult? High quality human translation implies:

  • deep and rich understanding of source language and text
  • sophisticated and creative command of target language

Nowadays, feasible goals for machine translation are tasks:

  • an approximate translation is still useful (gist translation)
  • human translators can post-edit MT (computer assisted translation)
  • linguistic domain is very focused and limited (smarthphone apps)

In general, difficulty of translating depends on how similar the target and source languages are in their vocabulary, grammar, and conceptual structure.

  • M. Federico, FBK-irst

SMT - Part 1 2013 7

Applications of MT

Gist translation for social media.

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-5
SLIDE 5

8

Applications of MT

Carrier 12:00 PM Carrier 12:00 PM

Speech translation app.

  • M. Federico, FBK-irst

SMT - Part 1 2013 9

Applications of MT

Integration of MT into computer assisted translation.

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-6
SLIDE 6

10

Differences and Similarities of Languages

  • Universal communicative role of language

– names for people, words for talking about women, men, children – every language seems to have nouns and verbs

  • Differences/similarities across large classes of languages:

– Morphology: one vs. many morphemes per words, agglutination vs. fusion – Syntax: Subj-Verb-Obj structure (E) vs. SOV (J) vs. VSO (Irish) – Semantics: mapping of semantic roles and meaning of words e.g. direction/manner of motion indicated by verb/satellite in the bottle floated out (E) → la botella sali´

  • flotando (S)
  • Lexical divergence between languages:

– Semantical: there is no corresponding word with the same meaning wall (E) → Wand/Mauer (G, inside/outside) – Syntactical: a word is better translated into another part-of-speech she likes to sing (E,v) → sie singt gerne (D,adv)

  • Cultural Differences: philosophical argument=is translation possible at all?
  • M. Federico, FBK-irst

SMT - Part 1 2013 11

Lexical Divergence

English brother Japanese

  • tooto (younger)
  • niisan (older)

English is Japanese isu (subj animate) aru (subj not animate) English know French conna^ ıtre (be acquainted with) savoir (know a proposition) English they French ils (masculine) elles (feminine) German Berg English hill mountain

  • some languages make distinctions that other languages don’t
  • difficulty to translate from less specific into more specific information
  • language differences enforce different conceptual structures
  • debate: do people who speak different languages think differently?2

2Watch talk by Lera Boroditsky (U. Stanford), ”How Language Shapes Thought”, fora.tv.

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-7
SLIDE 7

12

Approaches to MT

Rough classification according to employed linguistic representations:

  • Direct model: translate and re-order single words or n-grams

– basically, no linguistic representation is used

  • Transfer model: use explicit knowledge about language differences

– analyze lexical and syntactic structure of source sentence – transfer structures from source to target language – generate corresponding sentence in the target language

  • Interlingua model: extract the meaning and express it in the target language

– analyze lexical, syntactical and semantical structure of source sentence – interpret the meaning into a canonical interlingua – generate the target sentence from the interlingua Notice: required knowledge for the interlingua approach grows linearly with number of languages, rather than to the square.

  • M. Federico, FBK-irst

SMT - Part 1 2013 13

Vauquois’s Triangle

Interlingua Direct Target String String Source Transfer A n a l y s i s G e n e r a t i

  • n

Syntax Semantics Semantics Syntax

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-8
SLIDE 8

14

Approaches to MT

How is knowledge and linguistic information acquired by the system?

  • Hand-crafted:

knowledge for analysis, transfer, generation, meaning representation, or direct translation is manually developed – most of commercial MT systems fall into this category – requires lots of human labor and expertise – includes: rule-based MT

  • Machine-learned: representations are implemented by mathematical models

learnable from data, e.g. parallel corpora of human translations – much less human effort is needed – requires huge amounts of data, the more, the better! – includes: statistical MT and example-based MT

  • M. Federico, FBK-irst

SMT - Part 1 2013 15

Transfer-Based MT

context-free grammar

NP → DT NPB NPB → JJ NN NPB → NN · · · DT → the JJ → north NN → wind · · ·

Synchronous context-free grammar

NP → DT1 NPB2 / DT1 NPB2 NPB → JJ1 NN2 / NN2 JJ1 NPB → NN / NN · · · DT → the / il JJ → north / settentrionale NN → wind / vento · · ·

NP NPB DT the JJ NN north wind NP NPB DT il NN vento settentrionale JJ settentrionale

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-9
SLIDE 9

16

Transfer-Based MT

context-free grammar

NP → DT NPB NPB → JJ NN NPB → NN · · · DT → the JJ → north NN → wind · · ·

synchronous context-free grammar

NP → DT1 NPB2 / DT1 NPB2 NPB → JJ1 NN2 / NN2 JJ1 NPB → NN / NN · · · DT → the / il JJ → north / settentrionale NN → wind / vento · · ·

NP NPB DT the JJ NN north wind NP NPB DT il NN vento settentrionale JJ settentrionale

1This is a toy example. Working approaches use a very large set of probabilistic and lexicalized rules.

  • M. Federico, FBK-irst

SMT - Part 1 2013 17

Interlingua-Based MT

  • Applied to linguistic domains with a limited number of relations and concepts

– tourist information, hotel booking, flight reservation, ...

  • Semantics of a sentence can be expressed with predicate argument structure

– I need a twin bed room reservation for tomorrow – book-room(date=tomorrow,type=single)

  • Interlingua language has to be designed carefully (by hand)

– for some application formalism similar to SQL language

  • Processing steps in IBMT:

– extract content from source sentence – map content into SQL like IL format

  • generate translation from IL format
  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-10
SLIDE 10

18

Interlingua-Based MT

  • S3 : I’m arriving on june sixth
  • I: give-information+temporal+arrival (who=I, time=(june, md6))
  • T: my arrival time is sixth of june
  • S: no that’s not necessary
  • I: negate
  • T: no
  • S: and i was wondering what you have in the way of rooms available during

that time

  • I: request-information+availability+room (room-type=question)
  • T: what kind of rooms are available?

3S: speech (English), I: Interlingua, T: translation (English)

  • M. Federico, FBK-irst

SMT - Part 1 2013 19

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given a parallel corpus of translation examples

Italian German sono possibili deboli nevicate leichte Schneef¨ alle sind m¨

  • glich

sono possibili alcuni rovesci ein paar Regenschauer sind m¨

  • glich

le deboli precipitazioni cesseranno die leichte Niederschl¨ age klingen ab si verificheranno deboli precipitazioni leichte Niederschl¨ age werden einsetzen.

  • Learn Translation patterns

sono possibili X X sind m¨

  • glich

deboli precipitazioni leichte Niederschl¨ age

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-11
SLIDE 11

19

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given a parallel corpus of translation examples

Italian German sono possibili deboli nevicate leichte Schneef¨ alle sind m¨

  • glich

sono possibili alcuni rovesci ein paar Regenschauer sind m¨

  • glich

le deboli precipitazioni cesseranno die leichte Niederschl¨ age klingen ab si verificheranno deboli precipitazioni leichte Niederschl¨ age werden einsetzen.

  • Learn Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni leichte Niederschl¨ age

  • M. Federico, FBK-irst

SMT - Part 1 2013 19

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given a parallel corpus of translation examples

Italian German sono possibili deboli nevicate leichte Schneef¨ alle sind m¨

  • glich

sono possibili alcuni rovesci ein paar Regenschauer sind m¨

  • glich

le deboli precipitazioni cesseranno die leichte Niederschl¨ age klingen ab si verificheranno deboli precipitazioni leichte Niederschl¨ age werden einsetzen.

  • Learn Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-12
SLIDE 12

19

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given a parallel corpus of translation examples

Italian German sono possibili deboli nevicate leichte Schneef¨ alle sind m¨

  • glich

sono possibili alcuni rovesci ein paar Regenschauer sind m¨

  • glich

le deboli precipitazioni cesseranno die leichte Niederschl¨ age klingen ab si verificheranno deboli precipitazioni leichte Niederschl¨ age werden einsetzen.

  • Learn Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • M. Federico, FBK-irst

SMT - Part 1 2013 20

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given Learned Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • Translate a (possibly new) source sentence

Italian German sono possibili deboli precipitazioni

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-13
SLIDE 13

20

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given Learned Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

sono precipitazioni − → leichte Niederschl¨ age

  • Translate a (possibly new) source sentence

Italian German sono possibili deboli precipitazioni

  • M. Federico, FBK-irst

SMT - Part 1 2013 20

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given Learned Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • Translate a (possibly new) source sentence

Italian German sono possibili deboli precipitazioni − → leichte Niederschl¨ age sind m¨

  • glich
  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-14
SLIDE 14

20

Example-Based MT

  • Assumption: people translate by analogy

– Decompose a sentence into phrases – Translate phrases by analogy to previous translations – Properly compose translation fragments into one long sentence

  • Given Learned Translation patterns

Italian German sono possibili X − → X sind m¨

  • glich

deboli precipitazioni − → leichte Niederschl¨ age

  • Translate a (possibly new) source sentence

Italian German sono possibili deboli precipitazioni − → leichte Niederschl¨ age sind m¨

  • glich
  • M. Federico, FBK-irst

SMT - Part 1 2013 21

Statistical Machine Translation

  • parallel texts
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since eastern Alps the affects breeze cool an un Alpi le interessa est da freddo vento

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-15
SLIDE 15

21

Statistical Machine Translation

  • parallel texts and word alignments
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since un Alpi le interessa est da freddo vento eastern Alps the affects breeze cool an

  • M. Federico, FBK-irst

SMT - Part 1 2013 21

Statistical Machine Translation

  • parallel texts and word alignments
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since un Alpi le interessa est da freddo vento eastern Alps the affects breeze cool an

  • word translation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-16
SLIDE 16

21

Statistical Machine Translation

  • parallel texts and word alignments
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since un Alpi le interessa est da freddo vento eastern Alps the affects breeze cool an

  • word translation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento

  • M. Federico, FBK-irst

SMT - Part 1 2013 21

Statistical Machine Translation

  • parallel texts and word alignments
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since un Alpi le interessa est da freddo vento eastern Alps the affects breeze cool an

  • word translation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento

  • word concatenation probabilities

5 0.05 eastern cool ... eastern ... ... probs 0.12 0.10 0.07 7 eastern breeze eastern wind 12 10 eastern chilly counts bigrams with eastern

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-17
SLIDE 17

22

Statistical Machine Translation

  • given word translations and word concatenation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento 5 0.05 eastern cool ... eastern ... ... probs 0.12 0.10 0.07 7 eastern breeze eastern wind 12 10 eastern chilly counts bigrams with eastern

  • generate possible translations of the source sentence

un freddo vento da est

  • M. Federico, FBK-irst

SMT - Part 1 2013 22

Statistical Machine Translation

  • given word translations and word concatenation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento 5 0.05 eastern cool ... eastern ... ... probs 0.12 0.10 0.07 7 eastern breeze eastern wind 12 10 eastern chilly counts bigrams with eastern

  • generate possible translations of the source sentence and score them

0.10 an eastern chilly wind 0.09 a eastern cool wind ... ... an eastern chilly breeze 0.05 a cold eastern wind 0.12 0.08 un freddo vento da est a cool eastern breeze

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-18
SLIDE 18

22

Statistical Machine Translation

  • given word translations and word concatenation probabilities

15 0.15 chill ... ... ... probs 0.43 0.10 0.28 28 cool cold 43 10 chilly counts translations of freddo ... ... ... 59 0.59 wind probs 0.26 26 breeze counts translations of vento 5 0.05 eastern cool ... eastern ... ... probs 0.12 0.10 0.07 7 eastern breeze eastern wind 12 10 eastern chilly counts bigrams with eastern

  • generate possible translations of the source sentence and score them

0.10 an eastern chilly wind 0.09 a eastern cool wind ... ... an eastern chilly breeze 0.05 a cold eastern wind 0.12 0.08 un freddo vento da est a cool eastern breeze

  • return the best scoring translation
  • M. Federico, FBK-irst

SMT - Part 1 2013 23

Phrase-based SMT

PBSMT is a generalization of word-based SMT:

  • given automatic alignment of words in parallel texts:
  • rientale

vento freddo un soffierà domani di serata dalla blow will wind chilly eastern an evening tomorrow since

  • aligned words blocks, or phrases, are detected:
  • many phrase-pairs are collected and stored with their probabilities

dalla serata di domani - since tomorrow morning un freddo vento occidentale - an eastern chilly wind will blow - soffier` a serata di domani - tomorrow evening dalla - since di domani - tomorrow ...

  • ...
  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-19
SLIDE 19

24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • M. Federico, FBK-irst

SMT - Part 1 2013 24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-20
SLIDE 20

24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • M. Federico, FBK-irst

SMT - Part 1 2013 24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-21
SLIDE 21

24

Phrase-based SMT

Then, we use a search algorithm to find the optimal translation with phrase-pairs How PBSMT works

  • translation: segment input, translate and re-arrange phrases
  • steps: select a source segment, translate and attach to target
  • scores: linear combination of feature functions
  • features: phrase pairs, target n-grams, relative phrase movement , ...
  • decoder: efficient algorithm to compute (sub-)optimal solutions
  • features and combination weights are machine learnable from parallel data
  • M. Federico, FBK-irst

SMT - Part 1 2013 25

Hierarchical Phrase-Based SMT

  • First tree-to-tree approach to perform better than phrase-based SMT
  • n large scale evaluations involving very distant languages
  • Discontinuous phrases, i.e. phrases with gaps
  • Long-range reordering rules
  • Formalized as synchronous context-free grammars (transfer approach?)
  • Not based on syntactic rules: just two non-terminal symbols!
  • The model is fully machine learnable!
  • Example. Chinese-English: original, transliteration, glosses, and translation
  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-22
SLIDE 22

26

HPBM: Motivations

  • Example. Typical Phrase-Based Chinese-English Translation:

Let us model some syntactic differences with simple phrase-rules:

  • Chinese VPs follow PPs / English VPs precede PPs

yu X1 you X2 / have X2 with X1

  • Chinese NPs follow RCs / English NPs precede RCs

X1 de X2 / the X2 that X1

  • translation of zhiyi construct in English word order

X1 zhiyi / one of X1 Our rules use one non-terminal X and indices to mark multiple occurrences. These rules can be inferred automatically from word-aligned parallel data.

  • M. Federico, FBK-irst

SMT - Part 1 2013 27

HPBM: Example Rules

S → X1 / X1 (1) S → S1 X2 / S1 X2 (2) X → yu X1 you X2 / have X2 with X1 (3) X → X1 de X2 / the X2 that X1 (4) X → X1 zhiyi / one of X1 (5) X → Aozhou / Australia (6) X → Beihan / N. Korea (7) X → she / is (8) X → bangjiao / dipl.rels. (9) X → shaoshu guojia / few countries (10)

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-23
SLIDE 23

28

  • M. Federico, FBK-irst

SMT - Part 1 2013 28

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-24
SLIDE 24

28

  • M. Federico, FBK-irst

SMT - Part 1 2013 28

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-25
SLIDE 25

28

  • M. Federico, FBK-irst

SMT - Part 1 2013 28

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-26
SLIDE 26

28

  • M. Federico, FBK-irst

SMT - Part 1 2013 28

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-27
SLIDE 27

28

  • M. Federico, FBK-irst

SMT - Part 1 2013 28

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-28
SLIDE 28

28

  • M. Federico, FBK-irst

SMT - Part 1 2013 28

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-29
SLIDE 29

29

A Brief History of Machine Translation

before 1900 various suggestions about “mechanic” translation 1933 French Patent by George Artsouni: storage device on paper tape to find translations of words Russian Patent by Petr Petrovich Troyanskii: lexical-syntactic transfer (base-forms+syntactic functions) 1949 memorandum by Warren Weaver (and Andrew D. Booth): cryptography methods, statistical methods, Shannon’s theory 1951 First research position on MT at MIT 1954 rule-based MT project by Georgetown U. + IBM: public demo Russian to English (Vocab: 250 words, Grammar: 6 rules) 1955

  • U. Leningrad: interlingua as artificial language

3A rich source of historical information about MT is in John Hutchins’ website http://www.hutchinsweb.me.uk.

  • M. Federico, FBK-irst

SMT - Part 1 2013 30

A Brief History of Machine Translation

1956-1966 large scale funding in US: high expectation & disillusion 1957 Peter Toma starts building Systran 1958

  • U. Washington, IBM : word-for-word approach

Russian-English system for US Air Force (up to 1970) 1960 RAND corp. rough translation with statistical approach 1961

  • U. Georgetown (+ P. Toma) Russian to English demo

rule based (more levels of analysis) around 1960 MIT and U. Texas work on syntactic transfer approach 1967 ALPAC report: US funding drastically reduced for 10 years 1970-1981

  • U. Montreal, TAUM project: rule-based, logic-programming

success with weather forecasts, failure with aviation manuals 1960-1971

  • U. Texas and U. Grenoble work on interlingua approach, logic

1975 interlingua looses interest

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-30
SLIDE 30

31

A Brief History of Machine Translation

1980 - Rule based transfer and new interlingua approaches based on linguistic theories, logic programming, AI 1990 - Rule based MT dominance is broken Statistical alignment models for French-English (IBM) Example-based translation (Sato and Nagao, Japan) 1990 - Speech translation projects: limited domains ATR, Kyoto: automatic telephony research CSTAR consortium (US, Europe, Asia) Verbmobil project (Germany) 2000 - Unrestricted Language Translation Automatic evaluation metrics for MT (IBM) TIDES/GALE (US): written/spoken news Chi/Ara to Eng TC-STAR (EU): news Chi to Eng speeches Spa-Eng 2005 - Open source for MT Toolkits: Moses, Hiero, SRILM, Irstlm, ... Resources: Europarl, UN, French-English 109 corpus

  • M. Federico, FBK-irst

SMT - Part 1 2013 32

Experimental Development Cycle

Experimental research in HLT generally follows this development cycle:

Insight Model Decide Results Benchmark Discard Improve Analyse Implement Evaluate Exploit

Evaluation bottleneck MT developers need to monitor the effect of changes to their systems in order to weed out bad ideas from good ideas!

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-31
SLIDE 31

33

Evaluating MT Performance

How do we evaluate the output of a MT system?

  • Human MT evaluation

– criteria: adequacy, fluency, ranking, post-edit effort – pros: very accurate, high quality – cons: expensive, slow, difficult, subjective, difficult to reproduce

  • Automatic MT evaluation

– criteria: similarity with respect to one or more human translations – pros: cheap, quick, correlates with human judgments, good sensitivity – cons: correlation with human scores is not high, not much informative MT systems can be tuned to optimize automatic metrics!

  • M. Federico, FBK-irst

SMT - Part 1 2013 34

Automatic Evaluation of MT

Compare MT against one or more human translations (references):

  • Word alignment methods

– WER: ratio of smallest edit distance and output length – TER: extends edit distance to account for word order errors

  • N-gram matching methods

– BLEU: precision of matching n-grams with length penalty factor. – NIST: variation of BLEU weighting matching n-gram by their informativeness. – GTM: F-score of matching n-grams by rewarding longer matches. – METEOR: weighted F-score of 1-gram matches with synonymy matching. – ... – ....

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-32
SLIDE 32

35

WER: example

Edit distance between Hypothesis and Reference:

H: it is a guide to action *which ensures that the military R: it is a guide to action *that ensures that the military .... *always *obeys *the

  • commands

*of *the *party .... *will *forever *heed party commands

  • Sums up to 4 substitutions + 1 deletion + 3 insertions = 8

Hence, WER = 8 16 = 0.50

  • M. Federico, FBK-irst

SMT - Part 1 2013 36

TER: example

  • H: this week the Saudis denied information published in the New

York Times

  • R:

Saudi Arabia denied this week information published in the American New York Times

  • “this week” is shifted
  • “Saudi Arabia” in the REF appears as “the Saudis” in the HYP
  • “American” appears only in the REF

In this case, the number of edits is 4 (1 shift, 2 substitutions, and 1 deletion): TER% = 4 13 × 100 = 30.8% TER is computationally intractable, indeed we rely on approximate calculations.

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-33
SLIDE 33

37

BLEU: example

H: it is a guide to action which ensures that the military always

  • beys the commands of the party

R1: it is a guide to action that ensures that the military will forever heed party commands R2: it is the guiding principle which guarantees the military forces always being under the command of the party R3: it is the practical guide for the army always to heed the directions of the party 2-grams modified precision: 10/17 3-grams modified precision: 7/16

  • M. Federico, FBK-irst

SMT - Part 1 2013 38

BLEU: modified n-gram precision

n-gram precision = % of n-grams in the HYP that occur also in the REF – matches of shorter n-grams (n=1,2) try to capture adequacy – matches of longer n-grams (n=3,4,...) try to capture fluency Why modified’? Consider the example: H: the the the the the the the R1:the cat is on the mat R2:the cat lays on the mat pstandard

1

= 7 7 Intuition: a word or n-gram of the REF can be matched only once. Solution: clip the total count of each candidate word (n-gram) by its max ref. pmodified

1

= 2 7

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-34
SLIDE 34

39

BLEU: computation

  • 1. Compute overall test-set modified n-gram precision pn, n=1,2,3,4
  • 2. Take the geometric mean of the different precisions:

P =

4

√p1 · p2 · p3 · p4 Notice: P tends to reward short candidates.

  • 3. To compensate for short outputs the brevity penalty is computed:

BP = exp(∆) if ∆ < 0

  • therwise BP = 1

∆ is the relative difference between the overall length of Hs and Rs4

  • 4. Finally, compute: BLEU = BP ∗ P

Hence, for outputs systematically shorter than the references, BLEU < P. BLEU is computed at the test-set level, not at the sentence level.

4For each H consider corresponding R that is closest in length.

  • M. Federico, FBK-irst

SMT - Part 1 2013 40

The State of the Art

  • SMT is now a very competitive technology

– in many evaluations SMT outperformed rule-based MT – several commercial SMT systems: Google, Microsoft, IBM, LW, ...

  • Interest in SMT revamped around seminal work at IBM in early 90’

– indeed the core ideas go back to Warren Weaver in 1949

  • Best performing SMT systems use either:

– brute force direct translation exploiting huge amounts of data – combination of phrase-based and tree-based models – integration of morphology and syntax

  • Automatic evaluation has boosted research in SMT:

– model training directly optimizes the evaluation metric

  • Evaluation campaigns are organized every year:

– NIST: news texts - Chi/Ara to Eng (2002-) – IWSLT: traveling/lectures speech - Asian-EU languages (2004-) – WMT: news texts - many EU languages (2005-)

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-35
SLIDE 35

41

Example 1: Arabic English

Human Dubai 2 - 7 ( AFP ) - The Secretary-General of the United Nations Kofi Annan said he would donate the international Zayed Prize for the Environment , which he received on Monday night in Dubai worth 500000 dollars , to setup a foundation for agriculture and educating girls in Africa . Machine Dubai 2-7 (AFP) - United Nations Secretary-General Kofi Annan said that the award will Zayed International Environment, which received Monday evening in Dubai worth 500,000 dollars to establish an institution for agriculture and education of girls in the African continent.

  • M. Federico, FBK-irst

SMT - Part 1 2013 42

Example 1: Arabic English

Human Dubai 2 - 7 ( AFP ) - The Secretary-General of the United Nations Kofi Annan said he would donate the international Zayed Prize for the Environment , which he received on Monday night in Dubai worth 500000 dollars , to setup a foundation for agriculture and educating girls in Africa . Machine Dubai 2-7 (AFP) - United Nations Secretary-General Kofi Annan said that the award will Zayed International Environment, which ... he ... received ... on... Monday evening in Dubai worth 500,000 dollars ... , will be donated ... to establish an institution for agriculture and education

  • f girls in the African continent.
  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-36
SLIDE 36

43

Example 2: Arabic English

Human New York ( The United Nations ) 2 - 8 ( AFP ) - United Nations Secretary General Kofi Annan expressed his concern today , Tuesday , about the wave of targeted liquidations being carried out by Israel in Gaza and the West Bank , and he also condemned the rocket attacks targeting the Hebrew State , according to his spokesman . Machine New York (United Nations) 2-8 (AFP) - United Nations Secretary General Kofi Annan expressed concern today, Tuesday, the wave of qualifiers quality by Israel in Gaza and the West Bank, also condemned the missile attacks against the Jewish state, his spokesman said.

  • M. Federico, FBK-irst

SMT - Part 1 2013 44

Example 2: Arabic English

Human New York ( The United Nations ) 2 - 8 ( AFP ) - United Nations Secretary General Kofi Annan expressed his concern today , Tuesday , about the wave of targeted liquidations being carried out by Israel in Gaza and the West Bank , and he also condemned the rocket attacks targeting the Hebrew State , according to his spokesman . Machine New York (United Nations) 2-8 (AFP) - United Nations Secretary General Kofi Annan expressed concern today, Tuesday, ... about ... the wave of qualifiers quality targeted liquidations by Israel in Gaza and the West Bank, ... and he ... also condemned the missile attacks against the Jewish state, his spokesman said.

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-37
SLIDE 37

45

Example 3: Chinese English

Human Today was the Catholic Church’s annual ” Life Day ” . Pope Benedict XVI delivered a speech in St . Peter’s Basilica , in which he criticized that the hedonism of wealthy society impairs the Christian value system of respect for life , and he strongly condemned abortion and euthanasia . Machine Today is the ”life” of the Catholic Church once a year, when 16 of the pope delivered a speech in St. Peter’s cathedral, criticized the joy of an affluent society, undermine the values of the Christian faith to respect life, and strongly condemned euthanasia and abortion.

  • M. Federico, FBK-irst

SMT - Part 1 2013 46

Example 3: Chinese English

Human (?) Today was the Catholic Church’s annual ” Life Day ” . Pope Benedict XVI delivered a speech in St . Peter’s Basilica , in which he criticized that the hedonism of ...our... wealthy society ...which... impairs the Christian value system of respect for life , and he strongly condemned abortion and euthanasia . Machine Today is the ”life ..day...” of the Catholic Church once a year, when 16 of the pope delivered a speech in St. Peter’s cathedral, ...he... criticized the joy of an affluent society, ... that... undermines the values of the Christian faith to respect life, and strongly condemned euthanasia and abortion.

  • M. Federico, FBK-irst

SMT - Part 1 2013

slide-38
SLIDE 38

47

Example 4: Chinese English

Human The Pope told thousands of believers making the pilgrimage to St . Peter’s Basilica , ” Life is often glorified during times of happiness , but no longer respected during times

  • f sickness and trouble or when it is impaired . ”

Machine The pope told thousands who came to St. Peter’s church followers, ”when the joys of life were often, but sick or disabled, will no longer be respected.”

  • M. Federico, FBK-irst

SMT - Part 1 2013 48

Example 4: Chinese English

Human The Pope told thousands of believers making the pilgrimage to St . Peter’s Basilica , ” Life is often glorified during times of happiness , but no longer respected during times

  • f sickness and trouble or when it is impaired . ”

Machine The pope told thousands ... of followers... who came to St. Peter’s church followers, ”when the-re is joys of life were..was.. often ..glorified.., but ...when... sick or disabled, will..it is.. no longer be respected.”

  • M. Federico, FBK-irst

SMT - Part 1 2013