Machine Translation 12: (Non-neural) Statistical Machine Translation - PowerPoint PPT Presentation

Machine Translation 12: (Non-neural) Statistical Machine Translation Rico Sennrich University of Edinburgh R. Sennrich MT – 2018 – 12 1 / 27

Today’s Lecture So far, main focus of lecture was on: neural machine translation research since ≈ 2013 today, we look at (non-neural) Statistical Machine Translation, and research since ≈ 1990 R. Sennrich MT – 2018 – 12 1 / 27

MT – 2018 – 12 Statistical Machine Translation 1 Basics Phrase-based SMT Hierarchical SMT Syntax-based SMT R. Sennrich MT – 2018 – 12 2 / 27

Refresher: A probabilistic model of translation Suppose that we have: a source sentence S of length m ( x 1 , . . . , x m ) a target sentence T of length n ( y 1 , . . . , y n ) We can express translation as a probabilistic model: T ∗ = arg max P ( T | S ) T = arg max P ( S | T ) P ( T ) Bayes’ theorem T We can model translation via two models: language model to estimate P ( T ) translation model to estimate P ( S | T ) Without continuous space representations, how to estimate P ( S | T ) ? → break it up into smaller units R. Sennrich MT – 2018 – 12 3 / 27

Word Alignment chicken-and-egg problem let’s break up P ( S | T ) into small units (words): we can estimate an alignment given a translation model expectation step we can estimate translation model given a an alignment (using relative frequencies) maximization step what can we do if we have neither? solution: Expectation Maximization Algorithm initialize model iterate between estimating alignment and translation model simplest model based on lexical translation; more complex models consider position and fertility R. Sennrich MT – 2018 – 12 4 / 27

Word Alignment: IBM Models [Brown et al., 1993] ... la maison ... la maison blue ... la fleur ... ... the house ... the blue house ... the flower ... • Initial step: all alignments equally likely • Model learns that, e.g., la is often aligned with the R. Sennrich MT – 2018 – 12 5 / 27

Word Alignment: IBM Models [Brown et al., 1993] ... la maison ... la maison blue ... la fleur ... ... the house ... the blue house ... the flower ... • After one iteration • Alignments, e.g., between la and the are more likely R. Sennrich MT – 2018 – 12 5 / 27

Word Alignment: IBM Models [Brown et al., 1993] ... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ... • After another iteration • It becomes apparent that alignments, e.g., between fleur and flower are more likely (pigeon hole principle) R. Sennrich MT – 2018 – 12 5 / 27

Word Alignment: IBM Models [Brown et al., 1993] ... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ... • Convergence • Inherent hidden structure revealed by EM R. Sennrich MT – 2018 – 12 5 / 27

Linear Models T ∗ = arg max P ( S | T ) P ( T ) Bayes’ theorem T M T ∗ ≈ arg max � λ m h m ( S, T ) [Och, 2003] T m =1 linear combination of arbitrary features Minimum Error Rate Training to optimize feature weights big trend in SMT research: engineering new/better features R. Sennrich MT – 2018 – 12 6 / 27

Word-based SMT core idea combine word-based translation model and n-gram language model to compute score of translation consequences + models are easy to compute - word translations are assumed to be independent of each other: only LM takes into account context - poor at modelling long-distance phenomena: n-gram context is limited R. Sennrich MT – 2018 – 12 7 / 27

Phrase-based SMT core idea Basic translation unit in translation model is not word, but word sequence (phrase) consequences + much better memorization of frequent phrase translations - large (and noisy) phrase table - large search space; requires sophisticated pruning - still poor at modelling long-distance phenomena leider ist Herr Steiger nach Köln gefahren unfortunately , Mr Steiger has gone to Cologne R. Sennrich MT – 2018 – 12 9 / 27

Phrase Extraction extraction rules based on word-aligned sentence pair phrase pair must be compatible with alignment... ...but unaligned words are ok phrases are contiguous sequences entsprechenden Anmerkungen aushändigen werde Ihnen Ich die I shall be = werde shall be passing on to you some comments R. Sennrich MT – 2018 – 12 10 / 27

Phrase Extraction extraction rules based on word-aligned sentence pair phrase pair must be compatible with alignment... ...but unaligned words are ok phrases are contiguous sequences entsprechenden Anmerkungen aushändigen werde Ihnen Ich die I shall be passing on to you some comments = some die entsprechenden Anmerkungen comments R. Sennrich MT – 2018 – 12 10 / 27

Phrase Extraction extraction rules based on word-aligned sentence pair phrase pair must be compatible with alignment... ...but unaligned words are ok phrases are contiguous sequences entsprechenden Anmerkungen aushändigen werde Ihnen Ich die I shall be werde Ihnen die entsprechenden passing Anmerkungen aushändigen on = shall be passing on to you to some comments you some comments R. Sennrich MT – 2018 – 12 10 / 27

Common Features in Phrase-based SMT phrase translation probabilities (in both directions) word translation probabilities (in both directions) language model reordering model constant penalty for each phrase used sparse features with learned cost for some (classes of) phrase pairs multiple models of each type possible R. Sennrich MT – 2018 – 12 11 / 27

Decoding er geht ja nicht nach hause he is yes not after house it are is do not to home , it goes , of course does not according to chamber , he go is not in at home it is not home he will be is not under house it goes does not return home he goes do not do not is to are following is after all not after does not to not is not are not is not a • The machine translation decoder does not know the right answer – picking the right translation options – arranging them in the right order → Search problem solved by heuristic beam search R. Sennrich MT – 2018 – 12 12 / 27

Decoding er geht ja nicht nach hause are pick any translation option, create new hypothesis R. Sennrich MT – 2018 – 12 13 / 27

Decoding er geht ja nicht nach hause he are it create hypotheses for all other translation options R. Sennrich MT – 2018 – 12 13 / 27

Decoding er geht ja nicht nach hause yes he goes home are home does not go it to also create hypotheses from created partial hypothesis R. Sennrich MT – 2018 – 12 13 / 27

Decoding er geht ja nicht nach hause yes he goes home are home does not go it to backtrack from highest scoring complete hypothesis R. Sennrich MT – 2018 – 12 13 / 27

Decoding large search space (exponential number of hypotheses) reduction of search space: recombination of identical hypotheses pruning of hypotheses efficient decoding is a lot more complex in SMT than in neural MT R. Sennrich MT – 2018 – 12 14 / 27

Hierarchical SMT core idea use context-free grammars (CFG) rules as basic translation units → allows gaps consequences + better modeling of some reordering patterns leider ist Herr Steiger nach Köln gefahren unfortunately Mr Steiger has gone to Cologne , - overgeneralisation is still possible leider ist Herr Steiger nicht nach Köln gefahren unfortunately Herr Steiger does not has gone to Cologne , R. Sennrich MT – 2018 – 12 16 / 27

Hierarchical Phrase Extraction entsprechenden Anmerkungen aushändigen subtracting werde Ihnen subphrase Ich die I shall be werde X aushändigen passing = shall be passing on X on to you some comments R. Sennrich MT – 2018 – 12 17 / 27

Decoding Decoding via (S)CFG derivation | • Derivation starts with pair of linked s symbols. R. Sennrich MT – 2018 – 12 18 / 27

Decoding Decoding via (S)CFG derivation | ⇒ s 2 x 3 | s 2 x 3 • s → s 1 x 2 | s 1 x 2 (glue rule) R. Sennrich MT – 2018 – 12 18 / 27

Decoding Decoding via (S)CFG derivation | ⇒ s 2 x 3 | s 2 x 3 ⇒ s 2 x 4 und x 5 | s 2 x 4 and x 5 • x → x 1 und x 2 | x 1 and x 2 R. Sennrich MT – 2018 – 12 18 / 27

Machine Translation 12: (Non-neural) Statistical Machine Translation - PowerPoint PPT Presentation

Machine Translation 12: (Non-neural) Statistical Machine Translation Rico Sennrich University of Edinburgh R. Sennrich MT 2018 12 1 / 27 Todays Lecture So far, main focus of lecture was on: neural machine translation research

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Global Translation Services Website translation using post-edited machine translation and

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

History & Evaluation CMSC 470 Marine Carpuat T odays topics Machine Translation

Welcome S1 Parents Pauline Walker Headteacher The Royal High School Introductions Paul

Wick High School LEARNING - AMBITION - RESPECT Hello and welcome Dear Parent/Carer, Despite

1 Engineering Goods Industry ENGINEERING GOODS INDUSTRY METAL PRODUCTS ELECTRICAL EQUIPMENT

By: By: AMJAD ALI AWAN Chief Executive Officer September 7 th , 2016 Policy Framework 2013

S1 Parents Information Evening 26th November 6:30-7:30 Helping your child develop their

Annual results 2011/12 Patrick Kron 4 May 2012 Annual results 2011/12 Main events 2011/12

Blackstone Reports Second Quarter 2017 Results New York, July 20, 2017 : Blackstone (NYSE:BX) today

'IV WILLIAMSoN COUNTY '" Williamson County Self-Funded Health Plan The Opportunity for

Sambuz

Useful Links

Newsletter

Mail Us

Machine Translation 12: (Non-neural) Statistical Machine Translation - PowerPoint PPT Presentation

Machine Translation 12: (Non-neural) Statistical Machine Translation Rico Sennrich University of Edinburgh R. Sennrich MT 2018 12 1 / 27 Todays Lecture So far, main focus of lecture was on: neural machine translation research

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Global Translation Services Website translation using post-edited machine translation and

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

History &amp; Evaluation CMSC 470 Marine Carpuat T odays topics Machine Translation

Welcome S1 Parents Pauline Walker Headteacher The Royal High School Introductions Paul

Wick High School LEARNING - AMBITION - RESPECT Hello and welcome Dear Parent/Carer, Despite

1 Engineering Goods Industry ENGINEERING GOODS INDUSTRY METAL PRODUCTS ELECTRICAL EQUIPMENT

By: By: AMJAD ALI AWAN Chief Executive Officer September 7 th , 2016 Policy Framework 2013

S1 Parents Information Evening 26th November 6:30-7:30 Helping your child develop their

Annual results 2011/12 Patrick Kron 4 May 2012 Annual results 2011/12 Main events 2011/12

Blackstone Reports Second Quarter 2017 Results New York, July 20, 2017 : Blackstone (NYSE:BX) today

'IV WILLIAMSoN COUNTY '&quot; Williamson County Self-Funded Health Plan The Opportunity for

Sambuz

Useful Links

Newsletter

Mail Us

History & Evaluation CMSC 470 Marine Carpuat T odays topics Machine Translation

'IV WILLIAMSoN COUNTY '" Williamson County Self-Funded Health Plan The Opportunity for