Automatic Machine Translation Evaluation using Source Language - PowerPoint PPT Presentation

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model Presenter : 1 Kosuke Takahashi 1 2 Katsuhito Sudoh 1 Satoshi Nakamura 1: Nara Institute of Science and Technology (NAIST) 2: PRESTO, Japan Science and Technology Agency 1

Existing metrics based on surface level features } BLEU[Papineni +, 2002], NIST[Doddington +, 2002], METEOR[Satanjeev + , 2005] } Calculate evaluation scores with word matching rate Problems: Relying on lexical features → cannot appropriately evaluate semantic and syntactic differences 2

Existing metrics based on embedded representation } RUSE[Shimanaka +, 2018], BERT regressor[Shimanaka +, 2019] } Fully parameterized metrics } Use sentence vectors } fine-tuned to predict human evaluation scores } BERT regressor achieved the SOTA result on WMT17 metrics task in 2019 These metrics provide better evaluation performance than surface level ones. 3

Proposed multi-reference Conventional multi-reference Proposed Idea source reference 1 sentence system (reference 1) system translation reference 2 translation (hypothesis) (hypothesis) reference sentence reference n (reference 2) ○ better evaluation ○ better evaluation × costly to prepare multiple ○ a little cost to prepare 2 reference sentences references for each hypothesis 4

Architectures of a baseline and proposed models hyp+src/hyp+ref hyp+src+ref Baseline: BERT regressor evaluation ev on scor ore hyp+ref ev evaluation on scor ore evaluation ev on ML MLP scor ore ML MLP con oncaten enation on MLP ML v hy , v hy hyp+s +src, hyp+r +ref sen enten ence-pair vec ector or sen enten ence-pa pair r v hy sen enten ence-pai air r sen enten ence-pai air r hyp+src+ref vec ector or v hy vec ector or v hy vec ector or v hy hyp+r +ref hyp+src hyp+r +ref sentence-pair encoder sentence-pair encoder sentence-pair encoder hypot othes esis + sou ource e hypot othes esis + hypot othes esis + hypot othes esis + ref efer eren ence + ref efer eren ence sou ource ref efer eren ence 5

The setting of experiments • Language model : mBERT, XLM15 • Input : hyp+src/ref, hyp+src+ref, hyp+ref, hyp+src • Baselines : SentBLEU, BERT regressor (BERT with hyp+ref) • Data : WMT17 metrics shared task • Language pairs : {De, Ru, Tr, Zh}-En 6

Results : comparison with baselines metric or language model input style average score (r) SentBLEU hyp, ref 48.4 BERT regressor hyp+ref 74.0 (monolingual BERT) hyp+src/hyp+ref 72.6 + + 3. 3.1 mBERT hyp+src+ref 68.9 hyp+src/hyp+ref 77.1 XLM15 hyp+src+ref 74.7 • Proposed XLM15 with hyp+src/hyp+ref surpassed basline scores 7

Results : evaluation performance for each input style language model input style average score (r) hyp+ref 67.9 hyp+src 55.9 + + 4. 4.7 mBERT hyp+src/hyp+ref 72.6 72. hyp+src+ref 68.9 hyp+ref 74.1 hyp+src 72.8 + 3. + 3.0 XLM15 hyp+src/hyp+ref 77.1 77. hyp+src+ref 74.7 • Using src and ref improve evaluation performance • hyp+src/hyp+ref was the best input style 8

Analysis : scatter plots of evaluation and DA scores XLM15 hyp+src/ref Pearson’s correlation score All : 0. 0.768 768 DA ≧ 0.0 : 0. 0.580 580 DA < 0.0 : 0. 0.529 529 Low quality translation is hard to evaluate Note: DA (Direct Assessment) is a human evaluation score 9

Analysis : The drop rate of Pearson’s correlation score from high DA to low DA range language model input style reduction rate (%) BERT regressor hyp+ref 16.10 (monolingual BERT) hyp+ref 22.05 – 14. 14.28 28 hyp+src 6.88 mBERT hyp+src/hyp+ref 7.77 hyp+src+ref 17.51 hyp+ref 14.20 – 5. 5.52 52 hyp+src 8.46 XLM15 hyp+src/hyp+ref 8.68 hyp+src+ref 11.12 Note: reduction rate indicates how much evaluation performance is degraded from high to low quality translations 10

Summary } Proposed a MT evaluation metric that utilizes source sentences as pseudo references } hyp+src/hyp+ref makes good use of source sentences and is confirmed to improve evaluation performance. } XLM15 hyp+src/hyp+ref showed the higher correlation with humans than baselines } Source information is contributed to stabilize the evaluation of low quality translations Future Work } Experiment with multiple language models and datasets } Focus on a better evaluation of low quality translations 11

Automatic Machine Translation Evaluation using Source Language - PowerPoint PPT Presentation

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model Presenter : 1 Kosuke Takahashi 1 2 Katsuhito Sudoh 1 Satoshi Nakamura 1: Nara Institute of Science and Technology (NAIST) 2: PRESTO, Japan

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

History & Evaluation CMSC 470 Marine Carpuat T odays topics Machine Translation

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

Global Translation Services Website translation using post-edited machine translation and

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Cross-Lingual Word Sense Disambiguation using WordNets and Context Mapping Priyank Jaini Ankit

WMT 2016 Shared Task on Cross-lingual Pronoun Prediction . Liane Guillou, Christian Hardmeier,

Stacking With Auxiliary Features: Improved Ensembling for Natural Language and Vision Nazneen

The Moment of Meaning The Moment of Meaning

A Method of Cross-Lingual Question-Answering Based on Machine Translation and Noun Phrase

MWE-WN Community discussion Florence, August 2, 2019 1 Agenda Feedback from the joint workshop

Using query transformation to improve Gnutella search performance Surendar Chandra

Investigating the role of the interviewer in a face-to-face (FTF) survey in Zambia MPSM / JPSM

Automatic Machine Translation Evaluation using Source Language - PowerPoint PPT Presentation

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model Presenter : 1 Kosuke Takahashi 1 2 Katsuhito Sudoh 1 Satoshi Nakamura 1: Nara Institute of Science and Technology (NAIST) 2: PRESTO, Japan

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

History &amp; Evaluation CMSC 470 Marine Carpuat T odays topics Machine Translation

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

Global Translation Services Website translation using post-edited machine translation and

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Neural Machine Translation II Refinements Philipp Koehn 17 October 2017 Philipp Koehn Machine

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

Cross-Lingual Word Sense Disambiguation using WordNets and Context Mapping Priyank Jaini Ankit

WMT 2016 Shared Task on Cross-lingual Pronoun Prediction . Liane Guillou, Christian Hardmeier,

Stacking With Auxiliary Features: Improved Ensembling for Natural Language and Vision Nazneen

The Moment of Meaning The Moment of Meaning

A Method of Cross-Lingual Question-Answering Based on Machine Translation and Noun Phrase

MWE-WN Community discussion Florence, August 2, 2019 1 Agenda Feedback from the joint workshop

Using query transformation to improve Gnutella search performance Surendar Chandra

Investigating the role of the interviewer in a face-to-face (FTF) survey in Zambia MPSM / JPSM

History & Evaluation CMSC 470 Marine Carpuat T odays topics Machine Translation