Five Shades of Noise: Analyzing Machine Translation Errors in - PowerPoint PPT Presentation

Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text Marlies van der Wees, Arianna Bisazza, Christof Monz

Statistical Machine Translation News sentence: 印度⾦釒融中⼼忄孟买亦受到波及。 (mumbai, india's financial center, was also affected.) 😁 SMT india's financial center mumbai also affected. Five Shades of Noise: Analyzing Machine 2 Translation Errors in User-Generated Text

Statistical Machine Translation SMS sentence: 你路上慢点 (be careful on your way / take your time) 😪 SMT you are on the road to slow points Five Shades of Noise: Analyzing Machine 3 Translation Errors in User-Generated Text

SMT for user-generated text is often bad ✤ Reference ✤ SMT output and if i go out, i will and if i went. ✦ ✦ stop by your place i could not bring it to into its enemies. ✦ ✦ you i've never seen a pig i am seen pig there. ✦ ✦ there you're too delighted to anytime you ✦ ✦ be homesick Five Shades of Noise: Analyzing Machine 4 Translation Errors in User-Generated Text

Towards improving SMT quality for UG ✤ To target specific error types, we need to know why mistakes are made: in UG versus formal text ✦ contrast UG with newswire • in different types of UG ✦ five shades of noise: weblogs, comments, • speech (CTS), SMS, and chat messages in different language pairs ✦ Arabic-English & Chinese-English • Five Shades of Noise: Analyzing Machine 5 Translation Errors in User-Generated Text

Analyzing SMT errors in UG text ✤ What translation choices were made by the SMT system? SMT ✤ What translation choices could have been made by the SMT system? ✤ Why did the SMT system make the ✤ Why did the SMT system make the choices that it made? choices that it made? Five Shades of Noise: Analyzing Machine 6 Translation Errors in User-Generated Text

Word Alignment Driven Evaluation: approach * ✤ For each word alignment link in the test (e.g. 你 — your ) that is translated wrongly, determine: source phrase source phrase source phrase target phrase target phrase target phrase probability probability probability source and target source and target source and target �� on the road 0.4 source on the road on the road 0.4 0.4 source source phrases both in table, phrases both in table, phrases both in table, phrase not in phrase not in phrase not in �� but other translation but other translation but other translation on the way on the way on the way 0.3 0.3 0.3 phrase table: phrase table: phrase table: preferred: preferred: preferred: �� SEEN error SEEN error SEEN error on your way on your way on your way 0.2 0.2 0.2 SCORE error SCORE error SCORE error target phrase target phrase target phrase � � � dot 0.1 dot dot not in phrase table: not in phrase table: not in phrase table: SENSE error SENSE error SENSE error � � � point point point 0.4 * Approach adopted from Irvine et al., Measuring Machine Translation Errors in New Domains , 2013 Five Shades of Noise: Analyzing Machine 7 Translation Errors in User-Generated Text

Word Alignment Driven Evaluation: results Word-level error statistics for Arabic-English benchmarks Word-level error statistics for Arabic-English benchmarks 60 60 Correct Correct Seen 50 50 Sense Score Relative frequency Relative frequency 40 40 30 30 20 20 10 10 0 0 News 1 News 1 News 2 News 2 Weblogs Comments Weblogs Comments CTS CTS Chat Chat SMS SMS News UG Five Shades of Noise: Analyzing Machine 8 Translation Errors in User-Generated Text

Word Alignment Driven Evaluation: findings ✤ SMT errors for UG text differ from SMT errors for news ✦ many SEEN and SENSE errors for UG • between different types of UG ✦ SMS and chat messages are most affected • between different language pairs ✦ differences in Chinese-English are more • subtle than in Arabic-English Five Shades of Noise: Analyzing Machine 9 Translation Errors in User-Generated Text

Analyzing SMT errors in UG: what we learned ✤ Common errors in UG are due to: misspellings or Arabic dialectal forms ✦ formal lexical choices ✦ idioms translated word by word ✦ dropped pronouns in Chinese ✦ ✤ UG suffers from low model coverage generate new translation candidates ✦ normalize existing translation candidates ✦ Five Shades of Noise: Analyzing Machine 10 Translation Errors in User-Generated Text

More Error Analysis? ✤ Visit the poster for: Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text Marlies van der Wees Arianna Bisazza Christof Monz Informatics Institute, University of Amsterdam Model coverage analysis ✦ Motivation Five Shades of Noise Statistical machine translation (SMT) of user-generated (UG) text Two language pairs Five UG sets Two news sets input SMS message: output translation: Arabic-English & weblogs, comments, different sources, SMT �� Chinese-English speech, SMS, chat to contrast with UG you are on the road Arabic-English versus (= be careful on your way / take your time) to slow points Lower translation quality for UG than for news ✦ Understanding SMT errors in UG text why does SMT make the errors that it makes on UG? SMT low model coverage? poor scoring of translation options? Chinese-English results what errors are observed for various types of UG? Quantitative Analysis: SMT Model Coverage Approach for each phrase pair in the test set Qualitative Examples (e.g. �� / take your time), determine: ✦ source phrase covered in the SMT models target phrase covered in the SMT models phrase pair covered in the SMT models all computed for various phrase lengths ✤ Read the paper for: Findings coverage of source phrases and phrase pairs is lower for UG than for news coverage of target phrases is more balanced among test sets coverage dramatically decreases for longer phrases SMS and chat suffer most from low coverage Phrase-length analysis ✦ Qualitative Analysis: Word Alignment Driven Evaluation * so the kids do not feel upset Ref: i 'm online . take your time Ref: — Correct — SEEN error: unknown source Detailed explanation and — SENSE error: 上网了 , 你路上慢点 Input: Input: qAlt E$An AlEyAl mtzEl$ unknown target ✦ — SCORE error: suboptimal Output: on the internet , and you are on the road to slow points Output: said because of the sons scoring missing pronoun idiom translated in small chunks lexical choices that are too formal out-of-vocabulary (OOV) not inferred by SMT system losing its meaning as a phrase not reflecting colloquial language due to dialect or misspellings discussions * Irvine et al., Measuring Machine Translation Errors in New Domains , 2013 Conclusions SMT errors for UG text differ promising solutions include UG text This research was funded in part from SMT errors for news improving scoring for news by the Netherlands Organization �� for Scientific Research (NWO) SMT under project number 639.022.213 between different types of UG increasing phrase pair coverage for UG between different language pairs increasing source phrase coverage for SMS & chat ACL 2015 Workshop on Noisy User-generated Text (WNUT), Beijing, China m.e.vanderwees@uva.nl Five Shades of Noise: Analyzing Machine 11 Translation Errors in User-Generated Text

Five Shades of Noise: Analyzing Machine Translation Errors in - PowerPoint PPT Presentation

Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text Marlies van der Wees, Arianna Bisazza, Christof Monz Statistical Machine Translation News sentence: (mumbai,

Basic Errors Compiling in Unix Syntax errors Common Errors, and Debugging Run-Time errors

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

MANAGEMENT PLAN Courtney Reich, AICP, CFM, Goodwyn Mills & Cawood Upper Shades Creek

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

ELO TRANSLATION PROJECT SARAH **** SOME VOCAB Errors Logic Errors Runtime Errors

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Unified error reporting -- A worthy goal? Andi Kleen, Intel Corporation Sep 2009

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Privet! 2004 - 2010: Performance Engineer, Software Engineer @ MySQL AB / Sun Microsystems / Oracle

C OMPUTATIONAL A SPECTS OF C OMPUTATIONAL D IGITAL P HOTOGRAPHY P HOTOGRAPHY Noise & Denoising

2018 Annual Noise & Operations Report Santa Monica Airport Commission April 22, 2019 Areas

Improving the Accuracy of System Performance Estimation by Using Shards Nicola Ferro &

AAM Aircraft Working Group Kickoff AAM Aircraft Working Group Kickoff Agenda May 28, 2020 Topic

Lecture Three: Time Series Analysis If your experiment needs statistics, you ought to have

Weak-noise limit of systems driven by non-Gaussian fluctuations Adrian Baule with P. Sollich

Protec'ng quantum gates from control noise Constantin Brif Sandia

Sambuz

Useful Links

Newsletter

Mail Us

Five Shades of Noise: Analyzing Machine Translation Errors in - PowerPoint PPT Presentation

Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text Marlies van der Wees, Arianna Bisazza, Christof Monz Statistical Machine Translation News sentence: (mumbai,

Basic Errors Compiling in Unix Syntax errors Common Errors, and Debugging Run-Time errors

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

MANAGEMENT PLAN Courtney Reich, AICP, CFM, Goodwyn Mills &amp; Cawood Upper Shades Creek

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

ELO TRANSLATION PROJECT SARAH **** SOME VOCAB Errors Logic Errors Runtime Errors

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

Statistical Machine Translation Statistical Machine Translation p Lecture 2 Theory and Praxis of

Unified error reporting -- A worthy goal? Andi Kleen, Intel Corporation Sep 2009

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Machine Translation: Going Deep Philipp Koehn 4 June 2015 Philipp Koehn Machine Translation:

Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence:

Privet! 2004 - 2010: Performance Engineer, Software Engineer @ MySQL AB / Sun Microsystems / Oracle

C OMPUTATIONAL A SPECTS OF C OMPUTATIONAL D IGITAL P HOTOGRAPHY P HOTOGRAPHY Noise &amp; Denoising

2018 Annual Noise &amp; Operations Report Santa Monica Airport Commission April 22, 2019 Areas

Improving the Accuracy of System Performance Estimation by Using Shards Nicola Ferro &amp;

AAM Aircraft Working Group Kickoff AAM Aircraft Working Group Kickoff Agenda May 28, 2020 Topic

Lecture Three: Time Series Analysis If your experiment needs statistics, you ought to have

Weak-noise limit of systems driven by non-Gaussian fluctuations Adrian Baule with P. Sollich

Protec'ng quantum gates from control noise Constantin Brif Sandia

Sambuz

Useful Links

Newsletter

Mail Us

MANAGEMENT PLAN Courtney Reich, AICP, CFM, Goodwyn Mills & Cawood Upper Shades Creek

C OMPUTATIONAL A SPECTS OF C OMPUTATIONAL D IGITAL P HOTOGRAPHY P HOTOGRAPHY Noise & Denoising

2018 Annual Noise & Operations Report Santa Monica Airport Commission April 22, 2019 Areas

Improving the Accuracy of System Performance Estimation by Using Shards Nicola Ferro &