Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text
Marlies van der Wees, Arianna Bisazza, Christof Monz
Five Shades of Noise: Analyzing Machine Translation Errors in - - PowerPoint PPT Presentation
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text Marlies van der Wees, Arianna Bisazza, Christof Monz Statistical Machine Translation News sentence: (mumbai,
Marlies van der Wees, Arianna Bisazza, Christof Monz
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 2
News sentence: 印度⾦釒融中⼼忄孟买亦受到波及。 (mumbai, india's financial center, was also affected.) india's financial center mumbai also affected.
SMT
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 3
SMS sentence: 你路上慢点 (be careful on your way / take your time) you are on the road to slow points
SMT
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 4
✤ Reference ✤ SMT output ✦
and if i go out, i will stop by your place
✦
and if i went.
✦
i could not bring it to you
✦
into its enemies.
✦
i've never seen a pig there
✦
i am seen pig there.
✦
you're too delighted to be homesick
✦
anytime you
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 5
✤ To target specific error types, we need to know
why mistakes are made:
✦
in UG versus formal text
✦
in different types of UG
speech (CTS), SMS, and chat messages
✦
in different language pairs
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 6
✤ What translation choices were made
by the SMT system?
SMT ✤ What translation choices could have
been made by the SMT system?
✤ Why did the SMT system make the
choices that it made?
✤ Why did the SMT system make the
choices that it made?
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text
target phrase not in phrase table: SENSE error
dot point 0.4 0.3 0.2 0.1 0.4 source phrase target phrase probability source and target phrases both in table, but other translation preferred: SCORE error source phrase not in phrase table: SEEN error
dot point 0.4 0.3 0.2 source phrase target phrase probability source and target phrases both in table, but other translation preferred: SCORE error source phrase not in phrase table: SEEN error target phrase not in phrase table: SENSE error
dot point 0.4 0.3 0.2 source phrase target phrase probability source and target phrases both in table, but other translation preferred: SCORE error source phrase not in phrase table: SEEN error target phrase not in phrase table: SENSE error
7
✤ For each word alignment link in the test (e.g. 你 —
your) that is translated wrongly, determine:
* Approach adopted from Irvine et al., Measuring Machine Translation Errors in New Domains, 2013
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 8
News 1 News 2 Weblogs Comments CTS Chat SMS 10 20 30 40 50 60 Relative frequency
Word-level error statistics for Arabic-English benchmarks
Correct
News UG
News 1 News 2 Weblogs Comments CTS Chat SMS 10 20 30 40 50 60 Relative frequency
Word-level error statistics for Arabic-English benchmarks
Correct Seen Sense Score
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 9
✤ SMT errors for UG text differ ✦
from SMT errors for news
✦
between different types of UG
✦
between different language pairs
subtle than in Arabic-English
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 10
✤ Common errors in UG are due to: ✦
misspellings or Arabic dialectal forms
✦
formal lexical choices
✦
idioms translated word by word
✦
dropped pronouns in Chinese
✤ UG suffers from low model coverage ✦
generate new translation candidates
✦
normalize existing translation candidates
Five Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 11
✤ Visit the poster for: ✦
Model coverage analysis
✦
Arabic-English versus Chinese-English results
✦
Qualitative Examples
✤ Read the paper for: ✦
Phrase-length analysis
✦
Detailed explanation and discussions
Conclusions ACL 2015 Workshop on Noisy User-generated Text (WNUT), Beijing, China m.e.vanderwees@uva.nlFive Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text
Motivation Five Shades of Noise Marlies van der Wees Arianna Bisazza Christof Monz Informatics Institute, University of Amsterdam Qualitative Analysis: Word Alignment Driven Evaluation* Quantitative Analysis: SMT Model Coverage This research was funded in part by the Netherlands Organization for Scientific Research (NWO) under project number 639.022.213 Understanding SMT errors in UG text why does SMT make the errors that it makes on UG? low model coverage? poor scoring of translation options? what errors are observed for various types of UG? input SMS message: (= be careful on your way / take your time) Statistical machine translation (SMT) of user-generated (UG) text SMTFive Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text 12
✤ Marlies van der Wees ✤ m.e.vanderwees@uva.nl
Conclusions ACL 2015 Workshop on Noisy User-generated Text (WNUT), Beijing, China m.e.vanderwees@uva.nlFive Shades of Noise: Analyzing Machine Translation Errors in User-Generated Text
Motivation Five Shades of Noise Marlies van der Wees Arianna Bisazza Christof Monz Informatics Institute, University of Amsterdam Qualitative Analysis: Word Alignment Driven Evaluation* Quantitative Analysis: SMT Model Coverage This research was funded in part by the Netherlands Organization for Scientific Research (NWO) under project number 639.022.213 Understanding SMT errors in UG text why does SMT make the errors that it makes on UG? low model coverage? poor scoring of translation options? what errors are observed for various types of UG? input SMS message: (= be careful on your way / take your time) Statistical machine translation (SMT) of user-generated (UG) text SMT