machine translation proposal pilot objective
play

Machine Translation Proposal Pilot Objective: Cost : PEMT 27% - PowerPoint PPT Presentation

Machine Translation Proposal Pilot Objective: Cost : PEMT 27% savings over human translation Efficiency : PEMT 25% faster than human translation Quality : an acceptable score under 30 according to the Harmonized the TAUS Dynamic


  1. Machine Translation Proposal

  2. Pilot Objective: ● Cost : PEMT 27% savings over human translation ● Efficiency : PEMT 25% faster than human translation ● Quality : an acceptable score under 30 according to the Harmonized the TAUS Dynamic Quality Framework (DQF) and Multidimensional Quality Metrics (MQM)

  3. Error Type Minor Major Critical Accuracy omission 1 2 3 mistranslation 1 2 3 untranslated 1 2 3 Terminology inconsistent with termbase 1 2 3 inconsistent use of terminology 1 2 3 Fluency grammar 1 2 3

  4. Pilot Project Processes & Problems

  5. Step 1: File Preparation PDF DOCX Delete the unnecessary content Delete the extra space Make the file clear; the alignment easier

  6. Step2: Testing and Training Round A: tmx only Round B: adding related PDFs BLEU score increased

  7. Step3 First Round of Human Evaluation ❖ 2 post-editors ❖ A sample of 1000 words extracted from one of the system ❖ Analyzed and gave the quality score first ❖ average PE time: 53 minutes

  8. Step4: Tuning Failed to train the system put them into training data it works:)!

  9. Step 5: Adding dictionary ❖ 512-page IMF glossary ❖ Converted from PDF to DOCX ❖ Cleaned up formats and terms ❖ Added it into the dictionary data ❖ Trained two systems

  10. Problem Time consuming glossary clean-up

  11. Problem Failed to add the glossary into dictionary data

  12. Problem Lower BLEU score after adding the glossary

  13. Step 6: The final round of human evaluation ❖ 2 post-editors ❖ A sample of 1000 words extracted from the system with the highest BLEU score(44.03) ❖ Analyzed and gave the quality score first ❖ Average PE time: 0.68 hours

  14. Problem Mistranslated and untranslated MT due to incomplete manual cleanup

  15. Pilot Project Results

  16. 85% Efficiency

  17. 71% Cost HT: ❖ Translation: $0.12/word ❖ Editing: $0.05/word PEMT: ❖ Post-editing: $0.05/word

  18. 31.5 Quality

  19. Quality QA Error Score: 49 31.5 PE time for 1000 words: 53mins 40.8mins Comparison of two rounds of human evaluation

  20. Lessons Learned

  21. PDF formating cleanup

  22. When in doubt, check the system ➔ The system IS objective ➔ TMX IS better than PDF

  23. Content relevance is key

  24. ➔ “ Terminology ” and ” Fluency ” performance are improved ➔ Further data needs to be collected for assessment accuracy.

  25. Thank you

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend