Machine Translation Proposal Pilot Objective: Cost : PEMT 27% - - PowerPoint PPT Presentation

machine translation proposal pilot objective
SMART_READER_LITE
LIVE PREVIEW

Machine Translation Proposal Pilot Objective: Cost : PEMT 27% - - PowerPoint PPT Presentation

Machine Translation Proposal Pilot Objective: Cost : PEMT 27% savings over human translation Efficiency : PEMT 25% faster than human translation Quality : an acceptable score under 30 according to the Harmonized the TAUS Dynamic


slide-1
SLIDE 1

Machine Translation Proposal

slide-2
SLIDE 2

Pilot Objective:

  • Cost: PEMT 27% savings over human translation
  • Efficiency: PEMT 25% faster than human translation
  • Quality: an acceptable score under 30 according to the

Harmonized the TAUS Dynamic Quality Framework (DQF) and Multidimensional Quality Metrics (MQM)

slide-3
SLIDE 3 Error Type Minor Major Critical Accuracy
  • mission
1 2 3 mistranslation 1 2 3 untranslated 1 2 3 Terminology inconsistent with termbase 1 2 3 inconsistent use of terminology 1 2 3 Fluency grammar 1 2 3
slide-4
SLIDE 4

Pilot Project Processes & Problems

slide-5
SLIDE 5

Step 1: File Preparation

PDF DOCX Delete the unnecessary content Delete the extra space Make the file clear; the alignment easier

slide-6
SLIDE 6

Step2: Testing and Training

Round A: tmx only Round B: adding related PDFs BLEU score increased

slide-7
SLIDE 7

Step3 First Round of Human Evaluation

❖ 2 post-editors ❖ A sample of 1000 words extracted from one of the system ❖ Analyzed and gave the quality score first ❖ average PE time: 53 minutes
slide-8
SLIDE 8

Step4: Tuning

Failed to train the system put them into training data it works:)!

slide-9
SLIDE 9

Step 5: Adding dictionary

❖ 512-page IMF glossary ❖ Converted from PDF to DOCX ❖ Cleaned up formats and terms ❖ Added it into the dictionary data ❖ Trained two systems
slide-10
SLIDE 10

Problem

Time consuming glossary clean-up

slide-11
SLIDE 11

Problem

Failed to add the glossary into dictionary data

slide-12
SLIDE 12

Problem

Lower BLEU score after adding the glossary

slide-13
SLIDE 13

Step 6: The final round of human evaluation

❖ 2 post-editors ❖ A sample of 1000 words extracted from the system with the highest BLEU score(44.03) ❖ Analyzed and gave the quality score first ❖ Average PE time: 0.68 hours
slide-14
SLIDE 14

Problem

Mistranslated and untranslated MT due to incomplete manual cleanup

slide-15
SLIDE 15

Pilot Project Results

slide-16
SLIDE 16

85%

Efficiency

slide-17
SLIDE 17

71%

Cost

HT: ❖ Translation: $0.12/word ❖ Editing: $0.05/word PEMT: ❖ Post-editing: $0.05/word
slide-18
SLIDE 18

31.5

Quality

slide-19
SLIDE 19

QA Error Score: 49 31.5 PE time for 1000 words: 53mins 40.8mins

Comparison of two rounds of human evaluation

Quality

slide-20
SLIDE 20
slide-21
SLIDE 21

Lessons Learned

slide-22
SLIDE 22

PDF formating cleanup

slide-23
SLIDE 23

When in doubt, check the system

➔ The system IS objective ➔ TMX IS better than PDF

slide-24
SLIDE 24

Content relevance is key

slide-25
SLIDE 25

➔“Terminology” and ”Fluency” performance are improved ➔Further data needs to be collected for assessment accuracy.

slide-26
SLIDE 26

Thank you