Combining Crowd and AI to scale professional-quality translation - - PowerPoint PPT Presentation

combining crowd and ai to scale professional quality
SMART_READER_LITE
LIVE PREVIEW

Combining Crowd and AI to scale professional-quality translation - - PowerPoint PPT Presentation

Combining Crowd and AI to scale professional-quality translation Joo Graa CTO The Internet, 1997 80% English The Internet, 2017 30% English 20% Chinese 8% 6% Spanish 5% Japanese 4% Portuguese 3% German


slide-1
SLIDE 1

Combining Crowd and AI to scale professional-quality translation

João Graça CTO

slide-2
SLIDE 2

80%


English

The Internet, 1997

slide-3
SLIDE 3

5%


Portuguese

8%


Spanish

20%


Chinese

4%


German

6%


Japanese

3%


Arabic

30%


English

The Internet, 2017

slide-4
SLIDE 4

“All translation firms together are able to translate far less than 1% of relevant content produced everyday”

CSA – MT Is Unavoidable to Keep Up with Content Volumes

slide-5
SLIDE 5

Is Machine Translation already here?

*

Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions Marcin Junczys-Dowmunt, Tomasz Dwojak, Hieu Hoang

Everyone agrees that NMT is here to stay and much better than SMT

slide-6
SLIDE 6

Bleu Moses (generic) Neural MT (generic) Moses (adapted) Neural MT (adapted) 49,9 43,5 35,7 29,6

Unbabel Experiments with Customer Service Tickets

slide-7
SLIDE 7

Is NMT really enough?

* A Neural Network for Machine Translation, at Production Scale. Google Blog

slide-8
SLIDE 8

0% 20% 40% 60% 80% 100% 6 12 18 24 30 Good Not sure Bad

MQM

Job

Quality per Job

slide-9
SLIDE 9

Will AI solve translation?

MACHINE ONLY TIME JOBS MQM 95 QUALITY

slide-10
SLIDE 10

Will AI solve translation?

HUMAN EFFORT TIME JOBS MQM 95

slide-11
SLIDE 11

0% 20% 40% 60% 80% 100% 6 12 18 24 30 Good Not sure Bad

MQM

Job

Quality per Job

slide-12
SLIDE 12

Unbabel Pipeline

slide-13
SLIDE 13

Job

Unbabel Pipeline

slide-14
SLIDE 14

Job Machine Translation

Unbabel Pipeline

slide-15
SLIDE 15

Job Machine Translation

Unbabel Pipeline

slide-16
SLIDE 16

Job Machine Translation

Unbabel Pipeline

Q.E.

Quality Estimation

slide-17
SLIDE 17

Job Machine Translation Customer High Q.E.

Unbabel Pipeline

Q.E.

Quality Estimation

slide-18
SLIDE 18

Job Machine Translation Customer High Q.E.

Unbabel Pipeline

Q.E.

Quality Estimation Low Q.E. Re-Eval Community Translators

slide-19
SLIDE 19

Job Machine Translation Customer

Q.E.

Quality Estimation Community Translators Customer

Q.E.

Quality Estimation

Data Generation Engine

slide-20
SLIDE 20

Initial text Submitted text

After

Initial text
 
 Submitted text

Before

NO DATA POINTS

All changes in between:

Mouse clicks Key presses Timestamps

Data Generation Engine

slide-21
SLIDE 21

Raw data Processed information

At 18:03:30: In nugget 3 mouseClick Cursor at 16 Selected: 0 At 18:03:31: In nugget 3 Pressed Backspace Cursor at 16 Selected: 0 At 18:03:31: In nugget 3 Pressed Backspace Cursor at 15 Selected: 0 At 18:03:31: In nugget 3 Pressed Backspace At 18:03:35: In nugget 3 Pressed Shift Cursor at 25 Selected: 0 At 18:03:35: In nugget 3 Pressed s Cursor at 25 Selected: 0 At 18:03:35: In nugget 3 Pressed i Cursor at 26 Selected: 0 At 18:03:35: In nugget 3 Pressed e At 18:03:30: In nugget 3 mouseClick Cursor at 16 Selected: 0 At 18:03:31: In nugget 3 Pressed Backspace Cursor at 16 Selected: 0 At 18:03:31: In nugget 3 Pressed Backspace Cursor at 15 Selected: 0 At 18:03:31: In nugget 3 Pressed Backspace

Initial text

“Espero que esto es útil”

  • Deleted word “es”
  • Inserted word “sea”

Submitted text

“Espero que esto sea útil”

Keystroke Analysis

slide-22
SLIDE 22

Job Machine Translation Customer High Q.E.

Q.E.

Quality Estimation Low Q.E. Re-Eval Crowd

MACHINE QE COMMUNITY

Unbabel Pipeline

slide-23
SLIDE 23

Translation Memory Job Result MT Router Customer MT Customer APE

Machine Translation Pipeline

slide-24
SLIDE 24

Phrase-based MT Neural MT

Machine Translation Models

slide-25
SLIDE 25

Customer Support Tickets

30 37,5 45 52,5 60

G

  • g

l e N M T N M T 1 K 5 K 1 K 2 k 2 5 K 51,7 48,8 47,2 46,9 43,2 34,1 43,1

Bleu

Customer Adaptation

slide-26
SLIDE 26

Word-Level QE 
 Which words are translated correctly/incorrectly? Sentence-Level QE 
 How good is the entire translation?

Quality Estimation

slide-27
SLIDE 27

Word-level QE example

Hey lá , eu sou pesaroso sobre aquele !

OK OK OK OK BA D BA D BA D BA D BA D

Quality Estimation

slide-28
SLIDE 28

Unbabel Ticket Bad translation Good translation

source MT final

QE Training

slide-29
SLIDE 29

QE Word Level

slide-30
SLIDE 30

Ugent System LinearQE NeuralQE StackedQE APE-QE FullStackedQE

15 30 45 60

F1_MULT

QE Word Level

slide-31
SLIDE 31

Ugent System LinearQE NeuralQE StackedQE APE-QE FullStackedQE

15 30 45 60

F1_MULT

41,1

QE Word Level

slide-32
SLIDE 32

Ugent System LinearQE NeuralQE StackedQE APE-QE FullStackedQE

15 30 45 60

F1_MULT

45,2 41,1

QE Word Level

slide-33
SLIDE 33

Ugent System LinearQE NeuralQE StackedQE APE-QE FullStackedQE

15 30 45 60

F1_MULT

47,3 45,2 41,1

QE Word Level

slide-34
SLIDE 34

Ugent System LinearQE NeuralQE StackedQE APE-QE FullStackedQE

15 30 45 60

F1_MULT

50,3 47,3 45,2 41,1

WMT 16 WL QE Winner

QE Word Level

slide-35
SLIDE 35

Ugent System LinearQE NeuralQE StackedQE APE-QE FullStackedQE

15 30 45 60

F1_MULT

55,7 50,3 47,3 45,2 41,1

WMT 16 WL QE Winner

QE Word Level

slide-36
SLIDE 36

Ugent System LinearQE NeuralQE StackedQE APE-QE FullStackedQE

15 30 45 60

F1_MULT

57,5 55,7 50,3 47,3 45,2 41,1

Andre F. T. Martins, Marcin Junczys-Dowmunt, Fabio Kepler, Ramon Astudillo, Chris Hokamp, Roman Grundkiewicz. “Pushing the Limits of Translation Quality Estimation.” TACL 2017 (To Appear)

WMT 16 WL QE Winner

QE Word Level

slide-37
SLIDE 37

QE Sentence Level

slide-38
SLIDE 38

YANDEX StackedQE APE-QE FullStackedQE

Pearson Correlation

17,5 35 52,5 70

QE Sentence Level

slide-39
SLIDE 39

YANDEX StackedQE APE-QE FullStackedQE

Pearson Correlation

17,5 35 52,5 70

52,5

QE Sentence Level

slide-40
SLIDE 40

YANDEX StackedQE APE-QE FullStackedQE

Pearson Correlation

17,5 35 52,5 70

54,9 52,5

QE Sentence Level

slide-41
SLIDE 41

YANDEX StackedQE APE-QE FullStackedQE

Pearson Correlation

17,5 35 52,5 70

61,3 54,9 52,5

QE Sentence Level

slide-42
SLIDE 42

YANDEX StackedQE APE-QE FullStackedQE

Pearson Correlation

17,5 35 52,5 70

65,6 61,3 54,9 52,5

QE Sentence Level

Andre F. T. Martins, Marcin Junczys-Dowmunt, Fabio Kepler, Ramon Astudillo, Chris Hokamp, Roman Grundkiewicz. “Pushing the Limits of Translation Quality Estimation.” TACL 2017 (To Appear)

slide-43
SLIDE 43

QE in the Pipeline

slide-44
SLIDE 44

Job Machine Translation Customer High Q.E.

Q.E.

Quality Estimation

QE in the Pipeline

slide-45
SLIDE 45

Job Machine Translation Customer High Q.E.

Q.E.

Quality Estimation

Document-Level QE 
 how good is the entire document?

QE in the Pipeline

slide-46
SLIDE 46

Job Machine Translation Customer High Q.E.

Q.E.

Quality Estimation Low Q.E. Re-Eval Community Translators

Document-Level QE 
 how good is the entire document?

QE in the Pipeline

slide-47
SLIDE 47

Job Machine Translation Customer High Q.E.

Q.E.

Quality Estimation Low Q.E. Re-Eval Community Translators

Document-Level QE 
 how good is the entire document? Human QE 
 Can we evaluate post-edit output?

Interesting numbers coming soon

QE in the Pipeline

slide-48
SLIDE 48

Goals Quality Speed Cost

Professional Translation

slide-49
SLIDE 49

Goals Quality Speed Cost Pillars

Professional Translation

slide-50
SLIDE 50

Goals Quality Speed Cost Pillars

Professional Translation

slide-51
SLIDE 51

Goals Quality Speed Cost Pillars

  • Editors Pool

Professional Translation

slide-52
SLIDE 52

Goals Quality Speed Cost Pillars

  • Editors Pool
  • Initial Text (MT)

Professional Translation

slide-53
SLIDE 53

Goals Quality Speed Cost Pillars

  • Editors Pool
  • Initial Text (MT)
  • Editor Assignment

Professional Translation

slide-54
SLIDE 54

Goals Quality Speed Cost Pillars

  • Editors Pool
  • Initial Text (MT)
  • Editor Assignment
  • Interfaces

Professional Translation

slide-55
SLIDE 55

Goals Quality Speed Cost Pillars

  • Editors Pool
  • Initial Text (MT)
  • Editor Assignment
  • Interfaces
  • Quality Evaluation

Professional Translation

slide-56
SLIDE 56

Unbabel Community

slide-57
SLIDE 57

50 000 Users

Unbabel Community

slide-58
SLIDE 58

Quality Estimation

Distributed Pipeline

slide-59
SLIDE 59

Expert

Specialisation layers will grow with time

Testing Phase

Editors are tested when they sign up

Training Content

Editors get ratings for the tasks

Paid Content

The best editors have access to paid content

Editors Pool

slide-60
SLIDE 60

Expert

Specialisation layers will grow with time

Testing Phase

Editors are tested when they sign up

Training Content

Editors get ratings for the tasks

Paid Content

The best editors have access to paid content

Evaluators

Editors Pool

slide-61
SLIDE 61

Expert

Specialisation layers will grow with time

Testing Phase

Editors are tested when they sign up

Training Content

Editors get ratings for the tasks

Paid Content

The best editors have access to paid content

Evaluators Annotators

Editors Pool

slide-62
SLIDE 62

How Editors are Evaluated

slide-63
SLIDE 63

Editors Profiling

slide-64
SLIDE 64

Editors Profiling

slide-65
SLIDE 65

Editor Assignment

slide-66
SLIDE 66

Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Editor Assignment

slide-67
SLIDE 67

Editors Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Editor Assignment

slide-68
SLIDE 68

Pull

Editors Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Editor Assignment

slide-69
SLIDE 69

Pull

30 m 6 H 2 D 6 D 20 m 40 m

SLA Editors Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Editor Assignment

slide-70
SLIDE 70

Pull

30 m 6 H 2 D 6 D 20 m 40 m

SLA

1100 1000 1000 1000 1100 1100

Priority Editors Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Editor Assignment

slide-71
SLIDE 71

Pull

30 m 6 H 2 D 6 D 20 m 40 m

SLA

1100 1000 1000 1000 1100 1100

Priority Editors Queue

G G G G R R

Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Editor Assignment

slide-72
SLIDE 72

Pull

30 m 6 H 2 D 6 D 20 m 40 m

SLA

1100 1000 1000 1000 1100 1100

Priority Editors Rating

4.2 3.8 4.3 4.8

Queue

G G G G R R

Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Editor Assignment

slide-73
SLIDE 73

Pull

30 m 6 H 2 D 6 D 20 m 40 m

SLA

1100 1000 1000 1000 1100 1100

Priority Editors Rating

4.2 3.8 4.3 4.8

Native Queue

G G G G R R

Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Editor Assignment

slide-74
SLIDE 74

Pull

30 m 6 H 2 D 6 D 20 m 40 m

SLA

1100 1000 1000 1000 1100 1100

Priority Editors Rating

4.2 3.8 4.3 4.8

Native Queue

G G G G R R

Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Topics

Editor Assignment

slide-75
SLIDE 75

Pull

30 m 6 H 2 D 6 D 20 m 40 m

SLA

1100 1000 1000 1000 1100 1100

Priority Editors Rating

4.2 3.8 4.3 4.8

Native Topics Queue

G G G G R R

Tasks/time

6 m 2 m 10 m 12 m 18 m 45 m

Topics

Editor Assignment

slide-76
SLIDE 76

Editor Assignment

slide-77
SLIDE 77

Regular distribution

3.8

  • ld rating

Editor Assignment

slide-78
SLIDE 78

Regular distribution

3.8

  • ld rating

Smart distribution

4.6

Improved rating

Editor Assignment

slide-79
SLIDE 79

Post-Editing Interfaces

slide-80
SLIDE 80

Time Spent on Job

TIME

DELIVERY

MT

slide-81
SLIDE 81

Time Spent on Job

WAITING

Translator 1 TIME

DELIVERY

MT

slide-82
SLIDE 82

Time Spent on Job

WAITING

Translator 1 TIME

DELIVERY

MT Translator 2

WAITING

slide-83
SLIDE 83

Time Spent on Job: Mobile

WAITING

Translator 1 TIME

DELIVERY

MT Translator 2

WAITING

20%


less

slide-84
SLIDE 84

Time Spent on Job

slide-85
SLIDE 85

Pos-Editing Interfaces

slide-86
SLIDE 86

Smartcheck

slide-87
SLIDE 87

Spelling Tone Formality Consistency

External NLP Services

Spell Check Syntax Parser Word Aligner

Client Rule

Smartcheck

slide-88
SLIDE 88

Spelling Tone Formality Consistency

External NLP Services

Spell Check Syntax Parser Word Aligner

Client Rule

Smartcheck

slide-89
SLIDE 89

Spelling Tone Formality Consistency

External NLP Services

Spell Check Syntax Parser Word Aligner

Client Rule

Smartcheck

Annotation Tool

Eval

Annotated

slide-90
SLIDE 90

Spelling Tone Formality Consistency

External NLP Services

Spell Check Syntax Parser Word Aligner

Client Rule

Smartcheck

Annotation Tool

Eval

Annotated

Learn

slide-91
SLIDE 91

M A C H I N E + H U M A N

CORE API CHAT API CYRANO API

Customer Service Conversational

MESSAGING API

Language OS

Language Engine

Bots

slide-92
SLIDE 92

Tickets can come
 in many languages.

In Customer Service

slide-93
SLIDE 93

Unbabel for Zendesk

Unbabel’s 
 Domain Adapted
 Machine Translation Distributed Human Translation English-speaking agent Zendesk app connected with Unbabel API

Unbabel can offer the same Customer Satisfaction as native agents

slide-94
SLIDE 94

SPEED QUALITY

94

20

minutes

Customer Replies: Speed & Quality

slide-95
SLIDE 95

Chat

slide-96
SLIDE 96

Editors train the MT engine Community of 50K+ translators Skips Humans

Chat Translation Flow

slide-97
SLIDE 97

Chat Messages: Speed & Quality

SPEED QUALITY

90

2

Minutes

slide-98
SLIDE 98

What is the future?

HUMAN EFFORT TIME JOBS MQM 95

0%?

slide-99
SLIDE 99

Thank You

joao@unbabel.com