Dont Until the Final Verb Wait: Reinforcement Learning for - - PowerPoint PPT Presentation

don t until the final verb wait reinforcement learning
SMART_READER_LITE
LIVE PREVIEW

Dont Until the Final Verb Wait: Reinforcement Learning for - - PowerPoint PPT Presentation

Dont Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation Alvin Grissom II, He He, Jordan Boyd-Graber, John Morgan, and Hal Daum III http://www.umiacs.umd.edu/~alvin/simtrans University of Colorado,


slide-1
SLIDE 1

Don’t Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation

Alvin Grissom II, He He, Jordan Boyd-Graber, John Morgan, and Hal Daumé III

http://www.umiacs.umd.edu/~alvin/simtrans University of Colorado, Boulder University of Maryland, College Park

slide-2
SLIDE 2

Real-time Interpretation

2

Nuremberg Trials Videoconference

slide-3
SLIDE 3

Outline

3

  • SOV-SVO Simultaneous MT
  • Why this is difficult
  • Verb Prediction
  • Reinforcement Learning
  • Translation System
  • Experiments and Results
  • Related Work
  • Future Work
slide-4
SLIDE 4

Simultaneous Translation

4

  • Begin translating before a sentence ends.
  • First introduced on a large scale with

Nuremberg Trials.

  • Requires judgments about when to

translate fragments.

  • Skill learned from experience.
slide-5
SLIDE 5

Simultaneous Machine Translation

5

  • Most prior approaches have been rule-

based.

  • We would like to use machine learning.
  • Difficult for humans, who must learn from

experience.

slide-6
SLIDE 6

SOV-SVO Simultaneous Translation

6

  • Many languages (e.g., German,

Japanese) are verb-final (SOV); other (e.g. English) are SVO.

  • Translator must wait for verb, or predict.
slide-7
SLIDE 7

Verb Prediction

7

  • Predict the final verb to produce a

grammatical sentence.

  • We use language models to predict the

main verb and next word in the sentence.

slide-8
SLIDE 8

Verb Prediction

8

Apple ist zum wertvollsten Konzern aller Zeiten avanciert Nein, mit dem Virus ist es noch lange nicht getan Eine vielbefahrene Brücke in New Jersey wurde grundlos gesperrt Mit Drohen und Interpretieren ist es nicht getan Frankfurter Flughafen für Passagiere weitgehend gesperrt

slide-9
SLIDE 9

Verb Prediction

9

Apple ist zum wertvollsten Konzern aller Zeiten avanciert Nein, mit dem Virus ist es noch lange nicht getan Eine vielbefahrene Brücke in New Jersey wurde grundlos gesperrt Mit Drohen und Interpretieren ist es nicht getan Frankfurter Flughafen für Passagiere weitgehend gesperrt

slide-10
SLIDE 10

Verb Prediction

10

  • Build a language model for each verb.
  • For any input text, x, we make a verb

prediction:

slide-11
SLIDE 11

Verb Prediction

11

  • Most predictions will be incorrect.
  • Leads to terrible translations.
slide-12
SLIDE 12

Learn When to Trust Predictions

12

  • Learn under which circumstances to trust

predictions.

  • Translate when confident and wait for

more information otherwise.

  • Learn a policy, π, to do this.
slide-13
SLIDE 13

Reinforcement Learning for Simultaneous Machine Translation

13

  • State:
  • Observations (words), predictions (next word and verb) and

prediction scores.

  • Actions:
  • WAIT for more words.
  • Input: Word. Output: None.
  • Translate with predicted VERB.
  • Input: Word, verb prediction: Output: translated segment

with verb.

  • We can also predict the next word.
  • COMMIT to partial translation.
  • Input: word. Output: translated segment.
slide-14
SLIDE 14

Input: Mit dem Zug bin ich nach Verb Prediction: gefahren Output: I traveled by train to Output: Ø Output: With the train I’m after

WAIT

COMMIT VERB

Action Output

slide-15
SLIDE 15

Reward

15

  • We want to capture translation quality and

translation latency.

  • Optimal translations are both accurate and

prompt.

  • Incrementalize (sentence-level) BLEU.
  • Score partial translations; sum their scores.
slide-16
SLIDE 16

Reward from Many Translations

16

  • Calculate BLEU score each time an action is

executed.

slide-17
SLIDE 17

Reward from Many Translations

17

  • Sum weighted translations over course of

sentence.

  • Higher scores earlier mean higher final score.
slide-18
SLIDE 18

Policy Comparison

18

Er ist zum Laden gegangen He went to the store He to the He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

slide-19
SLIDE 19

Policy Comparison

19

Er ist zum Laden gegangen He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-20
SLIDE 20

Policy Comparison

20

Er ist zum Laden gegangen He to the He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-21
SLIDE 21

Policy Comparison

21

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-22
SLIDE 22

Policy Comparison

22

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-23
SLIDE 23

Policy Comparison

23

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went T Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-24
SLIDE 24

Policy Comparison

24

Er ist zum Laden gegangen He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-25
SLIDE 25

Policy Comparison

25

Er ist zum Laden gegangen He to the He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-26
SLIDE 26

Policy Comparison

26

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-27
SLIDE 27

Policy Comparison

27

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-28
SLIDE 28

Policy Comparison

28

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went T Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-29
SLIDE 29

Policy Comparison

29

Er ist zum Laden gegangen He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-30
SLIDE 30

Policy Comparison

30

Er ist zum Laden gegangen He to the He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-31
SLIDE 31

Policy Comparison

31

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-32
SLIDE 32

Policy Comparison

32

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-33
SLIDE 33

Policy Comparison

33

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went T Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-34
SLIDE 34

Policy Comparison

34

Er ist zum Laden gegangen He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-35
SLIDE 35

Policy Comparison

35

Er ist zum Laden gegangen He to the He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-36
SLIDE 36

Policy Comparison

36

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-37
SLIDE 37

Policy Comparison

37

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-38
SLIDE 38

Policy Comparison

38

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went T Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-39
SLIDE 39

Policy Comparison

39

Er ist zum Laden gegangen He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-40
SLIDE 40

Policy Comparison

40

Er ist zum Laden gegangen He to the He to the store Psychic Monotone He went to the store Batch Policy Prediction He went He went to the store He to the store went He went to the β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store

slide-41
SLIDE 41

Policy Comparison

41

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-42
SLIDE 42

Policy Comparison

42

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went β Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-43
SLIDE 43

Policy Comparison

43

Er ist zum Laden gegangen He to the Psychic Monotone Batch Policy Prediction He went T Source Sentence He He

Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation Good Translation Bad Translation

He went to the store He went to the store He went to the store He to the store went He to the store He went to the

slide-44
SLIDE 44

Translation System

Deutsch English

Translation System

Append to English output prefix. Append to German input prefix.

slide-45
SLIDE 45

Translation System

45

He He

1

slide-46
SLIDE 46

Translation System

46

He It was designed He

1 2

slide-47
SLIDE 47

Translation System

47

He It was designed He

1 2

slide-48
SLIDE 48

Translation System

48

He It was designed He was designed

1 2

slide-49
SLIDE 49

Translation System

49

He It It was designed was renovated yesterday He was designed yesterday

1 2 3

slide-50
SLIDE 50

Action Sequence Learning

50

Observation

  • 1. Mit dem Zug

state

Verb: gewesen Next: und

slide-51
SLIDE 51

Action Sequence Learning

51

Observation

  • 1. Mit dem Zug

Wait state

Verb: gewesen Next: und

slide-52
SLIDE 52

Action Sequence Learning

52

Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich Wait state

Verb: gewesen Next: und Verb: geliefert Next: nur

slide-53
SLIDE 53

Action Sequence Learning

53

Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich Wait Wait state

Verb: gewesen Next: und Verb: geliefert Next: nur

slide-54
SLIDE 54

Action Sequence Learning

54

Observation Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich

  • 3. Mit dem Zug bin

ich nach Wait Wait state

Verb: gewesen Next: und Verb: geliefert Next: nur Verb: gefahren Next: Berlin

slide-55
SLIDE 55

Action Sequence Learning

55

Observation Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich

  • 3. Mit dem Zug bin

ich nach Wait Wait Predict state

Verb: gewesen Next: und Verb: geliefert Next: nur Verb: gefahren Next: Berlin

slide-56
SLIDE 56

Action Sequence Learning

56

Observation Observation (prediction) Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich

  • 3. Mit dem Zug bin

ich nach

  • 4. Mit dem Zug bin

ich nach … gefahren ... Wait Wait Predict state

Verb: gewesen Next: und Verb: geliefert Next: nur Verb: gefahren Next: Berlin

slide-57
SLIDE 57

Action Sequence Learning

57

Observation Observation (prediction) Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich

  • 3. Mit dem Zug bin

ich nach

  • 4. Mit dem Zug bin

ich nach … gefahren ... Output: I traveled by train Wait Wait Predict

S

I traveled by train with the train to Commit state

Verb: gewesen Next: und Verb: geliefert Next: nur Verb: gefahren Next: Berlin

slide-58
SLIDE 58

Action Sequence Learning

58

Observation Observation (prediction) Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich

  • 3. Mit dem Zug bin

ich nach

  • 4. Mit dem Zug bin

ich nach … gefahren ... Output: I traveled by train Wait Wait Predict

S

I traveled by train with the train to Commit Wait state

Verb: gewesen Next: und Verb: geliefert Next: nur Verb: gefahren Next: Berlin

slide-59
SLIDE 59

Action Sequence Learning

59

Observation Observation (prediction) Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich

  • 3. Mit dem Zug bin

ich nach

  • 4. Mit dem Zug bin

ich nach … gefahren ... Observation (prediction)

  • 5. Mit dem Zug bin ich

nach Ulm … gefahren ... Output: I traveled by train Output: I traveled by train to Ulm to Ulm Wait Wait Predict

S

I traveled by train with the train to Commit Wait Fixed

  • utput

Commit state

Verb: gewesen Next: und Verb: geliefert Next: nur Verb: gefahren Next: Berlin

slide-60
SLIDE 60

Action Sequence Learning

60

STOP Commit Observation Observation (prediction) Observation Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich

  • 3. Mit dem Zug bin

ich nach

  • 4. Mit dem Zug bin

ich nach … gefahren ... Observation (prediction)

  • 5. Mit dem Zug bin ich

nach Ulm … gefahren ...

  • 6. Mit dem Zug bin ich

nach Ulm gefahren. Output: I traveled by train Output: I traveled by train to Ulm Output: I traveled by train to Ulm. to Ulm Wait Wait Predict

S

I traveled by train with the train to Commit Wait Fixed

  • utput

Commit state

Verb: gewesen Next: und Verb: geliefert Next: nur Verb: gefahren Next: Berlin

slide-61
SLIDE 61

Action Sequence Learning

61

STOP Commit Observation Observation (prediction) Observation Observation Observation

  • 1. Mit dem Zug
  • 2. Mit dem Zug bin

ich

  • 3. Mit dem Zug bin

ich nach

  • 4. Mit dem Zug bin

ich nach … gefahren ... Observation (prediction)

  • 5. Mit dem Zug bin ich

nach Ulm … gefahren ...

  • 6. Mit dem Zug bin ich

nach Ulm gefahren. Output: I traveled by train Output: I traveled by train to Ulm Output: I traveled by train to Ulm. to Ulm Wait Wait Predict

S

I traveled by train with the train to Commit Wait Fixed

  • utput

Commit state

Verb: gewesen Next: und Verb: geliefert Next: nur Verb: gefahren Next: Berlin

+15$pts +10$pts +5$pts

slide-62
SLIDE 62

Learning When and How

62

  • Imitation learning
  • Special case of

reinforcement learning.

  • Given predictions and

prediction-informed translations, discover

  • ptimal policies in

hindsight.

slide-63
SLIDE 63

Learning When and How

63

slide-64
SLIDE 64

Learning When and How

64

slide-65
SLIDE 65

Learning When and How

65

  • Find optimal policy, .
  • Set initial policy to optimal policy:
  • Until convergence:
  • Generate examples of state-action pairs:
  • Generate a classifier (apprentice policy) mapping states to

actions:

  • Loss of classifier is negative reward.
  • Interpolate learned classifier with previous iteration’s policy:
  • Searn (Daumé III et al., 2006)

π∗

π0 ≡ π∗

(πt(s), s) ht : f(s) 7! A πt+1 = λπt + (1 − λ)ht

slide-66
SLIDE 66

Experiments

66

  • “de-news” corpus
  • Daily News transcriptions (1996-2000).
  • German-English parallel data.
  • Only verb-final sentences.
  • Some compromises with translation system

(to ensure verbs appeared).

slide-67
SLIDE 67

Results

67

  • 0.25

0.50 0.75 0.00 0.25 0.50 0.75 1.00

% of Sentence Smoothed Average

  • Batch
slide-68
SLIDE 68

Results

68

  • 0.25

0.50 0.75 0.00 0.25 0.50 0.75 1.00

% of Sentence Smoothed Average

  • Batch

Monotone

slide-69
SLIDE 69

Results

69

  • 0.25

0.50 0.75 1.00 1.25 0.00 0.25 0.50 0.75 1.00

% of Sentence Smoothed Average

  • Batch

Monotone Optimal

slide-70
SLIDE 70

Results

70

  • 0.25

0.50 0.75 1.00 1.25 0.00 0.25 0.50 0.75 1.00

% of Sentence Smoothed Average

  • Batch

Monotone Optimal Searn

slide-71
SLIDE 71

Results: Example

71

“presented”

slide-72
SLIDE 72

Related Work

72

  • Oda et al. (2014) learned segmentations for simultaneous

interpretation with greedy search and dynamic programming.

  • Previous rule-based approaches using parsing (Mima et al.,

1998; Ryu et al., 2006), rule-based decisions (Wahlster, 2000), phrase-table probabilities (Fujita et al., 2013), pauses in speech (Sakamoto et al., 2013), and word alignments (Ryu et al., 2012)

  • Verb Prediction for simultaneous machine translation
  • Matsubara et al. (2000) used pattern matching to

predict English verbs for Japanese-English simultaneous machine translation.

slide-73
SLIDE 73

Future Work

73

  • Improved verb predictions with more robust

models.

  • Improved translation system.
  • Incorporate richer feature space.
  • Predict other components aside from verbs.
  • Use on other languages, e.g., Japanese.
  • Verb/semantic-centric scoring metrics. (e.g.,

MEANT).