Machine Translation
Philipp Koehn 28 April 2020
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn - - PowerPoint PPT Presentation
Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020 Machine Translation: French (2012) 1 Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020 Machine
Philipp Koehn 28 April 2020
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
1
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
2
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
3
Israeli officials are responsible for airport security. Israel is in charge of the security at this airport. The security work for this airport is the responsibility of the Israel government. Israeli side was in charge of the security of this airport. Israel is responsible for the airport’s security. Israel is responsible for safety work at this airport. Israel presides over the security of the airport. Israel took charge of the airport security. The safety of this airport is taken charge of by Israel. This airport’s security is the responsibility of the Israeli security officials.
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
4
Lexical Transfer
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
5
Lexical Transfer Syntactic Transfer
Analysis Generation
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
6
Lexical Transfer Syntactic Transfer Semantic Transfer
Analysis Generation
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
7
Lexical Transfer Syntactic Transfer Semantic Transfer
Analysis Generation
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
8
parallel corpora monolingual corpora dictionaries Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
9
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
10
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
11
this claim they at least the she
– the meaning the of das not possible (not a noun phrase) – the meaning she of sie not possible (subject-verb agreement)
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
12
– it refers to movie – movie translates to Film – Film has masculine gender – ergo: it must be translated into masculine pronoun er
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
13
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
14
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
15
Sicherheit → security 14,516 Sicherheit → safety 10,015 Sicherheit → certainty 334
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
16
Sicherheit → security 14,516 Sicherheit → safety 10,015 Sicherheit → certainty 334
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
17
Sicherheit → security 14,516 Sicherheit → safety 10,015 Sicherheit → certainty 334
Sicherheitspolitik → security policy 1580 Sicherheitspolitik → safety policy 13 Sicherheitspolitik → certainty policy 0 Lebensmittelsicherheit → food security 51 Lebensmittelsicherheit → food safety 1084 Lebensmittelsicherheit → food certainty 0 Rechtssicherheit → legal security 156 Rechtssicherheit → legal safety 5 Rechtssicherheit → legal certainty 723
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
18
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
19
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
20
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
21
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
22
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
23
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
24
Haus — house, building, home, household, shell.
– some more frequent than others – for instance: house, and building most common – special cases: Haus of a snail is its shell
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
25
Look at a parallel corpus (German text along with English translation) Translation of Haus Count house 8,000 building 1,600 home 200 household 150 shell 50
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
26
Maximum likelihood estimation pf(e) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0.8 if e = house, 0.16 if e = building, 0.02 if e = home, 0.015 if e = household, 0.005 if e = shell.
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
27
the words in the other
1 2 3 4 1 2 3 4
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
28
position j with a function a ∶ i → j
a ∶ {1 → 1,2 → 2,3 → 3,4 → 4}
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
29
Words may be reordered during translation
1 2 3 4 1 2 3 4
a ∶ {1 → 3,2 → 4,3 → 2,4 → 1}
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
30
A source word may translate into multiple target words
1 2 3 4 1 2 3 4 5
a ∶ {1 → 1,2 → 2,3 → 3,4 → 4,5 → 4}
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
31
Words may be dropped when translated (German article das is dropped)
1 2 3 1 2 3 4
a ∶ {1 → 2,2 → 3,3 → 4}
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
32
– The English just does not have an equivalent in German – We still need to map it to something: special NULL token
NULL
1 2 3 4 1 2 3 4 5
a ∶ {1 → 1,2 → 2,3 → 3,4 → 0,5 → 4}
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
33
– IBM Model 1 only uses lexical translation
– for a foreign sentence f = (f1,...,flf) of length lf – to an English sentence e = (e1,...,ele) of length le – with an alignment of each English word ej to a foreign word fi according to the alignment function a ∶ j → i p(e,a∣f) = ǫ (lf + 1)le
le
∏
j=1
t(ej∣fa(j)) – parameter ǫ is a normalization constant
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
34
das Haus ist klein e t(e∣f) the 0.7 that 0.15 which 0.075 who 0.05 this 0.025 e t(e∣f) house 0.8 building 0.16 home 0.02 household 0.015 shell 0.005 e t(e∣f) is 0.8 ’s 0.16 exists 0.02 has 0.015 are 0.005 e t(e∣f) small 0.4 little 0.4 short 0.1 minor 0.06 petty 0.04 p(e,a∣f) = ǫ 43 × t(the∣das) × t(house∣Haus) × t(is∣ist) × t(small∣klein) = ǫ 43 × 0.7 × 0.8 × 0.8 × 0.4 = 0.0028ǫ
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
35
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
36
parallel corpus
– if we had the alignments, → we could estimate the parameters of our generative model – if we had the parameters, → we could estimate the alignments
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
37
– if we had complete data, would could estimate model – if we had model, we could fill in the gaps in the data
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
38
... la maison ... la maison blue ... la fleur ... ... the house ... the blue house ... the flower ...
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
39
... la maison ... la maison blue ... la fleur ... ... the house ... the blue house ... the flower ...
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
40
... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ...
likely (pigeon hole principle)
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
41
... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ...
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
42
... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ... p(la|the) = 0.453 p(le|the) = 0.334 p(maison|house) = 0.876 p(bleu|blue) = 0.563 ...
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
43
– parts of the model are hidden (here: alignments) – using the model, assign probabilities to possible values
– take assign values as fact – collect counts (weighted by probabilities) – estimate model from counts
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
44
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
45
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
46
Translation Probability φ(¯ e∣ ¯ f)
0.5 naturally 0.3
0.15 , of course , 0.05
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
47
English φ(¯ e∣ ¯ f) English φ(¯ e∣ ¯ f) the proposal 0.6227 the suggestions 0.0114 ’s proposal 0.1068 the proposed 0.0114 a proposal 0.0341 the motion 0.0091 the idea 0.0250 the idea of 0.0091 this proposal 0.0227 the proposal , 0.0068 proposal 0.0205 its proposal 0.0068
0.0159 it 0.0068 the proposals 0.0159 ... ... – lexical variation (proposal vs suggestions) – morphological variation (proposal vs proposals) – included function words (the, a, ...) – noise (it)
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
48
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
49
p(e∣f)
ebest = argmaxe p(e∣f)
– the most probable translation is bad → fix the model – search does not find the most probably translation → fix the search
(although these are often correlated)
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
50
er geht ja nicht nach hause
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
51
er geht ja nicht nach hause er he
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
52
er geht ja nicht nach hause er ja nicht he does not
– it is allowed to pick words out of sequence reordering – phrases may have multiple words: many-to-many translation
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
53
er geht ja nicht nach hause er geht ja nicht he does not go
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
54
er geht ja nicht nach hause er geht ja nicht nach hause he does not go home
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
55
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
56
he
er geht ja nicht nach hause
it , it , he is are goes go yes is , of course not do not does not is not after to according to in house home chamber at home not is not does not do not home under house return home do not it is he will be it goes he goes is are is after all does to following not after not to , not is not are not is not a
– in Europarl phrase table: 2727 matching phrase pairs for this sentence – by pruning to the top 20 per phrase, 202 translation options remain
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
57
he
er geht ja nicht nach hause
it , it , he is are goes go yes is , of course not do not does not is not after to according to in house home chamber at home not is not does not do not home under house return home do not it is he will be it goes he goes is are is after all does to following not after not to not is not are not is not a
– picking the right translation options – arranging them in the right order → Search problem solved by heuristic beam search
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
58
er geht ja nicht nach hause
consult phrase translation table for all input phrases
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
59
er geht ja nicht nach hause
initial hypothesis: no input words covered, no output produced
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
60
er geht ja nicht nach hause
are
pick any translation option, create new hypothesis
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
61
er geht ja nicht nach hause
are it he
create hypotheses for all other translation options
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
62
er geht ja nicht nach hause
are it he goes does not yes go to home home
also create hypotheses from created partial hypothesis
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
63
er geht ja nicht nach hause
are it he goes does not yes go to home home
backtrack from highest scoring complete hypothesis
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
64
– same number of foreign words translated – same English words in the output – different scores
it is it is
it is Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
65
are it he goes does not yes
no word translated
translated two words translated three words translated
– translation option is applied to hypothesis – new hypothesis is dropped into a stack further down
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
66
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
67
PRP
MD
VB
VBG
RP
TO
PRP
DT
NNS
NP-A PP VP-A VP-A VP-A S Phrase structure grammar tree for an English sentence (as produced Collins’ parser)
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
68
NP → DET JJ NN
NP → DET NN JJ
NP → DET1 NN2 JJ3 ∣ DET1 JJ3 NN2
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
69
NP → DET1 NN2 JJ3 ∣ DET1 JJ3 NN2
N → maison ∣ house NP → la maison bleue ∣ the blue house
NP → la maison JJ1 ∣ the JJ1 house
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
70
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S VB
drink ➏
German input sentence with tree
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
71
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink ➏ ➊
Purely lexical rule: filling a span with a translation (a constituent in the chart)
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
72
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN
coffee ➏ ➊ ➋
Purely lexical rule: filling a span with a translation (a constituent in the chart)
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
73
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN
coffee ➏ ➊ ➋ ➌
Purely lexical rule: filling a span with a translation (a constituent in the chart)
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
74
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN |
cup
IN |
NP PP NN NP DET |
a
NN
coffee ➏ ➊ ➋ ➌ ➍
Complex rule: matching underlying constituent spans, and covering words
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
75
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN |
cup
IN |
NP PP NN NP DET |
a
VBZ |
wants
VB VP VP NP TO |
to
NN
coffee ➏ ➊ ➋ ➌ ➍ ➎
Complex rule with reordering
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
76
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN |
cup
IN |
NP PP NN NP DET |
a
VBZ |
wants
VB VP VP NP TO |
to
NN
coffee
S PRO VP
➏ ➊ ➋ ➌ ➍ ➎
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
77
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
78
p(W) = p(w1,w2,...,wn)
p(W) = ∑
i
p(wi∣w1,...,wi−1)
p(wi∣w1,...,wi−1) ≃ p(wi∣wi−4,wi−3,wi−2,wi−1)
→ we back off to p(wi∣wi−3,wi−2,wi−1), p(wi∣wi−2,wi−1), etc., all the way to p(wi) – exact details of backing off get complicated — ”interpolated Kneser-Ney”
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
79
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
80
– dog = (0,0,0,0,1,0,0,0,0,....) – cat = (0,0,0,0,0,0,0,1,0,....) – eat = (0,1,0,0,0,0,0,0,0,....)
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
81
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
82
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
83
– input→embedding: none – embedding→hidden: tanh – hidden→output: softmax
– loop through the entire corpus – update between predicted probabilities and 1-hot vector for output word
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
84
Word Embedding
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
85
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
86
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
87
– adjectives base form vs. comparative, e.g., good, better – nouns singular vs. plural, e.g., year, years – verbs present tense vs. past tense, e.g., see, saw
– clothing is to shirt as dish is to bowl – evaluated on human judgment data of semantic similarities
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
88
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
89
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
90
copy values
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
91
copy values
copy values
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
92
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
93
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
94
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
95
Given word Embedding Hidden state Predicted word
Predict the first word
Same as before, just drawn top-down
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
96
Given word Embedding Hidden state Predicted word
Predict the second word
Re-use hidden state from first word prediction
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
97
Given word Embedding Hidden state Predicted word
Predict the third word
... and so on
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
98
Given word Embedding Hidden state Predicted word
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
99
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
100
<s> the the house house is big . is big . </s>
Given word Embedding Hidden state Predicted word
</s> das das Haus Haus ist groß . ist groß . </s>
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
101
⇒ Solution: attention mechanism
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
102
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
103
Given word Embedding Hidden state Predicted word
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
104
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
105
Input Word Embeddings Left-to-Right Recurrent NN Right-to-Left Recurrent NN
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
106
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
107
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
108
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
109
Encoder States Attention Hidden State Output Words
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
110
Encoder States Attention Input Context Hidden State Output Words
αij = exp(a(si−1,hj)) ∑k exp(a(si−1,hk))
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
111
Encoder States Attention Input Context Hidden State Output Words
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
112
Input Word Embeddings Left-to-Right Recurrent NN Right-to-Left Recurrent NN Attention Input Context Hidden State Output Words
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020
113
Philipp Koehn Artificial Intelligence: Machine Translation 28 April 2020