Machine Translation
Philipp Koehn 1 December 2015
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn - - PowerPoint PPT Presentation
Machine Translation Philipp Koehn 1 December 2015 Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015 Machine Translation: Chinese 1 Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015 Machine
Philipp Koehn 1 December 2015
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
1
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
2
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
3
Israeli officials are responsible for airport security. Israel is in charge of the security at this airport. The security work for this airport is the responsibility of the Israel government. Israeli side was in charge of the security of this airport. Israel is responsible for the airport’s security. Israel is responsible for safety work at this airport. Israel presides over the security of the airport. Israel took charge of the airport security. The safety of this airport is taken charge of by Israel. This airport’s security is the responsibility of the Israeli security officials.
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
4
Lexical Transfer
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
5
Lexical Transfer Syntactic Transfer
Analysis Generation
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
6
Lexical Transfer Syntactic Transfer Semantic Transfer
Analysis Generation
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
7
Lexical Transfer Syntactic Transfer Semantic Transfer
Analysis Generation
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
8
parallel corpora monolingual corpora dictionaries Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
9
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
10
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
11
this claim they at least the she
– the meaning the of das not possible (not a noun phrase) – the meaning she of sie not possible (subject-verb agreement)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
12
– it refers to movie – movie translates to Film – Film has masculine gender – ergo: it must be translated into masculine pronoun er
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
13
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
14
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
15
Sicherheit → security 14,516 Sicherheit → safety 10,015 Sicherheit → certainty 334
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
16
Sicherheit → security 14,516 Sicherheit → safety 10,015 Sicherheit → certainty 334
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
17
Sicherheit → security 14,516 Sicherheit → safety 10,015 Sicherheit → certainty 334
Sicherheitspolitik → security policy 1580 Sicherheitspolitik → safety policy 13 Sicherheitspolitik → certainty policy 0 Lebensmittelsicherheit → food security 51 Lebensmittelsicherheit → food safety 1084 Lebensmittelsicherheit → food certainty 0 Rechtssicherheit → legal security 156 Rechtssicherheit → legal safety 5 Rechtssicherheit → legal certainty 723
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
18
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
19
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
20
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
21
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
22
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
23
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
24
Haus — house, building, home, household, shell.
– some more frequent than others – for instance: house, and building most common – special cases: Haus of a snail is its shell
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
25
Look at a parallel corpus (German text along with English translation) Translation of Haus Count house 8,000 building 1,600 home 200 household 150 shell 50
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
26
Maximum likelihood estimation pf(e) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0.8 if e = house, 0.16 if e = building, 0.02 if e = home, 0.015 if e = household, 0.005 if e = shell.
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
27
the words in the other
1 2 3 4 1 2 3 4
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
28
position j with a function a ∶ i → j
a ∶ {1 → 1,2 → 2,3 → 3,4 → 4}
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
29
Words may be reordered during translation
1 2 3 4 1 2 3 4
a ∶ {1 → 3,2 → 4,3 → 2,4 → 1}
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
30
A source word may translate into multiple target words
1 2 3 4 1 2 3 4 5
a ∶ {1 → 1,2 → 2,3 → 3,4 → 4,5 → 4}
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
31
Words may be dropped when translated (German article das is dropped)
1 2 3 1 2 3 4
a ∶ {1 → 2,2 → 3,3 → 4}
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
32
– The English just does not have an equivalent in German – We still need to map it to something: special NULL token
NULL
1 2 3 4 1 2 3 4 5
a ∶ {1 → 1,2 → 2,3 → 3,4 → 0,5 → 4}
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
33
– IBM Model 1 only uses lexical translation
– for a foreign sentence f = (f1,...,flf) of length lf – to an English sentence e = (e1,...,ele) of length le – with an alignment of each English word ej to a foreign word fi according to the alignment function a ∶ j → i p(e,a∣f) = ǫ (lf + 1)le
le
∏
j=1
t(ej∣fa(j)) – parameter ǫ is a normalization constant
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
34
das Haus ist klein e t(e∣f) the 0.7 that 0.15 which 0.075 who 0.05 this 0.025 e t(e∣f) house 0.8 building 0.16 home 0.02 household 0.015 shell 0.005 e t(e∣f) is 0.8 ’s 0.16 exists 0.02 has 0.015 are 0.005 e t(e∣f) small 0.4 little 0.4 short 0.1 minor 0.06 petty 0.04 p(e,a∣f) = ǫ 43 × t(the∣das) × t(house∣Haus) × t(is∣ist) × t(small∣klein) = ǫ 43 × 0.7 × 0.8 × 0.8 × 0.4 = 0.0028ǫ
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
35
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
36
parallel corpus
– if we had the alignments, → we could estimate the parameters of our generative model – if we had the parameters, → we could estimate the alignments
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
37
– if we had complete data, would could estimate model – if we had model, we could fill in the gaps in the data
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
38
... la maison ... la maison blue ... la fleur ... ... the house ... the blue house ... the flower ...
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
39
... la maison ... la maison blue ... la fleur ... ... the house ... the blue house ... the flower ...
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
40
... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ...
likely (pigeon hole principle)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
41
... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ...
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
42
... la maison ... la maison bleu ... la fleur ... ... the house ... the blue house ... the flower ... p(la|the) = 0.453 p(le|the) = 0.334 p(maison|house) = 0.876 p(bleu|blue) = 0.563 ...
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
43
– parts of the model are hidden (here: alignments) – using the model, assign probabilities to possible values
– take assign values as fact – collect counts (weighted by probabilities) – estimate model from counts
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
44
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
45
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
46
Translation Probability φ(¯ e∣ ¯ f)
0.5 naturally 0.3
0.15 , of course , 0.05
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
47
English φ(¯ e∣ ¯ f) English φ(¯ e∣ ¯ f) the proposal 0.6227 the suggestions 0.0114 ’s proposal 0.1068 the proposed 0.0114 a proposal 0.0341 the motion 0.0091 the idea 0.0250 the idea of 0.0091 this proposal 0.0227 the proposal , 0.0068 proposal 0.0205 its proposal 0.0068
0.0159 it 0.0068 the proposals 0.0159 ... ... – lexical variation (proposal vs suggestions) – morphological variation (proposal vs proposals) – included function words (the, a, ...) – noise (it)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
48
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
49
p(e∣f)
ebest = argmaxe p(e∣f)
– the most probable translation is bad → fix the model – search does not find the most probably translation → fix the search
(although these are often correlated)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
50
er geht ja nicht nach hause
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
51
er geht ja nicht nach hause er he
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
52
er geht ja nicht nach hause er ja nicht he does not
– it is allowed to pick words out of sequence reordering – phrases may have multiple words: many-to-many translation
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
53
er geht ja nicht nach hause er geht ja nicht he does not go
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
54
er geht ja nicht nach hause er geht ja nicht nach hause he does not go home
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
55
ebest = argmaxe
I
∏
i=1
φ( ¯ fi∣¯ ei) d(starti − endi−1 − 1) pLM(e)
Phrase translation Picking phrase ¯ fi to be translated as a phrase ¯ ei → look up score φ( ¯ fi∣¯ ei) from phrase translation table Reordering Previous phrase ended in endi−1, current phrase starts at starti → compute d(starti − endi−1 − 1) Language model For n-gram model, need to keep track of last n − 1 words → compute score pLM(wi∣wi−(n−1),...,wi−1) for added words wi
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
56
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
57
he
er geht ja nicht nach hause
it , it , he is are goes go yes is , of course not do not does not is not after to according to in house home chamber at home not is not does not do not home under house return home do not it is he will be it goes he goes is are is after all does to following not after not to , not is not are not is not a
– in Europarl phrase table: 2727 matching phrase pairs for this sentence – by pruning to the top 20 per phrase, 202 translation options remain
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
58
he
er geht ja nicht nach hause
it , it , he is are goes go yes is , of course not do not does not is not after to according to in house home chamber at home not is not does not do not home under house return home do not it is he will be it goes he goes is are is after all does to following not after not to not is not are not is not a
– picking the right translation options – arranging them in the right order → Search problem solved by heuristic beam search
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
59
er geht ja nicht nach hause
consult phrase translation table for all input phrases
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
60
er geht ja nicht nach hause
initial hypothesis: no input words covered, no output produced
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
61
er geht ja nicht nach hause
are
pick any translation option, create new hypothesis
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
62
er geht ja nicht nach hause
are it he
create hypotheses for all other translation options
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
63
er geht ja nicht nach hause
are it he goes does not yes go to home home
also create hypotheses from created partial hypothesis
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
64
er geht ja nicht nach hause
are it he goes does not yes go to home home
backtrack from highest scoring complete hypothesis
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
65
– same number of foreign words translated – same English words in the output – different scores
it is it is
it is Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
66
– same number of foreign words translated – same last two English words in output (assuming trigram language model) – same last foreign word translated – different scores
it he does not does not
it he does not Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
67
(we still have a NP complete problem on our hands)
– put comparable hypothesis into stacks (hypotheses that have translated same number of input words) – limit number of hypotheses in each stack
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
68
are it he goes does not yes
no word translated
translated two words translated three words translated
– translation option is applied to hypothesis – new hypothesis is dropped into a stack further down
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
69
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
70
PRP
MD
VB
VBG
RP
TO
PRP
DT
NNS
NP-A PP VP-A VP-A VP-A S Phrase structure grammar tree for an English sentence (as produced Collins’ parser)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
71
NP → DET JJ NN
NP → DET NN JJ
NP → DET1 NN2 JJ3 ∣ DET1 JJ3 NN2
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
72
NP → DET1 NN2 JJ3 ∣ DET1 JJ3 NN2
N → maison ∣ house NP → la maison bleue ∣ the blue house
NP → la maison JJ1 ∣ the JJ1 house
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
73
– synchronous grammar has to parse entire input sentence – output tree is generated at the same time – process is broken up into a number of rule applications
SCORE(TREE, E, F) = ∏
i
RULEi
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
74
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extract: set of smallest rules required to explain the sentence pair
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
75
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extracted rule: PRP → Ich ∣ I
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
76
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extracted rule: PRP → Ihnen ∣ you
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
77
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extracted rule: DT → die ∣ some
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
78
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extracted rule: NNS → Anmerkungen ∣ comments
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
79
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extracted rule: PP → X ∣ to PRP
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
80
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extracted rule: NP → X1 X2 ∣ DT1 NNS2
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
81
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extracted rule: VP → X1 X2 aush¨ andigen ∣ passing on PP1 NP2
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
82
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extracted rule: VP → werde X ∣ shall be VP (ignoring internal structure)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
83
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Extracted rule: S → X1 X2 ∣ PRP1 VP2
DONE — note: one rule per alignable constituent
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
84
I shall be passing
to you some comments
PRP MD VB VBG RP TO PRP DT NNS NP PP VP VP VP S
Ich werde Ihnen die entsprechenden Anmerkungen aushändigen
Attach to neighboring words or higher nodes → additional rules
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
85
Inspired by monolingual syntactic chart parsing: During decoding of the source sentence, a chart with translations for the O(n2) spans has to be filled
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
86
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S VB
drink ➏
German input sentence with tree
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
87
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink ➏ ➊
Purely lexical rule: filling a span with a translation (a constituent in the chart)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
88
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN
coffee ➏ ➊ ➋
Purely lexical rule: filling a span with a translation (a constituent in the chart)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
89
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN
coffee ➏ ➊ ➋ ➌
Purely lexical rule: filling a span with a translation (a constituent in the chart)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
90
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN |
cup
IN |
NP PP NN NP DET |
a
NN
coffee ➏ ➊ ➋ ➌ ➍
Complex rule: matching underlying constituent spans, and covering words
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
91
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN |
cup
IN |
NP PP NN NP DET |
a
VBZ |
wants
VB VP VP NP TO |
to
NN
coffee ➏ ➊ ➋ ➌ ➍ ➎
Complex rule with reordering
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
92
Sie
PPER
will
VAFIN
eine
ART
Tasse
NN
Kaffee
NN
trinken
VVINF NP VP S PRO
she
VB
drink
NN |
cup
IN |
NP PP NN NP DET |
a
VBZ |
wants
VB VP VP NP TO |
to
NN
coffee
S PRO VP
➏ ➊ ➋ ➌ ➍ ➎
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
93
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
94
(esp. neural network models)
(pronouns, discourse relationships, inference)
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015
95
Philipp Koehn Artificial Intelligence: Machine Translation 1 December 2015