CRF Word Alignment & Noisy Channel Translation
January 31, 2013
Tuesday, February 19, 13
CRF Word Alignment & Noisy Channel Translation January 31, 2013 - - PowerPoint PPT Presentation
CRF Word Alignment & Noisy Channel Translation January 31, 2013 Tuesday, February 19, 13 Last Time ... X Translation Translation Alignment p ( p ( ) = ) , Alignment Tuesday, February 19, 13 Last Time ... X Translation Translation
January 31, 2013
Tuesday, February 19, 13
Alignment
Tuesday, February 19, 13
Alignment
×
Alignment
Tuesday, February 19, 13
Alignment
×
Alignment
a∈[0,n]m
m
i=1
Tuesday, February 19, 13
IBM Model 4 alignment Our model's alignment
Tuesday, February 19, 13
IBM Model 4 alignment Our model's alignment
Tuesday, February 19, 13
Tuesday, February 19, 13
Tuesday, February 19, 13
Tuesday, February 19, 13
a∈[0,n]m
m
i=1
a∈[0,n]m p(a | e, f, m)
Tuesday, February 19, 13
a∈[0,n]m
m
i=1
a∈[0,n]m p(a | e, f, m)
Tuesday, February 19, 13
A B C X Y Z
p(A, B, C, X, Y, Z) = p(A) × p(B | A) × p(C | B)× p(X | A)p(Y | B)p(Z | C)
Tuesday, February 19, 13
A B C X Y Z
p(A, B, C, X, Y, Z) = p(A) × p(B | A) × p(C | B)× p(X | A)p(Y | B)p(Z | C)
A B C X Y Z
p(A, B, C, X, Y, Z) = 1 Z × Ψ1(A, B) × Ψ2(B, C) × Ψ3(C, D)× Ψ4(X) × Ψ5(Y ) × Ψ6(Z)
Tuesday, February 19, 13
A B C X Y Z
p(A, B, C, X, Y, Z) = p(A) × p(B | A) × p(C | B)× p(X | A)p(Y | B)p(Z | C)
A B C X Y Z
p(A, B, C, X, Y, Z) = 1 Z × Ψ1(A, B) × Ψ2(B, C) × Ψ3(C, D)× Ψ4(X) × Ψ5(Y ) × Ψ6(Z)
Tuesday, February 19, 13
X Y
X = {a, b, c} X ∈ X Y ∈ X Z = X
x∈X
X
y∈X
Ψ1(x, y)Ψ2(x)Ψ3(y)
Z = X
x∈X
Ψ2(x) X
y∈X
Ψ1(x, y)Ψ3(y)
Tuesday, February 19, 13
A B C X Y Z
p(A, B, C, X, Y, Z) = 1 Z × Ψ1(A, B) × Ψ2(B, C) × Ψ3(C, D)× Ψ4(X) × Ψ5(Y ) × Ψ6(Z) Ψ1,2,3(x, y) = exp X
k
wkfk(x, y)
Tuesday, February 19, 13
A B C X Y Z
p(A, B, C, X, Y, Z) = 1 Z × Ψ1(A, B) × Ψ2(B, C) × Ψ3(C, D)× Ψ4(X) × Ψ5(Y ) × Ψ6(Z) Ψ1,2,3(x, y) = exp X
k
wkfk(x, y)
Tuesday, February 19, 13
A B C X Y Z
p(A, B, C, X, Y, Z) = 1 Z × Ψ1(A, B) × Ψ2(B, C) × Ψ3(C, D)× Ψ4(X) × Ψ5(Y ) × Ψ6(Z) Ψ1,2,3(x, y) = exp X
k
wkfk(x, y)
Tuesday, February 19, 13
to arbitrary features (functions) of the variables
computing Z (often over and over again!)
Tuesday, February 19, 13
Tuesday, February 19, 13
p(y | x) = 1 Zw(y) exp X
F ∈G
X
k
wkfk(F, x)
Tuesday, February 19, 13
y
All factors in the graph of
p(y | x) = 1 Zw(y) exp X
F ∈G
X
k
wkfk(F, x)
Tuesday, February 19, 13
p(a | e, f) ˆ wMLE = arg max
w
Y
(xi,yi)∈D
p(yi | xi ; w)
Tuesday, February 19, 13
p(a | e, f)
ˆ wMLE = arg max
w
Y
(xi,yi)∈D
p(yi | xi ; w)
Tuesday, February 19, 13
Cohn (2006)
models (still make a one-to-many assumption)
p(a | e, f) = 1 Zw(e, f) exp
|e|
X
i=1
X
k
wkf(ai, ai−1, i, e, f)
Tuesday, February 19, 13
Cohn (2006)
models (still make a one-to-many assumption)
p(a | e, f) = 1 Zw(e, f) exp
|e|
X
i=1
X
k
wkf(ai, ai−1, i, e, f) O(n2m) ≈ O(n3)
Tuesday, February 19, 13
Tuesday, February 19, 13
Tuesday, February 19, 13
Identical word
Identical word
17
Tuesday, February 19, 13
Matching prefix
Identical word Matching prefix
18
Tuesday, February 19, 13
Matching suffix
Identical word Matching prefix Matching suffix
19
Tuesday, February 19, 13
Orthographic similarity
Identical word Matching prefix Matching suffix Orthographic similarity
20
Tuesday, February 19, 13
In dictionary
Identical word Matching prefix Matching suffix Orthographic similarity In dictionary ...
21
Tuesday, February 19, 13
Identical word Matching prefix Matching suffix Orthographic similarity In dictionary ...
21
Tuesday, February 19, 13
↔ ↔
Tuesday, February 19, 13
↔
Tuesday, February 19, 13
Tuesday, February 19, 13
Tuesday, February 19, 13
Tuesday, February 19, 13
Tuesday, February 19, 13
p(e) p(e | f, m) p(e, a | f, m) p(a | e, f)
Tuesday, February 19, 13
p(e) p(e | f, m) p(e, a | f, m) p(a | e, f)
Tuesday, February 19, 13
p(e) p(e | f, m) p(e, a | f, m) p(a | e, f)
Tuesday, February 19, 13
p(e) p(e | f, m) p(e, a | f, m) p(a | e, f) p(e | f, m) p(e)
Tuesday, February 19, 13
Warren Weaver to Norbert Wiener, March, 1947
Tuesday, February 19, 13
Claude Shannon. “A Mathematical Theory of Communication” 1948.
Encoder
M
Message
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Tuesday, February 19, 13
Claude Shannon. “A Mathematical Theory of Communication” 1948.
Encoder
M
Message
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Tuesday, February 19, 13
Claude Shannon. “A Mathematical Theory of Communication” 1948.
Encoder
M
Message
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Tuesday, February 19, 13
Claude Shannon. “A Mathematical Theory of Communication” 1948.
Encoder
M
Message
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
y
y
y
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
y
y
y
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
y
y
y
I can help.
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
y
y
y
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
y
y
y
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
y
y
y
Tuesday, February 19, 13
“Noisy” channel Decoder
Y X M 0
Sent transmission Received transmission Recovered message
Y 0
y
Tuesday, February 19, 13
Sent transmission Received transmission Recovered message
“Noisy” channel Decoder
Y X M 0 Y 0
y
e
Tuesday, February 19, 13
Sent transmission Received transmission Recovered message
“Noisy” channel Decoder
Y X M 0 Y 0
y
e
Tuesday, February 19, 13
Sent transmission Received transmission Recovered message
“Noisy” channel Decoder
Y X M 0 Y 0
y
e
Tuesday, February 19, 13
Sent transmission Received transmission Recovered message
“Noisy” channel Decoder
Y X M 0 Y 0
y
e
Tuesday, February 19, 13
source
Tuesday, February 19, 13
p(e)
Tuesday, February 19, 13
p(e) p(f | e)
Tuesday, February 19, 13
p(e) p(f | e) e∗ = arg max
e
p(e | f) = arg max
e
p(f | e) × p(e)
Tuesday, February 19, 13
Tuesday, February 19, 13