1/63
CS7015 (Deep Learning) : Lecture 16
Encoder Decoder Models, Attention Mechanism Mitesh M. Khapra
Department of Computer Science and Engineering Indian Institute of Technology Madras
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
CS7015 (Deep Learning) : Lecture 16 Encoder Decoder Models, - - PowerPoint PPT Presentation
1/63 CS7015 (Deep Learning) : Lecture 16 Encoder Decoder Models, Attention Mechanism Mitesh M. Khapra Department of Computer Science and Engineering Indian Institute of Technology Madras Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
1/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
2/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
3/63 <GO> U s0 V I I U W yt V am yt am U V W at yt at U V W home yt home U V W today yt today U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
4/63 <GO> U s0 V I I U W V am am U V W at at U V W home home U V W today today U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
5/63 <GO> U s0 V I I U W V am am U V W at at U V W home home U V W today today U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
6/63
<GO>
1 am 1 at 1 home 1 today 1 <stop> 1 st
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
7/63 <GO> U s0 V I I U W V am am U V W at at U V W home home U V W today today U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
8/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
9/63
<Go> U s0 V A A U V W man man U V W throwing
W
park U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
10/63 CNN s0 = fc7(I) <GO> U V A A U V W man man U V W throwing
W
park U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
, I)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
11/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
12/63
CNN s0 = fc7(I) <GO> U V A A U V W man man U V W throwing
W
park U V W xT sT yt ⟨ stop ⟩ P(yt = j|yt−1
1
, I)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
13/63
CNN s0 = fc7(I) <GO> U V A A U V W man man U V W throwing
W
park U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
, I)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
14/63 Encoder CNN h0 Decoder <GO> U V A A U V W man man U V W throwing
W
park U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
, I)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
15/63 Encoder CNN h0 Decoder <GO> U V A A U V W man man U V W throwing
W
park U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
, I)
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
16/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
17/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
18/63
Encoder Decoder Lt(θ) = − log P(yt = j|yt−1
1
, fc7 ) CNN h0 <GO> U V A A U V W man man U V W throwing . . .
W
park U V W xt st yt ⟨ stop ⟩ P(yt = j|yt−1
1
, fc7 )
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
19/63
x1 It i/p: x2 is x3 raining x4
ht <Go>
The 1 The ground 1 ground is 1 is wet 1 wet <STOP> 1 st
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
20/63
x1 It i/p: x2 is x3 raining x4
ht <Go>
The 1 The ground 1 ground is 1 is wet 1 wet <STOP> 1 st
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
21/63
x1 I i/p: x2 am x3 going x4 home ht <Go>
1 Mein ghar 1 ghar ja 1 ja raha 1 raha hoon 1 st
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
22/63
x1 I i/p: x2 am x3 going x4 home ht <Go>
Mein 1 Mein ghar 1 ghar ja 1 ja raha 1 raha hoon 1 st
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
23/63
š ' @ i y a x1 I i/p: x2 N x3 D x4 I x5 A ht <Go>
š 1 š ' 1 ' @ 1 @ i 1 i y 1 y a 1 st
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
24/63
š ' @ i y a x1 I i/p: x2 N x3 D x4 I x5 A ht <Go>
š 1 š ' 1 ' @ 1 @ i 1 i y 1 y a 1 st
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
25/63
What is the bird’s color CNN
˜ ht ˆ hI White
i=1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
26/63
India i/p: beats
Srilanka ht c <Go>
1 India won 1 won the 1 the world 1 world cup 1 cup <STOP> 1 st
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
27/63
CNN CNN
CNN A man walking
a rope
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
28/63
CNN CNN
CNN Suryanamaskar
i=1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
29/63
x1 How i/p: x2 are x3 you ht c <Go>
1 I am 1 am fjne 1 fjne <STOP> st 1
i=1
1
T
i=1
T
t=1
1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
30/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
31/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
32/63
x1 Main i/p: x2 ghar x3 ja x4 raha x5 hoon hi c <Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 si
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
33/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
34/63
x1 Main i/p: x2 ghar x3 ja x4 raha x5 hoon hi c <Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 si
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
35/63
<Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 x1 Main i/p: x2 ghar x3 ja x4 raha x5 hoon hi + c3 α1,3 α2,3 α3,3 α4,3 α5,3 + c2 α1,2 α2,2α3,2α4,2 α5,2 + α1,4α2,4α3,4 α4,4 α5,4 c4 + α1,5 α2,5α3,5α4,5 α5,5 c5
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
36/63
<Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 x1 Main i/p: x2 ghar x3 ja x4 raha x4 hoon hi + ct α1,2 α2,2 α3,2 α4,2 α5,2
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
37/63
<Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 x1 Main i/p: x2 ghar x3 ja x4 raha x4 hoon hi + ct α1,2 α2,2 α3,2 α4,2 α5,2
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
38/63
<Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 x1 Main i/p: x2 ghar x3 ja x4 raha x4 hoon hi + ct α1,2 α2,2 α3,2 α4,2 α5,2
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
39/63
<Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 x1 Main i/p: x2 ghar x3 ja x4 raha x4 hoon hi + ct α1,2 α2,2 α3,2 α4,2 α5,2
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
40/63
<Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 x1 Main i/p: x2 ghar x3 ja x4 raha x4 hoon hi + ct α1,2 α2,2 α3,2 α4,2 α5,2
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
41/63
<Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 x1 Main i/p: x2 ghar x3 ja x4 raha x4 hoon hi + ct α1,2 α2,2 α3,2 α4,2 α5,2
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
42/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
43/63
<Go>
1 I am 1 am going 1 going home 1 home <STOP> 1 x1 Main i/p: x2 ghar x3 ja x4 raha x5 hoon hi + c3 α1,3 α2,3 α3,3 α4,3 α5,3 + c2 α1,2 α2,2α3,2α4,2 α5,2 + α1,4α2,4α3,4 α4,4 α5,4 c4 + α1,5 α2,5α3,5α4,5 α5,5 c5
i=1
attntanh(Uattnhj + Wattnst)
T
j=1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
44/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
45/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
46/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
47/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
48/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
49/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
50/63
<Go>
main 1 main ghar 1 ghar ja 1 ja raha 1 raha hoon 1 hoon <STOP> 1 hi x1 I i/p: x2 am x3 going x4 home hi + ct α1 α2 α3 α4
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
51/63 Encoder CNN h0
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
52/63
Input
2 2 4 224
Conv
2 2 4 224 64
maxpool
1 1 2 112
64
Conv
1 1 2 112 128
maxpool
56 56 128
Conv
56 56 256
maxpool
28 28 256
Conv
28 28 512
maxpool
1 4 14 512
Conv
1 4 14 512
maxpool
7 7 512
fc fc
40964096
softmax
1000
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
53/63
+
14 14 512
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
54/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
55/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
56/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
57/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
58/63
<Go>
Hugh 1 I Jackman 1 am
1 going course 1 home <STOP> 1
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
59/63
Politics
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
60/63
Politics
i=1
ij = RNN(h1 ij−1, wij)
iTi
i = RNN(h2 i−1, si)
K
enc, U1 enc, W2 enc, U2 enc, V, b
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
61/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
62/63
ijuw)
t exp(uT ituw)
j
i us)
i exp(uT i us)
i
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16
63/63
Mitesh M. Khapra CS7015 (Deep Learning) : Lecture 16