Lecture 14
Advanced Neural Networks Michael Picheny, Bhuvana Ramabhadran, Stanley F . Chen, Markus Nussbaum-Thom
Watson Group IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen,nussbaum}@us.ibm.com
Lecture 14 Advanced Neural Networks Michael Picheny, Bhuvana - - PowerPoint PPT Presentation
Lecture 14 Advanced Neural Networks Michael Picheny, Bhuvana Ramabhadran, Stanley F . Chen, Markus Nussbaum-Thom Watson Group IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen,nussbaum}@us.ibm.com 27
Watson Group IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen,nussbaum}@us.ibm.com
2 / 72
N
N
1 ∈ω
1 |xTn 1 , θ) · L(ω, ωn)
θ
3 / 72
1 ) =
1 ∈ω
T
1 := x1, . . . , xT: feature sequence
1 := a1, . . . , aT: HMM state sequence
4 / 72
1 ) =
1 ∈ω
T
5 / 72
ω
1 |ω)
ω
1 ∈ω
T
ω
1 ∈ω
T
T
ω
1 ∈ω
T
6 / 72
1
2
3
4
5
6
7 / 72
8 / 72
9 / 72
10 / 72
11 / 72
12 / 72
i
13 / 72
1
2
3
4
5
6
14 / 72
15 / 72
16 / 72
17 / 72
18 / 72
19 / 72
1
2
3
4
5
6
20 / 72
21 / 72
22 / 72
23 / 72
24 / 72
C
i+k
25 / 72
j=i−k,...,i+k {yn,c,j}
i+k
26 / 72
27 / 72
# Fmaps Classic [16, 17, 18] VB(X) VC(X) VD(X) WD(X) 64 conv(3,64) conv(3,64) conv(3,64) conv(3,64) conv(64,64) conv(64,64) conv(64,64) conv(64,64) pool 1x3 pool 1x2 pool 1x2 pool 1x2 128 conv(64, 128) conv(64, 128) conv(64, 128) conv(64, 128) conv(128, 128) conv(128, 128) conv(128, 128) conv(128, 128) pool 2x2 pool 2x2 pool 1x2 pool 1x2 256 conv(128, 256) conv(128, 256) conv(128, 256) conv(256, 256) conv(256, 256) conv(256, 256) conv(256, 256) pool 1x2 pool 2x2 pool 2x2 512 conv9x9(3,512) conv(256, 512) conv(256, 512) pool 1x3 conv(512, 512) conv(512, 512) conv3x4(512,512) conv(512, 512) pool 2x2 pool 2x2 FC 2048 FC 2048 (FC 2048) FC output size Softmax
pool conv conv pool conv conv Shared KUR FC
Softmax
FC FC
Softmax
FC FC
Softmax
FC FC
Softmax
FC FC
Softmax
FC FC
Softmax
FC FC TOK CEB KAZ TEL LIT FC FC FC FC FC FC
Context +/-5 Context +/-10, stride 2 Context +/- 20, stride 4
28 / 72
pool conv conv pool conv conv Shared KUR FC
Softmax
FC FC
Softmax
FC FC
Softmax
FC FC
Softmax
FC FC
Softmax
FC FC
Softmax
FC FC TOK CEB KAZ TEL LIT FC FC FC FC FC FC
Context +/-5 Context +/-10, stride 2 Context +/- 20, stride 4
29 / 72
30 / 72
1
2
3
4
5
6
31 / 72
32 / 72
33 / 72
34 / 72
35 / 72
t−1
36 / 72
37 / 72
38 / 72
→
←
→
←
→
←
→
←
→
→
→
→
→
←
←
←
←
←
→
←
39 / 72
40 / 72
41 / 72
42 / 72
43 / 72
44 / 72
45 / 72
46 / 72
47 / 72
48 / 72
49 / 72
50 / 72
51 / 72
52 / 72
1
2
3
4
5
6
53 / 72
54 / 72
55 / 72
56 / 72
58 / 72
59 / 72
60 / 72
61 / 72
zt 1−p for t = 1, . . . , T.
62 / 72
|θ|
1 p
63 / 72
1
2
3
4
5
6
64 / 72
65 / 72
66 / 72
τ=1 σ(Wα · ǫm,τ)
T
67 / 72
τ=1 σ(Wǫ · ǫm,τ)
T
68 / 72
69 / 72
70 / 72
71 / 72
72 / 72