AMMI – Introduction to Deep Learning 7.2. Networks for image classification
Fran¸ cois Fleuret https://fleuret.org/ammi-2018/ Thu Sep 6 15:26:25 CAT 2018
ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE
AMMI Introduction to Deep Learning 7.2. Networks for image - - PowerPoint PPT Presentation
AMMI Introduction to Deep Learning 7.2. Networks for image classification Fran cois Fleuret https://fleuret.org/ammi-2018/ Thu Sep 6 15:26:25 CAT 2018 COLE POLYTECHNIQUE FDRALE DE LAUSANNE Image classification, standard convnets
ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 1 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 2 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 2 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 2 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 3 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 3 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 3 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 4 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 5 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 6 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 6 / 36
(features): Sequential ( (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): ReLU (inplace) (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (3): ReLU (inplace) (4): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1)) (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (6): ReLU (inplace) (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (8): ReLU (inplace) (9): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1)) (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (11): ReLU (inplace) (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (13): ReLU (inplace) (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (15): ReLU (inplace) (16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (17): ReLU (inplace) (18): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1)) (19): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (20): ReLU (inplace) (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (22): ReLU (inplace) (23): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (24): ReLU (inplace) (25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (26): ReLU (inplace) (27): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1)) /.../
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 7 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 8 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 9 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 10 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 10 / 36
12.26 Weimaraner 10.95 Chesapeake Bay retriever 10.87 Labrador retriever 10.10 Staffordshire bullterrier, Staffordshire bull terrier 9.55 flat-coated retriever 9.40 Italian greyhound 9.31 American Staffordshire terrier, Staffordshire terrier, American pit bull terrier, pit bull terrier 9.12 Great Dane 8.94 German short-haired pointer 8.53 Doberman, Doberman pinscher 8.35 Rottweiler 8.25 kelpie 8.24 barrow, garden cart, lawn cart, wheelbarrow 8.12 bucket, pail 8.07 soccer ball
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 11 / 36
12.26 Weimaraner 10.95 Chesapeake Bay retriever 10.87 Labrador retriever 10.10 Staffordshire bullterrier, Staffordshire bull terrier 9.55 flat-coated retriever 9.40 Italian greyhound 9.31 American Staffordshire terrier, Staffordshire terrier, American pit bull terrier, pit bull terrier 9.12 Great Dane 8.94 German short-haired pointer 8.53 Doberman, Doberman pinscher 8.35 Rottweiler 8.25 kelpie 8.24 barrow, garden cart, lawn cart, wheelbarrow 8.12 bucket, pail 8.07 soccer ball
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 11 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 12 / 36
x(l) H W C
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 13 / 36
x(l) H W C HWC
Reshape
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 13 / 36
x(l) H W C x(l+1) HWC
Reshape
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 13 / 36
x(l) H W C H W C x(l+1)
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 13 / 36
x(l) x(l+1) w(l+1)
Reshape
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 14 / 36
x(l) x(l+2) w(l+2) x(l+1) w(l+1)
Reshape
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 14 / 36
x(l) w(l+1) x(l+1)
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 14 / 36
x(l) x(l+2) w(l+2) w(l+1) x(l+1)
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 14 / 36
x(l) x(l+2) w(l+2) w(l+1) x(l+1)
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 15 / 36
x(l) x(l+2) w(l+2) w(l+1) x(l+1) x(l+1) x(l+2)
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 15 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 16 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 17 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 18 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 19 / 36
Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers Input image Conv layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Input image Conv layers Max-pooling 1000d FC layers Input image Conv layers Max-pooling 1000d FC layers
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 20 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 21 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 21 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 22 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 23 / 36
. . . . . .
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 24 / 36
1x1 convolutions 3x3 convolutions 5x5 convolutions Filter concatenation Previous layer 3x3 max pooling
(a) Inception module, na¨ ıve version
1x1 convolutions 3x3 convolutions 5x5 convolutions Filter concatenation Previous layer 3x3 max pooling 1x1 convolutions 1x1 convolutions 1x1 convolutions
(b) Inception module with dimension reductions
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 25 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 26 / 36
input Conv 7x7+2(S) MaxPool 3x3+2(S) LocalRespNorm Conv 1x1+1(V) Conv 3x3+1(S) LocalRespNorm MaxPool 3x3+2(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) MaxPool 3x3+2(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) AveragePool 5x5+3(V) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) AveragePool 5x5+3(V) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) MaxPool 3x3+2(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) Conv 1x1+1(S) MaxPool 3x3+1(S) DepthConcat Conv 3x3+1(S) Conv 5x5+1(S) Conv 1x1+1(S) AveragePool 7x7+1(V) FC Conv 1x1+1(S) FC FC SoftmaxActivation softmax0 Conv 1x1+1(S) FC FC SoftmaxActivation softmax1 SoftmaxActivation softmax2
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 27 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 28 / 36
. . .
Conv 3 × 3 64 → 64 BN ReLU 64 Conv 3 × 3 64 → 64 BN + ReLU
. . .
64
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 29 / 36
. . .
Conv 3 × 3 256 → 256 BN ReLU 256 Conv 3 × 3 256 → 256 BN + ReLU
. . .
256
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 30 / 36
. . .
Conv 3 × 3 256 → 256 BN ReLU 256 Conv 3 × 3 256 → 256 BN + ReLU
. . .
256
. . .
Conv 1 × 1 256 → 64 BN ReLU 256 Conv 3 × 3 64 → 64 BN ReLU Conv 1 × 1 64 → 256 BN + ReLU
. . .
256
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 30 / 36
. . .
Conv 3 × 3 256 → 256 BN ReLU 256 Conv 3 × 3 256 → 256 BN + ReLU
. . .
256
. . .
Conv 1 × 1 256 → 64 BN ReLU 256 Conv 3 × 3 64 → 64 BN ReLU Conv 1 × 1 64 → 256 BN + ReLU
. . .
256
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 30 / 36
layer name output size 18-layer 34-layer 50-layer 101-layer 152-layer conv1 112×112 7×7, 64, stride 2 conv2 x 56×56 3×3 max pool, stride 2
3×3, 64
3×3, 64
1×1, 64 3×3, 64 1×1, 256 ×3 1×1, 64 3×3, 64 1×1, 256 ×3 1×1, 64 3×3, 64 1×1, 256 ×3 conv3 x 28×28
3×3, 128
3×3, 128
1×1, 128 3×3, 128 1×1, 512 ×4 1×1, 128 3×3, 128 1×1, 512 ×4 1×1, 128 3×3, 128 1×1, 512 ×8 conv4 x 14×14
3×3, 256
3×3, 256
1×1, 256 3×3, 256 1×1, 1024 ×6 1×1, 256 3×3, 256 1×1, 1024 ×23 1×1, 256 3×3, 256 1×1, 1024 ×36 conv5 x 7×7
3×3, 512
3×3, 512
1×1, 512 3×3, 512 1×1, 2048 ×3 1×1, 512 3×3, 512 1×1, 2048 ×3 1×1, 512 3×3, 512 1×1, 2048 ×3 1×1 average pool, 1000-d fc, softmax FLOPs 1.8×109 3.6×109 3.8×109 7.6×109 11.3×109
Table 1. Architectures for ImageNet. Building blocks are shown in brackets (see also Fig. 5), with the numbers of blocks stacked. Down- sampling is performed by conv3 1, conv4 1, and conv5 1 with a stride of 2.
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 31 / 36
method
top-5 err. (test) VGG [41] (ILSVRC’14) 7.32 GoogLeNet [44] (ILSVRC’14) 6.66 VGG [41] (v5) 6.8 PReLU-net [13] 4.94 BN-inception [16] 4.82 ResNet (ILSVRC’15) 3.57 Table 5. Error rates (%) of ensembles. The top-5 error is on the test set of ImageNet and reported by the test server.
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 32 / 36
. . .
+ Conv 1 × 1 256 → 4 BN ReLU 256 Conv 3 × 3 4 → 4 BN ReLU Conv 1 × 1 4 → 256 BN Conv 1 × 1 256 → 4 BN ReLU Conv 3 × 3 4 → 4 BN ReLU Conv 1 × 1 4 → 256 BN ReLU
. . .
256
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 33 / 36
. . .
+ Conv 1 × 1 256 → 4 BN ReLU 256 Conv 3 × 3 4 → 4 BN ReLU Conv 1 × 1 4 → 256 BN Conv 1 × 1 256 → 4 BN ReLU Conv 3 × 3 4 → 4 BN ReLU Conv 1 × 1 4 → 256 BN ReLU
. . .
256
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 33 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 34 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 35 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 35 / 36
Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 35 / 36
LeNet5 (LeCun et al., 1989) LSTM (Hochreiter and Schmidhuber, 1997) Highway Net (Srivastava et al., 2015) No recurrence Deep hierarchical CNN (Ciresan et al., 2012) Bigger + GPU AlexNet (Krizhevsky et al., 2012) Bigger + ReLU + dropout Overfeat (Sermanet et al., 2013) Fully convolutional VGG (Simonyan and Zisserman, 2014) Bigger + small filters Net in Net (Lin et al., 2013) MLPConv GoogLeNet (Szegedy et al., 2015) Inception modules ResNet (He et al., 2015) No gating BN-Inception (Ioffe and Szegedy, 2015) Batch Normalization Inception-ResNet (Szegedy et al., 2016) ResNeXt (Xie et al., 2016) DenseNet (Huang et al., 2016) Wide ResNet (Zagoruyko and Komodakis, 2016) Wider Dense pass-through Aggregated channels Fran¸ cois Fleuret AMMI – Introduction to Deep Learning / 7.2. Networks for image classification 36 / 36