CS480/680 Machine Learning Lecture 20: Convolutional Neural Network
Zahra Sheikhbahaee March 29, 2020
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 1
CS480/680 Machine Learning Lecture 20: Convolutional Neural Network - - PowerPoint PPT Presentation
CS480/680 Machine Learning Lecture 20: Convolutional Neural Network Zahra Sheikhbahaee March 29, 2020 University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 1 Outline Convolution Zero Padding Stride Weight Sharing Pooling
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 1
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 2
Starry Night by Vincent van Gogh, 1889. D Der Schrei by Edvard Munch, 1893.
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 3
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 4
Gx =
1
−1
2
−2
1
−1 and Gy = GT
x . An edge is where the pixel intensity changes in a notorious way. A good way to express changes is by using
x + G2 y.
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 5
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 6
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 7
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 8
2
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 9
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 10
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 11
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 12
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 13
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 14
◮ The input is a 32 × 32 grayscale image which passes through the first convolutional layer. ◮ Layer C1 has 6 feature maps (filters) and 5 × 5 filter size with a stride of one and no padding. The image dimensions changes from 32 × 32 × 1 to
28 × 28 × 6. The layer has 156 trainable parameters.
◮ Layer S2 is a subsampling layer with 6 feature maps. It has a filter size 2 × 2 and a stride of s = 2. The resulting image dimensions will be reduced to
14 × 14 × 6. Layer S2 has only 12 trainable parameter.
◮ Layer C3 is a convolutional layer with 16 feature maps. The filter size is 5 × 5 and a stride of 1 and it has 1516 trainable parameters. ◮ The S4 layer is an average pooling layer with filter size 2 × 2 and a stride of 2. This layer has 16 feature maps with 32 parameters and its output will be reduced to 5 × 5 × 16. ◮ The fifth layer C5 is a fully connected convolutional layer with 120 feature maps each of size 1 × 1. Each of the 120 units in C5 is connected to all the
5 × 5 × 16 nodes in the s4 layer.
◮ The sixth layer is a fully connected layer F6 with 84 units and there is a fully connected Euclidean radial basis function (RBF) instead of softmax function. University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 15
◮ It contains 5 convolutional layers and 3 fully connected layers. ◮ The first convolutional layer filters the 227 × 227 × 3 input image with 96 kernels of size 11 × 11 × 3 with a stride of s = 4 pixels. ◮ The second convolutional layer takes as input the (response-normalized and pooled) output of the first convolutional layer and filters it with 256 kernels of size 5 × 5 × 48
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 16
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 17
◮ Training deep neural networks with gradient based optimizers and learning methods can cause vanishing and exploding gradients during backpropagation. ◮ the degradation problem: As the network depth increasing, accuracy gets saturated and then degrades rapidly. Adding more layers to a suitably deep model leads to higher training error. ◮ A residual network is a solution for the upper mentioned problems. These networks are easier to optimize, and can gain accuracy from considerably increased depth. ◮ The shortcut connections simply perform identity mapping, and their outputs are added to the outputs of the stacked layers.
xl and xl+1: input and output of the l-th unit
F and h(xl): a residual function and an identity mapping
yl = h(xl) + F(xl, Wl) xl+1 = f(yl)
If f is also an identity mapping: xl+1 = yl, so we have xl+1 = xl + F(xl, Wl)
xL = xl +
L−1
F(xi, Wi) ∂ε ∂xl = ∂ε ∂xL ∂xL ∂xl = ∂ε ∂xL (1 + ∂ ∂xl
L−1
F(xi, Wi)) University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 18
◮ 152-layer model for the ImageNet competition with 3.57% top 5 error (better than human performance). ◮ Every residula block has two 3 × 3 convolutional layers ◮ Periodically, double number of filters and downsample spatially using stride 2 which divides by 2 in each dimension ◮ There is additional conv layer at the beginning ◮ No fully connected (FC) layers at the end, just a globally averaging pooling layer (only FC 1000 to output classes) University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 19
◮ The GoogLeNet uses 12× fewer parameters than the winning architecture of Krizhevsky et al. 2012. The codenamed Inception is inspired by the famous we need to go deeper internet meme. ◮ Naively stacking large convolution operations is computationally expensive. In the naive inception module, with 3 different sizes of filters (1 × 1, 3 × 3,
5 × 5) the module performs convolution on an input as well as max pooling.
◮ In the naive structure, a modest number of 5 × 5 convolutions can be prohibitively expensive on top of a convolutional layer with a large number of filters.
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 20
◮ The solution: applying dimension reductions and projections wherever the computational requirements would increase too much. ◮
1 × 1 convolutions are used to compute reductions before the expensive 3 × 3 and 5 × 5 convolutions.
◮
1 × 1 convolution is used with ReLU in order to introduce more non-linearity and increase the representational power of the network.
University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 21
increase the gradient signal that gets propagated back, and provide additional regularization. University of Waterloo CS480/680 Winter 2020 Zahra Sheikhbahaee 22