Convolutional Neural Networks II Milan Straka March 30, 2020 - PowerPoint PPT Presentation

NPFL114, Lecture 5 Convolutional Neural Networks II Milan Straka March 30, 2020 Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless otherwise stated

Designing and Training Neural Networks Designing and training a neural network is not a one-shot action, but instead an iterative procedure. When choosing hyperparameters, it is important to verify that the model does not underfit and does not overfit. Underfitting can be checked by increasing model capacity or training longer. Overfitting can be tested by observing train/dev difference and by trying stronger regularization. Specifically, this implies that: We need to set number of training epochs so that training loss/performance no longer increases at the end of training. Generally, we want to use a large batchsize that does not slow us down too much (GPUs sometimes allow larger batches without slowing down training). However, with increasing batch size we need to increase learning rate, which is possible only to some extent. Also, small batch size sometimes work as regularization (especially for vanilla SGD algorithm). NPFL114, Lecture 5 Refresh ResNetModifications CNNRegularization EfficientNet TransferLearning TransposedConvolution 2/53

Main Takeaways From Previous Lecture Convolutions can provide local interactions in spacial/temporal dimensions shift invariance much less parameters than a fully connected layer 3 × 3 Usually repeated convolutions are enough, no need for larger filter sizes. When pooling is performed, double number of channels. Final fully connected layers are not needed, global average pooling is usually enough. Batch normalization is a great regularization method for CNNs, allowing removal of dropout. Small weight decay (i.e., L2 regularization) of usually 1e-4 is still useful for regularizing convolutional kernels. NPFL114, Lecture 5 Refresh ResNetModifications CNNRegularization EfficientNet TransferLearning TransposedConvolution 3/53

ResNet – 2015 (3.6% error) Figure 1 of paper "Deep Residual Learning for Image Recognition", https://arxiv.org/abs/1512.03385. NPFL114, Lecture 5 Refresh ResNetModifications CNNRegularization EfficientNet TransferLearning TransposedConvolution 4/53

ResNet – 2015 (3.6% error) Table 1 of paper "Deep Residual Learning for Image Recognition", https://arxiv.org/abs/1512.03385. NPFL114, Lecture 5 Refresh ResNetModifications CNNRegularization EfficientNet TransferLearning TransposedConvolution 7/53

ResNet – 2015 (3.6% error)                  The residual connections cannot be applied          directly when number of channels increase.                                 The authors considered several alternatives, and                       chose the one where in case of channels            1 × 1                 increase a convolution is used on the                              projections to match the required number of                  channels.                                                                                                                                                                Figure 3 of paper "Deep Residual Learning for Image Recognition", https://arxiv.org/abs/1512.03385. NPFL114, Lecture 5 Refresh ResNetModifications CNNRegularization EfficientNet TransferLearning TransposedConvolution 8/53

ResNet – 2015 (3.6% error) Figure 1 of paper "Visualizing the Loss Landscape of Neural Nets", https://arxiv.org/abs/1712.09913. NPFL114, Lecture 5 Refresh ResNetModifications CNNRegularization EfficientNet TransferLearning TransposedConvolution 10/53

ResNet – 2015 (3.6% error) Training details: batch normalizations after each convolution and before activation SGD with batch size 256 and momentum of 0.9 learning rate starts with 0.1 and is divided by 10 when error plateaus no dropout, weight decay 0.0001 during testing, 10-crop evaluation strategy is used, averaging scores across multiple scales – the images are resized so that their smaller size is in {224, 256, 384, 480, 640} NPFL114, Lecture 5 Refresh ResNetModifications CNNRegularization EfficientNet TransferLearning TransposedConvolution 11/53

Convolutional Neural Networks II Milan Straka March 30, 2020 - PowerPoint PPT Presentation

NPFL114, Lecture 5 Convolutional Neural Networks II Milan Straka March 30, 2020 Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless otherwise stated Designing and Training Neural

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

CONVOLUTIONAL AND RECURRENT NEURAL NETWORKS Neural networks Fully connected networks

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

MICROBOONE Taritree Wongjirad DPF 2017 Tufts/MIT Outline Convolutional neural networks

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image

Convolutional Neural Networks Rachel Hu and Zhi Zhang Amazon AI d2l.ai Outline GPUs

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Neural Networks (CNNs) Recurrent Neural Networks (RNNs) L1 Scalar Processor L0

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Dependency-based Convolutional Neural Networks for Sentence Embedding What is Hawaii

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Neural Networks + Convolutional Neural Networks Last Class Global Features The perceptron

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture

Announcements Class is 170. Matlab Grader homework, 1 and 2 (of less than 9) homeworks Due 22

CNN Case Studies M. Soleymani Sharif University of Technology Fall 2017 Slides are based on Fei

Learning Based Vision II Computer Vision Fall 2018 Columbia University Project Project

Deep Learning Gets Way Deeper 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1

Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Todays class

Statistical Machine Translation Clause Restructuring for

Restructuring The Care Delivery System: Will Limited Growth In Revenues Force A Change? Stuart H.

Programming for Engineers Iteration ICEN 200 Spring 2018 Prof. Dola Saha 1 Data

Convolutional Neural Networks II Milan Straka March 30, 2020 - PowerPoint PPT Presentation

NPFL114, Lecture 5 Convolutional Neural Networks II Milan Straka March 30, 2020 Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless otherwise stated Designing and Training Neural

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

CONVOLUTIONAL AND RECURRENT NEURAL NETWORKS Neural networks Fully connected networks

Convolutional Neural Networks 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

MICROBOONE Taritree Wongjirad DPF 2017 Tufts/MIT Outline Convolutional neural networks

Convolutional Neural Networks (Part III) 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image

Convolutional Neural Networks Rachel Hu and Zhi Zhang Amazon AI d2l.ai Outline GPUs

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Neural Networks (CNNs) Recurrent Neural Networks (RNNs) L1 Scalar Processor L0

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Dependency-based Convolutional Neural Networks for Sentence Embedding What is Hawaii

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Neural Networks + Convolutional Neural Networks Last Class Global Features The perceptron

Neural Network Part 3: Convolutional Neural Networks CS 760@UW-Madison Goals for the lecture

Announcements Class is 170. Matlab Grader homework, 1 and 2 (of less than 9) homeworks Due 22

CNN Case Studies M. Soleymani Sharif University of Technology Fall 2017 Slides are based on Fei

Learning Based Vision II Computer Vision Fall 2018 Columbia University Project Project

Deep Learning Gets Way Deeper 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1

Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Todays class

Statistical Machine Translation Clause Restructuring for

Restructuring The Care Delivery System: Will Limited Growth In Revenues Force A Change? Stuart H.

Programming for Engineers Iteration ICEN 200 Spring 2018 Prof. Dola Saha 1 Data

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image