SLIDE 1 Deep Learning: Theory and Practice
02-04-2020
Deep Learning - Practical Considerations
deeplearning.cce2020@gmail.com
SLIDE 2
Deep Networks Intuition
Neural networks with multiple hidden layers - Deep networks [Hinton, 2006]
SLIDE 3
Neural networks with multiple hidden layers - Deep networks
Deep Networks Intuition
SLIDE 4
Neural networks with multiple hidden layers - Deep networks Deep networks perform hierarchical data abstractions which enable the non-linear separation of complex data samples.
Deep Networks Intuition
SLIDE 5
Need for Depth
SLIDE 6
Need for Depth
SLIDE 7 Deep Networks
- Are these networks trainable ?
- Advances in computation and processing
- Graphical processing units (GPUs) performing multiple
parallel multiply accumulate operations.
- Large amounts of supervised data sets
SLIDE 8 Deep Networks
- Will the networks generalize with deep networks
- DNNs are quite data hungry and performance
improves by increasing the data.
- Generalization problem is tackled by providing
training data from all possible conditions.
- Many artificial data augmentation methods have
been successfully deployed
- Providing the state-of-art performance in several
real world applications.
SLIDE 9
- The input data representation is one of most important
components of any machine learning system.
Representation Learning in Deep Networks
Cartesian Coordinates Polar Coordinates
SLIDE 10
- The input data representation is one of most important
components of any machine learning system.
- Extract factors that enable classification while
suppressing factors which are susceptible to noise.
- Finding the right representation for real world applications -
substantially challenging.
- Deep learning solution - build complex representations
from simpler representations.
- The dependencies between these hierarchical
representations are refined by the target.
Representation Learning in Deep Networks
SLIDE 11
Underfit
SLIDE 12
Overfit
SLIDE 13
Avoiding OverFitting In Practice
SLIDE 14 Weight Decay Regularization
Regularization = 0 Regularization = 40 Regularization = 4000
SLIDE 15 Early Stopping
Most Popular in Practice
SLIDE 16
Batch Normalization
SLIDE 17
Effect of Batch Normalization
SLIDE 18
Dropout Strategy in Neural Network Training
SLIDE 19
Dropouts in Neural Networks
SLIDE 20
Dropout in Training and Test
SLIDE 21
Dropout Application
SLIDE 22
Effect of Dropouts
SLIDE 23
Convolutional Neural Networks
SLIDE 24
Other Architectures - Convolution Operation
Weight sharing
SLIDE 25
Max Pooling Operation
SLIDE 26
Convolutional Neural Networks
Multiple levels of filtering and subsampling operations. Feature maps are generated at every layer.
SLIDE 27
SLIDE 28
Back Propagation in CNNs