Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 1
Administrative
- In-class midterm this Wednesday! (More on this in a bit)
- Assignment #3: out Wed
- Sample Midterm will be up in few hours
Administrative - In-class midterm this Wednesday! (More on this in a - - PowerPoint PPT Presentation
Administrative - In-class midterm this Wednesday! (More on this in a bit) - Assignment #3: out Wed - Sample Midterm will be up in few hours Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 10 - Lecture 8 - 2 Feb 2015
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 1
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 2
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015
covered papers, but takeaways presented in class are fair game.
What it does include:
during lectures)
3
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 4
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 5
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 6
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 7
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 8
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 9
X
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 10
Suppose we stack three CONV layers with receptive field size 3x3 Q: What region of input does each neuron in 3rd CONV see?
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 11
Suppose we stack three CONV layers with receptive field size 3x3 Q: What region of input does each neuron in 3rd CONV see?
X X
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 12
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 13
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 14
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 15
Fewer parameters and more nonlinearities = GOOD.
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 16
[Network in Network, Lin et al. 2013]
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 17
[Network in Network, Lin et al. 2013]
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 18
[Network in Network, Lin et al. 2013]
1x1 CONV view of output
3x3 CONV view of input
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 19
[Network in Network, Lin et al. 2013]
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 20
[Very Deep Convolutional Networks for Large-Scale Image Recognition, Simonyan et al., 2014]
=> Evidence that using 3x3 instead of 1x1 works better
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 21
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 22
In ordinary 2x2 maxpool, the pooling regions are non-overlapping 2x2 squares Fractional pooling samples pooling region during forward pass: A mix of 1x1, 2x1, 1x2, 2x2.
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 23
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 24
What the computer sees
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 25
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 26
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 27
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 28
(maybe even contrast jittering, etc.)
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 29
(maybe even contrast jittering, etc.)
(As seen in [Krizhevsky et al. 2012])
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 30
1. Introduce a form of randomness in forward pass 2. Marginalize over the noise distribution during prediction DropConnect Dropout Fractional Pooling Data Augmentation, Model Ensembles
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 31
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 32
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 33
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 34
“central processing unit”
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 35
“graphics processing unit”
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 36
“graphics processing unit” plugs in to PCI express slot
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 37
CEO of NVIDIA: Jen-Hsun Huang (Stanford Master’s degree in EE from 1992 by the way)
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 38
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 39
http://www.nvidia.com/content/cuda/spotlights/dan-ciresan-idsia.html
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 40
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 41
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 42
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 43
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 44
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 45
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 46
All comparisons are against a 12-core Intel E5-2679v2 CPU @ 2.4GHz running Caffe with Intel MKL 11.1.3.
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 47
NVIDIA Titan Blacks ~$1K
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 48
See e.g. [Fast Convolutional Nets With fbfft: A GPU Performance Evaluation]
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 49
See e.g. [Fast Convolutional Nets With fbfft: A GPU Performance Evaluation]
Unfortunately, FFT Conv is slower with smaller filter sizes :( (backwards!)
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 50
to be aware of
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 51
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 52
Moving parts lol
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 53
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 54
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 55
[Large Scale Distributed Deep Networks, Jeff Dean et al., 2013]
Data parallelism
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 56
[Large Scale Distributed Deep Networks, Jeff Dean et al., 2013]
Model parallelism Data parallelism
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 57
[One weird trick for parallelizing convolutional neural networks, Krizhevsky 2014] also see: [Deep learning with COTS HPC systems, Coates et al. 2013]
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 58
[Deep Image: Scaling up Image Recognition, Wu Ren et al. 2015] (Baidu)
When Computer Vision papers start to look like Systems papers...
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 59
[Deep Image: Scaling up Image Recognition, Wu Ren et al. 2015] (Baidu)
When Computer Vision papers start to look like Systems papers...
Brute-force approach:
resolutions
ImageNet classification Hit@5 error: 5.33% (Recall, human error is ~5.1%, and optimistic human error is ~3%)
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 60
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
[Kaiming He et al., 2015] (MSR)
+ Careful initialization of the weights
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 61
Fei-Fei Li & Andrej Karpathy Lecture 8 - 2 Feb 2015 Fei-Fei Li & Andrej Karpathy Lecture 10 - 9 Feb 2015 62
(in-class)