Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 1
Administrative
- how is the assignment going?
- btw, the notes get updated all the time
based on your feedback
- no lecture on Monday
Administrative - how is the assignment going? - btw, the notes get - - PowerPoint PPT Presentation
Administrative - how is the assignment going? - btw, the notes get updated all the time based on your feedback - no lecture on Monday Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 4 - Lecture 4 - 7 Jan 2015 7
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 1
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 2
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 3
assume given set of discrete labels {dog, cat, truck, plane, ...}
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 4
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 5
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 6
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 7
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 8
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 9
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 10
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 11
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 12
hue bins
+1
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 13
8x8 pixel region, quantize the edge
(images from vlfeat.org)
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 14
1. Resize patch to a fixed size (e.g. 32x32 pixels) 2. Extract HOG on the patch (get 144 numbers)
gives a matrix of size [number_of_features x 144]
repeat for each detected feature
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 15
1. Resize patch to a fixed size (e.g. 32x32 pixels) 2. Extract HOG on the patch (get 144 numbers)
gives a matrix of size [number_of_features x 144]
repeat for each detected feature
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 16
144 visual word vectors learn k-means centroids “vocabulary of visual words e.g. 1000 centroids 1000-d vector 1000-d vector 1000-d vector histogram of visual words
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 17
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 18
(slide from Yann LeCun)
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 19
(slide from Yann LeCun)
CNNs: end-to-end models
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 20
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 21
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 22
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 23
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 24
the full data loss:
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 25
Suppose there are 3 examples with 3 classes (class 0, 1, 2 in sequence), then this becomes:
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 26
Suppose there are 3 examples with 3 classes (class 0, 1, 2 in sequence), then this becomes:
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 27
Suppose there are 3 examples with 3 classes (class 0, 1, 2 in sequence), then this becomes:
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 28
Suppose there are 3 examples with 3 classes (class 0, 1, 2 in sequence), then this becomes:
Question: CIFAR-10 has 50,000 training images, 5,000 per class and 10 labels. How many
the full data loss?
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 29
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 30
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 31
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 32
what’s up with 0.0001?
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 33
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 34
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 35
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 36
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 37
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 38
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 39
In multiple dimension, the gradient is the vector of (partial derivatives).
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 40
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 41
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 42
“centered difference formula” in practice:
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 43
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 44
performing a parameter update
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 45
performing a parameter update
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 46
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 47
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 48
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 49
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 50
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 51
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 52
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 53
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 54
=>
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 55
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 56
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 57
Common mini-batch sizes are ~100 examples. e.g. Krizhevsky ILSVRC ConvNet used 256 examples
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 58
(also sometimes called on-line Gradient Descent)
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 59
(or call it batch gradient descent)
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 60
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 61
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 62
always pull the weights down pull some weights up and some down
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 63
gradient momentum update
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 64
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 65
Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 4 - 7 Jan 2015 66