Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 1
Lecture 3: Linear Classification Fei-Fei Li & Andrej Karpathy - - PowerPoint PPT Presentation
Lecture 3: Linear Classification Fei-Fei Li & Andrej Karpathy - - PowerPoint PPT Presentation
Lecture 3: Linear Classification Fei-Fei Li & Andrej Karpathy Fei-Fei Li & Andrej Karpathy Lecture 2 - Lecture 2 - 7 Jan 2015 7 Jan 2015 1 Last time: Image Classification assume given set of discrete labels {dog, cat, truck,
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 2
Last time: Image Classification
cat
assume given set of discrete labels {dog, cat, truck, plane, ...}
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 3
k-Nearest Neighbor
training set test images
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 4
Linear Classification
- 1. define a score function
class scores
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 5
Linear Classification
- 1. define a score function
“weights” “bias vector” data (image) class scores “parameters”
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 6
Linear Classification
- 1. define a score function
(assume CIFAR-10 example so 32 x 32 x 3 images, 10 classes)
weights bias vector data (image) [3072 x 1] class scores
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 7
Linear Classification
- 1. define a score function
(assume CIFAR-10 example so 32 x 32 x 3 images, 10 classes)
weights [10 x 3072] bias vector [10 x 1] data (image) [3072 x 1] class scores [10 x 1]
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 8
Linear Classification
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 9
Interpreting a Linear Classifier
Question: what can a linear classifier do?
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 10
Interpreting a Linear Classifier
Example training classifiers on CIFAR-10:
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 11
Interpreting a Linear Classifier
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 12
Bias trick
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 13
So far:
We defined a (linear) score function:
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 14
- 2. Define a loss function (or cost function, or objective)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 15
- 2. Define a loss function (or cost function, or objective)
- scores, label loss.
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 16
- 2. Define a loss function (or cost function, or objective)
- scores, label loss.
Example:
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 17
- 2. Define a loss function (or cost function, or objective)
- scores, label loss.
Example: Question: if you were to assign a single number to how “unhappy” you are with these scores, what would you do?
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 18
- 2. Define a loss function (or cost function, or objective)
One (of many ways) to do it: Multiclass SVM Loss
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 19
- 2. Define a loss function (or cost function, or objective)
One (of many ways) to do it: Multiclass SVM Loss
(One possible generalization of Binary Support Vector Machine to multiple classes)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 20
- 2. Define a loss function (or cost function, or objective)
One (of many ways) to do it: Multiclass SVM Loss
loss due to example i sum over all incorrect labels difference between the correct class score and incorrect class score
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 21
loss due to example i sum over all incorrect labels difference between the correct class score and incorrect class score
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 22
Example:
e.g. 10 loss = ?
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 23
Example:
e.g. 10
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 24
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 25
There is a bug with the objective…
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 26
L2 Regularization
Regularization strength
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 27
L2 regularization: motivation
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 28
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 29
Do we have to cross-validate both and ?
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 30
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 31
So far…
- 1. Score function
- 2. Loss function
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 32
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 33
Softmax Classifier
score function is the same (extension of Logistic Regression to multiple classes)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 34
Softmax Classifier
score function is the same
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 35
Softmax Classifier
score function is the same i.e. we’re minimizing the negative log likelihood. softmax function
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 36
Softmax Classifier
score function is the same softmax function
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 37
Softmax Classifier
score function is the same i.e. we’re minimizing the negative log likelihood.
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 38
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 39
Softmax vs. SVM
- Interpreting the probabilities from the Softmax
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 40
Softmax vs. SVM
- Interpreting the probabilities from the Softmax
suppose the weights W were only half as large (we use a higher regularization strength)
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 41
Softmax vs. SVM
- Interpreting the probabilities from the Softmax
suppose the weights W were only half as large:
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 42
Softmax vs. SVM
- Interpreting the probabilities from the Softmax
suppose the weights W were only half as large: What happens in the limit, as the regularization strength goes to infinity?
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 43
Softmax vs. SVM
scores: [10, -2, 3] [10, 9, 9] [10, -100, -100] 1
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 44
Softmax vs. SVM
1
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 45
Interactive Web Demo time....
http://vision.stanford.edu/teaching/cs231n/linear-classify-demo/
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 46
Summary
- We introduced a parametric approach to
image classification
- We defined a score function (linear map)
- We defined a loss function (SVM / Softmax)
One problem remains: How to find W,b ?
Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 Fei-Fei Li & Andrej Karpathy Lecture 2 - 7 Jan 2015 47