Lecture 18: Recognition IV
Thursday, Nov 15
- Prof. Kristen Grauman
Lecture 18: Recognition IV Thursday, Nov 15 Prof. Kristen Grauman - - PDF document
Lecture 18: Recognition IV Thursday, Nov 15 Prof. Kristen Grauman Outline Discriminative classifiers SVMs Learning categories from weakly supervised images Constellation model Shape matching Shape context, visual
– SVMs
– Constellation model
– Shape context, visual CAPTCHA application
Each dimension: output of a possible rectangle feature
Each dimension: output of a possible rectangle feature
=
Each dimension: output of a possible rectangle feature
Image subwindow Optimal threshold that results in minimal misclassifications = Notice that any threshold giving same error rate would be equally good here.
0, y
0, y
Τ 2 2
0, y
Τ 2 2
Τ 2 2 2
n
Τ
Τ
denotes + 1 denotes -1
f(x,w,b) = sign(w x + b)
How would you classify this data?
w x + b=0 w x + b<0 w x + b>0 Slides from Andrew Moore’s tutorial: http://www.autonlab.org/tutorials/svm.html
denotes + 1 denotes -1
f(x,w,b) = sign(w x + b)
How would you classify this data?
denotes + 1 denotes -1
f(x,w,b) = sign(w x + b)
How would you classify this data?
denotes + 1 denotes -1
f(x,w,b) = sign(w x + b)
Any of these would be fine.. ..but which is best?
denotes + 1 denotes -1
f(x,w,b) = sign(w x + b)
How would you classify this data?
Misclassified to +1 class
denotes + 1 denotes -1
f(x,w,b) = sign(w x + b)
denotes + 1 denotes -1
f(x,w,b) = sign(w x + b)
denotes + 1 denotes -1
f(x,w,b) = sign(w x + b)
Linear SVM Support Vectors are those datapoints that the margin pushes up against 1. Maximizing the margin is good according to intuition and theory 2. Implies that only support vectors are important; other training examples are ignorable. 3. Empirically it works very very well.
“ P r e d i c t C l a s s = + 1 ” z
e “ P r e d i c t C l a s s =
” z
e
wx+ b= 1 wx+ b= 0 wx+ b= -1
X- x+
Τ
Goal: 1) Correctly classify all training data
2) Maximize the Margin same as minimize
Minimize
t
i i
i i
Solution has the form (omitting derivation): Each non-zero αi indicates that corresponding xi is a
Then the classifying function will have the form: Notice that it relies on an inner product between the test
Solving the optimization problem also involves
Txj between all pairs of
w =Σαiyixi b= yk- wTxk for any xk such that αk≠ 0 f(x) = Σαiyixi
Tx + b
Datasets that are linearly separable with some noise
But what are we going to do if the dataset is just too hard? How about… mapping data to a higher-dimensional
x x x x2
General idea: the original input space can always be
Φ: x → φ(x)
Txj
transformation Φ: x → φ(x), the dot product becomes: K(xi,xj)= φ(xi) Tφ(xj)
in some expanded feature space.
2-dimensional vectors x=[x1 x2]; let K(xi,xj)=(1 + xi
Txj)2
Need to show that K(xi,xj)= φ(xi) Tφ(xj): K(xi,xj)=(1 + xi
Txj)2 ,
= 1+ xi1
2xj1 2 + 2 xi1xj1 xi2xj2+ xi2 2xj2 2 + 2xi1xj1 + 2xi2xj2
= [1 xi1
2 √2 xi1xi2 xi2 2 √2xi1 √2xi2]T [1 xj1 2 √2 xj1xj2 xj2 2 √2xj1 √2xj2]
= φ(xi) Tφ(xj), where φ(x) = [1 x1
2 √2 x1x2 x2 2 √2x1 √2x2]
Linear: K(xi,xj)= xi Txj
Polynomial of power p: K(xi,xj)= (1+ xi Txj)p Gaussian (radial-basis function network):
2 2
j i j i
Moghaddam and Yang, Learning Gender with Support Faces, TPAMI 2002. Moghaddam and Yang, Face & Gesture 2000.
Moghaddam and Yang, Learning Gender with Support Faces, TPAMI 2002.
Processed faces Face alignment processing
Moghaddam and Yang, Learning Gender with Support Faces, TPAMI 2002.
Moghaddam and Yang, Learning Gender with Support Faces, TPAMI 2002.
Moghaddam and Yang, Face & Gesture 2000.
Moghaddam and Yang, Face & Gesture 2000.
Error Error
Moghaddam and Yang, Face & Gesture 2000.
– SVMs
– Constellation model
– Shape context, visual CAPTCHA application
Vs.
Slide from Li Fei-Fei http://www.vision.caltech.edu/feifeili/Resume.htm
Weber, Welling, Perona., 2000.
Slide by Bill Freeman, MIT
Slide by Bill Freeman, MIT
Slide by Bill Freeman, MIT
Slide by Bill Freeman, MIT
Slide by Bill Freeman, MIT
Slide by Fei-Fei Li, 2003.
Figure from Rob Fergus
image patch descriptors, with uncertainty mutual positions of the parts, with uncertainty
Weber, Welling, Perona, ECCV 2000.
From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/
From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/
Slide by Bill Freeman, MIT
Weber, Welling, Perona. Unsupervised Learning of Models for Recognition, 2000.
For faces For cars At this point, parts appear in both background and foreground of training images.
Images from Rob Fergus
X: locations S: scales A: appearances
Identified in new image:
X: locations S: scales A: appearances Use maximum-likelihood parameters
Slide from Li Fei-Fei http://www.vision.caltech.edu/feifeili/Resume.htm
Face model Recognition results Appearance: 10 patches closest to mean for each part
Face model Recognition results Appearance: 10 patches closest to mean for each part
Appearance: 10 patches closest to mean for each part Motorbike model Recognition results
Appearance: 10 patches closest to mean for each part Spotted cat model Recognition results
– SVMs
– Constellation model
– Shape context, visual CAPTCHA application
Slides adapted from Belongie, Malik, & Puzicha, Matching Shapes, ICCV 2001. www.eecs.berkeley.edu/Research/Projects/CS/vision/shape/belongie-iccv01
Computer Vision Group
University of California
Berkeley
Slides by Greg Mori, CVPR 2003
Computer Vision Group
University of California
Berkeley
Computer Vision Group
University of California
Berkeley
Computer Vision Group
University of California
Berkeley
Computer Vision Group
University of California
Berkeley
Computer Vision Group
University of California
Berkeley
# Correct words % tests (of 24) 1 or more 92% 2 or more 75% 3 33% EZ-Gimpy 92%