Pictorial structures
Laurens van der Maaten
Pictorial structures Laurens van der Maaten Introduction Object - - PowerPoint PPT Presentation
Pictorial structures Laurens van der Maaten Introduction Object detection aims to find a particular object in an image Most popular object detectors are based on a discriminative model : Gather annotated image patches (positive and
Laurens van der Maaten
gradient orientations in each of the blocks using a histogram (almost like SIFT)
* Dalal & Triggs, 2005
Training: Detection: s(I; x) = w∗Tφ(I; x)
w∗ = argmin
w
max
Training: Detection: s(I; x) = w∗Tφ(I; x)
w∗ = argmin
w
max
Training: Detection: s(I; x) = w∗Tφ(I; x)
w∗ = argmin
w
max
* Fischler & Elschlager, 1973
Global object model
s(I; x0, y0, . . . , x|V |, y|V |) = wT
0 φ(I; x0, y0) +
X
i∈V
wT
i φ(I; xi, yi) +
X
(i,j)∈E
dijφd(xi − xj, yi − yj)
* Felzenszwalb et al., 2010
Global object model Object part models
s(I; x0, y0, . . . , x|V |, y|V |) = wT
0 φ(I; x0, y0) +
X
i∈V
wT
i φ(I; xi, yi) +
X
(i,j)∈E
dijφd(xi − xj, yi − yj)
* Felzenszwalb et al., 2010
Global object model Object part models Deformation model
s(I; x0, y0, . . . , x|V |, y|V |) = wT
0 φ(I; x0, y0) +
X
i∈V
wT
i φ(I; xi, yi) +
X
(i,j)∈E
dijφd(xi − xj, yi − yj)
* Felzenszwalb et al., 2010
and deformations of non-rigid objects
Global object model Object part models Deformation model
s(I; x0, y0, . . . , x|V |, y|V |) = wT
0 φ(I; x0, y0) +
X
i∈V
wT
i φ(I; xi, yi) +
X
(i,j)∈E
dijφd(xi − xj, yi − yj)
* Felzenszwalb et al., 2010
x0,y0,...,x|V |,y|V | s(I; x0, y0, . . . , x|V |, y|V |)
x0,y0,...,x|V |,y|V | s(I; x0, y0, . . . , x|V |, y|V |)
xj (f(xj) + (xi − xj)2)
x0,y0,...,x|V |,y|V | s(I; x0, y0, . . . , x|V |, y|V |)
xj (f(xj) + (xi − xj)2)
final score with deformations
x0,y0,...,x|V |,y|V | s(I; x0, y0, . . . , x|V |, y|V |)
negative part model score
xj (f(xj) + (xi − xj)2)
final score with deformations
x0,y0,...,x|V |,y|V | s(I; x0, y0, . . . , x|V |, y|V |)
deformation penalty negative part model score
xj (f(xj) + (xi − xj)2)
final score with deformations
x0,y0,...,x|V |,y|V | s(I; x0, y0, . . . , x|V |, y|V |)
deformation penalty negative part model score
(xj, f(xj))
xj (f(xj) + (xi − xj)2)
final score with deformations
1 2 f(0) f(1) f(2) f(n-1) n-1 . . . . . . . . . . . . .
* Felzenszwalb & Huttenlocher, 2004
1 2 f(0) f(1) f(2) f(n-1) n-1 . . . . . . . . . . . . .
i ) − (f(xj) + x2 j)
* Felzenszwalb & Huttenlocher, 2004
intersection, and above it right of the intersection
1 2 f(0) f(1) f(2) f(n-1) n-1 . . . . . . . . . . . . .
* Felzenszwalb & Huttenlocher, 2004
v[k] v[k-1] z[k] q s v[k] v[k-1] z[k] q s
new intersection right of last intersection: maintain last parabola in the envelope new intersection left of last intersection: remove last parabola from the envelope
parabola from lower envelope, and go back one step
* Felzenszwalb & Huttenlocher, 2004
+ x x x
... ... ...
model response of root filter transformed responses response of part filters feature map feature map at twice the resolution combined score of root locations low value high value color encoding of filter response values
Minimum spanning tree Star-shaped tree
* Felzenszwalb et al., 2010
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
recall precision class: car, year 2006 1 Root (0.48) 2 Root (0.58) 1 Root+Parts (0.55) 2 Root+Parts (0.62) 2 Root+Parts+BB (0.64)
* Felzenszwalb et al., 2010
person car horse sofa
* Zhang & van der Maaten, 2013