features for computer vision
play

FeaturesforComputerVision AlexBerg ComputerScienceDepartment - PowerPoint PPT Presentation

FeaturesforComputerVision AlexBerg ComputerScienceDepartment ColumbiaUniversity WhyVision? Light! Itishowweseeotherpeople, navigateourenvironment,


  1. Example
Feature
Pipeline
 Edge
 Eliminate
rotaHonal

 OrientaHon

 Extract
affine
regions
 Normalize
regions
 Histograms
 ambiguity
 SIFT
(Lowe
’04)
 Harris‐Affine
Region
of
Interest
Operator








































Lowe’s
Descriptor
 Features!


  2. Matching
for
Alignment
 Use
descriptors
to
compare
features
and
enforce
geometric
constraints


  3. Match
a
few
points


  4. Dense
Alignment


  5. Si^
2


  6. Example
Feature
Pipeline
 Edge
 Eliminate
rotaHonal

 OrientaHon

 ambiguity
 Histograms
 Needs
to
be
handled
 Remaining
variaHon
 here
 here


  7. Matching
affine
covariant
regions
 Note
that
they
sHll
don’t
look
exactly
the
same
even
on
easy
images!
 Lowe’s
orientaHon
histogram
helps,
but
Grauman
&
Darrell
and
Lazebnik
et
 al

have
a
neat
alternaHve


  8. Embedding


  9. Grauman’s
Pyramid
Match
Kernel
 “Match”
score
for
sets
X,
Y,
 of
features:
 Idea
from
StaHsHcs:
Mallow’s
1972
 Included
the
method
of
quanHzing

 feature
space,
which
was
rediscovered
by
 Rubner
et
al
1998
as
the
 Earth
Mover’s
Distance.


  10. Grauman’s
Pyramid
Match
Kernel
 Indyk
and
Thaper
2003
 “Match”
score
for
sets
X,
Y,
 Showed
how
to
embed

 of
features:
 points
in
a
mulHscale
pyramid
 so
that
the
l2
norm
on
the

 embedding
approximated

 EMD
 Idea
from
StaHsHcs:
Mallow’s
1972
 Included
the
method
of
quanHzing

 feature
space,
which
was
rediscovered
by
 Rubner
et
al
1998
as
the
 Earth
Mover’s
Distance
(EMD)


  11. Grauman’s
Pyramid
Match
Kernel
 Indyk
and
Thaper
2003
 “Match”
score
for
sets
X,
Y,
 Showed
how
to
embed

 of
features:
 points
in
a
mulHscale
pyramid
 so
that
the
l2
norm
on
the

 embedding
approximated

 EMD
 Grauman
replaced
l2
with

 histogram
intersecHon.
 Histogram
IntersecHon
/
Min
Kernel
is
posiHve
definite,
so
we
can
use
it
for
a
Kernelized
SVM


  12. Grauman’s
Pyramid
Match
Kernel
 Indyk
and
Thaper
2003
 “Match”
score
for
sets
X,
Y,
 Showed
how
to
embed

 of
features:
 points
in
a
mulHscale
pyramid
 so
that
the
l2
norm
on
the

 embedding
approximated

 EMD
 Grauman
replaced
l2
with

 histogram
intersecHon.
 Histogram
IntersecHon
/
Min
Kernel
is
posiHve
definite,
so
we
can
use
it
for
a
Kernelized
SVM


  13. SpaHal
Pyramid
Match
(Lazebnik)
 Only
use
pyramid
for
the
spaHal
coordinates
of
features.


  14. SpaHal
Pyramid
Match
(Lazebnik)
 Applied
to
large
region
or
whole
image,
 No
interest
point
operator.


  15. RotaHon
/
scale
invariance
not
always
 needed.
 Airplanes
on
the
runway
are
level.


  16. SpaHal
Pyramid
Kernel
(Lazebnik)
 DistribuHon
of

edge
features
x,
y,
orientaHon,
energy
 E ( x, y, o ) = Edge
energy
at
x,y
in
orientaHon
o
 Histograms
are
just
sums
of
different
slices
of
E
 (just
a
linear
projecHon
if
E
is
represented
discretely)


  17. SpaHal
Pyramid
Kernel
(Lazebnik)
 DistribuHon
of

edge
features
x,
y,
orientaHon,
energy
 E ( x, y, o ) = Edge
energy
at
x,y
in
orientaHon
o
 Histograms
are
just
sums
of
different
slices
of
E
 (just
a
linear
projecHon
if
E
is
represented
discretely)
 Same
for
GIST,
Shape
Contexts,
Geometric
Blur,
HOG
etc.
 The
only
impediment
to
an
understanding
of
all
of
these
features
as

 simple
projecHons
of
something
like
E()
above
is
the
min
kernel…


  18. Unified
Feature
Pipeline
 Comparison
 Image

 Edges/filter
responses
 Contrast
 ProjecHon
 NormalizaHon
 L2
 Inner
product
 Min
Kernel


  19. Max‐Margin
Addi2ve
Classifiers
for
Detec2on 
 
 Subhransu
Maji
(UC
Berkeley)
 Alex
Berg
(Columbia
University)
 Will
be
a
talk
at
ICCV
2009
in
Kyoto


  20. DetecHon
 Find
pedestrians


  21. DetecHon
 Find
pedestrians


  22. DetecHon
 Find
pedestrians


  23. DetecHon
 Find
pedestrians


  24. DetecHon
 10 4 
to
10 6
 or
more
 
 windows
per
image
 Find
pedestrians


  25. DetecHon
 10 4 
to
10 6
 or
more
 
 windows
per
image
 BoosHng
+
Decision
Trees
 Viola
&
Jones

(faces)
 Linear
Classifier

 Dalal
&
Triggs

(pedestrians)
 Neural
Networks
 Find
pedestrians
 Rowley
et
al

(faces)


  26. ClassificaHon
 What
is
this?


  27. ClassificaHon
 What
is
this?
 Choose
from
many
categories


  28. ClassificaHon
 ~10 5 
examples
images
(training)
 What
is
this?
 Choose
from
many
categories


  29. ClassificaHon
 ~10 5 
examples
images
(training)
 Nearest
Neighbor
 Berg

(Caltech
101)
 What
is
this?
 Kernelized
SVM
 Choose
from
many
categories
 Grauman
et
al

(Caltech
101)
 CombinaHon
of
SVMs
 Varma
et
al

(Caltech
101)
 (skipping
model
based
methods)


  30. ClassificaHon
 ~10 5 
examples
images
(training)
 Nearest
Neighbor
 3sec
/
comparison
 Berg

(Caltech
101)
 What
is
this?
 0.001
sec
/
comparison
 Kernelized
SVM
 Choose
from
many
categories
 Grauman
et
al

(Caltech
101)
 Slow?
 CombinaHon
of
SVMs
 Varma
et
al

(Caltech
101)
 (skipping
model
based
methods)
 Caltech
101
–
Fei‐Fei
Li,
Pietro
Perona
2004


  31. DetecHon
 ClassificaHon
 Linear
Classifier
 Kernelized
SVM
Classifier


  32. DetecHon
 ClassificaHon
 Linear
Classifier
 Kernelized
SVM
Classifier
 #sv � #dimensions � � � α j K ( x, x j ) + b h ( x ) = h ( x ) = + b w i x i j =1 i =1 Decision
funcHon
is 
sign(h)
 Decision
funcHon
is 
sign(h)


  33. DetecHon
 ClassificaHon
 Linear
Classifier
 Kernelized
SVM
Classifier
 O(#dims)
 O(#dims
x
#sv
)
 #sv � #dimensions � � � α j K ( x, x j ) + b h ( x ) = h ( x ) = + b w i x i j =1 i =1 Test
feature
vector
 Kernel
FuncHon
 Support
Vector
 (comparison)
 (training
example)
 One
coordinate
of

 feature
vector


  34. DetecHon
 ClassificaHon
 Linear
Classifier
 Kernelized
SVM
Classifier
 O(#dims)
 O(#dims
x
#sv
)
 #sv � #dimensions � � � α j K ( x, x j ) + b h ( x ) = h ( x ) = + b w i x i j =1 i =1 Feature
vector
 Kernel
FuncHon
 Support
Vector
 (comparison)
 (training
example)
 One
coordinate
of

 feature
vector


  35. A
SVM
with
 Addi8ve 
kernel
can
be
 evaluated
efficiently
 Maji,
Berg,
Malik
CVPR
2008
 #dimensions If
 � K ( a, b ) = K i ( a i , b i ) i =1 #sv Then
 � α j K ( x, x j ) + b h ( x ) = j =1 #sv � #dimensions � � � K i ( x i , x j α j = i ) + b j =1 i =1 #dimensions � = h i ( x i ) i =1

  36. A
SVM
with
AddiHve
Kernel
can
be
 Evaluated
Efficiently
 Maji,
Berg,
Malik
CVPR
2008
 #dimensions If
 If
you
have
an
addiHve
 � K ( a, b ) = K i ( a i , b i ) kernel…
 i =1 #sv Then
 � then
the
SVM
decision
 α j K ( x, x j ) + b h ( x ) = funcHon
is
addiHve.
 j =1 #sv � #dimensions � � � K i ( x i , x j α j = i ) + b j =1 i =1 #dimensions � = h i ( x i ) i =1

  37. A
SVM
with
AddiHve
Kernel
can
be
 Evaluated
Efficiently
 Maji,
Berg,
Malik
CVPR
2008
 #dimensions If
 If
you
have
an
addiHve
 � K ( a, b ) = K i ( a i , b i ) kernel…
 i =1 #sv Then
 � then
the
SVM
decision
 α j K ( x, x j ) + b h ( x ) = funcHon
is
addiHve.
 j =1 #sv � #dimensions � � � K i ( x i , x j α j = i ) + b j =1 i =1 #dimensions � = h i ( x i ) i =1 Evaluate
these
1D
funcHons
efficiently
using
 a
look
up
table,
spline
(exact
or
approximate)



  38. IntersecHon

or
Min
Kernel
 Maji,
Berg,
Malik
CVPR
2008
 #dimensions The
IntersecHon
or
Min
Kernel
 � K min ( a, b ) = min ( a i , b i ) i =1 Grauman
et
al
use
this
on
MulHscale

 Histograms
to
approximate
the
linear

 assignment
problem

(and
do
recogniHon)
 Lazebnik
et
al
refine
this
approach
to
only
 use
mulHple
scales
for

posiHon,
and
not
 for
the
features
 Much
follow
on
work


  39. IntersecHon

or
Min
Kernel
 Maji,
Berg,
Malik
CVPR
2008
 #dimensions The
IntersecHon
or
Min
Kernel
 � K min ( a, b ) = min ( a i , b i ) i =1 #sv � α j K min ( x, x j ) + b h ( x ) = j =1 #sv � #dimensions � � � min( x i , x j α j = i ) + b j =1 i =1 #dimensions � = h i ( x i ) + b # sv i =1 α j min( x i , x j Where
 � h i ( x i ) = i ) j =1

  40. IntersecHon

or
Min
Kernel
 Maji,
Berg,
Malik
CVPR
2008
 #dimensions The
IntersecHon
or
Min
Kernel
 � K min ( a, b ) = min ( a i , b i ) i =1 #sv � α j K min ( x, x j ) + b h ( x ) = j =1 #sv � #dimensions � The
support
vectors
are
constants,
 � � min( x i , x j α j = i ) + b min(
 x i 
 ,
 constant
 )
is
piecewise
linear,
 j =1 i =1 so
 h i (x i ) 
is
piecewise
linear.
 #dimensions � = h i ( x i ) + b # sv i =1 α j min( x i , x j Where
 � h i ( x i ) = i ) j =1

  41. IntersecHon

or
Min
Kernel
 Maji,
Berg,
Malik
CVPR
2008
 #dimensions The
IntersecHon
or
Min
Kernel
 � K min ( a, b ) = min ( a i , b i ) i =1 O(
#dims
x
#sv
)
 Becomes

 O(
#dims
x
log(#sv)
)
 exact
 #sv � α j K min ( x, x j ) + b or
 O(
#dims

)
 h ( x ) = approx.
 j =1 #sv � #dimensions � The
support
vectors
are
constants,
 � � min( x i , x j α j = i ) + b min(
 x i 
 ,
 constant
 )
is
piecewise
linear,
 j =1 i =1 so
 h i (x i ) 
is
piecewise
linear.
 #dimensions � = h i ( x i ) + b # sv i =1 α j min( x i , x j Where
 � h i ( x i ) = i ) j =1

  42. Time
to
Perform
ClassificaHon
 Maji,
Berg,
Malik
CVPR
2008
 Times
in
seconds
to
classify
10,000
test
vectors


  43. MulHscale
HOG
features
 (Very
Similar
to
SpaHal
Pyramids)
 Based
on
histograms
of
response
to
eight
orientated
edge
detecHons.
Non‐ overlapping
windows
of
integraHon
and
fixed
size
windows
for
contrast

 normalizaHon
allow
efficient
computaHon.


  44. Example
 h i (x i )
 and
ApproximaHons


  45. Min
Kernel
“Beger”
than
Linear


  46. Min
Kernel
“Beger”
than
Linear
 Caltech
101
with
“simple
features”



 Linear
SVM


















40%
correct
 15
training
examples
per
category
 Min
Kernel
(IK)
SVM


52%
correct
 Accuracy
of
Min
Kernel
vs
Linear
on
Text
classificaHon


  47. Now
we
can
use
Min
Kernel
for
 DetecHon
in
Seconds
Instead
of
Hours


  48. Direct
Training
 It
is
possible
to
directly
train
classifiers
with
the
same
structure
as
the
approximaHon

 without
using
support
vectors
at
all.

The
formulaHon
is
very
similar
to
a
linear
classifier,

 with
different
regularizaHon.

Can
be
trained
efficiently
using
stochasHc
(sub)gradient
descent.
 Linear
 Piecewise
 w ′ w + c � ξ j w + c � ξ j Linear
 minimize : minimize : w ′ H ˆ ˆ y i ( w ′ x j + b ) x j + b ) ≥ 1 − ξ j ≥ 1 − ξ j y i ( ˆ subject to : subject to : ˆ w ′ ˆ ξ j ξ j ≥ 0 ≥ 0 [ ] 
1

‐1


 ‐1


2


‐1
 H
=

 





‐1



.
 
















2

‐1
 
















‐1


1


  49. Slightly
different
formulaHon
 Linear
 2 w ′ w + 1 λ � min ℓ ( w ; ( x i , y i )) w m i Piecewise
linear
 w + 1 λ � w ′ H ˆ min 2 ˆ ℓ ( ˆ w ; (ˆ x i , y i )) m w i

  50. Shalev‐Schwartz,
Singer,
Srebro
ICML
2007
 � d � O for







accuracy
 ǫ λǫ

  51. Shalev‐Schwartz,
Singer,
Srebro
ICML
2007
 � d � √ � w � w ′ Hw O for







accuracy
 ǫ λǫ � w ′ 1 Hw 1 (1 − η t λ H ) � w ′ 2 Hw t + 1 t + 1 2 Maji,
Berg,
ICCV
2009


Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend