Instance-level recognition: Local invariant features
Cordelia Schmid INRIA, Grenoble
Instance-level recognition: Local invariant features Cordelia - - PowerPoint PPT Presentation
Instance-level recognition: Local invariant features Cordelia Schmid INRIA, Grenoble Overview Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT H i i t t i t SSD ZNCC SIFT Scale & affine
Instance-level recognition: Local invariant features
Cordelia Schmid INRIA, Grenoble
Overview Overview
H i i t t i t SSD ZNCC SIFT
Local features Local features
local descriptor
Several / many local descriptors per image Robust to occlusion/clutter + no object segmentation required Robust to occlusion/clutter + no object segmentation required Photometric : distinctive Invariant : to image transformations + illumination changes
Local features Local features
Interest Points Contours/lines Region segments
Local features Local features
Interest Points Contours/lines Region segments Patch descriptors, i.e. SIFT Mi-points, angles Color/texture histogram
Interest points / invariant regions Interest points / invariant regions
Harris detector Scale/affine inv. detector
t d i thi l t presented in this lecture
Contours / lines Contours / lines
Extraction de contours
– Zero crossing of Laplacian – Local maxima of gradients Local maxima of gradients
p ( y ) , y
– Global probability of boundary (gPb) detector [Malik et al., UC Berkeley] S f f f (S ) – Structured forests for fast edge detection (SED) [Dollar and Zitnick] – student presentation
Regions segments / superpixels Regions segments / superpixels
Simple linear iterative clustering (SLIC) Simple linear iterative clustering (SLIC)
Normalized cut [Shi & Malik], Mean Shift [Comaniciu & Meer], ….
Application: matching Application: matching
Find corresponding locations in the image Find corresponding locations in the image
Illustration – Matching Illustration Matching
I t t i t t t d ith H i d t t ( 500 i t ) Interest points extracted with Harris detector (~ 500 points)
Matching Illustration – Matching Matching Illustration Matching
I t t i t t h d b d l ti (188 i ) Interest points matched based on cross-correlation (188 pairs)
Global constraints Illustration – Matching Global constraints
Global constraint Robust estimation of the fundamental matrix
Illustration Matching
Global constraint - Robust estimation of the fundamental matrix 99 inliers 89 outliers 99 inliers 89 outliers
Application: Panorama stitching pp g
Images courtesy of A. Zisserman.
Application: Instance-level recognition Application: Instance level recognition
Search for particular objects and scenes in large databases Search for particular objects and scenes in large databases
Difficulties
Finding the object despite possibly large changes in l i i t li hti d ti l l i scale, viewpoint, lighting and partial occlusion requires invariant description S l Viewpoint Scale Lighting Occlusion
Difficulties Difficulties
V l i ll ti d f ffi i t i d i
Fli k h 2 billi h t h th 1 illi dd d d il – Flickr has 2 billion photographs, more than 1 million added daily Facebook has 15 billion images ( 27 million added daily) – Facebook has 15 billion images (~27 million added daily) – Large personal collections – Large personal collections – Video collections, i.e., YouTube Video collections, i.e., YouTube
Applications
Search photos on the web for particular places
pp
p p p
Find these landmarks ...in these images and 1M more
Applications Applications
Take a picture of a product or advertisement
find relevant information on the web
Applications Applications
Search in 200h of video Query video
Overview Overview
H i i t t i t SSD ZNCC SIFT
Harris detector [Harris & Stephens’88] Harris detector [Harris & Stephens 88]
B d th id f t l ti Based on the idea of auto-correlation I t t diff i ll di ti i t t i t Important difference in all directions => interest point
Harris detector Harris detector
2
)) ( ) ( ( ) ( I I A
Auto-correlation function for a point and a shift
) , ( y x ) , ( y x
2 ) , ( ) , (
)) , ( ) , ( ( ) , ( y y x x I y x I y x A
k k k y x W y x k
k k
) ( y x
W
) , ( y x
Harris detector Harris detector
2
)) ( ) ( ( ) ( I I A
Auto-correlation function for a point and a shift
) , ( y x ) , ( y x
2 ) , ( ) , (
)) , ( ) , ( ( ) , ( y y x x I y x I y x A
k k k y x W y x k
k k
) ( y x
W
) , ( y x
small in all directions large in one directions
) , ( y x A {
→ uniform region → contour g large in all directions
) , ( y{
→ interest point
Harris detector Harris detector
Discret shifts are avoided based on the auto-correlation matrix
x y x I y x I y x I y y x x I )) ( ) ( ( ) ( ) (
with first order approximation
y y x I y x I y x I y y x x I
k k y k k x k k k k
)) , ( ) , ( ( ) , ( ) , (
2 ) , ( ) , (
)) , ( ) , ( ( ) , ( y y x x I y x I y x A
k k k y x W y x k
k k
2 ) (
) , ( ) , (
W y x k k y k k x
k ky x y x I y x I
) , (
W y x
k ky
Harris detector Harris detector
x y x I y x I y x I
W k k y k k x W k k x ) ( ) ( 2
) , ( ) , ( )) , ( (
y x y x I y x I y x I y x
W y x k k y W y x k k y k k x W y x W y x
k k k k k k k k) , ( 2 ) , ( ) , ( ) , (
)) , ( ( ) , ( ) , (
Auto-correlation matrix the sum can be smoothed with a Gaussian the sum can be smoothed with a Gaussian
x I I I G
y x x 2
y I I I G y x
y y x y x x 2
Harris detector Harris detector
Auto correlation matrix
2
) (
y x x
I I I
2
) , (
y y x y x x
I I I G y x A
– captures the structure of the local neighborhood measure based on eigenvalues of this matrix – measure based on eigenvalues of this matrix
=> interest point => contour
Interpreting the eigenvalues Interpreting the eigenvalues
Classification of image points using eigenvalues of autocorrelation matrix:
2 “Edge” 2 >> 1 “Corner” 1 and 2 are large, 2 >> 1 1 ~ 2; \ 1 and 2 are small; “Edge” 1 >> 2 “Flat” region 1
Corner response function Corner response function
2 2 1 2 1 2
) ( ) ( trace ) det( A A R
“Corner” “Edge” R < 0
2 1 2 1
) ( ) ( ) (
α: constant (0.04 to 0.06)
“Corner” R > 0 R < 0 |R| small “Edge” R < 0 “Flat” region
Harris detector Harris detector
C f ti
2 2 1 2 1 2
) ( )) ( ( ) det( k A trace k A R
2 1 2 1
) ( )) ( ( ) (
R d th ff t f t t Reduces the effect of a strong contour
– Treshold (absolut, relatif, number of corners) – Local maxima
) , ( ) , ( 8 , y x f y x f
neighbourh y x thresh f
Harris Detector: Steps Harris Detector: Steps
Harris Detector: Steps Harris Detector: Steps
Compute corner response R
Harris Detector: Steps Harris Detector: Steps
Find points with large corner response: R>threshold
Harris Detector: Steps Harris Detector: Steps
Take only the points of local maxima of R
Harris Detector: Steps Harris Detector: Steps
Harris detector: Summary of steps Harris detector: Summary of steps
1. Compute Gaussian derivatives at each pixel 2. Compute second moment matrix A in a Gaussian i d d h i l window around each pixel 3. Compute corner response function R 4. Threshold R 5. Find local maxima of response function (non-maximum i ) suppression)
Harris - invariance to transformations Harris invariance to transformations
Geometric transformations
– translation – rotation i ilit d ( t ti l h ) – similitude (rotation + scale change) – affine (valid for local planar objects) ( p j )
– Affine intensity changes (I a I + b)
Harris Detector: Invariance Properties Harris Detector: Invariance Properties
Ellipse rotates but its shape (i.e. eigenvalues) remains the same remains the same Corner response R is invariant to image rotation Corner response R is invariant to image rotation
Harris Detector: Invariance Properties Harris Detector: Invariance Properties
Scaling
Corner All points will be classified as edges
Not invariant to scaling
Harris Detector: Invariance Properties Harris Detector: Invariance Properties
Only derivatives are used => invariance to intensity shift I I + b to intensity shift I I + b Intensity scale: I a I R R
threshold
x (image coordinate) x (image coordinate) ll ffi i i h Partially invariant to affine intensity change, dependent on type of threshold
Comparison of patches - SSD Comparison of patches SSD
Comparison of the intensities in the neighborhood of two interest points Comparison of the intensities in the neighborhood of two interest points
) , (
1 1 y
x
) , (
2 2 y
x
image 1 image 2 g g
SSD : sum of square difference
2 2 2 2 1 1 1 ) 1 2 ( 1
)) , ( ) , ( (
2j y i x I j y i x I
N N i N N j N
Small difference values similar patches
Cross-correlation ZNCC Cross correlation ZNCC
ZNCC: zero normalized cross correlation
2 2 2 2 1 1 1 1 ) 1 2 ( 1
) , ( ) , (
2m j y i x I m j y i x I
N N N
2 1 ) 1 2 (
N i N j N
ZNCC values between 1 and 1 1 when identical patches ZNCC values between -1 and 1, 1 when identical patches in practice threshold around 0.5 Robust to illumination change I-> aI+b
Local descriptors Local descriptors
Pixel values
G l d i ti
Diff ti l i i t
SIFT d i
Local descriptors Local descriptors
G ey a ue de at es
– Convolution with Gaussian derivatives ) ( ) , ( ) ( ) , (
xG y x I G y x I ) ( * ) ( ) ( * ) , ( ) ( * ) , ( ) , (
xx yG y x I G y x I G y x I y x v ) ( * ) , ( ) ( * ) , (
yy xyG y x I G y x I
y d x d y y x x I y x G G y x I ) , ( ) , , ( ) ( ) , (
) 2 exp( 2 1 ) , , (
2 2 2 2
y x y x G
y y y y y ) ( ) ( ) ( ) (
Local descriptors Local descriptors
Notation for greyvalue derivatives [Koenderink’87]
) , ( ) ( ) , ( y x L G y x I
Notation for greyvalue derivatives [Koenderink’87]
) , ( ) , ( ) , ( ) ( * ) , ( ) ( ) , ( ) ( ) , ( y x L y x L y G y x I G y x I y
y x y x
) , ( ) , ( ) , ( ) ( * ) , ( ) ( * ) , ( ) ( ) , ( ) , ( y x L y x L y G y x I G y x I y y x
xy xx y xy xx y
v ) , ( ) , ( ) ( * ) , ( ) ( ) , ( y x L y G y x I y
yy xy yy xy
I i ? Invariance?
Local descriptors – rotation invariance Local descriptors rotation invariance
I i t i t ti diff ti l i i t Invariance to image rotation : differential invariants [Koen87]
L
y y x x
L L L L L L L L L L L L L 2
gradient magnitude
yy xx yy yy y x xy x x xx
L L L L L L L L L L L L L L L L 2 2
Laplacian
yy yy xy xy xx xx
L L L L L L 2
Laplacian of Gaussian (LOG) Laplacian of Gaussian (LOG)
) ( ) (
yy xx
G G LOG
SIFT descriptor [Lowe’99] SIFT descriptor [Lowe 99]
– 8 orientations of the gradient – 4x4 spatial grid – Dimension 128 – soft-assignment to spatial bins – normalization of the descriptor to norm one normalization of the descriptor to norm one – comparison with Euclidean distance gradient 3D histogram image patch
x
y
Local descriptors - rotation invariance Local descriptors rotation invariance
E ti ti f th d i t i t ti
– extract gradient orientation histogram over gradient orientation – histogram over gradient orientation – peak in this histogram
2
Local descriptors – illumination change Local descriptors illumination change
in case of an affine transformation
b aI I ) ( ) (
2 1
x x
in case of an affine transformation
b aI I ) ( ) (
2 1
x x
Invariance to scale changes Invariance to scale changes
– In case of a convolution with Gaussian derivatives defined by – In case of a convolution with Gaussian derivatives defined by
y d x d y y x x I y x G G y x I ) , ( ) , , ( ) ( ) , (
) 2 exp( 2 1 ) , , (
2 2 2 2
y x y x G
y d x d y y x x I y x G G y x I ) , ( ) , , ( ) ( ) , (
2 2
2 2
Overview Overview
H i i t t i t SSD ZNCC SIFT
Scale invariance - motivation Scale invariance motivation
I t t i t h t b t bl f l h
Harris detector + scale changes Harris detector + scale changes
| } ) ) ( ( | ) {( | H dist b a b a
Repeatability rate
|) | |, max(| | } ) ), ( ( | ) , {( | ) (
i i i i i i
H dist R b a b a b a
Scale adaptation Scale adaptation
Scale change bet een t o images Scale change between two images
1 1 2 2 2 2 1 1 1
sy sx I y x I y x I Scale adapted derivative calculation
Scale adaptation Scale adaptation
Scale change bet een t o images Scale change between two images
1 1 2 2 2 2 1 1 1
sy sx I y x I y x I
Scale adapted derivative calculation
) ( ) (
1 12 2 2 1 1 1
s G y x I s G y x I
n ni i n i i
s
n
s
Harris detector – adaptation to scale Harris detector adaptation to scale
} ) ), ( ( | ) , {( ) (
i i i iH dist R b a b a
Scale selection Scale selection
For a point compute a value (gradient Laplacian etc ) at
several scales Normali ation of the al es ith the scale factor
e.g. Laplacian
| ) ( |
2 yy xx
L L s
s
| ) ( |
2 yy xx
L L s
E lt h th t th L l i i b t lt
scale
Scale selection Scale selection
Scale invariance of the characteristic scale
s
p.
scale
Scale selection Scale selection
Scale invariance of the characteristic scale
s
p. p.
scale scale
2 1
s s s
Scale-invariant detectors Scale invariant detectors
Harris Laplace (Mikolajczyk & Schmid’01)
Harris-Laplace Laplacian
Harris-Laplace Harris Laplace
multi-scale Harris points selection of points at maximum of Laplacian invariant points + associated regions [Mikolajczyk & Schmid’01]
Matching results Matching results
213 / 190 detected interest points 213 / 190 detected interest points
Matching results Matching results
58 points are initially matched 58 points are initially matched
Matching results Matching results
32 points are matched after verification – all correct 32 points are matched after verification all correct
LOG detector LOG detector
Convolve image with scale Convolve image with scale- normalized Laplacian at several scales several scales
)) ( ) ( (
2
yy xx
G G s LOG
Detection of maxima and minima
Efficient implementation Efficient implementation
Difference of Gaussian (DOG) approximates the Laplacian
) ( ) ( G k G DOG
DOG detector DOG detector
time
David G. Lowe. "Distinctive image features from scale-invariant keypoints.”IJCV 60 (2) student presentation
Affine invariant regions - Motivation Affine invariant regions Motivation
Scale invariance is not sufficient for large baseline changes
detected scale invariant region g
A
j t d i i i t h l ll projected regions, viewpoint changes can locally be approximated by an affine transformation A
Affine invariant regions - Motivation Affine invariant regions Motivation
Affine invariant regions - Example Affine invariant regions Example
Harris/Hessian/Laplacian-Affine Harris/Hessian/Laplacian Affine
points
moment matrix [Lindeberg’94]
invariant interest points [Mikolajczyk & Schmid’02 invariant interest points [Mikolajczyk & Schmid 02, Schaffalitzky & Zisserman’02]
Affine invariant regions Affine invariant regions
Based on the second moment matrix (Lindeberg’94)
) , ( ) , ( ) ( ) (
2 2 D y x D x
L L L G M x x ) , ( ) , ( ) , ( ) , ( ) ( ) , , (
2 2 D y D y x D y x D x I D D I
L L L G M x x x
1
x x
2
M
Affine invariant regions Affine invariant regions
x x A
L R
x x A
L 2 1 L
x x
L
M
R 2 1 R
x x
R
M
L R
Rx x
Isotropic neighborhoods related by image rotation
Affine invariant regions - Estimation
initial points
Affine invariant regions Estimation
Affine invariant regions - Estimation
iteration #1
Affine invariant regions Estimation
Affine invariant regions - Estimation
iteration #2
Affine invariant regions Estimation
Harris-Affine versus Harris-Laplace Harris Affine versus Harris Laplace
H i L l Harris-Laplace Harris-Affine
Harris/Hessian-Affine Harris/Hessian Affine
H i Affi Harris-Affine Hessian-Affine
Harris-Affine Harris Affine
Hessian-Affine Hessian Affine
Matches Matches
22 correct matches
Matches Matches
33 correct matches
Maximally stable extremal regions (MSER) [Matas’02] Maximally stable extremal regions (MSER) [Matas 02]
Extremal regions: connected components in a thresholded
image (all pixels above/below a threshold)
(area) for a change of the threshold i e region remains (area) for a change of the threshold, i.e. region remains stable for a change of threshold
Maximally stable extremal regions (MSER) Maximally stable extremal regions (MSER) E l f th h ld d i Examples of thresholded images
high threshold low threshold
MSER MSER