[PPT] - Instance-level recognition Cordelia Schmid INRIA, Grenoble PowerPoint Presentation

SLIDE 1

Instance-level recognition

Cordelia Schmid INRIA, Grenoble

SLIDE 2

Instance-level recognition

Search for particular objects and scenes in large databases

…

SLIDE 3

Finding the object despite possibly large changes in scale, viewpoint, lighting and partial occlusion  requires invariant description Viewpoint Scale Lighting Occlusion

Difficulties

SLIDE 4

Difficulties

Very large images collection  need for efficient indexing

– Flickr has 2 billion photographs, more than 1 million added daily – Facebook has 15 billion images (~27 million added daily) – Large personal collections – Video collections, i.e., YouTube

SLIDE 5

Search photos on the web for particular places

Find these landmarks ...in these images and 1M more

Applications

SLIDE 6

Applications

Take a picture of a product or advertisement

 find relevant information on the web

SLIDE 7

Applications

Finding stolen/missing objects in a large collection

SLIDE 8

Applications

Copy detection for images and videos

Search in 200h of video Query video

SLIDE 9

9

K. Grauman, B. Leibe
Sony Aibo – Robotics

– Recognize docking station – Communicate with visual cards – Place recognition – Loop closure in SLAM

S lide credit: David Lowe

Applications

SLIDE 10

Instance-level recognition

1) Local invariant features 2) Matching and recognition with local features 3) Efficient visual search 4) Very large scale indexing

SLIDE 11

Local invariant features

Introduction to local features
Harris interest points + SSD, ZNCC, SIFT
Scale & affine invariant interest point detectors

SLIDE 12

Local features

( )

local descriptor

Many local descriptors per image Robust to occlusion/clutter + no object segmentation required Photometric : distinctive Invariant : to image transformations + illumination changes

SLIDE 13

Local features

Interest Points Contours/lines Region segments

SLIDE 14

Local features

Interest Points Patch descriptors, i.e. SIFT Contours/lines Mi-points, angles Region segments Color/texture histogram

SLIDE 15

Interest points / invariant regions

Harris detector Scale/affine inv. detector

SLIDE 16

Contours / lines

Extraction de contours

– Zero crossing of Laplacian – Local maxima of gradients

Chain contour points (hysteresis) , Canny detector
Recent contour detectors

– global probability of boundary (gPb) detector [Malik et al., UC Berkeley, CVPR’08] – Structured forests for fast edge detection (SED) [Dollar and Zitnick, ICCV’13]

SLIDE 17

Regions segments / superpixels

Simple linear iterative clustering (SLIC)

Normalized cut [Shi & Malik], Mean Shift [Comaniciu & Meer], ….

SLIDE 18

Matching of local descriptors

Find corresponding locations in the image

SLIDE 19

Illustration – Matching

Interest points extracted with Harris detector (~ 500 points)

SLIDE 20

Matching

Interest points matched based on cross-correlation (188 pairs)

Illustration – Matching

SLIDE 21

Global constraints

Global constraint - Robust estimation of the fundamental matrix 99 inliers 89 outliers

Illustration – Matching

SLIDE 22

Application: Panorama stitching

Images courtesy of A. Zisserman.

SLIDE 23

Overview

Introduction to local features
Harris interest points + SSD, ZNCC, SIFT
Scale & affine invariant interest point detectors

SLIDE 24

Harris detector [Harris & Stephens’88]

Based on the idea of auto-correlation Important difference in all directions => interest point

SLIDE 25

Harris detector

W

2 ) , ( ) , (

)) , ( ) , ( ( ) , ( y y x x I y x I y x A

k k k y x W y x k

k k

     





Auto-correlation function for a point and a shift

) , ( y x ) , ( y x   ) , ( y x  

SLIDE 26

Harris detector

W

2 ) , ( ) , (

)) , ( ) , ( ( ) , ( y y x x I y x I y x A

k k k y x W y x k

k k

     





Auto-correlation function for a point and a shift

) , ( y x ) , ( y x   ) , ( y x  

small in all directions large in one directions large in all directions

) , ( y x A {

→ uniform region → contour → interest point

SLIDE 27

Harris detector

 

2 ) , (

) , ( ) , (





                  

W y x k k y k k x

k k

y x y x I y x I

Discret shifts are avoided based on the auto-correlation matrix

                y x y x I y x I y x I y y x x I

k k y k k x k k k k

)) , ( ) , ( ( ) , ( ) , (

with first order approximation

2 ) , ( ) , (

)) , ( ) , ( ( ) , ( y y x x I y x I y x A

k k k y x W y x k

k k

     





SLIDE 28

Harris detector

 

                      

   

   

y x y x I y x I y x I y x I y x I y x I y x

W y x k k y W y x k k y k k x W y x k k y k k x W y x k k x

k k k k k k k k

) , ( 2 ) , ( ) , ( ) , ( 2

)) , ( ( ) , ( ) , ( ) , ( ) , ( )) , ( (

Auto-correlation matrix the sum can be smoothed with a Gaussian

 

                      y x I I I I I I G y x

y y x y x x 2 2

SLIDE 29

Harris detector

Auto-correlation matrix

– captures the structure of the local neighborhood – measure based on eigenvalues of this matrix

2 strong eigenvalues
1 strong eigenvalue
0 eigenvalue

=> interest point => contour => uniform region

         

2 2

) , (

y y x y x x

I I I I I I G y x A

SLIDE 30

Interpreting the eigenvalues

1 2 “Corner” 1 and 2 are large, 1 ~ 2; 1 and 2 are small; “Edge” 1 >> 2 “Edge” 2 >> 1 “Flat” region

Classification of image points using eigenvalues of autocorrelation matrix

SLIDE 31

Corner response function

“Corner” R > 0 “Edge” R < 0 “Edge” R < 0 “Flat” region |R| small

2 2 1 2 1 2

) ( ) ( trace ) det(            A A R

α: constant (0.04 to 0.06)

SLIDE 32

Harris detector

2 2 1 2 1 2

) ( )) ( ( ) det(          k A trace k A R

Reduces the effect of a strong contour

Cornerness function
Interest point detection

– Treshold (absolut, relatif, number of corners) – Local maxima

) , ( ) , ( 8 , y x f y x f

od

neighbourh y x thresh f        

SLIDE 33

Harris Detector: Steps

SLIDE 34

Harris Detector: Steps

Compute corner response R

SLIDE 35

Harris Detector: Steps

Find points with large corner response: R>threshold

SLIDE 36

Harris Detector: Steps

Take only the points of local maxima of R

SLIDE 37

Harris Detector: Steps

SLIDE 38

Harris detector: Summary of steps

1. Compute Gaussian derivatives at each pixel 2. Compute second moment matrix A in a Gaussian window around each pixel 3. Compute corner response function R 4. Threshold R 5. Find local maxima of response function (non-maximum suppression)

SLIDE 39

Harris - invariance to transformations

Geometric transformations

– translation – rotation – similitude (rotation + scale change) – affine (valide for local planar objects)

Photometric transformations

– Affine intensity changes (I  a I + b)

SLIDE 40

Harris Detector: Invariance Properties

Rotation

Ellipse rotates but its shape (i.e. eigenvalues) remains the same Corner response R is invariant to image rotation

SLIDE 41

Harris Detector: Invariance Properties

Scaling

All points will be classified as edges Corner

Not invariant to scaling

SLIDE 42

Harris Detector: Invariance Properties

Affine intensity change

 Only derivatives are used => invariance to intensity shift I  I + b  Intensity scale: I  a I R x (image coordinate)

threshold

R x (image coordinate) Partially invariant to affine intensity change, dependent on type of threshold

SLIDE 43

Comparison of patches - SSD

) , (

1 1 y

x

image 1 image 2

SSD : sum of square difference Comparison of the intensities in the neighborhood of two interest points

) , (

2 2 y

x

2 2 2 2 1 1 1 ) 1 2 ( 1

)) , ( ) , ( (

2

j y i x I j y i x I

N N i N N j N

    

 

    

Small difference values  similar patches

SLIDE 44

Comparison of patches

2 2 2 2 1 1 1 ) 1 2 ( 1

)) , ( ) , ( (

2

j y i x I j y i x I

N N i N N j N

    

 

    

SSD : Invariance to photometric transformations? Intensity changes (I  I + b) Intensity changes (I  aI + b)

2 2 2 2 2 1 1 1 1 ) 1 2 ( 1

)) ) , ( ( ) ) , ( ((

2

m j y i x I m j y i x I

N N i N N j N

      

 

    

=> Normalizing with the mean of each patch

2 2 2 2 2 2 1 1 1 1 1 ) 1 2 ( 1

) , ( ) , (

2  

    

              

N N i N N j N

m j y i x I m j y i x I  

=> Normalizing with the mean and standard deviation of each patch

SLIDE 45

Cross-correlation ZNCC

ZNCC: zero normalized cross correlation                       

 

     2 2 2 2 2 1 1 1 1 1 ) 1 2 ( 1

) , ( ) , (

2

  m j y i x I m j y i x I

N N i N N j N 2 2 2 2 2 2 1 1 1 1 1 ) 1 2 ( 1

) , ( ) , (

2  

    

              

N N i N N j N

m j y i x I m j y i x I   zero normalized SSD ZNCC values between -1 and 1, 1 when identical patches in practice threshold around 0.5

SLIDE 46

Local descriptors

Pixel values
Greyvalue derivatives, differential invariants [Koenderink’87]
SIFT descriptor [Lowe’99]
SURF descriptor [Bay et al.’08]
DAISY descriptor [Tola et al.’08, Windler et al’09]
LIOP descriptor [Wang et al.’11]
Recent patch descriptors based on CNN features [Brox et

al.’15, Paulin et al.’15,…]

SLIDE 47

Local descriptors

) 2 exp( 2 1 ) , , (

2 2 2 2

   y x y x G   

 

     

          y d x d y y x x I y x G G y x I ) , ( ) , , ( ) ( ) , (  

                          ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( ) , ( ) ( ) , ( ) , (      

yy xy xx y x

G y x I G y x I G y x I G y x I G y x I G y x I y x v

Greyvalue derivatives

– Convolution with Gaussian derivatives

SLIDE 48

Local descriptors

                                                  ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( ) , ( ) ( ) , ( ) , ( y x L y x L y x L y x L y x L y x L G y x I G y x I G y x I G y x I G y x I G y x I y x

yy xy xx y x yy xy xx y x

      v

Notation for greyvalue derivatives [Koenderink’87] Invariance?

SLIDE 49

Local descriptors – rotation invariance

Invariance to image rotation : differential invariants [Koen87]

                                     

yy yy xy xy xx xx yy xx yy yy y x xy x x xx y y x x

L L L L L L L L L L L L L L L L L L L L L 2 2

gradient magnitude Laplacian

SLIDE 50

Laplacian of Gaussian (LOG)

) ( ) (  

yy xx

G G LOG  

SLIDE 51

SIFT descriptor [Lowe’99]

Approach

– 8 orientations of the gradient – 4x4 spatial grid – Dimension 128 – soft-assignment to spatial bins – normalization of the descriptor to norm one – comparison with Euclidean distance gradient 3D histogram

 

image patch

y x

SLIDE 52

Local descriptors - rotation invariance

Estimation of the dominant orientation

– extract gradient orientation – histogram over gradient orientation – peak in this histogram

Rotate patch in dominant direction

2

SLIDE 53

Local descriptors – illumination change

in case of an affine transformation

b aI I   ) ( ) (

2 1

x x

Normalization of the image patch with mean and variance
Robustness to illumination changes

SLIDE 54

Invariance to scale changes

Scale change between two images
Scale factor s can be eliminated
Support region for calculation!!

– In case of a convolution with Gaussian derivatives defined by

) 2 exp( 2 1 ) , , (

2 2 2 2

   y x y x G   

 

     

          y d x d y y x x I y x G G y x I ) , ( ) , , ( ) ( ) , (  



SLIDE 55

Overview

Introduction to local features
Harris interest points + SSD, ZNCC, SIFT
Scale invariant interest point detectors

SLIDE 56

Scale invariance - motivation

Description regions have to be adapted to scale changes
Interest points have to be repeatable for scale changes

SLIDE 57

Harris detector + scale changes

|) | |, max(| | } ) ), ( ( | ) , {( | ) (

i i i i i i

H dist R b a b a b a    

Repeatability rate

SLIDE 58

Scale adaptation

                         

1 1 2 2 2 2 1 1 1

sy sx I y x I y x I Scale change between two images Scale adapted derivative calculation

SLIDE 59

Scale adaptation

                         

1 1 2 2 2 2 1 1 1

sy sx I y x I y x I Scale change between two images

) ( ) (

1 1

2 2 2 1 1 1

  s G y x I s G y x I

n n

i i n i i  

                  

Scale adapted derivative calculation  s

n

s

SLIDE 60

Harris detector – adaptation to scale

} ) ), ( ( | ) , {( ) (    

i i i i

H dist R b a b a

SLIDE 61

Scale selection

For a point compute a value (gradient, Laplacian etc.) at

several scales

Normalization of the values with the scale factor
Select scale at the maximum → characteristic scale
Exp. results show that the Laplacian gives best results

| ) ( |

2 yy xx

L L s 

e.g. Laplacian



s

| ) ( |

2 yy xx

L L s 

scale

SLIDE 62

Scale selection

Scale invariance of the characteristic scale
norm. Lap.

s

scale

SLIDE 63

Scale selection

Scale invariance of the characteristic scale

  



2 1

s s s

norm. Lap.
norm. Lap.

s

Relation between characteristic scales

scale scale

SLIDE 64

Scale-invariant detectors

Harris-Laplace (Mikolajczyk & Schmid’01)
Laplacian detector (Lindeberg’98)
Difference of Gaussian (SIFT detector, Lowe’99)

Harris-Laplace Laplacian

SLIDE 65

Harris-Laplace

invariant points + associated regions [Mikolajczyk & Schmid’01] multi-scale Harris points selection of points at maximum of Laplacian

SLIDE 66

Matching results

213 / 190 detected interest points

SLIDE 67

Matching results

58 points are initially matched

SLIDE 68

Matching results

32 points are matched after verification – all correct

SLIDE 69

LOG detector

Detection of maxima and minima

f Laplacian in scale space

Convolve image with scale- normalized Laplacian at several scales

)) ( ) ( (

2

 

yy xx

G G s LOG  

SLIDE 70

Efficient implementation

Difference of Gaussian (DOG) approximates the

Laplacian

) ( ) (   G k G DOG  

Error due to the approximation

SLIDE 71

DOG detector

Fast computation, scale space processed one octave at a

time

David G. Lowe. "Distinctive image features from scale-invariant keypoints.”IJCV 60 (2).

SLIDE 72

Maximally stable extremal regions (MSER) [Matas’02]

Extremal regions: connected components in a thresholded

image (all pixels above/below a threshold)

Maximally stable: minimal change of the component

(area) for a change of the threshold, i.e. region remains stable for a change of threshold

Excellent results in a recent comparison

SLIDE 73

Maximally stable extremal regions (MSER) Examples of thresholded images

high threshold low threshold

SLIDE 74

MSER