Instance-level recognition Cordelia Schmid INRIA, Grenoble - - PowerPoint PPT Presentation

instance level recognition
SMART_READER_LITE
LIVE PREVIEW

Instance-level recognition Cordelia Schmid INRIA, Grenoble - - PowerPoint PPT Presentation

Instance-level recognition Cordelia Schmid INRIA, Grenoble Instance-level recognition Search for particular objects and scenes in large databases Difficulties Finding the object despite possibly large changes in scale, viewpoint,


slide-1
SLIDE 1

Instance-level recognition

Cordelia Schmid INRIA, Grenoble

slide-2
SLIDE 2

Instance-level recognition

Search for particular objects and scenes in large databases

slide-3
SLIDE 3

Finding the object despite possibly large changes in scale, viewpoint, lighting and partial occlusion  requires invariant description Viewpoint Scale Lighting Occlusion

Difficulties

slide-4
SLIDE 4

Difficulties

  • Very large images collection  need for efficient indexing

– Flickr has 2 billion photographs, more than 1 million added daily – Facebook has 15 billion images (~27 million added daily) – Large personal collections – Video collections, i.e., YouTube

slide-5
SLIDE 5

Search photos on the web for particular places

Find these landmarks ...in these images and 1M more

Applications

slide-6
SLIDE 6

Applications

  • Take a picture of a product or advertisement

 find relevant information on the web

slide-7
SLIDE 7

Applications

  • Finding stolen/missing objects in a large collection
slide-8
SLIDE 8

Applications

  • Copy detection for images and videos

Search in 200h of video Query video

slide-9
SLIDE 9

9

  • K. Grauman, B. Leibe
  • Sony Aibo – Robotics

– Recognize docking station – Communicate with visual cards – Place recognition – Loop closure in SLAM

S lide credit: David Lowe

Applications

slide-10
SLIDE 10

Instance-level recognition

1) Local invariant features 2) Matching and recognition with local features 3) Efficient visual search 4) Very large scale indexing

slide-11
SLIDE 11

Local invariant features

  • Introduction to local features
  • Harris interest points + SSD, ZNCC, SIFT
  • Scale & affine invariant interest point detectors
slide-12
SLIDE 12

Local features

( )

local descriptor

Many local descriptors per image Robust to occlusion/clutter + no object segmentation required Photometric : distinctive Invariant : to image transformations + illumination changes

slide-13
SLIDE 13

Local features

Interest Points Contours/lines Region segments

slide-14
SLIDE 14

Local features

Interest Points Patch descriptors, i.e. SIFT Contours/lines Mi-points, angles Region segments Color/texture histogram

slide-15
SLIDE 15

Interest points / invariant regions

Harris detector Scale/affine inv. detector

slide-16
SLIDE 16

Contours / lines

  • Extraction de contours

– Zero crossing of Laplacian – Local maxima of gradients

  • Chain contour points (hysteresis) , Canny detector
  • Recent contour detectors

– global probability of boundary (gPb) detector [Malik et al., UC Berkeley, CVPR’08] – Structured forests for fast edge detection (SED) [Dollar and Zitnick, ICCV’13]

slide-17
SLIDE 17

Regions segments / superpixels

Simple linear iterative clustering (SLIC)

Normalized cut [Shi & Malik], Mean Shift [Comaniciu & Meer], ….

slide-18
SLIDE 18

Matching of local descriptors

Find corresponding locations in the image

slide-19
SLIDE 19

Illustration – Matching

Interest points extracted with Harris detector (~ 500 points)

slide-20
SLIDE 20

Matching

Interest points matched based on cross-correlation (188 pairs)

Illustration – Matching

slide-21
SLIDE 21

Global constraints

Global constraint - Robust estimation of the fundamental matrix 99 inliers 89 outliers

Illustration – Matching

slide-22
SLIDE 22

Application: Panorama stitching

Images courtesy of A. Zisserman.

slide-23
SLIDE 23

Overview

  • Introduction to local features
  • Harris interest points + SSD, ZNCC, SIFT
  • Scale & affine invariant interest point detectors
slide-24
SLIDE 24

Harris detector [Harris & Stephens’88]

Based on the idea of auto-correlation Important difference in all directions => interest point

slide-25
SLIDE 25

Harris detector

W

2 ) , ( ) , (

)) , ( ) , ( ( ) , ( y y x x I y x I y x A

k k k y x W y x k

k k

     

Auto-correlation function for a point and a shift

) , ( y x ) , ( y x   ) , ( y x  

slide-26
SLIDE 26

Harris detector

W

2 ) , ( ) , (

)) , ( ) , ( ( ) , ( y y x x I y x I y x A

k k k y x W y x k

k k

     

Auto-correlation function for a point and a shift

) , ( y x ) , ( y x   ) , ( y x  

small in all directions large in one directions large in all directions

) , ( y x A {

→ uniform region → contour → interest point

slide-27
SLIDE 27

Harris detector

 

2 ) , (

) , ( ) , (

                  

W y x k k y k k x

k k

y x y x I y x I

Discret shifts are avoided based on the auto-correlation matrix

                y x y x I y x I y x I y y x x I

k k y k k x k k k k

)) , ( ) , ( ( ) , ( ) , (

with first order approximation

2 ) , ( ) , (

)) , ( ) , ( ( ) , ( y y x x I y x I y x A

k k k y x W y x k

k k

     

slide-28
SLIDE 28

Harris detector

 

                      

   

   

y x y x I y x I y x I y x I y x I y x I y x

W y x k k y W y x k k y k k x W y x k k y k k x W y x k k x

k k k k k k k k

) , ( 2 ) , ( ) , ( ) , ( 2

)) , ( ( ) , ( ) , ( ) , ( ) , ( )) , ( (

Auto-correlation matrix the sum can be smoothed with a Gaussian

 

                      y x I I I I I I G y x

y y x y x x 2 2

slide-29
SLIDE 29

Harris detector

  • Auto-correlation matrix

– captures the structure of the local neighborhood – measure based on eigenvalues of this matrix

  • 2 strong eigenvalues
  • 1 strong eigenvalue
  • 0 eigenvalue

=> interest point => contour => uniform region

         

2 2

) , (

y y x y x x

I I I I I I G y x A

slide-30
SLIDE 30

Interpreting the eigenvalues

1 2 “Corner” 1 and 2 are large, 1 ~ 2; 1 and 2 are small; “Edge” 1 >> 2 “Edge” 2 >> 1 “Flat” region

Classification of image points using eigenvalues of autocorrelation matrix

slide-31
SLIDE 31

Corner response function

“Corner” R > 0 “Edge” R < 0 “Edge” R < 0 “Flat” region |R| small

2 2 1 2 1 2

) ( ) ( trace ) det(            A A R

α: constant (0.04 to 0.06)

slide-32
SLIDE 32

Harris detector

2 2 1 2 1 2

) ( )) ( ( ) det(          k A trace k A R

Reduces the effect of a strong contour

  • Cornerness function
  • Interest point detection

– Treshold (absolut, relatif, number of corners) – Local maxima

) , ( ) , ( 8 , y x f y x f

  • od

neighbourh y x thresh f        

slide-33
SLIDE 33

Harris Detector: Steps

slide-34
SLIDE 34

Harris Detector: Steps

Compute corner response R

slide-35
SLIDE 35

Harris Detector: Steps

Find points with large corner response: R>threshold

slide-36
SLIDE 36

Harris Detector: Steps

Take only the points of local maxima of R

slide-37
SLIDE 37

Harris Detector: Steps

slide-38
SLIDE 38

Harris detector: Summary of steps

1. Compute Gaussian derivatives at each pixel 2. Compute second moment matrix A in a Gaussian window around each pixel 3. Compute corner response function R 4. Threshold R 5. Find local maxima of response function (non-maximum suppression)

slide-39
SLIDE 39

Harris - invariance to transformations

  • Geometric transformations

– translation – rotation – similitude (rotation + scale change) – affine (valide for local planar objects)

  • Photometric transformations

– Affine intensity changes (I  a I + b)

slide-40
SLIDE 40

Harris Detector: Invariance Properties

  • Rotation

Ellipse rotates but its shape (i.e. eigenvalues) remains the same Corner response R is invariant to image rotation

slide-41
SLIDE 41

Harris Detector: Invariance Properties

  • Scaling

All points will be classified as edges Corner

Not invariant to scaling

slide-42
SLIDE 42

Harris Detector: Invariance Properties

  • Affine intensity change

 Only derivatives are used => invariance to intensity shift I  I + b  Intensity scale: I  a I R x (image coordinate)

threshold

R x (image coordinate) Partially invariant to affine intensity change, dependent on type of threshold

slide-43
SLIDE 43

Comparison of patches - SSD

) , (

1 1 y

x

image 1 image 2

SSD : sum of square difference Comparison of the intensities in the neighborhood of two interest points

) , (

2 2 y

x

2 2 2 2 1 1 1 ) 1 2 ( 1

)) , ( ) , ( (

2

j y i x I j y i x I

N N i N N j N

    

 

    

Small difference values  similar patches

slide-44
SLIDE 44

Comparison of patches

2 2 2 2 1 1 1 ) 1 2 ( 1

)) , ( ) , ( (

2

j y i x I j y i x I

N N i N N j N

    

 

    

SSD : Invariance to photometric transformations? Intensity changes (I  I + b) Intensity changes (I  aI + b)

2 2 2 2 2 1 1 1 1 ) 1 2 ( 1

)) ) , ( ( ) ) , ( ((

2

m j y i x I m j y i x I

N N i N N j N

      

 

    

=> Normalizing with the mean of each patch

2 2 2 2 2 2 1 1 1 1 1 ) 1 2 ( 1

) , ( ) , (

2  

    

              

N N i N N j N

m j y i x I m j y i x I  

=> Normalizing with the mean and standard deviation of each patch

slide-45
SLIDE 45

Cross-correlation ZNCC

ZNCC: zero normalized cross correlation                       

 

     2 2 2 2 2 1 1 1 1 1 ) 1 2 ( 1

) , ( ) , (

2

  m j y i x I m j y i x I

N N i N N j N 2 2 2 2 2 2 1 1 1 1 1 ) 1 2 ( 1

) , ( ) , (

2  

    

              

N N i N N j N

m j y i x I m j y i x I   zero normalized SSD ZNCC values between -1 and 1, 1 when identical patches in practice threshold around 0.5

slide-46
SLIDE 46

Local descriptors

  • Pixel values
  • Greyvalue derivatives, differential invariants [Koenderink’87]
  • SIFT descriptor [Lowe’99]
  • SURF descriptor [Bay et al.’08]
  • DAISY descriptor [Tola et al.’08, Windler et al’09]
  • LIOP descriptor [Wang et al.’11]
  • Recent patch descriptors based on CNN features [Brox et

al.’15, Paulin et al.’15,…]

slide-47
SLIDE 47

Local descriptors

) 2 exp( 2 1 ) , , (

2 2 2 2

   y x y x G   

 

     

          y d x d y y x x I y x G G y x I ) , ( ) , , ( ) ( ) , (  

                          ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( ) , ( ) ( ) , ( ) , (      

yy xy xx y x

G y x I G y x I G y x I G y x I G y x I G y x I y x v

  • Greyvalue derivatives

– Convolution with Gaussian derivatives

slide-48
SLIDE 48

Local descriptors

                                                  ) , ( ) , ( ) , ( ) , ( ) , ( ) , ( ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( ) , ( ) ( ) , ( ) , ( y x L y x L y x L y x L y x L y x L G y x I G y x I G y x I G y x I G y x I G y x I y x

yy xy xx y x yy xy xx y x

      v

Notation for greyvalue derivatives [Koenderink’87] Invariance?

slide-49
SLIDE 49

Local descriptors – rotation invariance

Invariance to image rotation : differential invariants [Koen87]

                                     

yy yy xy xy xx xx yy xx yy yy y x xy x x xx y y x x

L L L L L L L L L L L L L L L L L L L L L 2 2

gradient magnitude Laplacian

slide-50
SLIDE 50

Laplacian of Gaussian (LOG)

) ( ) (  

yy xx

G G LOG  

slide-51
SLIDE 51

SIFT descriptor [Lowe’99]

  • Approach

– 8 orientations of the gradient – 4x4 spatial grid – Dimension 128 – soft-assignment to spatial bins – normalization of the descriptor to norm one – comparison with Euclidean distance gradient 3D histogram

 

image patch

y x

slide-52
SLIDE 52

Local descriptors - rotation invariance

  • Estimation of the dominant orientation

– extract gradient orientation – histogram over gradient orientation – peak in this histogram

  • Rotate patch in dominant direction

2

slide-53
SLIDE 53

Local descriptors – illumination change

in case of an affine transformation

b aI I   ) ( ) (

2 1

x x

  • Normalization of the image patch with mean and variance
  • Robustness to illumination changes
slide-54
SLIDE 54

Invariance to scale changes

  • Scale change between two images
  • Scale factor s can be eliminated
  • Support region for calculation!!

– In case of a convolution with Gaussian derivatives defined by

) 2 exp( 2 1 ) , , (

2 2 2 2

   y x y x G   

 

     

          y d x d y y x x I y x G G y x I ) , ( ) , , ( ) ( ) , (  

slide-55
SLIDE 55

Overview

  • Introduction to local features
  • Harris interest points + SSD, ZNCC, SIFT
  • Scale invariant interest point detectors
slide-56
SLIDE 56

Scale invariance - motivation

  • Description regions have to be adapted to scale changes
  • Interest points have to be repeatable for scale changes
slide-57
SLIDE 57

Harris detector + scale changes

|) | |, max(| | } ) ), ( ( | ) , {( | ) (

i i i i i i

H dist R b a b a b a    

Repeatability rate

slide-58
SLIDE 58

Scale adaptation

                         

1 1 2 2 2 2 1 1 1

sy sx I y x I y x I Scale change between two images Scale adapted derivative calculation

slide-59
SLIDE 59

Scale adaptation

                         

1 1 2 2 2 2 1 1 1

sy sx I y x I y x I Scale change between two images

) ( ) (

1 1

2 2 2 1 1 1

  s G y x I s G y x I

n n

i i n i i  

                  

Scale adapted derivative calculation  s

n

s

slide-60
SLIDE 60

Harris detector – adaptation to scale

} ) ), ( ( | ) , {( ) (    

i i i i

H dist R b a b a

slide-61
SLIDE 61

Scale selection

  • For a point compute a value (gradient, Laplacian etc.) at

several scales

  • Normalization of the values with the scale factor
  • Select scale at the maximum → characteristic scale
  • Exp. results show that the Laplacian gives best results

| ) ( |

2 yy xx

L L s 

e.g. Laplacian

s

| ) ( |

2 yy xx

L L s 

scale

slide-62
SLIDE 62

Scale selection

  • Scale invariance of the characteristic scale
  • norm. Lap.

s

scale

slide-63
SLIDE 63

Scale selection

  • Scale invariance of the characteristic scale

  

2 1

s s s

  • norm. Lap.
  • norm. Lap.

s

  • Relation between characteristic scales

scale scale

slide-64
SLIDE 64

Scale-invariant detectors

  • Harris-Laplace (Mikolajczyk & Schmid’01)
  • Laplacian detector (Lindeberg’98)
  • Difference of Gaussian (SIFT detector, Lowe’99)

Harris-Laplace Laplacian

slide-65
SLIDE 65

Harris-Laplace

invariant points + associated regions [Mikolajczyk & Schmid’01] multi-scale Harris points selection of points at maximum of Laplacian

slide-66
SLIDE 66

Matching results

213 / 190 detected interest points

slide-67
SLIDE 67

Matching results

58 points are initially matched

slide-68
SLIDE 68

Matching results

32 points are matched after verification – all correct

slide-69
SLIDE 69

LOG detector

Detection of maxima and minima

  • f Laplacian in scale space

Convolve image with scale- normalized Laplacian at several scales

)) ( ) ( (

2

 

yy xx

G G s LOG  

slide-70
SLIDE 70

Efficient implementation

  • Difference of Gaussian (DOG) approximates the

Laplacian

) ( ) (   G k G DOG  

  • Error due to the approximation
slide-71
SLIDE 71

DOG detector

  • Fast computation, scale space processed one octave at a

time

David G. Lowe. "Distinctive image features from scale-invariant keypoints.”IJCV 60 (2).

slide-72
SLIDE 72

Maximally stable extremal regions (MSER) [Matas’02]

  • Extremal regions: connected components in a thresholded

image (all pixels above/below a threshold)

  • Maximally stable: minimal change of the component

(area) for a change of the threshold, i.e. region remains stable for a change of threshold

  • Excellent results in a recent comparison
slide-73
SLIDE 73

Maximally stable extremal regions (MSER) Examples of thresholded images

high threshold low threshold

slide-74
SLIDE 74

MSER