Instance-level recognition: Local invariant features Cordelia - - PowerPoint PPT Presentation

instance level recognition local invariant features
SMART_READER_LITE
LIVE PREVIEW

Instance-level recognition: Local invariant features Cordelia - - PowerPoint PPT Presentation

Instance-level recognition: Local invariant features Cordelia Schmid INRIA, Grenoble Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Scale


slide-1
SLIDE 1

Instance-level recognition: Local invariant features

Cordelia Schmid INRIA, Grenoble

slide-2
SLIDE 2

Overview

  • Introduction to local features
  • Harris interest points + SSD, ZNCC, SIFT
  • Scale & affine invariant interest point detectors
  • Scale & affine invariant interest point detectors
  • Evaluation and comparison of different detectors
  • Region descriptors and their performance
slide-3
SLIDE 3

Local features

( )

local descriptor

Several / many local descriptors per image Robust to occlusion/clutter + no object segmentation required Photometric : distinctive Invariant : to image transformations + illumination changes

slide-4
SLIDE 4

Local features: interest points

slide-5
SLIDE 5

Local features: Contours/segments

slide-6
SLIDE 6

Local features: segmentation

slide-7
SLIDE 7

Application: Matching

Find corresponding locations in the image

slide-8
SLIDE 8

Illustration – Matching

Interest points extracted with Harris detector (~ 500 points)

slide-9
SLIDE 9

Matching Illustration – Matching

Interest points matched based on cross-correlation (188 pairs)

slide-10
SLIDE 10

Global constraints

Global constraint - Robust estimation of the fundamental matrix

Illustration – Matching

99 inliers 89 outliers

slide-11
SLIDE 11

Application: Panorama stitching

Images courtesy of A. Zisserman.

slide-12
SLIDE 12

Application: Instance-level recognition

Search for particular objects and scenes in large databases

slide-13
SLIDE 13

Finding the object despite possibly large changes in scale, viewpoint, lighting and partial occlusion

  • requires invariant description

Difficulties

Viewpoint Scale Lighting Occlusion

slide-14
SLIDE 14

Difficulties

  • Very large images collection need for efficient indexing

– Flickr has 2 billion photographs, more than 1 million added daily – Facebook has 15 billion images (~27 million added daily) – Facebook has 15 billion images (~27 million added daily) – Large personal collections – Video collections, i.e., YouTube

slide-15
SLIDE 15
  • Image content is transformed into local features invariant to

geometric and photometric transformations

  • Matching local invariant descriptors

Instance-level recognition: Approach

slide-16
SLIDE 16

Search photos on the web for particular places

Applications

Find these landmarks ...in these images and 1M more

slide-17
SLIDE 17

Applications

  • Take a picture of a product or advertisement

find relevant information on the web [Pixee – Milpix]

slide-18
SLIDE 18

Applications

  • Finding stolen/missing objects in a large collection

slide-19
SLIDE 19

Applications

  • Copy detection for images and videos

Search in 200h of video Query video

slide-20
SLIDE 20
  • Sony Aibo – Robotics

– Recognize docking station – Communicate with visual cards – Place recognition – Loop closure in SLAM

Applications

slide-21
SLIDE 21

Local features - history

  • Line segments [Lowe’87, Ayache’90]
  • Interest points & cross correlation [Z. Zhang et al. 95]
  • Rotation invariance with differential invariants [Schmid&Mohr’96]
  • Scale & affine invariant detectors [Lindeberg’98, Lowe’99,

Tuytelaars&VanGool’00, Mikolajczyk&Schmid’02, Matas et al.’02]

  • Dense detectors and descriptors [Leung&Malik’99, Fei-Fei&

Perona’05, Lazebnik et al.’06]

  • Contour and region (segmentation) descriptors [Shotton et al.’05,

Opelt et al.’06, Ferrari et al.’06, Leordeanu et al.’07]

slide-22
SLIDE 22

Local features

1) Extraction of local features

– Contours/segments – Interest points & regions – Regions by segmentation – Dense features, points on a regular grid

2) Description of local features

– Dependant on the feature type – Contours/segments angles, length ratios – Interest points greylevels, gradient histograms – Regions (segmentation) texture + color distributions

slide-23
SLIDE 23

Line matching

  • Extraction de contours

– Zero crossing of Laplacian – Local maxima of gradients

  • Chain contour points (hysteresis)
  • Chain contour points (hysteresis)
  • Extraction of line segments
  • Description of segments

– Mi-point, length, orientation, angle between pairs etc.

slide-24
SLIDE 24

Experimental results – line segments

images 600 x 600

slide-25
SLIDE 25

Experimental results – line segments

248 / 212 line segments extracted

slide-26
SLIDE 26

Experimental results – line segments

89 matched line segments - 100% correct

slide-27
SLIDE 27

Experimental results – line segments

3D reconstruction

slide-28
SLIDE 28

Problems of line segments

  • Often only partial extraction

– Line segments broken into parts – Missing parts

  • Information not very discriminative
  • Information not very discriminative

– 1D information – Similar for many segments

  • Potential solutions

– Pairs and triplets of segments – Interest points

slide-29
SLIDE 29

Overview

  • Introduction to local features
  • Harris interest points + SSD, ZNCC, SIFT
  • Scale & affine invariant interest point detectors
  • Scale & affine invariant interest point detectors
  • Evaluation and comparison of different detectors
  • Region descriptors and their performance
slide-30
SLIDE 30

Harris detector [Harris & Stephens’88]

Based on the idea of auto-correlation Important difference in all directions => interest point

slide-31
SLIDE 31

Harris detector

2 ) , ( ) , (

)) , ( ) , ( ( ) , ( y y x x I y x I y x A

k k k y x W y x k

k k

∆ + ∆ + − =

Auto-correlation function for a point and a shift

) , ( y x ) , ( y x ∆ ∆

W

) , ( y x ∆ ∆

slide-32
SLIDE 32

Harris detector

2 ) , ( ) , (

)) , ( ) , ( ( ) , ( y y x x I y x I y x A

k k k y x W y x k

k k

∆ + ∆ + − =

Auto-correlation function for a point and a shift

) , ( y x ) , ( y x ∆ ∆

W

) , ( y x ∆ ∆

small in all directions large in one directions large in all directions

) , ( y x A {

→ uniform region → contour → interest point

slide-33
SLIDE 33

Harris detector

slide-34
SLIDE 34

Harris detector

Discret shifts are avoided based on the auto-correlation matrix

     ∆ + = ∆ + ∆ + x y x I y x I y x I y y x x I )) , ( ) , ( ( ) , ( ) , (

with first order approximation

( )

2 ) , (

) , ( ) , (

                ∆ ∆ =

W y x k k y k k x

k k

y x y x I y x I

     ∆ + = ∆ + ∆ + y y x I y x I y x I y y x x I

k k y k k x k k k k

)) , ( ) , ( ( ) , ( ) , (

2 ) , ( ) , (

)) , ( ) , ( ( ) , ( y y x x I y x I y x A

k k k y x W y x k

k k

∆ + ∆ + − =

slide-35
SLIDE 35

Harris detector

( )

        ∆ ∆           ∆ ∆ =

∑ ∑ ∑ ∑

∈ ∈ ∈ ∈

y x y x I y x I y x I y x I y x I y x I y x

W y x k k y W y x k k y k k x W y x k k y k k x W y x k k x

k k k k k k k k

) , ( 2 ) , ( ) , ( ) , ( 2

)) , ( ( ) , ( ) , ( ) , ( ) , ( )) , ( (

Auto-correlation matrix the sum can be smoothed with a Gaussian

( )

        ∆ ∆         ⊗ ∆ ∆ = y x I I I I I I G y x

y y x y x x 2 2

slide-36
SLIDE 36

Harris detector

  • Auto-correlation matrix

        ⊗ =

2 2

) , (

y y x y x x

I I I I I I G y x A

– captures the structure of the local neighborhood – measure based on eigenvalues of this matrix

  • 2 strong eigenvalues
  • 1 strong eigenvalue
  • 0 eigenvalue

=> interest point => contour => uniform region

slide-37
SLIDE 37

Interpreting the eigenvalues

λ2 “Corner” λ1 and λ2 are large, λ ~ λ ; “Edge” λ2 >> λ1

Classification of image points using eigenvalues of autocorrelation matrix:

λ1 λ1 ~ λ2; \ λ1 and λ2 are small; “Edge” λ1 >> λ2 “Flat” region

slide-38
SLIDE 38

Corner response function

“Corner” R > 0 “Edge” R < 0

2 2 1 2 1 2

) ( ) ( trace ) det( λ λ α λ λ α + − = − = A A R

α: constant (0.04 to 0.06)

“Edge” R < 0 “Flat” region |R| small

slide-39
SLIDE 39

Harris detector

2 2 1 2 1 2

) ( )) ( ( ) det( λ λ λ λ + − = − = k A trace k A f

Reduces the effect of a strong contour

  • Cornerness function

Reduces the effect of a strong contour

  • Interest point detection

– Treshold (absolut, relatif, number of corners) – Local maxima

) , ( ) , ( 8 , y x f y x f

  • od

neighbourh y x thresh f ′ ′ ≥ − ∈ ∀ ∧ >

slide-40
SLIDE 40

Harris Detector: Steps

slide-41
SLIDE 41

Harris Detector: Steps

Compute corner response R

slide-42
SLIDE 42

Harris Detector: Steps

Find points with large corner response: R>threshold

slide-43
SLIDE 43

Harris Detector: Steps

Take only the points of local maxima of R

slide-44
SLIDE 44

Harris Detector: Steps

slide-45
SLIDE 45

Harris detector: Summary of steps

1. Compute Gaussian derivatives at each pixel 2. Compute second moment matrix A in a Gaussian window around each pixel 3. Compute corner response function R 4. Threshold R 4. Threshold R 5. Find local maxima of response function (non-maximum suppression)

slide-46
SLIDE 46

Harris - invariance to transformations

  • Geometric transformations

– translation – rotation – similitude (rotation + scale change) – similitude (rotation + scale change) – affine (valide for local planar objects)

  • Photometric transformations

– Affine intensity changes (I → a I + b)

slide-47
SLIDE 47

Harris Detector: Invariance Properties

  • Rotation

Ellipse rotates but its shape (i.e. eigenvalues) remains the same Corner response R is invariant to image rotation

slide-48
SLIDE 48

Harris Detector: Invariance Properties

  • Affine intensity change

Only derivatives are used => invariance to intensity shift I → I +b Intensity scale: I → a I R x (image coordinate)

threshold

R x (image coordinate) Partially invariant to affine intensity change, dependent on type of threshold

slide-49
SLIDE 49

Harris Detector: Invariance Properties

  • Scaling

All points will be classified as edges Corner

Not invariant to scaling

slide-50
SLIDE 50

Comparison of patches - SSD

) , (

1 1 y

x

Comparison of the intensities in the neighborhood of two interest points

) , (

2 2 y

x

image 1 image 2

SSD : sum of square difference

2 2 2 2 1 1 1 ) 1 2 ( 1

)) , ( ) , ( (

2

j y i x I j y i x I

N N i N N j N

+ + − + +

∑ ∑

− = − = +

Small difference values similar patches

slide-51
SLIDE 51

Comparison of patches

2 2 2 2 1 1 1 ) 1 2 ( 1

)) , ( ) , ( (

2

j y i x I j y i x I

N N i N N j N

+ + − + +

∑ ∑

− = − = +

SSD : Invariance to photometric transformations? Intensity changes (I → I + b) => Normalizing with the mean of each patch Intensity changes (I → aI + b)

2 2 2 2 2 1 1 1 1 ) 1 2 ( 1

)) ) , ( ( ) ) , ( ((

2

m j y i x I m j y i x I

N N i N N j N

− + + − − + +

∑ ∑

− = − = +

=> Normalizing with the mean of each patch

2 2 2 2 2 2 1 1 1 1 1 ) 1 2 ( 1

) , ( ) , (

2 ∑ ∑

− = − = +

        − + + − − + +

N N i N N j N

m j y i x I m j y i x I σ σ

=> Normalizing with the mean and standard deviation of each patch

slide-52
SLIDE 52

Cross-correlation ZNCC

2 2 2 2 2 2 1 1 1 1 1 ) 1 2 ( 1

) , ( ) , (

2 ∑ ∑

− = − = +

        − + + − − + +

N N i N N j N

m j y i x I m j y i x I σ σ zero normalized SSD ZNCC: zero normalized cross correlation         − + + ⋅         − + +

∑ ∑

− = − = + 2 2 2 2 2 1 1 1 1 1 ) 1 2 ( 1

) , ( ) , (

2

σ σ m j y i x I m j y i x I

N N i N N j N

ZNCC values between -1 and 1, 1 when identical patches in practice threshold around 0.5

slide-53
SLIDE 53

Local descriptors

  • Greyvalue derivatives
  • Differential invariants [Koenderink’87]
  • SIFT descriptor [Lowe’99]
  • SIFT descriptor [Lowe’99]
slide-54
SLIDE 54

Greyvalue derivatives: Image gradient

  • The gradient of an image:
  • The gradient points in the direction of most rapid increase in

intensity

  • The gradient direction is given by

– how does this relate to the direction of the edge?

  • The edge strength is given by the gradient magnitude
slide-55
SLIDE 55

Differentiation and convolution

  • Recall, for 2D function, f(x,y):
  • We could approximate this as

∂f ∂x = lim

ε→0

f x + ε, y

( )

ε − f x,y

( )

ε       ∂f ∂x ≈ f xn+1,y

( )− f xn, y ( )

∆x

  • We could approximate this as

∂x ≈ ∆x

  • 1

1

  • Convolution with the filter
slide-56
SLIDE 56

Finite difference filters

  • Other approximations of derivative filters exist:
slide-57
SLIDE 57

Effects of noise

  • Consider a single row or column of the image

– Plotting intensity as a function of position gives a signal

  • Where is the edge?
slide-58
SLIDE 58

Solution: smooth first

f g

  • To find edges, look for peaks in

) ( g f dx d ∗ f * g

) ( g f dx d ∗

slide-59
SLIDE 59
  • Differentiation is convolution, and convolution is

associative:

  • This saves us one operation:

g dx d f g f dx d ∗ = ∗ ) (

Derivative theorem of convolution

g dx d f ∗

f

g dx d

slide-60
SLIDE 60

Local descriptors

        ∗ ∗ ) ( * ) , ( ) ( ) , ( ) ( ) , ( σ σ σ

x

G y x I G y x I G y x I

  • Greyvalue derivatives

– Convolution with Gaussian derivatives

) 2 exp( 2 1 ) , , (

2 2 2 2

σ πσ σ y x y x G + − =

∫ ∫

∞ ∞ − ∞ ∞ −

′ ′ ′ − ′ − ′ ′ = ∗ y d x d y y x x I y x G G y x I ) , ( ) , , ( ) ( ) , ( σ σ

                =

  • )

( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) , ( σ σ σ σ

yy xy xx y

G y x I G y x I G y x I G y x I y x v

slide-61
SLIDE 61

Local descriptors

                    ∗ ∗ ) , ( ) , ( ) , ( ) ( * ) , ( ) ( ) , ( ) ( ) , ( y x L y x L y x L G y x I G y x I G y x I

y x y x

σ σ σ

Notation for greyvalue derivatives [Koenderink’87]

              =               =

  • )

, ( ) , ( ) , ( ) , ( ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) ( * ) , ( ) , ( y x L y x L y x L y x L G y x I G y x I G y x I G y x I y x

yy xy xx y yy xy xx y

σ σ σ σ v

Invariance?

slide-62
SLIDE 62

Local descriptors – rotation invariance

Invariance to image rotation : differential invariants [Koen87]

        + + +

y y x x

L L L L L L L L L L L L L 2

gradient magnitude

                      + + + + +

  • yy

yy xy xy xx xx yy xx yy yy y x xy x x xx

L L L L L L L L L L L L L L L L 2 2

Laplacian

slide-63
SLIDE 63

Laplacian of Gaussian (LOG)

) ( ) ( σ σ

yy xx

G G LOG + =

slide-64
SLIDE 64

SIFT descriptor [Lowe’99]

  • Approach

– 8 orientations of the gradient – 4x4 spatial grid – Dimension 128 – soft-assignment to spatial bins – normalization of the descriptor to norm one – normalization of the descriptor to norm one – comparison with Euclidean distance gradient 3D histogram

→ →

image patch

y x

slide-65
SLIDE 65

Local descriptors - rotation invariance

  • Estimation of the dominant orientation

– extract gradient orientation – histogram over gradient orientation – peak in this histogram

  • Rotate patch in dominant direction
slide-66
SLIDE 66

Local descriptors – illumination change

in case of an affine transformation

b aI I + = ) ( ) (

2 1

x x

  • Robustness to illumination changes
slide-67
SLIDE 67

Local descriptors – illumination change

in case of an affine transformation

b aI I + = ) ( ) (

2 1

x x

  • Normalization of derivatives with gradient magnitude
  • Robustness to illumination changes
  • Normalization of derivatives with gradient magnitude

y y x x yy xx

L L L L L L + + ) (

slide-68
SLIDE 68

Local descriptors – illumination change

in case of an affine transformation

b aI I + = ) ( ) (

2 1

x x

  • Normalization of derivatives with gradient magnitude
  • Robustness to illumination changes
  • Normalization of the image patch with mean and variance
  • Normalization of derivatives with gradient magnitude

y y x x yy xx

L L L L L L + + ) (

slide-69
SLIDE 69

Invariance to scale changes

  • Scale change between two images
  • Scale factor s can be eliminated
  • Support region for calculation!!

– In case of a convolution with Gaussian derivatives defined by

) 2 exp( 2 1 ) , , (

2 2 2 2

σ πσ σ y x y x G + − =

∫ ∫

∞ ∞ − ∞ ∞ −

′ ′ ′ − ′ − ′ ′ = ∗ y d x d y y x x I y x G G y x I ) , ( ) , , ( ) ( ) , ( σ σ

σ