Interest point detection Nicolas ROUGON ARTEMIS Department - - PowerPoint PPT Presentation

interest point detection
SMART_READER_LITE
LIVE PREVIEW

Interest point detection Nicolas ROUGON ARTEMIS Department - - PowerPoint PPT Presentation

High Tech Imaging IMA 4509 | Visual Content Analysis Interest point detection Nicolas ROUGON ARTEMIS Department Nicolas.Rougon@telecom-sudparis.eu Institut Mines-Tlcom Problem statement We hereafter review methods for extracting


slide-1
SLIDE 1

Institut Mines-Télécom

Interest point detection

Nicolas ROUGON

ARTEMIS Department

Nicolas.Rougon@telecom-sudparis.eu

High Tech Imaging

IMA 4509 | Visual Content Analysis

slide-2
SLIDE 2

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

■ We hereafter review methods for extracting

local geometric features of interest in gray level images, useable in a variety of image matching problems

  • Image registration

► image stitching | augmented reality

  • Image retrieval & object recognition/categorization

► image & video indexing

  • 3D scene/object reconstruction

► vision-based 3D photogrammetry

  • Tracking & navigation

► simultaneous localization and mapping (SLAM)

slide-3
SLIDE 3

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

■ Motivation

Matching techniques using local features of interest are significantly more robust to large variations of scene geometry, including than approaches assessing similarity between image (sub)domains

  • strong viewpoint

change

  • partial occlusion
  • object deformation
slide-4
SLIDE 4

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

  • Structural properties

 generic  sparse ► compactness ► computational efficiency  numerous ► robustness

 uniformly distributed  occlusions | clutter | cropping

■ Requirements

Relevant features of interest should be distinctive, and satisfy properties ensuring stable and efficient detection / matching

slide-5
SLIDE 5

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

■ Requirements

Relevant features of interest should be distinctive, and satisfy properties ensuring stable and efficient detection / matching

  • Invariance properties

► repeatability  contrast transforms  sensor photometric calibration  scene lighting monotonic luminance transforms  spatial transforms

 sensor geometric calibration  viewpoint

isometries | scalings | affine transforms

slide-6
SLIDE 6

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

  • Robustness properties

► repeatability ► accuracy  sampling & quantization  digital image acquisition  coding scheme  noise  sensor model

■ Requirements

Relevant features of interest should be distinctive, and satisfy properties ensuring stable and efficient detection / matching

slide-7
SLIDE 7

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

■ Candidate features

slide-8
SLIDE 8

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

■ Candidate features

Edges are not eligible as features of interest

  • Generic, sparse, uniformly distributed
  • Reasonably invariant to contrast changes
  • Not distinctive

► matching ambiguity along edge tangent

  • Not invariant to spatial transforms

t n Ltarget = c Lsource = c

? ? ?

slide-9
SLIDE 9

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

■ Candidate features

Corners provide relevant features of interest

  • Distinctive
  • Generic, sparse, numerous, uniformly distributed
  • Invariant to contrast changes
  • Invariant to spatial transforms except scalings
slide-10
SLIDE 10

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

■ Interest point matching

  • Detection

 Extract a set of distinctive & repeatable interest points  Define an invariant interest patch around each keypoint

  • Description

 Normalize & transform patches into invariant local coordinates  Compute a patch local descriptor

  • Matching

 Match local descriptors based

  • n some similarity metrics

x1

1

x3

1 1

x2 x3

2 2

x2 x1

2 1

fi

2

fj

1

f1

1

fd

2

f1

2

fd ► ◄ ► ◄

1

fi d( , )

2

fj

▼ ▼

slide-11
SLIDE 11

Institut Mines-Télécom

Example applications

IMA 4509 - Nicolas ROUGON

■ Image stitching

View #1 View #2

slide-12
SLIDE 12

Institut Mines-Télécom

Example applications

IMA 4509 - Nicolas ROUGON

■ Image stitching

Corners #1 Corners #2

► Corners capture the geometry of textured shapes

slide-13
SLIDE 13

Institut Mines-Télécom

Example applications

IMA 4509 - Nicolas ROUGON

■ Image stitching

Corner matching

slide-14
SLIDE 14

Institut Mines-Télécom

Example applications

IMA 4509 - Nicolas ROUGON

■ Image stitching

View stitching

slide-15
SLIDE 15

Institut Mines-Télécom

IMA 4509 - Nicolas ROUGON

Strong viewpoint changes | Partial occlusions

■ Video Tracking

Example applications

slide-16
SLIDE 16

Institut Mines-Télécom

IMA 4509 - Nicolas ROUGON

Strong viewpoint changes | Partial occlusions | Object deformations

■ Video Tracking

Example applications

slide-17
SLIDE 17

Institut Mines-Télécom

Example applications

IMA 4509 - Nicolas ROUGON

■ Multi-view 3D scene reconstruction

Feature extraction

> corner points

Feature matching

> motion vectors

Camera + sparse depth estimation

> 3D point cloud

Surface reconstruction + texturing

> 3D textured mesh

slide-18
SLIDE 18

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

  • Good detection

■ Requirements

Expected performances of relevant interest point detectors ◄ generic framework for performance assessment

slide-19
SLIDE 19

Institut Mines-Télécom

Performance assessment

IMA 4509 - Nicolas ROUGON

■ Confusion matrix

Type I error Type II error

  • True Positives (TP)

Correct detections

  • True Negatives (TN)

Correct rejections

  • False Positives (FP)

Wrong detections False alarm | Type I error

  • False Negatives (FN)

Wrong rejections Miss | Type II error

Type I error Type II error

True Positives False Positives True Negatives False Negatives False True Ground Truth (X) Detection (Y) Negative Positive

Type I error Type II error

Success Y = X Failure Y ≠ X

slide-20
SLIDE 20

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

  • Good detection

 few Failures

 few false positives  few false negatives

■ Requirements

Expected performances of relevant interest point detectors ◄ dedicated error metrics

slide-21
SLIDE 21

Institut Mines-Télécom

Performance metrics

IMA 4509 - Nicolas ROUGON

■ Recall | Sensitivity | True Positive Rate

  • Probability of relevant samples

to be detected > P[Y=1|X=1]

𝑠𝑓𝑑𝑏𝑚𝑚 = TP TP + FN

  • 𝑠𝑓𝑑𝑏𝑚𝑚 ↗ 1 when FN ↘ 0

► assessment of false negatives ► false positives not addressed

Type I error Type II error

True Positives False Positives True Negatives False Negatives False True Ground Truth (X) Detection (Y) Negative Positive

Type I error Type II error

slide-22
SLIDE 22

Institut Mines-Télécom

Performance metrics

■ Precision | Positive Predicted Value

Type I error Type II error

𝑞𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 = TP TP + FP

  • Probability of detections

to be relevant > P[X=1|Y=1]

Type I error Type II error

True Positives False Positives True Negatives False Negatives False True Ground Truth (X) Detection (Y) Negative Positive

Type I error Type II error

  • 𝑞𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 ↗ 1 when FP ↘ 0

► assessment of false positives ► false negatives not addressed

IMA 4509 - Nicolas ROUGON

slide-23
SLIDE 23

Institut Mines-Télécom

Performance metrics

IMA 4509 - Nicolas ROUGON

■ Specificity | True Negative Rate

Type I error Type II error

  • Probability of irrelevant samples

to be rejected > P[Y=0|X=0]

𝑡𝑞𝑓𝑑𝑗𝑔𝑗𝑑𝑗𝑢𝑧 = TN TN + FP

Type I error Type II error

True Positives False Positives True Negatives False Negatives False True Ground Truth (X) Detection (Y) Negative Positive

Type I error Type II error

  • 𝑡𝑞𝑓𝑑𝑗𝑔𝑗𝑑𝑗𝑢𝑧 ↗ 1 when FP ↘ 0

► assessment of false positives

slide-24
SLIDE 24

Institut Mines-Télécom

Performance metrics

IMA 4509 - Nicolas ROUGON

■ Negative Predicted Value

Type I error Type II error

  • Probability of rejections

to be irrelevant > P[X=0|Y=0]

𝑂𝑄𝑊 = TN TN + FN

Type I error Type II error

True Positives False Positives True Negatives False Negatives False True Ground Truth (X) Detection (Y) Negative Positive

Type I error Type II error

  • 𝑂𝑄𝑊 ↗ 1 when FN ↘ 0

► assessment of false negatives

slide-25
SLIDE 25

Institut Mines-Télécom

Performance metrics

IMA 4509 - Nicolas ROUGON

𝐵𝐷𝐷 = TP + TN TP + TN + FP + FN

  • Probability of correct decision

(detection/rejection) > P[Y=X]

■ Accuracy

Type I error Type II error

True Positives False Positives True Negatives False Negatives False True Ground Truth (X) Detection (Y) Negative Positive

Type I error Type II error

  • 𝐵𝐷𝐷 ↗ 1 when (FP, FN) ↘ 0

► joint assessment of false positives & false negatives, from detections & rejections

slide-26
SLIDE 26

Institut Mines-Télécom

Performance metrics

IMA 4509 - Nicolas ROUGON

𝐺 = 2 𝑞𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 ∙ 𝑠𝑓𝑑𝑏𝑚𝑚 𝑞𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 + 𝑠𝑓𝑑𝑏𝑚𝑚

  • Harmonic mean of precision

and recall

■ F-score

Type I error Type II error

True Positives False Positives True Negatives False Negatives False True Ground Truth (X) Detection (Y) Negative Positive

Type I error Type II error

  • 𝐺 ↗ 1 when (FP, FN) ↘ 0

► joint assessment of false positives & false negatives, focusing on detections

= 2 TP 2 TP + FP + FN

slide-27
SLIDE 27

Institut Mines-Télécom

Problem statement

IMA 4509 - Nicolas ROUGON

  • Good detection

 few false positives = high precision  few false negatives = high recall

  • Robustness against noise
  • Good localization

► accuracy

  • Computational efficiency

■ Requirements

Expected performances of relevant interest point detectors

► repeatability

slide-28
SLIDE 28

Institut Mines-Télécom

Differential corner detection

IMA 4509 - Nicolas ROUGON

slide-29
SLIDE 29

Institut Mines-Télécom

Corner detection

IMA 4509 - Nicolas ROUGON

■ Basic idea

At corner points, comparing an image patch to its neighbors shows dissimilarity in all directions

  • Low-texture region ● Edge
  • Corner

► similar in all directions ► similar along edge direction ► dissimilar in all directions

slide-30
SLIDE 30

Institut Mines-Télécom

ρ Kρ

Corner detection

IMA 4509 - Nicolas ROUGON

■ Basic idea

At corner points, comparing an image patch to its neighbors shows dissimilarity in all directions

  • This requires defining a similarity metric between image patches,

and search for its local maxima over a space of admissible shifts u

  • A natural choice is (weighted) autocorrelation

 Kρ : kernel with extension ρ

► patch support unit | Gaussian  Opting for a unit kernel and 8-connected shifts yields the (early) Moravec detector

► not isotropic

x

slide-31
SLIDE 31

Institut Mines-Télécom

Corner detection

IMA 4509 - Nicolas ROUGON

■ Quadratic approximation of the autocorrelation metric

  • 1st-order Taylor expansion:
  • The matrix is known as the structure tensor

► Quadratic approximation

slide-32
SLIDE 32

Institut Mines-Télécom

  • A robust estimate of the structure tensor is obtained using

regularized image derivatives Gaussian: | Canny-Deriche

  • Expanded form

Corner detection

IMA 4509 - Nicolas ROUGON

■ Structure tensor

 σ = local scale  ρ = integration scale

slide-33
SLIDE 33

Institut Mines-Télécom

  • Pointwise patch (ρ = 0)

Corner detection

IMA 4509 - Nicolas ROUGON

■ Structure tensor

The information in the tensor (symmetric, positive definite) is described by its eigenvectors (dmax, dmin) and eigenvalues (λmax, λmin) ► same information in and

  • describes average gradient properties over patch support

 dmax (dmin) : dominant (anti-dominant) orientation  λmax , λmin : directional contrast values 

slide-34
SLIDE 34

Institut Mines-Télécom

Corner detection

IMA 4509 - Nicolas ROUGON

■ Structure tensor

  • Edge

 1 dominant direction – 1 large directional gradient  λmax large – λmin small

slide-35
SLIDE 35

Institut Mines-Télécom

Corner detection

IMA 4509 - Nicolas ROUGON

■ Structure tensor

  • Low-textured region

 no dominant direction – no gradient information  λmax small – λmin small

slide-36
SLIDE 36

Institut Mines-Télécom

Corner detection

IMA 4509 - Nicolas ROUGON

■ Structure tensor

  • High-textured region | Corner

 no dominant direction – large directional gradients  λmax large – λmin large

slide-37
SLIDE 37

Institut Mines-Télécom

Corner detection

IMA 4509 - Nicolas ROUGON

■ Structure tensor

  • At corner points, the smallest eigenvalue λmin of the structure

tensor is large enough

  • This property is exploited by 2 widely-used corner detectors

 Kanade-Lucas-Tomasi (KLT)  Harris-Förstner

slide-38
SLIDE 38

Institut Mines-Télécom

The KLT detector

IMA 4509 - Nicolas ROUGON

slide-39
SLIDE 39

Institut Mines-Télécom

Kanade-Lucas-Tomasi (KLT) detector

IMA 4509 - Nicolas ROUGON

■ Algorithm

Given a threshold tλmin on λmin and the size D of a square neighborhood

  • Compute image gradient
  • Initialize a point list L. For each x  Ω

− compute structure tensor using a unit (D x D) kernel − compute smallest eigenvalue λmin

− if λmin> tλmin, insert x into L

  • Sort L in decreasing order of λmin ► Lsort
  • Scan Lsort from top to bottom

− for each current point x  Lsort, discard all points after x in Lsort located in the (D x D) neighborhood of x

slide-40
SLIDE 40

Institut Mines-Télécom

Kanade-Lucas-Tomasi (KLT) detector

IMA 4509 - Nicolas ROUGON

■ Hyperparameters

  • The threshold tλmin controls the sensitivity of the detector

 tλmin can be estimated from the histogram of λmin which has usually an obvious valley near 0  however, this valley does not always exist

  • The kernel / neighborhood size D is estimated empirically

 in most cases: D  [2,10]  large values of D induce delocalization artifacts and neighboring corner fusion

slide-41
SLIDE 41

Institut Mines-Télécom

The Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

slide-42
SLIDE 42

Institut Mines-Télécom

  • The Harris-Förstner detector makes use of both eigenvalues
  • f the structure tensor via their ratio
  • To avoid computing (λmax, λmin) explicitly, similitude invariants
  • f are used

Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

■ Principles

slide-43
SLIDE 43

Institut Mines-Télécom

  • The latter are combined into a corner index

 setting a threshold on r induces a threshold on α

  • This yields the following corner metric

 hyperparameter α  [0, 0.25]

Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

■ Principles

− large at corner points − small in low-texture regions − negative at edge points

slide-44
SLIDE 44

Institut Mines-Télécom

Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

■ Algorithm

Given a value of α and a threshold tR on R

  • Compute image gradient
  • For each x  Ω

− compute structure tensor using a Gaussian kernel Gρ Standard choice: ρ = 2σ − compute corner metric R(x)

  • Threshold corner map R above tR and retain only local maxima*

* via Non-Maximal Suppression in the 8-connected neighborhood

  • Filter out weak*corners in the ρ-neighborhood of strong*corners

in a way similar to KLT

* w.r.t. the corner metric R

slide-45
SLIDE 45

Institut Mines-Télécom

Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

■ Hyperparameters

  • The parameter α controls the

sensitivity of the detector and is tuned empirically

 sensitivity  when α   in most cases: α  [0.04, 0.06]

  • The threshold tR is tuned empirically

 usually, tR is set close to 0

0.05 α 0.10 0.20 0.22 0.24

slide-46
SLIDE 46

Institut Mines-Télécom

Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

  • riginal
slide-47
SLIDE 47

Institut Mines-Télécom

Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

Corner metric R

slide-48
SLIDE 48

Institut Mines-Télécom

Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

Thresholded corner metric R > tR

slide-49
SLIDE 49

Institut Mines-Télécom

Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

Corner metric local maxima

slide-50
SLIDE 50

Institut Mines-Télécom

Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

Harris points

slide-51
SLIDE 51

Institut Mines-Télécom

KLT vs. Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

■ Properties

  • Isometry-invariance

Inherited from eigenvalues properties

  • Insensitive to affine intensity transforms

Local maxima of λmin / R are preserved R(x) x x R

φ(L) = aL + b

slide-52
SLIDE 52

Institut Mines-Télécom

KLT vs. Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

■ Limitations

  • Not invariant to scaling

Computing J at fixed scale ρ in both views ► multiple points detected as edges ► single point detected as a corner

ρ view #1 view #2

slide-53
SLIDE 53

Institut Mines-Télécom

KLT vs. Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

■ Performance comparison

KLT Harris

slide-54
SLIDE 54

Institut Mines-Télécom

KLT vs. Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

■ Performance comparison

KLT Harris

slide-55
SLIDE 55

Institut Mines-Télécom

KLT vs. Harris-Förstner detector

IMA 4509 - Nicolas ROUGON

■ KLT detector

  • Output is usually closer to human perception of corners
  • Often used for motion tracking

► widespread KLT Tracker

  • Mostly used in the US

■ Harris-Förstner detector

  • Good repeatability under varying rotation and lighting
  • Often used for 3D scene reconstruction and image retrieval
  • Mostly used in Europe
slide-56
SLIDE 56

Institut Mines-Télécom

The Hessian detector

IMA 4509 - Nicolas ROUGON

slide-57
SLIDE 57

Institut Mines-Télécom

  • As the Harris / KLT detectors, the Hessian detector searches

for points with large image derivatives in 2 orthogonal directions

  • Instead of the image gradient encoded in the structure tensor,

the Hessian detector deals with the 2nd-order image differential, known as the Hessian tensor

  • A robust estimate of the Hessian tensor is obtained using

Gaussian image derivatives

The Hessian detector

IMA 4509 - Nicolas ROUGON

■ Principles

slide-58
SLIDE 58

Institut Mines-Télécom

  • Extremal directions / values of 2nd-order directional image

derivatives are found via eigen-decomposition of

  • To avoid computing eigenvalues explicitly, similitude invariants
  • f are used

 its trace is the Laplacian-of-Gaussian (LoG) filter  the determinant of the Hessian (DoH) provides a 2nd-order corner metric

The Hessian detector

IMA 4509 - Nicolas ROUGON

■ Principles

at corner points in high-texture regions at edge points in low-texture regions − large − small

slide-59
SLIDE 59

Institut Mines-Télécom

The Hessian detector

IMA 4509 - Nicolas ROUGON

■ Algorithm

Given a threshold tdet on

  • Compute image Hessian over Ω
  • Compute corner metric over Ω

► DoH corner map

  • Threshold corner map above tdet and retain only local maxima*

* via Non-Maximal Suppression in the 8-connected pixel neighborhood

  • Filter out weak*corners in the ρ-neighborhood of strong*corners

in a way similar to KLT

* w.r.t. DOH corner metric

slide-60
SLIDE 60

Institut Mines-Télécom

  • Maximal DoH responses occur on corners & high-texture regions
  • Compared to the Harris-Förstner detector, the Hessian detector

 provides many additional responses on high-texture regions ► less specific to corners ► denser object cover  delivers corners that are less precisely located ► lower localization accuracy Both properties results from the use of higher-order derivatives and pointwise estimation (no patch-based integration)

The Hessian detector

IMA 4509 - Nicolas ROUGON

■ Performances

slide-61
SLIDE 61

Institut Mines-Télécom

  • Isometry-invariance
  • Insensitive to affine intensity transforms

Local maxima of the DoH are preserved

  • Not invariant to scaling

Pointwise estimation

The Hessian detector

IMA 4509 - Nicolas ROUGON

■ Properties

slide-62
SLIDE 62

Institut Mines-Télécom

The Harris-Laplace & SIFT detectors

IMA 4509 - Nicolas ROUGON

slide-63
SLIDE 63

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Key ideas

  • When zooming occurs, computing J at fixed scale ρ in both views

leads to matching ambiguities

ρ

► Harris/KLT detector repeatability degrades with scale changes

view #1 view #2

slide-64
SLIDE 64

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Key ideas

  • This issue is fixed by computing J at view-specific scales ρ1(x), ρ2(x’)

so that corresponding keypoint neighborhoods look the same

view #1 view #2 ρ2(x’) ρ1(x)

Issue: How to determine keypoint characteristic scale independently in each view?

► scale-selection mechanism

slide-65
SLIDE 65

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

ρ1(x) ρ2(x’) view #1 view #2

 exponential scale sampling σ0 = finest scale | ξ = scale factor (default: 1.4) ► uniform information variation between scale levels

  • Build a multiscale image representation (L(x,σ))σ

 Gaussian scale-space ► Gaussian image derivatives

slide-66
SLIDE 66

Institut Mines-Télécom

► f involves scale-normalized coordinates

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

  • Design a local image signature function f(x,σ) operating on

pixel neighborhood at scale σ  scale-invariance f(x,σ) and f(x’,σ) are similar* for corresponding keypoints * up to a scaling factor

σ f(x,σ) view #1 σ f(x’,σ) view #2

scaling ►

slide-67
SLIDE 67

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

  • Keypoint neighborhood characteristic size can be estimated

by searching for local extremum over scale of f(x,σ)  scale-invariance keypoint characteristic scale is invariant to image scaling

σ1(x)

► The (local) scaling factor between views is

σ2(x’) σ σ f(x,σ) f(x’,σ)

scaling ►

view #1 view #2

slide-68
SLIDE 68

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

  • Example: image signature function at corresponding keypoints

view #1 view #2 σ f(x,σ) σ f(x’,σ)

slide-69
SLIDE 69

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

  • Example: image signature function at corresponding keypoints

view #1 view #2 f(x’,σ) σ σ f(x,σ)

slide-70
SLIDE 70

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

  • Example: image signature function at corresponding keypoints

view #1 view #2 f(x,σ) f(x’,σ) σ σ

slide-71
SLIDE 71

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

  • Example: image signature function at corresponding keypoints

view #1 view #2 f(x,σ) f(x’,σ) σ σ

slide-72
SLIDE 72

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

  • Example: image signature function at corresponding keypoints

view #1 view #2 f(x,σ) f(x’,σ) σ σ

slide-73
SLIDE 73

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

  • Example: image signature function at corresponding keypoints

view #1 view #2 f(x,σ) f(x’,σ) σ σ σ1(x) σ2(x’)

slide-74
SLIDE 74

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Automatic scale-selection

Admissibility conditions for image signature functions f

f(x,σ) σ

  • ( f(x,σ))σ has a scale-space structure
  • ( f(x,σ))σ is isometry-invariant and scale-invariant
  • f(x,σ) has a single stable sharp peak over scale

 good  bad  bad

  • f responds to luminance contrast (= image structure)

► scale-normalized Gaussian derivatives

slide-75
SLIDE 75

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Scale-selection kernels

  • Scale-normalized Laplacian-of-Gaussian (LoG)

 Laplacian-of-Gaussian w.r.t. scale-normalized coordinates  scale extrema of

  • ccur for image blobs centered at x

with radius r = σ ► neighborhood characteristic scale

σ r

slide-76
SLIDE 76

Institut Mines-Télécom

Laplacian-of-Gaussian (LoG) detector

IMA 4509 - Nicolas ROUGON

■ Scale-normalized Laplacian-of-Gaussian

provides a detector for blob-like features

  • Scale-space extrema of

are searched

 jointly detects scale-invariant blob-like regions & estimates their characteristic scale  performed by comparing values in pixel 26-connected neighborhood in scale-space (x,σ)

  • Blob centers provide repeatable

interest points

σn+1 σn-1 σn

xij

y x

slide-77
SLIDE 77

Institut Mines-Télécom

Laplacian-of-Gaussian (LoG) detector

IMA 4509 - Nicolas ROUGON

■ Performances

  • riginal

LoG blobs

slide-78
SLIDE 78

Institut Mines-Télécom

Laplacian-of-Gaussian (LoG) detector

IMA 4509 - Nicolas ROUGON

■ Performances

  • riginal

LoG blobs

slide-79
SLIDE 79

Institut Mines-Télécom

Laplacian-of-Gaussian (LoG) detector

IMA 4509 - Nicolas ROUGON

■ Performances

  • riginal

LoG blobs

slide-80
SLIDE 80

Institut Mines-Télécom

Laplacian-of-Gaussian (LoG) detector

IMA 4509 - Nicolas ROUGON

  • T. Lindeberg | Feature detection with automatic scale selection

International Journal of Computer Vision, 30(2):79-116, November 1998

■ To probe further

slide-81
SLIDE 81

Institut Mines-Télécom

Scale-invariant interest point detection

IMA 4509 - Nicolas ROUGON

■ Scale-selection kernels

  • Difference-of-Gaussians (DoG)

 accurately approximates the scale-normalized LoG for constant ξ  scale extrema occur for image blobs centered at x with radius r = 1.6σ

slide-82
SLIDE 82

Institut Mines-Télécom

Difference-of-Gaussians (DoG) detector

IMA 4509 - Nicolas ROUGON

■ Difference-of-Gaussians

provides a detector for blob-like features

  • Interest regions are searched as scale-space extrema of

 detection by inspecting pixel 26-connected neighborhood in scale-space (x,σ)  accurate localization by fitting a 3D quadric in scale-space

  • Blob centers provide repeatable interest points

 to capture some of the surrounding structure, region characteristic scale is chosen larger than 1.6σ Usually: r = 3σ

  • DoG interest regions are the basis of the popular

Scale-Invariant Feature Transform (SIFT) image descriptor

slide-83
SLIDE 83

Institut Mines-Télécom

2

Difference-of-Gaussians (DoG) detector

IMA 4509 - Nicolas ROUGON

■ Difference-of-Gaussians

Efficient implementation combining Gaussian scale-space + pyramid

σ0 4σ0 2σ0

  • For each octave, a Gaussian

scale-space is built ► DoG scale-space

 octave sampling into N intervals > constant scale factor ξ = 21/N

  • To speed up computations,

a Gaussian pyramid is built from images at scale 2nσ0 ► DoG pyramid

σN = 2σ0

  • N+1

levels N+3 levels

slide-84
SLIDE 84

Institut Mines-Télécom

Scale-Invariant Feature Transform (SIFT)

IMA 4509 - Nicolas ROUGON

■ To probe further

  • D. Lowe | Distinctive image features from scale-invariant keypoints

International Journal of Computer Vision, 60(2):91-110, November 2004

slide-85
SLIDE 85

Institut Mines-Télécom

Harris-Laplace detector

IMA 4509 - Nicolas ROUGON

■ Principles

Combine Harris-Förstner detector specificity for corner-like structures with LoG-based scale selection mechanism

  • The scale-normalized structure tensor induces

a scale-invariant Harris corner metric

  • Scale-spaces are built for the Harris metric

and the LoG (default: s = 0.7)

 at each scale, local maxima of the Harris metric higher than some threshold tR are detected > candidate points  points at which the LoG simultaneously attains an extremum

  • ver scale higher than some threshold t LoG are retained
slide-86
SLIDE 86

Institut Mines-Télécom

Harris-Laplace detector

IMA 4509 - Nicolas ROUGON

■ Performances

  • Harris-Laplace interest points are highly-discriminative

► increased repeatability

  • The Harris-Laplace detector returns a much smaller number of

points than the LoG / DoG detectors ► reduced robustness to

− partial occlusions

issue for object recognition

− scene variability

issue for object categorization

  • A modified Harris-Laplace detector using a weaker criterion

has been proposed  selects scale extrema of the LoG at locations for which the Harris metric also attains a maximum at any scale

slide-87
SLIDE 87

Institut Mines-Télécom

Harris-Laplace detector

IMA 4509 - Nicolas ROUGON

  • K. Mikolajczyk, C. Schmid | Scale & affine invariant interest point detectors

International Journal of Computer Vision, 60(1):63-86, October 2004

  • K. Mikolajczyk, C. Schmid | Indexing based on scale-invariant interest points

8th IEEE International Conference on Computer Vision (ICCV’2001), Vancouver, Canada

  • Vol. 1, 525-531, July 2001

■ To probe further

slide-88
SLIDE 88

Institut Mines-Télécom

Harris-Laplace vs. SIFT detector

IMA 4509 - Nicolas ROUGON

■ Harris-Laplace

Local maxima of

  • Harris metric in space
  • LoG in scale

■ SIFT

Local maxima of

  • DoG in space
  • DoG in scale

σ y x σ y x LoG DoG DoG Harris

slide-89
SLIDE 89

Institut Mines-Télécom

The Hessian-Laplace detector

IMA 4509 - Nicolas ROUGON

slide-90
SLIDE 90

Institut Mines-Télécom

Hessian-Laplace detector

IMA 4509 - Nicolas ROUGON

■ Principles

Combine Hessian detector specificity for corner-like structures with LoG-based scale selection mechanism

  • The scale-normalized Hessian tensor induces

a scale-invariant DoH metric

  • Scale-spaces are built for the DoH metric

and the LoG

 at each scale, local maxima of the DoH metric higher than some threshold tdet are detected > candidate points  points at which the LoG simultaneously attains an extremum

  • ver scale higher than some threshold t LoG are retained
slide-91
SLIDE 91

Institut Mines-Télécom

Temporary conclusion

IMA 4509 - Nicolas ROUGON

■ Topics to be further addressed

  • Description

Define / extract feature vector descriptors around interest points

  • Matching

Estimate correspondence between descriptors in each view ► To be continued in the HTI / Multimedia indexing 3rd-year course

■ Point of Interest detectors

  • 1st-order

fixed scale scale-invariant Hessian-Laplace SIFT | Harris-Laplace KLT | Harris-Förstner DoH

  • 2nd-order
slide-92
SLIDE 92

Institut Mines-Télécom

Interest point detection

Nicolas ROUGON

ARTEMIS Department

Nicolas.Rougon@telecom-sudparis.eu

High Tech Imaging

IMA 4509 | Visual Content Analysis