SIFT 16-385 Computer Vision (Kris Kitani) Carnegie Mellon - - PowerPoint PPT Presentation

sift
SMART_READER_LITE
LIVE PREVIEW

SIFT 16-385 Computer Vision (Kris Kitani) Carnegie Mellon - - PowerPoint PPT Presentation

SIFT 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University SIFT (Scale Invariant Feature Transform) SIFT describes both a detector and descriptor 1. Multi-scale extrema detection 2. Keypoint localization 3. Orientation assignment


slide-1
SLIDE 1

SIFT

16-385 Computer Vision (Kris Kitani)

Carnegie Mellon University

slide-2
SLIDE 2

SIFT

(Scale Invariant Feature Transform) SIFT describes both a detector and descriptor

  • 1. Multi-scale extrema detection
  • 2. Keypoint localization
  • 3. Orientation assignment
  • 4. Keypoint descriptor
slide-3
SLIDE 3
  • 1. Multi-scale extrema detection

First octave Second octave

Gaussian Difference of Gaussian (DoG)

slide-4
SLIDE 4

Gaussian Laplacian

slide-5
SLIDE 5

Scale-space extrema

Selected if larger than all 26 neighbors Difference of Gaussian (DoG) Scale of Gaussian variance

slide-6
SLIDE 6
  • 2. Keypoint localization

x = {x, y, σ}

2nd order Taylor series approximation of DoG scale-space Take the derivative and solve for extrema Additional tests to retain only strong features

slide-7
SLIDE 7
  • 3. Orientation assignment

For a keypoint, L is the Gaussian-smoothed image with the closest scale, Detection process returns

{x, y, σ, θ}

location scale

  • rientation

x-derivative y-derivative

slide-8
SLIDE 8
  • 4. Keypoint descriptor

Image Gradients

(4 x 4 pixel per cell, 4 x 4 cells)

SIFT descriptor

(16 cells x 8 directions = 128 dims)

Gaussian weighting

(sigma = half width)

slide-9
SLIDE 9
  • Raw pixels
  • Sampled

Locally orderless Global histogram