CS201: Computer Vision Lect 09: SIFT Descriptors John Magee 26 - - PowerPoint PPT Presentation

cs201 computer vision
SMART_READER_LITE
LIVE PREVIEW

CS201: Computer Vision Lect 09: SIFT Descriptors John Magee 26 - - PowerPoint PPT Presentation

CS201: Computer Vision Lect 09: SIFT Descriptors John Magee 26 September 2014 Slides Courtesy of Diane H. Theriault Questions of the Day: How can we find matching points in images? How can we use matching points to recognize objects?


slide-1
SLIDE 1

CS201: Computer Vision Lect 09: SIFT Descriptors

John Magee 26 September 2014

Slides Courtesy of Diane H. Theriault

slide-2
SLIDE 2

Questions of the Day:

  • How can we find matching points in images?
  • How can we use matching points to recognize
  • bjects?
slide-3
SLIDE 3

SIFT

  • Find repeatable, scale-invariant points in images (Tuesday)
  • Compute something about them (Today)
  • Use the thing you computed to perform matching (Today)
  • A lot of engineering decisions
  • “Distinctive Image Features from Scale-Invariant Keypoints”

by David Lowe

  • Patented!
slide-4
SLIDE 4

How to find the same cat?

  • Imagine that we had a

library of cats

  • How could we find another

picture of the same cat in the library?

  • Look for the markings?
slide-5
SLIDE 5

Scale Space

  • Image convolved with Gaussians of different widths
slide-6
SLIDE 6

Keypoints with Image Filtering

  • Perform image filtering by

convolving an image with a “filter”/”mask” / “kernel” to

  • btain a “result” / “response”
  • The value of the result will be

positive in regions of the image that “look like” the filter

  • What would a “dot” filter

look like? Image Filter

slide-7
SLIDE 7

Laplacian of a Gaussian

  • Sum of spatial second derivatives
slide-8
SLIDE 8

Difference of Gaussians

  • Approximation of the

Laplacian of a Gaussian

slide-9
SLIDE 9

Scale-space Extrema

  • “Extremum” = local

minimum or maximum

  • Check 8 neighbors at a

particular scale

  • Check neighbors at scales

above and below

slide-10
SLIDE 10

Scale-space Extrema

  • Find locations and scales where the response to

the LoG filter is a local extremum

slide-11
SLIDE 11

Removing Low Contrast Points

  • Threshold on the magnitude of the response

to the LoG filter

  • Threshold empirically determined
slide-12
SLIDE 12

Removing Points Along Edges

  • In 1D: first derivative shows how the function

is changing (velocity)

  • In 1D: second derivative how the change is

changing (acceleration)

  • In 2D: first derivative leads to a gradient

vector, which has a magnitude and direction

  • In 2D: second derivatives lead to a matrix,

which gives information about the rate and

  • rientation of the change in the gradient
slide-13
SLIDE 13

Removing Points Along Edges

  • Hessian is a matrix of 2nd derivatives
  • Eigenvectors tell you the orientation of the curvature
  • Eigenvalues tell you the magnitude
  • Ratio of eigenvalues tells you extent to which one orientation

is dominant

Gradient of a Gaussian Hessian of a Gaussian

slide-14
SLIDE 14

Attributes of a Keypoint

  • Position (x,y)

– location in the image

  • Scale

– scale where this point is a LoG extremum

  • Orientation?
slide-15
SLIDE 15

Gradient Orientation Histogram

  • Make a histogram over

gradient orientation

  • Weighted by gradient

magnitude

  • Weighted by distance to

key point

  • Contribution to bins with

linear interpolation

slide-16
SLIDE 16

Gradient Orientation Histogram

Gradient orientation histogram

slide-17
SLIDE 17

Gradient Orientation Histogram

  • Plain Histogram of

Gradient Orientation

slide-18
SLIDE 18

Gradient Orientation Histogram

  • Weighted by

gradient magnitude

  • (Could also weight by distance to

center of window)

slide-19
SLIDE 19

Gradient Orientation Histogram

  • Interpolated to avoid

edge effects of bin quantization

slide-20
SLIDE 20

Assigning Orientation to Keypoint

  • Support: from image at assigned scale, all points in a window

surrounding keypoint

  • 36 bins over 360 degrees
  • Contributions weighted by distance to center of key point,

weighted by a Gaussian with sigma 1.5 x assigned scale

Dominant

  • rientation
slide-21
SLIDE 21

Computing SIFT Descriptor

  • Divide 16 x 16 region

surrounding keypoint into 4 x 4 windows

  • For each window, compute a

histogram with 8 bins

  • 128 total elements
  • Interpolation to improve

stability (over orientation and over distance to boundary of window)

slide-22
SLIDE 22

Computing SIFT Descriptor

  • Divide 16 x 16 region

surrounding keypoint into 4 x 4 windows

  • For each window, compute a

histogram with 8 bins

  • 128 total elements
  • Interpolation to improve

stability (over orientation and over distance to boundary of window)

slide-23
SLIDE 23

Normalizing the descriptor

  • To get (some) invariance to brightness and

contrast

– Clamp weight due to gradient magnitude (In case some edges are very strong due to weird lighting) – Normalize entire vector to unit length (So the absolute value of the gradient magnitude isn’t as important as the distribution of the gradient magnitude)

slide-24
SLIDE 24

Using the keypoints

  • Assemble a database:

– Pick some “training” images of different objects – Find keypoints and compute descriptors – Store the descriptors and associated source image, position, scale, and orientation

slide-25
SLIDE 25

Using the keypoints

  • New Image

– Find keypoints and compute descriptors – Search database for matching descriptors – (Throw out descriptors that are not distinctive) – Look for clusters of matching descriptors

  • (e.g. In your new image, you found 10 keypoints and

associated descriptors, and in the database, there is an image where 6 of the descriptors match, but only 1 or 2

  • n other database images)
slide-26
SLIDE 26

Using the keypoints

– http://chrisjmccormick.wordpress.com/2013/01/24/opencv-sift- tutorial/

slide-27
SLIDE 27

Voting for Pose

  • Matching keypoints from database image and new image will

imply some relationship in pose (position, scale, and

  • rientation)

– Example: This keypoint was found 20 pixels down and 50 pixels to the right of the matching descriptor from the database image – Example: This keypoint was computed at 2x the scale of the matching descriptor from the database image – Look for clusters of matches with similar offsets – (“Generalized Hough Transform”)

slide-28
SLIDE 28

Discussion Questions

  • What types of invariance do we want to have when we think

about doing object recognition?

  • What does it mean to be invariant to different image

attributes? (brightness, contrast, position, scale, orientation)

  • What does it mean for an image feature to be stable?
  • Why might it make sense to use a weighted histogram? What

kinds of weights?

  • What is a problem with the quantization associated with

creating a histogram and what can we do about it?