Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
Lecture 10: Scales and Descriptors
1
Lecture 10: Scales and Descriptors Justin Johnson EECS 442 WI - - PowerPoint PPT Presentation
Lecture 10: Scales and Descriptors Justin Johnson EECS 442 WI 2020: Lecture 10 - 1 February 11, 2020 Administrative HW2 out, due 1 week from tomorrow: Wednesday 2/19/2019, 11:59pm HW3 out tomorrow, due 2 weeks from Friday: Friday 2/27,
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
1
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
HW2 out, due 1 week from tomorrow: Wednesday 2/19/2019, 11:59pm HW3 out tomorrow, due 2 weeks from Friday: Friday 2/27, 11:59pm
2
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
3
Are these pictures of the same object? If so, how are they related?
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
4
1: find corners+features 2: match based on local image data
Slide Credit: S. Lazebnik, original figure: M. Brown, D. Lowe
Finding and Matching
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 - 5
Corner of the glasses Edge next to panel
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
6
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
7
Iβm making the lightness equal to gradient magnitude
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
8
g dx d f *
f
g dx d
Slide Credit: S. Seitz
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
9
βedgeβ: no change along the edge direction βcornerβ: significant change in all directions βflatβ region: no change in all directions
Can localize the location, or any shift β big intensity change.
Diagram credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
10
π΅ = (
),+β-
π½)
/
(
),+β-
π½)π½+ (
),+β-
π½)π½+ (
),+β-
π½+
/
= πΊ12 π2 π/ πΊ
By doing a taylor expansion of the image, the second moment matrix tells us how quickly the image changes and in which directions. Can compute at each pixel Directions Amounts
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
11
π = det π΅ β π½ π’π πππ π΅ / = π2π/ β π½ π2 + π/
/
βCornerβ R > 0 βEdgeβ R < 0 βEdgeβ R < 0 βFlatβ region |R| small
Ξ±: constant (0.04 to 0.06)
Slide credit: S. Lazebnik; Note: this refers to visualization ellipses, not original M ellipse. Other slides on the internet may vary
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
12
weighting w
C.Harris and M.Stephens. βA Combined Corner and Edge Detector.β Proceedings
Slide credit: S. Lazebnik
π΅ = (
),+β-
π₯(π¦, π§)π½)
/
(
),+β-
π₯(π¦, π§)π½)π½+ (
),+β-
π₯(π¦, π§)π½)π½+ (
),+β-
π₯(π¦, π§)π½+
/
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
13
weighting w
π = det π΅ β π½ π’π πππ π΅ / = π2π/ β π½ π2 + π/
/ C.Harris and M.Stephens. βA Combined Corner and Edge Detector.β Proceedings
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
14
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
15
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
16
weighting w
C.Harris and M.Stephens. βA Combined Corner and Edge Detector.β Proceedings
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
17
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
18
weighting w
(Non-Maxima Suppression, NMS)
C.Harris and M.Stephens. βA Combined Corner and Edge Detector.β Proceedings
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
19
threshold Local Maxima are pixels with a higher R value than their neighbors
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
20
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
21
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
22
If our detectors are repeatable, they should be:
and corners remain the same: D(T(I)) = D(I)
is transformed and corners transform with it: D(T(I)) = T(D(I)) Where I is an image, T is a transformation, and D is our detector
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
23
R x (image coordinate)
threshold
R x (image coordinate)
Partially invariant to affine intensity changes
Slide credit: S. Lazebnik
M only depends on derivatives, so b is irrelevant But a scales derivatives and thereβs a threshold
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
24
All done with convolution. Convolution is translation invariant. Equivariant with translation
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
25
Rotations just cause the corner rotation to change. Eigenvalues remain the same. Equivariant with rotation
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
26
Corner
One pixel can become many pixels and vice-versa. Not equivariant with scaling
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
27
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
28
1/2 1/2 1/2
Note: Iβm also slightly blurring to prevent aliasing (https://en.wikipedia.org/wiki/Aliasing)
Left to right: each image is half-sized Upsampled with big pixels below
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
29
1/2 1/2 1/2
Note: Iβm also slightly blurring to prevent aliasing (https://en.wikipedia.org/wiki/Aliasing)
Left to right: each image is half-sized If I apply a KxK filter, how much of the original image does it see in each image?
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
30
See: Multi-Image Matching using Multi-Scale Oriented Patches, Brown et al. CVPR 2005
Harris Detection Harris Detection Harris Detection Harris Detection
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
31
Given a 50x16 person detector, how do I detect: (a) 250x80 (b) 150x48 (c) 100x32 (d) 25x8 people?
Sample people from image
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
32
Detecting all the people The red box is a fixed size
Sample people from image
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
33
Sample people from image
Detecting all the people The red box is a fixed size
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
34
Sample people from image
Detecting all the people The red box is a fixed size
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
35
Another detector (has some nice properties)
Find maxima and minima of blob filter response in scale and space
Slide credit: N. Snavely
Minima Maxima
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
36
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
37
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
38
π/ ππ¦/ π»M π¦ = π¦/ β π/ πV 2π exp β π¦/ 2π/ = π¦/ πW β 1 π/ π»M π¦
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
π ππ§ π π ππ¦ π
Gaussian 1st Deriv
π/ π/π§ π π/ π/π¦ π
2nd Deriv
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
40
π/ π/π§ π π/ π/π¦ π
Slight detail: for technical reasons, you need to scale the Laplacian.
βEHYZ
/
= π/ π/ ππ¦/ π + π/ π/π§ π
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
41
π
Edge
π/ π/π¦ π
Laplacian Of Gaussian
π β π/ π/π¦ π
Edge = Zero-crossing
Figure credit: S. Seitz
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
42
Figure credit: S. Lazebnik
maximum Edge: zero-crossing Blob = Two edges in opposite directions When blob is just the right size, Laplacian gives a large absolute value
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
43
Given binary circle and Laplacian filter of scale Ο, we can compute the response as a function of the scale.
R: 0.02
R: 2.9
R: 1.8 Radius: 8
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
44
Characteristic scale of a blob is the scale that produces the maximum response
Image
Slide credit: S. Lazebnik. For more, see: T. Lindeberg (1998). "Feature detection with automatic scale selection." International Journal of Computer Vision 30 (2): pp 77--116.
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
45
at several scales
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
46
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
47
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
48
at several scales
Laplacian response in image+scale space
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
49
R: 0.02
R: 2.9
R: 1.8 Radius: 8
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
50
Blue points are image-space neighbors
R: 0.02
R: 2.9
R: 1.8 Radius: 8
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
51
Red points are neighbors in scale space
R: 0.02
R: 2.9
R: 1.8 Radius: 8
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
52
R: 0.02
R: 2.9
R: 1.8 Radius: 8
Green point is a local maxima in both image and scale space if: it is larger than its image-space neighbors (blue) and larger than its scale-space neighbors (red)
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
53
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
54
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
55
Gaussians:
2
( , , ) ( , , )
xx yy
L G x y G x y s s s = +
( , , ) ( , , ) DoG G x y k G x y s s =
(Difference of Gaussians)
Slide credit: S. Lazebnik
Gaussian is separable, so cheaper to compute!
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
56
Original Image
Gaussian, sigma=2 Downsample by 2x Downsample by 2x Gaussian, sigma=1
Save computation by downsampling before blurring
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
57
David G. Lowe. "Distinctive image features from scale-invariant keypoints.β IJCV 60 (2), pp. 91-110, 2004.
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
58
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
59
Vast majority of effort is in the first and second scales
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
60
1: find corners+features 2: match based on local image data
Slide Credit: S. Lazebnik, original figure: M. Brown, D. Lowe
Finding and Matching
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
61
Image β 40
Image
1/2 size, rot. 45Β° Lightened+40
100x100 crop at Glasses
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 - 62
Once weβve found a corner/blobs, we canβt just use the image nearby. What about:
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
63
Given characteristic scale (maximum Laplacian response), we can just rescale image
Slide credit: S. Lazebnik
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
64
2 p
Given window, can compute dominant
Compute gradient direction at each pixel Build a histogram of gradient directions in window Find dominant direction from histogram Rotate so dominant direction is up
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
65
Picture credit: S. Lazebnik. Paper: David G. Lowe. "Distinctive image features from scale-invariant keypoints.β IJCV 60 (2), pp. 91-110, 2004.
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
66
Picture credit: S. Lazebnik. Paper: David G. Lowe. "Distinctive image features from scale-invariant keypoints.β IJCV 60 (2), pp. 91-110, 2004.
Rotate and set to common scale j Rotate and set to common scale
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
67
Now if we had two images of the same scene but different scale / rotation, we would find the same keypoints and set them to a common scale / rotation
Pixels of the rotated, rescaled patches should be the same!
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
68
We would like to be able to match these two points But the local patches look very different! Idea: Instead of comparing the pixels, instead use a feature vector to describe the appearance of each patch
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
69
Figure from David G. Lowe. "Distinctive image features from scale-invariant keypoints.β IJCV 60 (2), pp. 91-110, 2004.
j
gradient directions
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
70
Figure from David G. Lowe. "Distinctive image features from scale-invariant keypoints.β IJCV 60 (2), pp. 91-110, 2004.
j
Nice properties of SIFT:
small shifts / rotations
with 128-dim vector
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
71
Figure from David G. Lowe. "Distinctive image features from scale-invariant keypoints.β IJCV 60 (2), pp. 91-110, 2004.
Read the paper for all the gory detailsβ¦
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
72
Changes of illumination
Slide credit: N. Snavely
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
73
Think of feature as some non-linear filter that maps pixels to 128D feature
Photo credit: N. Snavely
Distance between vectors gives us the visual appearance between patches
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
Example credit: J. Hays
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
Example credit: J. Hays
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
76
but distances canβt be thresholded.
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
77
Figure from David G. Lowe. "Distinctive image features from scale-invariant keypoints.β IJCV 60 (2), pp. 91-110, 2004.
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
78
1: find corners+features 2: match based on local image data
Slide Credit: S. Lazebnik, original figure: M. Brown, D. Lowe
Finding and Matching
Justin Johnson February 11, 2020 EECS 442 WI 2020: Lecture 10 -
79
Find a transform that brings our matched features together!