Theoretical Bounds on Image Search Instructor - Simon Lucey 16-423 - - PowerPoint PPT Presentation

theoretical bounds on image search
SMART_READER_LITE
LIVE PREVIEW

Theoretical Bounds on Image Search Instructor - Simon Lucey 16-423 - - PowerPoint PPT Presentation

Theoretical Bounds on Image Search Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today Exhaustive Search & Sampling Motivation for Descriptors Overcoming the Curse of Dimensionality in Search 2 3 Biggest


slide-1
SLIDE 1

Theoretical Bounds on Image Search

16-423 - Designing Computer Vision Apps

Instructor - Simon Lucey

slide-2
SLIDE 2

Today

  • Exhaustive Search & Sampling
  • Motivation for Descriptors
  • Overcoming the Curse of Dimensionality in Search

2

slide-3
SLIDE 3

3 Biggest Problems in Computer Vision?

3

slide-4
SLIDE 4

3 Biggest Problems in Computer Vision?

3

REGISTRATION

REGISTRATION

REGISTRATION

(Prof. Takeo Kanade - Robotics Institute - CMU.)

slide-5
SLIDE 5

x

What is Registration?

4

“Source” “Template”

slide-6
SLIDE 6

Our goal is to find the warp parameter vector !

W(x; p) x W(x; p) = warping function such that x0 = W(x; p) p = parameter vector describing warp x = coordinate in template [x, y]T x0 = corresponding coordinate in source [x0, y0]T

p

What is Registration?

4

x0

“Source” “Template”

slide-7
SLIDE 7

Warp Functions

5

W(x; p) = x + p p = [p1 p2]T W(x; p) = x + p p = [p1 p2]T

translation

translation

aspect

W(x; p) = M  x 1

  • affine

rotation

M = 1 − p1 p2 p3 p4 1 − p5 p6

  • p = [p1 . . . p6]T
slide-8
SLIDE 8

Warp Functions

6

W(x; p)

slide-9
SLIDE 9

Warp Functions

6

W(x; p)

slide-10
SLIDE 10

(p)

Naive Approach

“Images at various warps ”

7

p

“Template image”

slide-11
SLIDE 11

(p)

Naive Approach

“Images at various warps ” “Vectors of pixel values at each warp position”

[255,134,45,.......,34,12,124,67] [123,244,12,.......,134,122,24,02] [67,13,245,.......,112,51,92,181] [65,09,67,.......,78,66,76,215]

...........

7

p

“Template image”

slide-12
SLIDE 12
  • bject

≥ < background Th

Naive Approach

  • If you were a person coming straight from machine learning

you might suggest,

“Vectors of pixel values at each warp position”

[255,134,45,.......,34,12,124,67] [123,244,12,.......,134,122,24,02] [67,13,245,.......,112,51,92,181] [65,09,67,.......,78,66,76,215]

...........

D( )

“matching function”

8

slide-13
SLIDE 13
  • bject

≥ < background Th

Naive Approach

  • If you were a person coming straight from machine learning

you might suggest,

“Vectors of pixel values at each warp position”

[255,134,45,.......,34,12,124,67] [123,244,12,.......,134,122,24,02] [67,13,245,.......,112,51,92,181] [65,09,67,.......,78,66,76,215]

...........

D( )

“matching function”

8

We refer to this as Exhaustive Search!!!

slide-14
SLIDE 14

p = {p1, p2} p1 p2

Sampling?

“Possible Source Warps”

  • How do we sample every warp parameter value?

9

slide-15
SLIDE 15

p = {p1, p2} p1 p2

Sampling?

“Possible Source Warps”

  • How do we sample every warp parameter value?

∆p2

∆p1

9

slide-16
SLIDE 16

I T

W(x1; p)

x1

D(p) =

M

X

i=1

||I(W(xi; p)) − T(xi)||2

I T

Measures of Similarity

“Model” “Source Image”

10

  • Sampling strategy dependent on similarity measure,
  • Sum of squared differences (SSD)
slide-17
SLIDE 17

Measures of Similarity

  • Sampling strategy dependent on similarity measure,
  • Sum of squared differences (SSD)

“Vector Form”

11

D(p) = ||I(p) − T(0)||2

I T

“Model” “Source Image”

slide-18
SLIDE 18

I T z

D(p) = −I(p)T T(0)

Measures of Similarity

“Template” “Source Image”

12

“Can be done efficiently using 2D convolutions....”

  • Sampling strategy dependent on similarity measure,
  • Linear Correlation,
slide-19
SLIDE 19

Sampling

  • Shannon sampling theorem:

13

if you sample densely enough (at the Nyquist rate) you can perfectly reconstruct the original data.

  • We will show the desired sampling rate is

dependent on the “centre frequency” of the salient edges.

slide-20
SLIDE 20

Sampling

  • Shannon sampling theorem:

13

if you sample densely enough (at the Nyquist rate) you can perfectly reconstruct the original data.

  • We will show the desired sampling rate is

dependent on the “centre frequency” of the salient edges.

Nyquist Rate:- Signal must be sampled at twice the highest frequency!!!

slide-21
SLIDE 21

Under Sampling = “Aliasing”

  • Common phenomenon
  • Wagon wheels rolling the wrong way in movies.

14

slide-22
SLIDE 22

Under Sampling = “Aliasing”

  • Common phenomenon
  • Wagon wheels rolling the wrong way in movies.

14

slide-23
SLIDE 23

Image = Summation of Oriented Edges

  • Well known that natural images can be represented as a

linear summation of oriented edges,

15

x y

...

x y

... ... ... ... ...

“Useful as edges capture ONLY relative local change in intensity....”

slide-24
SLIDE 24

Sensitivity to Shift

16 Warp Pixel Intensity Pixel Intensity Pixel Coordinates

D(p)

(p)

Warp(p) Pixel Coordinates

D(p)

T(0)

I(p) “image”

“model”

slide-25
SLIDE 25

Sensitivity to Shift

16 Warp Pixel Intensity Pixel Intensity Pixel Coordinates

D(p)

(p)

Warp(p) Pixel Coordinates

D(p)

T(0)

I(p) “image”

“model”

slide-26
SLIDE 26

Sensitivity to Shift

16 Warp Pixel Intensity Pixel Intensity Pixel Coordinates

D(p)

(p)

Warp(p) Pixel Coordinates

D(p)

∆ph

∆pl

T(0)

I(p) “image”

“model”

slide-27
SLIDE 27

p = {p1, p2} p1 p2

Sensitivity to Shift

“Possible Source Warps”

  • Clear that sampling is linked to center frequency of edge.

17

∆pl

slide-28
SLIDE 28

p = {p1, p2} p1 p2

Sensitivity to Shift

“Possible Source Warps”

  • Clear that sampling is linked to center frequency of edge.

18

∆ph

slide-29
SLIDE 29

Sensitivity to Shift

  • Edges can be thought to have a center frequency/wavelength,

19

  • Clear that wavelength of the most “salient” edge is proportional to

sample size:-

λ ∝ ∆p

slide-30
SLIDE 30

Sensitivity to Shift

  • Edges can be thought to have a center frequency/wavelength,

19

λ

  • Clear that wavelength of the most “salient” edge is proportional to

sample size:-

λ ∝ ∆p

slide-31
SLIDE 31

Sensitivity to Shift

  • Edges can be thought to have a center frequency/wavelength,

19

λ

  • Clear that wavelength of the most “salient” edge is proportional to

sample size:-

λ ∝ ∆p

slide-32
SLIDE 32

Beyond Translation

  • Is sensitivity to shift only for translation?

“Motion Field for Translation”

20

slide-33
SLIDE 33

Beyond Translation

“Motion Field for Scale”

21

  • Is sensitivity to shift only for translation?
slide-34
SLIDE 34

Beyond Translation

“Motion Field for Rotation”

22

  • Is sensitivity to shift only for translation?
slide-35
SLIDE 35

Beyond Translation

23

slide-36
SLIDE 36

Today

  • Exhaustive Search & Sampling
  • Motivation for Descriptors
  • Overcoming the Curse of Dimensionality in Search

24

slide-37
SLIDE 37

Primary Visual Cortex

25

            

slide-38
SLIDE 38

Spatial Sensitivity

26

slide-39
SLIDE 39

Spatial Sensitivity

27

Kingdom, Field, Olmos, 2007

slide-40
SLIDE 40

Hierarchical Learning

28

View-tuned cells Complex Simple

Bob Crimi

slide-41
SLIDE 41

Hierarchical Learning

28

View-tuned cells Complex Simple

Bob Crimi

V1

V2/V4

IT

Ventral Visual Stream

slide-42
SLIDE 42

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint change.

  • Two options to match local image patches:-

29

“1D Patch” “Distorted 1D Patch”

Source: A. C. Berg

slide-43
SLIDE 43

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint change.

  • Two options to match local image patches:-

29

“1D Patch” “Distorted 1D Patch”

Source: A. C. Berg

“match”

slide-44
SLIDE 44

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint change.

  • Two options to match local image patches:-
  • 1. simultaneously estimate the distortion and position of matching patch

30

“1D Patch” “Distorted 1D Patch”

slide-45
SLIDE 45

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint change.

  • Two options to match local image patches:-
  • 1. simultaneously estimate the distortion and position of matching patch

30

“1D Patch” “Distorted 1D Patch” “align”

slide-46
SLIDE 46

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint change.

  • Two options to match local image patches:-
  • 1. simultaneously estimate the distortion and position of matching patch

30

“1D Patch” “Distorted 1D Patch” “align”

slide-47
SLIDE 47

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint change.

  • Two options to match local image patches:-
  • 1. simultaneously estimate the distortion and position of matching patch

30

“1D Patch” “Distorted 1D Patch” “align” “match”

slide-48
SLIDE 48

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint and/or illumination change.

  • Two options to match patches:-
  • 1. simultaneously estimate the distortion and position of matching patch.
  • 2. to “blur” the template window performing matching coarse-to-fine.

31

“1D Patch” “Distorted 1D Patch”

slide-49
SLIDE 49

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint and/or illumination change.

  • Two options to match patches:-
  • 1. simultaneously estimate the distortion and position of matching patch.
  • 2. to “blur” the template window performing matching coarse-to-fine.

31

“1D Patch” “Distorted 1D Patch” “blur”

slide-50
SLIDE 50

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint and/or illumination change.

  • Two options to match patches:-
  • 1. simultaneously estimate the distortion and position of matching patch.
  • 2. to “blur” the template window performing matching coarse-to-fine.

31

“1D Patch” “Distorted 1D Patch” “blur” “match”

slide-51
SLIDE 51

Handling Geometric Distortion

  • As pointed out in seminal work by Berg and Malik (CVPR’01)

the effectiveness of SSD will degrade with significant viewpoint and/or illumination change.

  • Two options to match patches:-
  • 1. simultaneously estimate the distortion and position of matching patch.
  • 2. to “blur” the template window performing matching coarse-to-fine.

31

“1D Patch” “Distorted 1D Patch” “blur” “match”

Option 2 is attractive, low computational cost!

slide-52
SLIDE 52

x

slide-53
SLIDE 53

x

slide-54
SLIDE 54

x

P

Not Always Zero Always Zero

slide-55
SLIDE 55

x

P Px

Not Always Zero Always Zero

slide-56
SLIDE 56

x

slide-57
SLIDE 57

x

φ{x} = Px

slide-58
SLIDE 58

x

φ{x} = Px

slide-59
SLIDE 59

x

φ{x} = Px

slide-60
SLIDE 60

x

φ{x} = Px

P ∈ G

slide-61
SLIDE 61

P ∈ G

“Similarity” “Rotation” “Translation” “Circular Shift”

slide-62
SLIDE 62

P ∈ G

“Similarity” “Rotation” “Translation” “Circular Shift”

slide-63
SLIDE 63

“Similarity” “Rotation” “Translation” “Circular Shift”

φ{x} = 1 |G| X

P∈G

Px

slide-64
SLIDE 64

“Similarity” “Rotation” “Translation” “Circular Shift”

φ{x} = 1 |G| X

P∈G

Px

slide-65
SLIDE 65

“Similarity” “Rotation” “Translation” “Circular Shift”

φ{x} = X

P∈G

Px

slide-66
SLIDE 66

“Similarity” “Rotation” “Translation” “Circular Shift”

φ{x} = X

P∈G

Px

“throws away information”

slide-67
SLIDE 67

What About Blurring?

37

“Blur Kernel” “Edge Filter” “Edge Blur Filter”

slide-68
SLIDE 68

What About Blurring?

37

“Blur Kernel” “Edge Filter” “Edge Blur Filter”

slide-69
SLIDE 69

What About Blurring?

38

“Blur Kernel” “Edge Filter” “Edge Blur Filter”

“Power Normalized”

slide-70
SLIDE 70

What About Blurring?

38

“Blur Kernel” “Edge Filter” “Edge Blur Filter”

“Power Normalized”

slide-71
SLIDE 71

What About Blurring

  • Clearly, blurring a high-frequency edge filter simply lowers the

centre frequency (not what we want).

39

λb > λe

“High Frequency Edge Wavelength” “Blurred Edge Wavelength”

slide-72
SLIDE 72

Sparseness and Positiveness

  • Blurring only works if the signals being matched are sparse

and positive.

  • Unfortunately natural images are neither.
  • Combination of oriented filter banks and rectification can

remedy this problem with little loss in performance.

x y

...

x y

... ... ... ... ...

“Rectification”

40

slide-73
SLIDE 73

r

Sparseness and Positiveness

  • Blurring only works if the signals being matched are sparse

and positive.

  • Unfortunately natural images are neither.
  • Combination of oriented filter banks and rectification can

remedy this problem with little loss in performance.

“Rectification”

41

r · r

“Non-Linearly sets Centre Frequency to Zero”

slide-74
SLIDE 74

Sensitivity to Shift

42 Warp Edge Energy Pixel Coordinates

D(p)

(p)

No Blurring

slide-75
SLIDE 75

Sensitivity to Shift

42 Warp Edge Energy Pixel Coordinates

D(p)

(p)

No Blurring

slide-76
SLIDE 76

Sensitivity to Shift

43 Warp Rectified Edge Pixel Coordinates

D(p)

(p)

Gaussian Blur

slide-77
SLIDE 77

Sensitivity to Shift

43 Warp Rectified Edge Pixel Coordinates

D(p)

(p)

Gaussian Blur

slide-78
SLIDE 78

Sensitivity to Shift

44 Warp Rectified Edge Pixel Coordinates

D(p)

(p)

Histogram Blur

slide-79
SLIDE 79

Sensitivity to Shift

44 Warp Rectified Edge Pixel Coordinates

D(p)

(p)

Histogram Blur

slide-80
SLIDE 80

I(p) φ{I(p)}

Sparseness and Positiveness

  • Comes at additional computational cost, as new

representation is F times larger (where F is the number of filters employed).

45

φ{} = image descriptor function

slide-81
SLIDE 81

Reminder - SIFT Descriptor

46

  • 1. Compute image gradients
  • 2. Pool into local histograms
  • 3. Concatenate histograms
  • 4. Normalize histograms
slide-82
SLIDE 82

Relationship to Deep Learning

47

slide-83
SLIDE 83

Relationship to Deep Learning

47

slide-84
SLIDE 84

Sensitivity of VGG to Geometric Variation

48

conv1 64@ (54x54) conv2 256@ (27x27) conv3 384@ (13x13) conv4 384@ (13x13) conv5 256@ (13x13) fc-6 (4096)

image patch 3@ (224x224)

SNR (dB) conv1 conv2 conv3 conv4 conv5 fc-6 fc-7

fc-7 (4096)

slide-85
SLIDE 85

Sensitivity of VGG to Geometric Variation

48

conv1 64@ (54x54) conv2 256@ (27x27) conv3 384@ (13x13) conv4 384@ (13x13) conv5 256@ (13x13) fc-6 (4096)

image patch 3@ (224x224)

SNR (dB) conv1 conv2 conv3 conv4 conv5 fc-6 fc-7

fc-7 (4096)

slide-86
SLIDE 86

Sensitivity of VGG to Geometric Variation

48

conv1 64@ (54x54) conv2 256@ (27x27) conv3 384@ (13x13) conv4 384@ (13x13) conv5 256@ (13x13) fc-6 (4096)

image patch 3@ (224x224)

SNR (dB) conv1 conv2 conv3 conv4 conv5 fc-6 fc-7

fc-7 (4096)

slide-87
SLIDE 87

Today

  • Exhaustive Search & Sampling
  • Motivation for Descriptors
  • Overcoming the Curse of Dimensionality in Search

49

slide-88
SLIDE 88

Exhaustive Search

p = {p1, p2} p1 p2

“Possible Source Warps”

arg min

p ||I(p) − T (0)||2 2

p = 0 p 6= 0

d = dim(p)

where:

slide-89
SLIDE 89

Exhaustive Search

p = {p1, p2} p1 p2

“Possible Source Warps”

arg min

p ||I(p) − T (0)||2 2

p = 0 p 6= 0

∆p

||∆p||2 = ✏

d = dim(p)

where:

slide-90
SLIDE 90
  • One can see that as accuracy linearly increases the number
  • f required training samples increases exponentially as a function
  • f .

Exhaustive Search

d

  • O(1/ϵd)

O(Cd log 1/ϵ)

O(Cd

1 + C2 log 1/ϵ)

1/ϵ

Tian &

1/✏

d = dim(p)

where:

slide-91
SLIDE 91

Can we do better?

  • O(1/ϵd)

O(Cd log 1/ϵ)

O(Cd

1 + C2 log 1/ϵ)

1/ϵ

Tian & Narasimhan [ICCV 2013]

slide-92
SLIDE 92

Can we do better?

  • O(1/ϵd)

O(Cd log 1/ϵ)

O(Cd

1 + C2 log 1/ϵ)

1/ϵ

Tian & Narasimhan [ICCV 2013]

slide-93
SLIDE 93

Data Driven Descent

Tian & Narasimhan 2012

slide-94
SLIDE 94

Data Driven Descent

{T (∆pk)}K

k=1

Tian & Narasimhan 2012

slide-95
SLIDE 95

Data Driven Descent

{T (∆pk)}K

k=1

T (0)

Tian & Narasimhan 2012

slide-96
SLIDE 96

Data Driven Descent

Tian & Narasimhan 2012

slide-97
SLIDE 97

Data Driven Descent

T (∆pi) I(p) T (0)

Tian & Narasimhan 2012

slide-98
SLIDE 98

Data Driven Descent

Tian & Narasimhan 2012

slide-99
SLIDE 99

Data Driven Descent

“inverse composition”

p ! p −1 ∆pi

Tian & Narasimhan 2012

slide-100
SLIDE 100

Data Driven Descent

Tian & Narasimhan 2012

slide-101
SLIDE 101

Data Driven Descent

Tian & Narasimhan 2012

slide-102
SLIDE 102

Data Driven Descent

Tian & Narasimhan 2012

slide-103
SLIDE 103

Tian & Narasimhan 2012

slide-104
SLIDE 104

Tian & Narasimhan 2012

slide-105
SLIDE 105

Can we do better?

  • O(1/ϵd)

O(Cd log 1/ϵ)

O(Cd

1 + C2 log 1/ϵ)

1/ϵ

Tian & Narasimhan [ICCV 2013]

slide-106
SLIDE 106

Can we do better?

  • O(1/ϵd)

O(Cd log 1/ϵ)

O(Cd

1 + C2 log 1/ϵ)

1/ϵ

Tian & Narasimhan [ICCV 2013]