Filters and other potions P. Perona - Caltech MIT - 21 November - - PowerPoint PPT Presentation

filters and other potions
SMART_READER_LITE
LIVE PREVIEW

Filters and other potions P. Perona - Caltech MIT - 21 November - - PowerPoint PPT Presentation

Filters and other potions P. Perona - Caltech MIT - 21 November 2013 what ? where Architectures Architecture 1 building train The vision black box Marble Ripe torso bananas Image(s) Grouping: image regions Surface shape, motor


slide-1
SLIDE 1

Filters and other potions

  • P. Perona - Caltech

MIT - 21 November 2013

slide-2
SLIDE 2
slide-3
SLIDE 3

?

what where

slide-4
SLIDE 4

Architectures

slide-5
SLIDE 5

Architecture 1

Image(s) The vision black box

Ripe bananas

Marble torso

train building

Feature extraction: texture stereo disparity color contrast motion flow edgels …. Surface shape, scene depth, spatial relationships, 3D motion Grouping: image regions Perceptual
  • rganization:
2.5D sketch: boundaries, junctions, foregrnd, bckgrnd Recognition, surface properties

Image processing Regions and surfaces Objects, verbs, categories…

motor cognition

[Marr ’82]

slide-6
SLIDE 6

features?

Le Corbusier, Villa Savoye http://flickr.com/photos/ikura/1398271367/
slide-7
SLIDE 7

edges

http://www.iit.edu/~stawraf/perspx.jpg Le Corbusier, Villa Savoye
slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11

[Fukushima ‘80]

Architecture 2

slide-12
SLIDE 12
slide-13
SLIDE 13

[DeValois ’85]

slide-14
SLIDE 14

Column

slide-15
SLIDE 15

Hypercolumn

slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18

Dense sampling

slide-19
SLIDE 19

translation, rotation invariance

[LeCun et al. 1998]

slide-20
SLIDE 20

scale invariance

[Lowe 2004]

slide-21
SLIDE 21

[Hinton et al. ’12]

translation, rotation, scale invariance

slide-22
SLIDE 22

96 filters 6 orientations 2 center-surround 14 scale samples over 2.2 binary octaves

slide-23
SLIDE 23

Detection Performance

Caltech pedestrians: 1M frames, 250K hand-annotated

slide-24
SLIDE 24

Detection Performance

slide-25
SLIDE 25

Detection Performance

Dollar et al. ‘10 Dollar et al. ‘08 Viola & Jones ‘01 Dalal-Triggs ‘05 * Walk et al. ‘10
slide-26
SLIDE 26

filter technology

slide-27
SLIDE 27

Scale, orientation, elongation…. lots of CPU cycles

slide-28
SLIDE 28

how do we make computations efficient?

slide-29
SLIDE 29

Separability

[Adelson & Bergen, ’85]

Cost = m x n Cost = m + n R(i, j) = X

h=1:M,k=1:N

k(h, k)I(i − h, j − k)

R(i, j) = X

h=1:M

X

k=1:N

k(h)k0(k)I(i − h, j − k)

slide-30
SLIDE 30

Separability and decomposition

[Adelson & Bergen, ’85]

slide-31
SLIDE 31

Steerability

[Freeman & Adelson, ’91]

slide-32
SLIDE 32

General decomposition

k(x, θ) =

D

X

i=1

bi(θ)fi(x)

k(x, y) =

D

X

i=1

fi(x)gi(y)

k(x, y; θ) =

D

X

i=1

bi(θ)fi(x)gi(y)

slide-33
SLIDE 33

Design?

slide-34
SLIDE 34

x

θ

=

k(x; θ)

D

bi(θ)

θ

x

σi,i

fi(x)

A = USV T

slide-35
SLIDE 35

Approximation

K(x, y; θ) =

D

X

i=1

bi(θ)fi(x, y)

K(x, y; θ) ≈

R

X

i=1

bi(θ)fi(x, y)

R ⌧ D

slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38

[Perona ’95]

slide-39
SLIDE 39

[Perona ’95]

slide-40
SLIDE 40

[Perona ’95]

slide-41
SLIDE 41

Tensor Factorization

k(x, y; θ) =

D

X

i=1

bi(θ)fi(x)gi(y)

  • Not a convex problem
  • Gradient descent

[Shy, Perona ’96]

slide-42
SLIDE 42

Including scale by resampling

slide-43
SLIDE 43
slide-44
SLIDE 44

[Manduchi et al. ’98] [cfr. Simoncelli et al]

slide-45
SLIDE 45
slide-46
SLIDE 46

Exploiting Image Statistics

slide-47
SLIDE 47
  • riginal

upsampled

sampling the gradient

slide-48
SLIDE 48

[Dollar et al. 2013]

slide-49
SLIDE 49

Gradient histograms

[Dollar et al. 2013]

slide-50
SLIDE 50

Power law feature scaling

slide-51
SLIDE 51

Power law feature scaling

slide-52
SLIDE 52

Individual images

[Dollar et al. 2013]

slide-53
SLIDE 53

Fast computations

slide-54
SLIDE 54

Fast computations

[Dollar et al. 2013]

slide-55
SLIDE 55

Performance

[Dollar et al. 2013]

slide-56
SLIDE 56

Conclusions

  • Filtering front-end
  • Need fine sampling of scale, orientation, …
  • Scalable, separable and steerable approximations
  • Exploiting image statistics to extrapolate
  • Fast and accurate detection