Filters and other potions P. Perona - Caltech MIT - 21 November - - PowerPoint PPT Presentation

▶

Jan 23, 2023 332 likes •913 views

Filters and other potions P. Perona - Caltech MIT - 21 November 2013 what ? where Architectures Architecture 1 building train The vision black box Marble Ripe torso bananas Image(s) Grouping: image regions Surface shape, motor

SLIDE 1

Filters and other potions

P. Perona - Caltech

MIT - 21 November 2013

SLIDE 2

SLIDE 3

what where

SLIDE 4

Architectures

SLIDE 5

Architecture 1

Image(s) The vision black box

Ripe bananas

Marble torso

train building

Feature extraction: texture stereo disparity color contrast motion flow edgels …. Surface shape, scene depth, spatial relationships, 3D motion Grouping: image regions Perceptual

rganization:

2.5D sketch: boundaries, junctions, foregrnd, bckgrnd Recognition, surface properties

Image processing Regions and surfaces Objects, verbs, categories…

motor cognition

[Marr ’82]

SLIDE 6

features?

Le Corbusier, Villa Savoye http://flickr.com/photos/ikura/1398271367/

SLIDE 7

edges

http://www.iit.edu/~stawraf/perspx.jpg Le Corbusier, Villa Savoye

SLIDE 8

SLIDE 9

SLIDE 10

SLIDE 11

[Fukushima ‘80]

Architecture 2

SLIDE 12

SLIDE 13

[DeValois ’85]

SLIDE 14

Column

SLIDE 15

Hypercolumn

SLIDE 16

SLIDE 17

SLIDE 18

Dense sampling

SLIDE 19

translation, rotation invariance

[LeCun et al. 1998]

SLIDE 20

scale invariance

[Lowe 2004]

SLIDE 21

[Hinton et al. ’12]

translation, rotation, scale invariance

SLIDE 22

96 filters 6 orientations 2 center-surround 14 scale samples over 2.2 binary octaves

SLIDE 23

Detection Performance

Caltech pedestrians: 1M frames, 250K hand-annotated

SLIDE 24

Detection Performance

SLIDE 25

Detection Performance

Dollar et al. ‘10 Dollar et al. ‘08 Viola & Jones ‘01 Dalal-Triggs ‘05 * Walk et al. ‘10

SLIDE 26

filter technology

SLIDE 27

Scale, orientation, elongation…. lots of CPU cycles

SLIDE 28

how do we make computations efficient?

SLIDE 29

Separability

[Adelson & Bergen, ’85]

Cost = m x n Cost = m + n R(i, j) = X

h=1:M,k=1:N

k(h, k)I(i − h, j − k)

R(i, j) = X

h=1:M

k=1:N

k(h)k0(k)I(i − h, j − k)

SLIDE 30

Separability and decomposition

[Adelson & Bergen, ’85]

SLIDE 31

Steerability

[Freeman & Adelson, ’91]

SLIDE 32

General decomposition

k(x, θ) =

i=1

bi(θ)fi(x)

k(x, y) =

i=1

fi(x)gi(y)

k(x, y; θ) =

i=1

bi(θ)fi(x)gi(y)

SLIDE 33

Design?

SLIDE 34

k(x; θ)

bi(θ)

σi,i

fi(x)

A = USV T

SLIDE 35

Approximation

K(x, y; θ) =

i=1

bi(θ)fi(x, y)

K(x, y; θ) ≈

i=1

bi(θ)fi(x, y)

R ⌧ D

SLIDE 36

SLIDE 37

SLIDE 38

[Perona ’95]

SLIDE 39

[Perona ’95]

SLIDE 40

[Perona ’95]

SLIDE 41

Tensor Factorization

k(x, y; θ) =

i=1

bi(θ)fi(x)gi(y)

Not a convex problem
Gradient descent

[Shy, Perona ’96]

SLIDE 42

Including scale by resampling

SLIDE 43

SLIDE 44

[Manduchi et al. ’98] [cfr. Simoncelli et al]

SLIDE 45

SLIDE 46

Exploiting Image Statistics

SLIDE 47

riginal

upsampled

sampling the gradient

SLIDE 48

[Dollar et al. 2013]

SLIDE 49

Gradient histograms

[Dollar et al. 2013]

SLIDE 50

Power law feature scaling

SLIDE 51

Power law feature scaling

SLIDE 52

Individual images

[Dollar et al. 2013]

SLIDE 53

Fast computations

SLIDE 54

Fast computations

[Dollar et al. 2013]

SLIDE 55

Performance

[Dollar et al. 2013]

SLIDE 56

Conclusions

Filtering front-end
Need fine sampling of scale, orientation, …
Scalable, separable and steerable approximations
Exploiting image statistics to extrapolate
Fast and accurate detection