Image Pyramids COMPSCI 527 Computer Vision COMPSCI 527 Computer - - PowerPoint PPT Presentation

image pyramids
SMART_READER_LITE
LIVE PREVIEW

Image Pyramids COMPSCI 527 Computer Vision COMPSCI 527 Computer - - PowerPoint PPT Presentation

Image Pyramids COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Image Pyramids 1 / 12 Outline 1 Pyramids and Scale 2 (Spatial Frequency) Aliasing 3 Downsampling and Upsampling 4 Bilinear Interpolation 5 Gaussian (and Laplacian)


slide-1
SLIDE 1

Image Pyramids

COMPSCI 527 — Computer Vision

COMPSCI 527 — Computer Vision Image Pyramids 1 / 12
slide-2
SLIDE 2

Outline

1 Pyramids and Scale 2 (Spatial Frequency) Aliasing 3 Downsampling and Upsampling 4 Bilinear Interpolation 5 Gaussian (and Laplacian) Pyramid

COMPSCI 527 — Computer Vision Image Pyramids 2 / 12
slide-3
SLIDE 3 Pyramids and Scale

Pyramids and Scale

↑ smallest denticle

we look for

  • Scale:
  • Start with smallest template
  • Look for larger and larger occurrences
  • Larger template ⇡ smaller image!
COMPSCI 527 — Computer Vision Image Pyramids 3 / 12
slide-4
SLIDE 4 Pyramids and Scale

Scale Budgets

  • n ⇥ n image, k ⇥ k template, scaling s > 1
  • Processing a large image with progressively larger

templates: n2(k2 + k 2s2 + k2s4 + . . .) = n2k2(1 + s2 + s4 + . . .)

  • Series diverges
  • Processing progressively smaller images with a small

template: k2(n2 + n2/s2 + n2/s4 + . . .) = k2n2(1 + 1/s2 + 1/s4 + . . .)

  • Series converges to k2n2s2(s2 1)
  • For s = 2, the series converges to k2n24/3
  • About 33% additional cost relative to processing the original

image alone

COMPSCI 527 — Computer Vision Image Pyramids 4 / 12

g

µ

k's

w

pi

slide-5
SLIDE 5 Pyramids and Scale

Finer Scales

  • Scaling down by s = 2 every time may be overly aggressive
  • Let φ = 1/s be the downsampling factor
  • For 0 < φ < 1, image shrinks. For φ > 1, the image grows

larger

  • How to downsample (0 < φ < 1)?
  • Two issues: aliasing and non-integer s
COMPSCI 527 — Computer Vision Image Pyramids 5 / 12

O

slide-6
SLIDE 6 (Spatial Frequency) Aliasing

Aliasing

  • Even when s is an integer, pure sampling is a bad idea:

(Spatial frequency) aliasing

  • Colors are sampled at locations on the pixel grid
  • Nothing to do with the scene
Original Sampled by s = 30, then magnified by 30 COMPSCI 527 — Computer Vision Image Pyramids 6 / 12

et

  • F

O

O

slide-7
SLIDE 7 Downsampling and Upsampling

Downsampling = Smoothing + Sampling

  • Smooth with a Gaussian blur kernel first, then sample
Original Smoothed with σ = 48, then sampled by s = 30, then magnified by 30
  • We lose detail (blur), but that’s the whole point
  • True scale:
  • Every pixel in the low-resolution image is a weighted

average of pixel values in the original image

COMPSCI 527 — Computer Vision Image Pyramids 7 / 12

JG.ch 2jGCi.jlIfr i

cs

lq

3oJf3oq3ob

O0

z zaciij I 303 4

M

than

slide-8
SLIDE 8 Downsampling and Upsampling

Key Questions

  • How much to smooth before resampling?
  • That is, where does σ = 48 come from for φ = 1/30?
  • Lots of theory for the optimal multiplier
  • Depends on various factors (spectral properties of image

and noise)

  • We use what works most of the time, empirically
  • Answer: σ ⇡ 1.6 s = 1.6/φ
  • How to “take one out of every s pixels” when s = 1/φ is not

an integer?

COMPSCI 527 — Computer Vision Image Pyramids 8 / 12
slide-9
SLIDE 9 Bilinear Interpolation

Bilinear Interpolation

  • What does it mean to “take one out of every s pixels” when

s = 1/phi is not an integer?

ξ = bxc , η = byc ∆x = x ξ , ∆y = y η I(x) = I(ξ, η) (1 ∆x) (1 ∆y) + I(ξ + 1, η) ∆x (1 ∆y) + I(ξ, η + 1) (1 ∆x) ∆y + I(ξ + 1, η + 1) ∆x ∆y

COMPSCI 527 — Computer Vision Image Pyramids 9 / 12

a

i

i E

e

i

slide-10
SLIDE 10 Bilinear Interpolation

Abstracting Pyramid Operations

J = resize(I, φ):

  • If 0 < φ < 1, image shrinks:

Filter with σ = 1.6/φ, then sample every s = 1/φ > 1 pixels

  • If φ 1, image grows:

No filter. Just sample every s = 1/φ  1 pixels

  • Pyramid operators: Pick a single value of φ 2 (0, 1),

then define

down(X) = resize(X, φ) up(X) = resize(X, 1/φ)

  • up is not the inverse of down:

Cannot restore lost information

COMPSCI 527 — Computer Vision Image Pyramids 10 / 12

T

O

ee

slide-11
SLIDE 11 Gaussian (and Laplacian) Pyramid

A Gaussian Pyramid (φ = 1/2)

  • A lowpass pyramid: Each level contains a subset of the

lower spatial frequencies that are in the next-higher resolution level (blurring attenuates high frequencies)

COMPSCI 527 — Computer Vision Image Pyramids 11 / 12
slide-12
SLIDE 12 Gaussian (and Laplacian) Pyramid

A Laplacian Pyramid (φ = 1/2)

  • A bandpass pyramid, because each level contains a (more
  • r less) separate band of spatial frequencies
  • The Laplacian pyramid is invertible
  • Optional topic, see notes
COMPSCI 527 — Computer Vision Image Pyramids 12 / 12

O