image pyramids
play

Image Pyramids COMPSCI 527 Computer Vision COMPSCI 527 Computer - PowerPoint PPT Presentation

Image Pyramids COMPSCI 527 Computer Vision COMPSCI 527 Computer Vision Image Pyramids 1 / 12 Outline 1 Pyramids and Scale 2 (Spatial Frequency) Aliasing 3 Downsampling and Upsampling 4 Bilinear Interpolation 5 Gaussian (and Laplacian)


  1. Image Pyramids COMPSCI 527 — Computer Vision COMPSCI 527 — Computer Vision Image Pyramids 1 / 12

  2. Outline 1 Pyramids and Scale 2 (Spatial Frequency) Aliasing 3 Downsampling and Upsampling 4 Bilinear Interpolation 5 Gaussian (and Laplacian) Pyramid COMPSCI 527 — Computer Vision Image Pyramids 2 / 12

  3. Pyramids and Scale Pyramids and Scale ↑ smallest denticle we look for • Scale: • Start with smallest template • Look for larger and larger occurrences • Larger template ⇡ smaller image! COMPSCI 527 — Computer Vision Image Pyramids 3 / 12

  4. Pyramids and Scale Scale Budgets • n ⇥ n image, k ⇥ k template, scaling s > 1 • Processing a large image with progressively larger k's templates: g n 2 ( k 2 + k 2 s 2 + k 2 s 4 + . . . ) = n 2 k 2 ( 1 + s 2 + s 4 + . . . ) µ w • Series diverges • Processing progressively smaller images with a small template: k 2 ( n 2 + n 2 / s 2 + n 2 / s 4 + . . . ) = k 2 n 2 ( 1 + 1 / s 2 + 1 / s 4 + . . . ) pi • Series converges to k 2 n 2 s 2 ( s 2 � 1 ) • For s = 2, the series converges to k 2 n 2 4 / 3 • About 33 % additional cost relative to processing the original image alone COMPSCI 527 — Computer Vision Image Pyramids 4 / 12

  5. Pyramids and Scale Finer Scales • Scaling down by s = 2 every time may be overly aggressive O • Let φ = 1 / s be the downsampling factor • For 0 < φ < 1, image shrinks. For φ > 1, the image grows larger • How to downsample (0 < φ < 1)? • Two issues: aliasing and non-integer s COMPSCI 527 — Computer Vision Image Pyramids 5 / 12

  6. (Spatial Frequency) Aliasing et Aliasing • Even when s is an integer, pure sampling is a bad idea: (Spatial frequency) aliasing • Colors are sampled at locations on the pixel grid • Nothing to do with the scene F o O O Original Sampled by s = 30, then magnified by 30 COMPSCI 527 — Computer Vision Image Pyramids 6 / 12

  7. Downsampling and Upsampling Downsampling = Smoothing + Sampling JG.ch 2jGCi.jlIfr i • Smooth with a Gaussian blur kernel first, then sample cs lq 3oJf3oq3ob O0 z z aciij I 303 4 M Original Smoothed with σ = 48, than then sampled by s = 30, then magnified by 30 • We lose detail (blur), but that’s the whole point • True scale: • Every pixel in the low-resolution image is a weighted average of pixel values in the original image COMPSCI 527 — Computer Vision Image Pyramids 7 / 12

  8. Downsampling and Upsampling Key Questions • How much to smooth before resampling? • That is, where does σ = 48 come from for φ = 1 / 30? • Lots of theory for the optimal multiplier • Depends on various factors (spectral properties of image and noise) • We use what works most of the time, empirically • Answer: σ ⇡ 1 . 6 s = 1 . 6 / φ • How to “take one out of every s pixels” when s = 1 / φ is not an integer? COMPSCI 527 — Computer Vision Image Pyramids 8 / 12

  9. Bilinear Interpolation Bilinear Interpolation • What does it mean to “take one out of every s pixels” when s = 1 / phi is not an integer? i ξ = b x c , η = b y c ∆ x = x � ξ , ∆ y = y � η a i E I ( x ) = I ( ξ , η ) ( 1 � ∆ x ) ( 1 � ∆ y ) e i + I ( ξ + 1 , η ) ∆ x ( 1 � ∆ y ) + I ( ξ , η + 1 ) ( 1 � ∆ x ) ∆ y + I ( ξ + 1 , η + 1 ) ∆ x ∆ y COMPSCI 527 — Computer Vision Image Pyramids 9 / 12

  10. Bilinear Interpolation Abstracting Pyramid Operations J = resize ( I , φ ) : • If 0 < φ < 1, image shrinks: Filter with σ = 1 . 6 / φ , then sample every s = 1 / φ > 1 pixels • If φ � 1, image grows: No filter. Just sample every s = 1 / φ  1 pixels O • Pyramid operators: Pick a single value of φ 2 ( 0 , 1 ) , T ee then define down ( X ) = resize ( X , φ ) up ( X ) = resize ( X , 1 / φ ) • up is not the inverse of down : Cannot restore lost information COMPSCI 527 — Computer Vision Image Pyramids 10 / 12

  11. Gaussian (and Laplacian) Pyramid A Gaussian Pyramid ( φ = 1 / 2) • A lowpass pyramid: Each level contains a subset of the lower spatial frequencies that are in the next-higher resolution level (blurring attenuates high frequencies) COMPSCI 527 — Computer Vision Image Pyramids 11 / 12

  12. Gaussian (and Laplacian) Pyramid A Laplacian Pyramid ( φ = 1 / 2) O • A bandpass pyramid , because each level contains a (more or less) separate band of spatial frequencies • The Laplacian pyramid is invertible • Optional topic, see notes COMPSCI 527 — Computer Vision Image Pyramids 12 / 12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend