Lecture 10: Stereo databases Build a shape-based object detector - PDF document

Grad student extension ideas for problem set 2 • Implement textons approach for texture recognition [Leung & Malik] – Possible data sources: Vistex, Curet Lecture 10: Stereo databases • Build a shape-based object detector using the generalized Hough transform Tuesday, Oct 2 • Clustering approach to video shot boundary detection • Build a deformable contour tracker Review all material covered so far Exam • Image formation • Edges, pyramids, sampling • Next Tuesday, Oct 9, in class – Perspective, orthographic projection – Image gradients properties, equations, effects – Effects of noise – Pinhole cameras – Derivative of Gaussian, Laplacian filters • Bring one handwritten 8.5 x 11”, one-sided – Thin lens – Canny edge detection – Field of view, depth of field – Corner detection • Color sheet with any notes – Sampling and aliasing – BRDF – Pyramids – construction and applications – Spectral power distribution • Texture • Closed book/laptop/calculator – Color mixing – Analysis vs. synthesis – Color matching – Representations – Color spaces • Grouping – Human perception – Gestalt principles • Binary image analysis – Clustering: agglomerative, k-means, mean – Histograms and thresholding shift, graph-based – Connected components – Graphs and affinity matrices – Morphological operators • Fitting – Region properties and invariance – Hough transform – Distance transform, Chamfer distance – Generalized Hough transform • Filters – Least squares – Application/effects of – Incremental line fitting, k-means – Convolution properties – Robust fitting: RANSAC, M-estimators – Noise models – Deformable contours, energy functions – Mean, median, Gaussian, derivative filters • Stereo vision – Separability Outline Last time: deformable contours a.k.a. active contours, snakes • Brief review of deformable contours • Fundamentals of stereo vision • Epipolar geometry initial intermediate final

Snake energy function Discrete energy terms The total energy of the current snake defined as • If the curve is represented by n points = + E E E ν = = − K ( , ) 0 1 x y i n … total in ex i i i − ν ν 2 ν d v d ≈ + ≈ ν − ν − ν − ν = ν − ν + ν Internal energy encourages smoothness i 1 i ( ) ( ) 2 External energy encourages curve onto + − + − i 1 i i i 1 i 1 i i 1 or any particular shape 2 2 ds image structures (e.g. image edges) ds Internal energy incorporates prior knowledge about object boundary, − n 1 ∑ which allows a boundary to be extracted = α ν − ν + β ν − ν + ν 2 2 | | | 2 | E even if some image data is missing + + − 1 1 1 in i i i i i = i 0 We will want to iteratively minimize this energy for Elasticity, Stiffness a good fit between the deformable contour and Tension; Curvature; the target shape in the image Want to favor close points Want to favor smoothly shaped curve (not corners) Discrete energy terms Energy minimization • An external energy term for a (discrete) • Many algorithms proposed to fit snake based on image edge deformable contours − – Greedy search n 1 ∑ = − + 2 2 E | G ( x , y ) | | G ( x , y ) | – Gradient descent ex x i i y i i = 0 i – Dynamic programming (for 2d snakes) Problems with snakes Problems with snakes • May be sensitive to initialization, get stuck • Depends on number and spacing of control in local minimum points • Snake may oversmooth the boundary • Not trivial to prevent curve self intersecting • Cannot follow topological changes of objects • Accuracy (and computation time) depends on the convergence criteria used in the energy minimization technique

Problems with snakes • External energy: snake does not really “see” object boundaries in the image unless it gets very close to it. ∇ I image gradients are large only directly on the boundary Depth unavailable in single views Shading P1 P2 P1’=P2’ Optical center What cues can indicate 3d shape? [Figure from Prados & Faugeras 2006] Focus/Defocus Texture [Figure from H. Jin and P. Favaro, 2002] [From A.M. Loh. The recovery of 3-D structure using visual texture patterns. PhD thesis]

Motion Estimating scene shape • Shape from X: Shading, Texture, Focus, Motion… • Stereo: – shape from motion between two views – infer 3d shape of scene from two (multiple) images from different viewpoints Figures from L. Zhang http://www.brainconnection.com/teasers/?main=illusion/motion-shape Accommodation and focus Fixation, convergence The lens modifies the image focus by adjusting its focal length. Fixation, convergence Human stereopsis: disparity Disparity occurs when eyes verge on one object; others appear at different visual angles From Palmer, “Vision Science”, MIT Press

Human stereopsis: disparity Random dot stereograms • Julesz 1960: Do we identify local brightness patterns before fusion (monocular process) or after (binocular)? • To test: pair of synthetic images obtained by randomly spraying black dots on white d=0 objects Disparity: d = r-l = D-F. Adapted from M. Pollefeys Random dot stereograms Random dot stereograms Forsyth & Ponce Random dot stereograms Random dot stereograms • When viewed monocularly, they appear random; when viewed stereoscopically, see 3d structure. • Conclusion: human binocular fusion not directly associated with the physical retinas; must involve the central nervous system • Imaginary “ cyclopean retina” that combines the left and right image stimuli as a single unit From Palmer, “Vision Science”, MIT Press

Generating a random dot stereogram Autostereograms Exploit disparity as depth cue using single image (Single image random dot stereogram, Single image stereogram) http://www.wellesley.edu/CS/LiDPC/OnParallaxis/Braunl.paper20.html Images from magiceye.com Autostereograms Autostereograms Images from magiceye.com Images from magiceye.com Stereo photography and stereo viewers Take two pictures of the same subject from two slightly different viewpoints and display so that each eye sees only one of the images. Image courtesy of fisher-price.com Invented by Sir Charles Wheatstone, 1838 http://www.johnsonshawmuseum.org

http://www.johnsonshawmuseum.org Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923 Stereo in machine vision systems Left : The Stanford cart sports a single camera moving in discrete increments along a straight line and providing multiple snapshots of outdoor scenes Right : The INRIA mobile robot uses three cameras to map its environment http://www.well.com/~jimg/stereo/stereo_list.html Forsyth & Ponce Stereo Multi-view geometry • Main issues – Geometry: what information is available, how do the camera views relate? – Correspondences: what feature in view 1 corresponds to feature in view 2? – Triangulation, reconstruction: inference in presence of noise Slide credit: T. Darrell

Camera parameters Geometry for a simple stereo system • First, assuming parallel optical axes, known camera parameters: World frame Extrinsic: Camera frame �� Reference frame Intrinsic: Image coordinates relative to Camera camera �� Pixel coordinates frame • Extrinsic params: rotation matrix and translation vector • Intrinsic params: focal length, pixel sizes (mm), image center point, radial distortion parameters Geometry for a simple stereo system • Parameters in this case: – Camera centers (Ol, Or) – Focal length (f) – Baseline (T) Geometry for a simple stereo system Stereo constraints • First, assuming parallel optical axes, known camera parameters: Similar triangles (pl, P, pr) and (Ol, P, Or) T + xl – xr = T • Given p in left image, where can --------------- --- Z – f Z corresponding point p’ be? Z = f T/d where d = xr – xl

Stereo constraints Stereo constraints • Given p in left image, where can corresponding point p’ be? Epipolar geometry Epipolar geometry Now the optical axes are not necessarily parallel. • Baseline: line joining the camera centers • Epipole: point of intersection of baseline with the image plane • Epipolar plane: plane containing baseline • Epipolar line: intersection of epipolar plane with the image plane • All epipolar lines intersect at the epipole • Epipolar Plane • Baseline • An epipolar plane intersects the left and right • Epipoles • Epipolar Lines image planes in epipolar lines Adapted from M. Pollefeys, UNC Epipolar constraint Epipolar constraint example • Potential matches for p have to lie on the corresponding epipolar line l’ . • Potential matches for p’ have to lie on the corresponding epipolar line l . Slide credit: M. Pollefeys

Lecture 10: Stereo databases Build a shape-based object detector - PDF document

Grad student extension ideas for problem set 2 Implement textons approach for texture recognition [Leung & Malik] Possible data sources: Vistex, Curet Lecture 10: Stereo databases Build a shape-based object detector using the

3D Photography: Stereo Matching Kevin Kser, Marc Pollefeys Spring 2012

3D Vision: Stereo Marc Pollefeys, Torsten Sattler Spring 2016

Today Recap: epipolar constraint Stereo image rectification Stereo: Stereo

Towards Deep Multi-View Stereo Silvano Galliani October 2, 2017 1 / 40 Towards Deep Multi-View

Stereo Matching 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What is stereo

Depth from Stereo Dominic Cheng February 7, 2018 Agenda 1. Introduction to stereo 2.

Stereo Matching Wei-Chih Tu ( ) National Taiwan University Fall 2018 Stereo Matching

CS 4495 Computer Vision Stereo: Disparity and Matching Aaron Bobick School of Interactive

Two-View Stereo Slides from S. Lazebnik, S. Seitz, Y. Furukawa Stereo What cues tell us

Stereo Vision Reading: Chapter 11 Stereo matching computes depth from two or more images

Efficient Deep Learning for Stereo Matching Wenjie Luo, Alex Schwing and Raquel Urtasun W. Luo

CS 4495 Computer Vision Stereo: Disparity and Matching Aaron Bobick School of Interactive

Dense Stereo Some Slides by Forsyth & Ponce, Jim Rehg, Sing Bing Kang (Does not line up well

1 Basic Stereo Derivations Correspondence It is fundamentally ambiguous, even with stereo

CSE 152 Section 5 HW2: Stereo Geometry April 29, 2019 Owen Jow Stereo: two views. Why is one

Outline Last lecture: stereo reconstruction with calibrated cameras non-geometric

Natural Language Processing CSCI 4152/6509 Lecture 11 IR Measures and Text Mining

Distributed motion coordination of robotic networks Lecture 5 agreement Jorge Cort es

"Probabilistic" Data Structures vs. PostgreSQL (and similar stuff) FOSDEM PgDay -

Transducing for fun and profit simon@metabase.com @sbelak Clojure at a glance (lisp

DB4SIL2 - Kernel assurance data for SIL2LinuxMP OpenTech Andreas Platschek <

EECS 442 Computer Vision Prof. David Fouhey Winter 2019, University of Michigan

Year 11 Information Evening Preparing for GCSEs Thursday 23 January 2020 Y11 Information Evening

Infinite Dimensional Preconditioners V.B. Kiran Kumar Department of Mathematics Cochin

Lecture 10: Stereo databases Build a shape-based object detector - PDF document

Grad student extension ideas for problem set 2 Implement textons approach for texture recognition [Leung & Malik] Possible data sources: Vistex, Curet Lecture 10: Stereo databases Build a shape-based object detector using the

3D Photography: Stereo Matching Kevin Kser, Marc Pollefeys Spring 2012

3D Vision: Stereo Marc Pollefeys, Torsten Sattler Spring 2016

Today Recap: epipolar constraint Stereo image rectification Stereo: Stereo

Towards Deep Multi-View Stereo Silvano Galliani October 2, 2017 1 / 40 Towards Deep Multi-View

Stereo Matching 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What is stereo

Depth from Stereo Dominic Cheng February 7, 2018 Agenda 1. Introduction to stereo 2.

Stereo Matching Wei-Chih Tu ( ) National Taiwan University Fall 2018 Stereo Matching

CS 4495 Computer Vision Stereo: Disparity and Matching Aaron Bobick School of Interactive

Two-View Stereo Slides from S. Lazebnik, S. Seitz, Y. Furukawa Stereo What cues tell us

Stereo Vision Reading: Chapter 11 Stereo matching computes depth from two or more images

Efficient Deep Learning for Stereo Matching Wenjie Luo, Alex Schwing and Raquel Urtasun W. Luo

CS 4495 Computer Vision Stereo: Disparity and Matching Aaron Bobick School of Interactive

Dense Stereo Some Slides by Forsyth &amp; Ponce, Jim Rehg, Sing Bing Kang (Does not line up well

1 Basic Stereo Derivations Correspondence It is fundamentally ambiguous, even with stereo

CSE 152 Section 5 HW2: Stereo Geometry April 29, 2019 Owen Jow Stereo: two views. Why is one

Outline Last lecture: stereo reconstruction with calibrated cameras non-geometric

Natural Language Processing CSCI 4152/6509 Lecture 11 IR Measures and Text Mining

Distributed motion coordination of robotic networks Lecture 5 agreement Jorge Cort es

&quot;Probabilistic&quot; Data Structures vs. PostgreSQL (and similar stuff) FOSDEM PgDay -

Transducing for fun and profit simon@metabase.com @sbelak Clojure at a glance (lisp

DB4SIL2 - Kernel assurance data for SIL2LinuxMP OpenTech Andreas Platschek &lt;

EECS 442 Computer Vision Prof. David Fouhey Winter 2019, University of Michigan

Year 11 Information Evening Preparing for GCSEs Thursday 23 January 2020 Y11 Information Evening

Infinite Dimensional Preconditioners V.B. Kiran Kumar Department of Mathematics Cochin

Dense Stereo Some Slides by Forsyth & Ponce, Jim Rehg, Sing Bing Kang (Does not line up well

"Probabilistic" Data Structures vs. PostgreSQL (and similar stuff) FOSDEM PgDay -

DB4SIL2 - Kernel assurance data for SIL2LinuxMP OpenTech Andreas Platschek <