A Whirlwind Tour of where we are in Computational Binocular Stereo - PowerPoint PPT Presentation

A Whirlwind Tour of where we are in Computational Binocular Stereo Vision a beginners tutorial for the uninitiated Toby Breckon School of Engineering and Computing Sciences Durham University Slides: www.durham.ac.uk/toby.breckon/teaching/tutorials/vihm_wks_2015_breckon.pdf Slide material acknowledgements (some portions): R. Szeliski (Microsoft/Washington), B. Fisher (Edinburgh), O. Hamilton (Cranfield/Durham), J. Xiao, N. Snavely, J. Hays, S. Prince ViiHM Mini-Workshop 2015 Stereo Vision : 1

Setting the Scene ... Breckon: ViiHM 2015 Stereo Vision : 2

the core problem: stereo vision Breckon: ViiHM 2015 Stereo Vision : 3

the core problem: stereo vision ● Binocular Stereo Vision (i.e. only 2 cameras) 3D scene information implicitly encoded in image differences – ⇒ Representation: RGB intensity images noisy – Breckon: ViiHM 2015 Stereo Vision : 4

Left Breckon: ViiHM 2015 Stereo Vision : 5

Right Breckon: ViiHM 2015 Stereo Vision : 6

Stereo Vision – the key principle image features (e.g. point / line / pixel) will project differently in the left and right images depending on its distance from the camera (or eyes in human vision). P R P L P L P R This difference in image position is known as disparity , d =|P L - P R | Breckon: ViiHM 2015 Stereo Vision : 7

Stereo Vision - principle - Matching every feature between the left and right images results in a 2D ‘disparity map’ or ‘depth map’ (computed as disparity, d, at every feature position) - Real-world 3D information (distances to scene objects) can be recovered from this depth map Breckon: ViiHM 2015 Stereo Vision : 8

Concept : depth recovery Depth of scene object indicated by greyscale value http://vision.middlebury.edu/stereo/ Breckon: ViiHM 2015 Stereo Vision : 9

But why is this computationally challenging ? Breckon: ViiHM 2015 Stereo Vision : 10

Left Breckon: ViiHM 2015 Stereo Vision : 11

Right Breckon: ViiHM 2015 Stereo Vision : 12

In reality - images are noisy due to {encoding, sampling, illumination, camera alignment, camera variations, temperature} thus features appear differently in each image .. thus simple image matching (most) often fails Breckon: ViiHM 2015 Stereo Vision : 13

this is what makes stereo vision challenging Breckon: ViiHM 2015 Stereo Vision : 14

Today , almost all computational stereo research addresses the matching problem [to some degree, at some level] Breckon: ViiHM 2015 Stereo Vision : 15

Disparity Vs. Depth ● Computer Vision people often refer to disparity estimation P L – disparity is a 2D measure of feature P R d displacement between the images (measured in pixels) ● Biological Vision people often refer to depth perception – depth is an axis of positional Scene measurement of distance Depth Ordering within the scene (measured in metres / mm / cm) Relative scene depth, Z Breckon: ViiHM 2015 Stereo Vision : 16

… essentially the same thing Depth of a scene object, Z , observed to have disparity difference, d , between two stereo images separated by a baseline distance, B , with camera lenses with a focal length, f. .... if you have one you can calculate the other Breckon: ViiHM 2015 Stereo Vision : 17

Stereo : Standard Formulation Camera 1 Camera 2 (left eye) (right eye) B L ⇒ R left / right views at known (calibrated) distance apart (baseline, B) ● Breckon: ViiHM 2015 Stereo Vision : 18

Stereo Vision – disparity to depth Point P (in the world) is projected into the left image plane (as P L ) and the right image plane (as P R ) Z P L P R Left Right Image Plane Image Plane f f P = (X,Y,Z) (in the world) P L =(x L ,y L ) (in left image) B L ⇒ P R =(x R ,y R ) (in right image) R Breckon: ViiHM 2015 Stereo Vision : 19

Stereo Vision – disparity to depth The re-projection of P L from the left image plane into the right image plane allows us to recover disparity as a pixel distance within the image. disparity, d =|P L -P R | P Z d P L P R P L Left Right Image Plane Image Plane f f P = (X,Y,Z) (in the world) P L =(x L ,y L ) (in left image) B L ⇒ P R =(x R ,y R ) (in right image) R Breckon: ViiHM 2015 Stereo Vision : 20

Stereo Vision – disparity to depth What is stereo vision? Z X Y Images captured under Perspective Transform ● (X,Y,Z) in scene (depth Z) – imaged at position (x,y) on the image plane – determined by the focal length of the camera f – (lens to image plane distance) image inverted during capture (fixed inside camera) – y Z x f Thus in stereo to recover 3D position of P = (X, Y, Z): ● depth of a feature, Z , with disparity, d, over a stereo baseline, B: – Breckon: ViiHM 2015 Stereo Vision : 21

Computational Stereo – An Outline [How do we solve the matching problem ?] Breckon: ViiHM 2015 Stereo Vision : 22

Stereo Vision - Overview 2 stereo cameras Stereo camera setup two cameras, viewing calibration ● target [Lukins '05] relative positions known (calibration) Image Capture Feature Extraction What can we see in each ● image? Can we match ● Feature Matching features between images? Triangulation Depth recovery from matched features Breckon: ViiHM 2015 Stereo Vision : 23

Sparse Image Features ● State of the Art : feature points – high dimensional local feature descriptions (e.g. 128D+) – considerable research effort Initial work - [ Harris, 1998] then intensive - [Period : 2004 → 2010+ ] – robust matching performance beyond the stereo case ● considerably beyond (!) ● strongly invariant (via RANSAC) – Feature points in a nutshell: ● pixels described by local gradient histograms ● normalized for maximal invariance ● discard pixel regions that are not locally unique [ SIFT – Lowe, 2004 / SURF – Bay et al., 2006] Breckon: ViiHM 2015 Stereo Vision : 24

Sparse Image Features Harris Feature Points – example - [Fisher / Breckon et al., 2014] Breckon: ViiHM 2015 Stereo Vision : 25

Sparse Image Features ● Under-pins …. 3D reconstruction from tourist photos: http://www.cs.cornell.edu/projects/p2f/ Real-time image mosaicking [Breckon et al., 2010] Deformable object matching - http://www.cvc.uab.es/~jcrubio/ Object instance detection – [SURF, SIFT et al.] … + object recognition and a whole lot more. Breckon: ViiHM 2015 Stereo Vision : 26

Readily gives us feature-based stereo (i.e. sparse depth) e.g. Match local unique “corner” features points (obtain disparity/depth at these points) Interpolate complete 3D depth solution / object positions etc. Breckon: ViiHM 2015 Stereo Vision : 27

Example: sparse stereo for HCI [Features = red/green blobs] [source: anon] Breckon: ViiHM 2015 Stereo Vision : 28

Example: sparse stereo for stereo odometry [Features = feature points] https://www.youtube.com/watch?v=lTQGTbrNssQ Breckon: ViiHM 2015 Stereo Vision : 29

Reality … nobody really uses sparse stereo any more [apart from bespoke applications like those just illustrated] Breckon: ViiHM 2015 Stereo Vision : 30

.. the world went dense. Breckon: ViiHM 2015 Stereo Vision : 31

Dense Stereo Vision ● Concept: compute depth for each and every scene pixel Breckon: ViiHM 2015 Stereo Vision : 32

Key challenge: any pixel in left could now potentially match to any pixel in the right this is a lot of matches to evaluate! → a large search space of matches is computationally expensive (and prone to mis-matching errors) Breckon: ViiHM 2015 Stereo Vision : 33

Stereo Correspondence Problem Q: For a given feature in the left, what is the correct correspondence? ? Different pairing result in different 3D results ● inconsistent correspondence = inconsistent 3D (!) – Key problem in all stereo vision approaches – Breckon: ViiHM 2015 Stereo Vision : 34

In computational stereo vision this is addressed via three aspects: camera calibration leading to epipolar geometry Match aggregation – matching regions not pixels Match optimization – compute many possible matches, then select the best subset that are maximal inter-consistent Breckon: ViiHM 2015 Stereo Vision : 35

Epipolar Geometry – reduces matching space ● Feature p l in the left image lies on a ray r in space – r projects to an epipolar line e in the right image – along which the matching feature p r must lie If the images are “rectified”, then epipolar line is the image row ● i.e. camera images both perfectly axis aligned – Breckon: ViiHM 2015 Stereo Vision : 36

Epipolar Geometry – reduces matching space ● Constrains L → R Correspondence – reduces 2D search to 1D – images linked by fundamental matrix, F. – For matched points p l F p r =0. – F generally derived from prior calibration routine (with pre- known target). – Points are homogeneous – F is 3x3 Right Image Plane Left Image Plane Match for point p l on ray r (left) must lie on epipolar line e (right). ● Breckon: ViiHM 2015 Stereo Vision : 37

Example: rectified Images original rectified  “rectified” images = then epipolar line is the image row • rectification is performed via calibration thus stereo is reduced to a 1D “scan-line matching” problem Breckon: ViiHM 2015 Stereo Vision : 38

A Whirlwind Tour of where we are in Computational Binocular Stereo - PowerPoint PPT Presentation

A Whirlwind Tour of where we are in Computational Binocular Stereo Vision a beginners tutorial for the uninitiated Toby Breckon School of Engineering and Computing Sciences Durham University Slides:

Redis 101 A whirlwind tour of the next big thing in NoSQL data storage P E T E R C O O P E R h

THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR Under the framework of

A G E N D A Tour Policy Oakhill Tour Presentation Travel & Sports Tour

Outline Overview VR Tour VR Tour Entities Luiz Velho Tour Script IMPA Tour

DAY TOURS 2016 TOUR OPTIONS SCHEDULED TOUR We require a minimum of two people to conduct a

2019 KR19 TOUR PRESENTATION Kalahari Sunset KR19 EURO TOUR PRESENTATION Kalahari Khoi-San

COMP 431 A Whirlwind Introduction to the Internet Internet Services & Protocols Overview

Ohio Tax Statistical Sampling A Whirlwind Tour of How Procedures & Techniques Vary

23 Patterns in 80 Minutes: a Whirlwind Java- centric Tour of the Gang-of-Four Design Patterns

Developing Apps With WatchKit A Whirlwind Tour Adam Shaw Kabuki Vision @KabukiVision

Open vSwitch: A Whirlwind Tour Jus8n Pe:t March 3, 2011

A Whirlwind Tour of OpenDaylight Colin Dixon Luis Gomez TSC Chair, OpenDaylight Principal

RNA Search and Whirlwind tour of ncRNA search & discovery Motif Discovery RNA motif

OpenGL and Assignment #1 Intensive introduction to OpenGL whirlwind tour of: window setup

This time Starting with Networking Basics A whirlwind tour of networking What is a

Exploration in Online Decision Making (A whirlwind tour w/ everything but MDPs) Daniel Russo

Biologically inspired Vision on a Modular Reconfigurable System (BMV) Jon Binney RESL Lior

Anonymous Graph Exploration with Binoculars Jrmie Chalopin Emmanuel Godard Antoine Naudin

Image Formation: Geometry Thurs. Jan. 11, 2018 1 Origins of spatial vision (500 million years

System Transformation in New York State August 12, 2019 August 12, 2019 2 Contents Changes

Welcome to Comp/Phys/Mtsc 715 1/11/2011 Introduction Comp/Phys/Mtsc 715 Taylor 1 1/11/2011

Theoretical Bounds on Image Search Instructor - Simon Lucey 16-423 - Designing Computer Vision

2019 GKHA regional slides presentations EASTERN and CENTRAL EUROPE Slide 1: <opening

Using MadCap Central for SME Reviews and Contributions PRESENTED BY Patrick Fueldner & R.N.

A Whirlwind Tour of where we are in Computational Binocular Stereo - PowerPoint PPT Presentation

A Whirlwind Tour of where we are in Computational Binocular Stereo Vision a beginners tutorial for the uninitiated Toby Breckon School of Engineering and Computing Sciences Durham University Slides:

Redis 101 A whirlwind tour of the next big thing in NoSQL data storage P E T E R C O O P E R h

THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR Under the framework of

A G E N D A Tour Policy Oakhill Tour Presentation Travel &amp; Sports Tour

Outline Overview VR Tour VR Tour Entities Luiz Velho Tour Script IMPA Tour

DAY TOURS 2016 TOUR OPTIONS SCHEDULED TOUR We require a minimum of two people to conduct a

2019 KR19 TOUR PRESENTATION Kalahari Sunset KR19 EURO TOUR PRESENTATION Kalahari Khoi-San

COMP 431 A Whirlwind Introduction to the Internet Internet Services &amp; Protocols Overview

Ohio Tax Statistical Sampling A Whirlwind Tour of How Procedures &amp; Techniques Vary

23 Patterns in 80 Minutes: a Whirlwind Java- centric Tour of the Gang-of-Four Design Patterns

Developing Apps With WatchKit A Whirlwind Tour Adam Shaw Kabuki Vision @KabukiVision

Open vSwitch: A Whirlwind Tour Jus8n Pe:t March 3, 2011

A Whirlwind Tour of OpenDaylight Colin Dixon Luis Gomez TSC Chair, OpenDaylight Principal

RNA Search and Whirlwind tour of ncRNA search &amp; discovery Motif Discovery RNA motif

OpenGL and Assignment #1 Intensive introduction to OpenGL whirlwind tour of: window setup

This time Starting with Networking Basics A whirlwind tour of networking What is a

Exploration in Online Decision Making (A whirlwind tour w/ everything but MDPs) Daniel Russo

Biologically inspired Vision on a Modular Reconfigurable System (BMV) Jon Binney RESL Lior

Anonymous Graph Exploration with Binoculars Jrmie Chalopin Emmanuel Godard Antoine Naudin

Image Formation: Geometry Thurs. Jan. 11, 2018 1 Origins of spatial vision (500 million years

System Transformation in New York State August 12, 2019 August 12, 2019 2 Contents Changes

Welcome to Comp/Phys/Mtsc 715 1/11/2011 Introduction Comp/Phys/Mtsc 715 Taylor 1 1/11/2011

Theoretical Bounds on Image Search Instructor - Simon Lucey 16-423 - Designing Computer Vision

2019 GKHA regional slides presentations EASTERN and CENTRAL EUROPE Slide 1: &lt;opening

Using MadCap Central for SME Reviews and Contributions PRESENTED BY Patrick Fueldner &amp; R.N.

A G E N D A Tour Policy Oakhill Tour Presentation Travel & Sports Tour

COMP 431 A Whirlwind Introduction to the Internet Internet Services & Protocols Overview

Ohio Tax Statistical Sampling A Whirlwind Tour of How Procedures & Techniques Vary

RNA Search and Whirlwind tour of ncRNA search & discovery Motif Discovery RNA motif

2019 GKHA regional slides presentations EASTERN and CENTRAL EUROPE Slide 1: <opening

Using MadCap Central for SME Reviews and Contributions PRESENTED BY Patrick Fueldner & R.N.