Stereo Computer Vision Fall 2018 Columbia University Homework - - PowerPoint PPT Presentation

stereo
SMART_READER_LITE
LIVE PREVIEW

Stereo Computer Vision Fall 2018 Columbia University Homework - - PowerPoint PPT Presentation

Stereo Computer Vision Fall 2018 Columbia University Homework Homework 2 grades are back Median 37/40, std 7.2 Homework 3 due now Homework 4 out today My Office Hours Now Mondays 5pm-6pm Course Evaluations 60% response


slide-1
SLIDE 1

Stereo

Computer Vision Fall 2018 Columbia University

slide-2
SLIDE 2

Homework

  • Homework 2 grades are back
  • Median 37/40, std 7.2
  • Homework 3 due now
  • Homework 4 out today
slide-3
SLIDE 3

My Office Hours

  • Now Mondays 5pm-6pm
slide-4
SLIDE 4

Course Evaluations

  • 60% response rate so far
  • Please respond by tomorrow
  • We read all feedback!
slide-5
SLIDE 5

Image Stitching

slide-6
SLIDE 6

Image alignment

Why don’t these image line up exactly?

slide-7
SLIDE 7

Transformation Models

  • T

ranslation only

  • Rigid body (translate+rotate)
  • Similarity (translate+rotate+scale)
  • AIne
  • Homography (projective)
slide-8
SLIDE 8

Camera Projection

˜ x ˜ y ˜ z = f f 0 0 1 r11 r12 r13 tx r21 r22 r23 ty r31 r32 r33 tx X Y Z 1 World Coordinates Camera Extrinsics Camera Intrinsics

slide-9
SLIDE 9

Camera Matrix

˜ x ˜ y ˜ z = C11 C12 C13 C14 C21 C22 C23 C24 C31 C32 C33 C34 X Y Z 1 Mapping points from the world to image coordinates is matrix multiplication in homogenous coordinates

slide-10
SLIDE 10

Projection of 3D Plane

All points on the plane have Z = 0 ˜ x ˜ y ˜ z = C11 C12 C13 C14 C21 C22 C23 C24 C31 C32 C33 1 X Y 1

Slide credit: Peter Corke

slide-11
SLIDE 11

Projection of 3D Plane

All points on the plane have Z = 0 ˜ x ˜ y ˜ z = C11 C12 0 C14 C21 C22 0 C24 C31 C32 1 X Y 1

Slide credit: Peter Corke

slide-12
SLIDE 12

Two-views of Plane

˜ x2 ˜ y2 ˜ z2 = H2H−1

1

˜ x1 ˜ y1 ˜ z1 ˜ x1 ˜ y1 ˜ z1 = H1 ( X Y 1) ˜ x2 ˜ y2 ˜ z2 = H2 ( X Y 1)

Slide credit: Deva Ramanan

slide-13
SLIDE 13

Image Alignment Algorithm

Given images A and B

  • 1. Compute image features for A and B
  • 2. Match features between A and B
  • 3. Compute homography between A and B

using least squares on set of matches What could go wrong?

Slide credit: Noah Snavely

slide-14
SLIDE 14

Outliers

  • utliers

inliers

Slide credit: Noah Snavely

slide-15
SLIDE 15

Robustness

  • Let’s consider a simpler example… linear

regression

  • How can we fix this?

Problem: Fit a line to these datapoints Least squares fit

Slide credit: Noah Snavely

slide-16
SLIDE 16

We need a better cost function…

  • Suggestions?

Slide credit: Noah Snavely

slide-17
SLIDE 17

Counting inliers

Slide credit: Noah Snavely

slide-18
SLIDE 18

Counting inliers

Inliers: 3

Slide credit: Noah Snavely

slide-19
SLIDE 19

Counting inliers

Inliers: 20

Slide credit: Noah Snavely

slide-20
SLIDE 20

Idea

  • Given a hypothesized line
  • Count the number of points that “agree” with

the line

– “Agree” = within a small distance of the line – I.e., the inliers to that line

  • For all possible lines, select the one with the

largest number of inliers

Slide credit: Noah Snavely

slide-21
SLIDE 21

How do we find the best line?

  • Unlike least-squares, no simple closed-form

solution

  • Hypothesize-and-test

– Try out many lines, keep the best one – Which lines?

Slide credit: Noah Snavely

slide-22
SLIDE 22

RANSAC

Algorithm:

  • 1. Sample (randomly) the number of points s required to fit the model
  • 2. Solve for model parameters using samples
  • 3. Score by the fraction of inliers within a preset threshold of the model

Repeat 1-3 until the best model is found with high confidence

Fischler & Bolles in ‘81.

(RANdom SAmple Consensus) :

Slide credit: James Hays

slide-23
SLIDE 23

RANSAC

Algorithm:

  • 1. Sample (randomly) the number of points required to fit the model (s=2)
  • 2. Solve for model parameters using samples
  • 3. Score by the fraction of inliers within a preset threshold of the model

Repeat 1-3 until the best model is found with high confidence

Illustration by Savarese

Line fitting example

slide-24
SLIDE 24

RANSAC

Algorithm:

  • 1. Sample (randomly) the number of points required to fit the model (s=2)
  • 2. Solve for model parameters using samples
  • 3. Score by the fraction of inliers within a preset threshold of the model

Repeat 1-3 until the best model is found with high confidence Line fitting example

Slide credit: James Hays

slide-25
SLIDE 25
  • RANSAC

6

  • Inliers

N

Algorithm:

  • 1. Sample (randomly) the number of points required to fit the model (s=2)
  • 2. Solve for model parameters using samples
  • 3. Score by the fraction of inliers within a preset threshold of the model

Repeat 1-3 until the best model is found with high confidence Line fitting example

Slide credit: James Hays

slide-26
SLIDE 26
  • RANSAC

14

  • Inliers

N

Algorithm:

  • 1. Sample (randomly) the number of points required to fit the model (s=2)
  • 2. Solve for model parameters using samples
  • 3. Score by the fraction of inliers within a preset threshold of the model

Repeat 1-3 until the best model is found with high confidence

Slide credit: James Hays

slide-27
SLIDE 27

RANSAC for alignment

Slide credit: Deva Ramanan

slide-28
SLIDE 28

RANSAC for alignment

Slide credit: Deva Ramanan

slide-29
SLIDE 29

RANSAC for alignment

Slide credit: Deva Ramanan

slide-30
SLIDE 30

Implementing image warping

  • Given a coordinate xform (x’,y’) = T(x,y) and a

source image f(x,y), how do we compute an xformed image g(x’,y’) = f(T(x,y))?

f(x,y) g(x’,y’) x x’ T(x,y) y y’

slide-31
SLIDE 31

Forward Warping

  • Send each pixel f(x) to its corresponding

location (x’,y’) = T(x,y) in g(x’,y’)

f(x,y) g(x’,y’) x x’ T(x,y)

  • What if pixel lands “between” two pixels?

y y’

slide-32
SLIDE 32

Inverse Warping

  • Get each pixel g(x’,y’) from its corresponding

location (x,y) = T-1(x,y) in f(x,y)

f(x,y) g(x’,y’) x x’ T-1(x,y)

  • Requires taking the inverse of the transform
  • What if pixel comes from “between” two pixels?

y y’

slide-33
SLIDE 33

Inverse Warping

  • Get each pixel g(x’) from its corresponding

location x’ = h(x) in f(x)

  • What if pixel comes from “between” two pixels?
  • Answer: resample color value from interpolated

(prefiltered) source image

f(x,y) g(x’,y’) x x’ y y’ T-1(x,y)

slide-34
SLIDE 34

Blending

  • We’ve aligned the images – now what?

Slide credit: Noah Snavely

slide-35
SLIDE 35

Blending

  • Want to seamlessly blend them together

Slide credit: Noah Snavely

slide-36
SLIDE 36

Image Blending

Slide credit: Noah Snavely

slide-37
SLIDE 37

Feathering

1 1

+ =

Slide credit: Noah Snavely

slide-38
SLIDE 38

Effect of window size

1

left right

1

Slide credit: Noah Snavely

slide-39
SLIDE 39

1 1

Effect of window size

Slide credit: Noah Snavely

slide-40
SLIDE 40

Blending

Slide credit: Davis ‘98

slide-41
SLIDE 41

Blending

Slide credit: Olga Russakovsky

slide-42
SLIDE 42

Blending

Slide credit: Davis ‘98

slide-43
SLIDE 43

Stereo

slide-44
SLIDE 44

Stereo vision

~6cm ~50cm

44 Slide credit: Antonio Torralba

slide-45
SLIDE 45

Why not put our second eye here?

slide-46
SLIDE 46

Stereoscopes: A 19th Century Pastime

slide-47
SLIDE 47

Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923

slide-48
SLIDE 48

Teesta suspension bridge-Darjeeling, India

slide-49
SLIDE 49

Mark Twain at Pool Table", no date, UCR Museum of Photography

slide-50
SLIDE 50

3D Movies

slide-51
SLIDE 51

Depth without objects

Random dot stereograms (Bela Julesz)

Julesz, 1971

51

slide-52
SLIDE 52

Stereo

  • Given two images from different viewpoints

– How can we compute the depth of each point in the image? – Based on how much each pixel moves between the two images

slide-53
SLIDE 53

Geometry for a simple stereo system

53

f Z1 X1 xL

Slide credit: Antonio Torralba

slide-54
SLIDE 54

Geometry for a simple stereo system

54

f Z1 X1 Z? xL

Slide credit: Antonio Torralba

slide-55
SLIDE 55

Geometry for a simple stereo system

55

f Z1 X1 f T Z2 X2 Z? xL xR

Slide credit: Antonio Torralba

slide-56
SLIDE 56

Geometry for a simple stereo system

56

f Z1 X1 f T Z2 X2 Z? xL xR Similar triangles

Slide credit: Antonio Torralba

slide-57
SLIDE 57

Geometry for a simple stereo system

57

f Z1 X1 f T Z2 X2 Z? xL xR Similar triangles

Slide credit: Antonio Torralba

slide-58
SLIDE 58

Geometry for a simple stereo system

f Z1 X1 f T Z2 X2 Z? xL xR

T+XL-XR Z-f

= Similar triangles:

Slide credit: Antonio Torralba

slide-59
SLIDE 59

Geometry for a simple stereo system

59

f Z1 X1 f T Z2 X2 Z? xL xR

T+XL-XR Z-f

= Similar triangles:

T Z

Slide credit: Antonio Torralba

slide-60
SLIDE 60

Geometry for a simple stereo system

60

f Z1 X1 f T Z2 X2 Z? xL xR

T+XL-XR Z-f

= Similar triangles:

T Z

Solving for Z: Z = f

T

XR - XL Disparity

Slide credit: Antonio Torralba

slide-61
SLIDE 61

epipolar lines

Epipolar geometry

(x1, y1) (x2, y1)

x2 -x1 = the disparity of pixel (x1, y1)

Two images captured by a purely horizontal translating camera (rectified stereo pair)

Slide credit: Noah Snavely

slide-62
SLIDE 62

Your basic stereo algorithm

For each epipolar line For each pixel in the left image

  • compare with every pixel on same epipolar line in right image
  • pick pixel with minimum match cost

Improvement: match windows

Slide credit: Noah Snavely

slide-63
SLIDE 63

Stereo matching based on SSD

SSD dmin d Best matching disparity

Slide credit: Noah Snavely

slide-64
SLIDE 64

Window size

– Smaller window

+

  • – Larger window

+

  • W = 3

W = 20 Better results with adaptive window

  • T. Kanade and M. Okutomi, A Stereo Matching Algorithm

with an Adaptive Window: Theory and Experiment,,

  • Proc. International Conference on Robotics and

Automation, 1991.

  • D. Scharstein and R. Szeliski. Stereo matching with

nonlinear diffusion. International Journal of Computer Vision, 28(2):155-174, July 1998

Effect of window size

Slide credit: Noah Snavely

slide-65
SLIDE 65

Stereo results

– Data from University of Tsukuba – Similar results on other images without ground truth

Ground truth Scene

Slide credit: Noah Snavely

slide-66
SLIDE 66

Results with window search

Window-based matching (best window size) Ground truth

Slide credit: Noah Snavely

slide-67
SLIDE 67

Better methods exist...

State of the art method

Boykov et al., Fast Approximate Energy Minimization via Graph Cuts, International Conference on Computer Vision, September 1999.

Ground truth

For the latest and greatest: http://www.middlebury.edu/stereo/

slide-68
SLIDE 68

Stereo as energy minimization

  • What defines a good stereo correspondence?
  • 1. Match quality
  • Want each pixel to find a good match in the other image
  • 2. Smoothness
  • If two pixels are adjacent, they should (usually) move about

the same amount

Slide credit: Noah Snavely

slide-69
SLIDE 69

Stereo as energy minimization

  • Find disparity map d that minimizes an energy

function

  • Simple pixel / window matching

SSD distance between windows I(x, y) and J(x + d(x,y), y)

=

Slide credit: Noah Snavely

slide-70
SLIDE 70

Stereo as energy minimization

I(x, y) J(x, y)

y = 141

C(x, y, d); the disparity space image (DSI)

x d

Slide credit: Noah Snavely

slide-71
SLIDE 71

Stereo as energy minimization

y = 141 x d

Simple pixel / window matching: choose the minimum of each column in the DSI independently:

Slide credit: Noah Snavely

slide-72
SLIDE 72

Greedy selection of best match

Slide credit: Noah Snavely

slide-73
SLIDE 73

Stereo as energy minimization

  • Better objective function

{ {

match cost smoothness cost

Want each pixel to find a good match in the other image Adjacent pixels should (usually) move about the same amount

Slide credit: Noah Snavely

slide-74
SLIDE 74

Stereo as energy minimization

match cost: smoothness cost:

4-connected neighborhood 8-connected neighborhood : set of neighboring pixels

Slide credit: Noah Snavely

slide-75
SLIDE 75

Smoothness cost

“Potts model” L1 distance How do we choose V?

Slide credit: Noah Snavely

slide-76
SLIDE 76

Dynamic programming

  • Can minimize this independently per scanline

using dynamic programming (DP)

  • Basic idea: incrementally build a table of costs

D one column at a time

: minimum cost of solution such that d(x,y) = i

Recurrence: Base case: (L = max disparity)

Slide credit: Noah Snavely

slide-77
SLIDE 77

Dynamic programming

  • Finds “smooth”, low-cost path through DPI from left

to right

y = 141 x d

Slide credit: Noah Snavely

slide-78
SLIDE 78

Dynamic Programming

slide-79
SLIDE 79

Stereo as a minimization problem

  • The 2D problem has many local minima

– Gradient descent doesn’t work well

  • And a large search space

– n x m image w/ k disparities has knm possible solutions – Finding the global minimum is NP-hard in general

  • Good approximations exist… we’ll see this soon

Slide credit: Noah Snavely

slide-80
SLIDE 80

Stereo correspondence constraints

O O’ p p’ ? If we see a point in camera 1, are there any constraints on where we
 will find it on camera 2? Camera 1 Camera 2

80 Slide credit: Antonio Torralba

slide-81
SLIDE 81

Epipolar constraint

O O’ p p’ ?

81 Slide credit: Antonio Torralba

slide-82
SLIDE 82

Some terminology

82

O O’ p p’ ?

Slide credit: Antonio Torralba

slide-83
SLIDE 83

Some terminology

83

O O’ p p’ ?

Baseline: the line connecting the two camera centers Epipole: point of intersection of baseline with the image plane

Baseline

Slide credit: Antonio Torralba

slide-84
SLIDE 84

Some terminology

84

O O’ p p’ ?

Baseline: the line connecting the two camera centers Epipole: point of intersection of baseline with the image plane

epipole epipole Baseline

Slide credit: Antonio Torralba

slide-85
SLIDE 85

Some terminology

85

O O’ p p’ ?

Baseline: the line connecting the two camera centers Epipolar plane: the plane that contains the two camera centers and a 3D point in the world Epipole: point of intersection of baseline with the image plane

epipolar plane

Slide credit: Antonio Torralba

slide-86
SLIDE 86

Some terminology

86

O O’ p p’ ?

Baseline: the line connecting the two camera centers Epipolar plane: the plane that contains the two camera centers and a 3D point in the world Epipolar line: intersection of the epipolar plane with each image plane Epipole: point of intersection of baseline with the image plane

epipolar line epipolar line

Slide credit: Antonio Torralba

slide-87
SLIDE 87

Epipolar constraint

O O’ p p’ ?

87

epipolar line We can search for matches across epipolar lines All epipolar lines intersect at the epipoles

Slide credit: Antonio Torralba

slide-88
SLIDE 88

The essential matrix

88

O O’ p p’

pT E p’ = 0

E: essential matrix p, p’: image points in homogeneous coordinates If we observe a point in one image, its position in the other image is constrained to lie

  • n line defined by above.

Slide credit: Antonio Torralba

slide-89
SLIDE 89

Real-time stereo

  • Used for robot navigation (and other tasks)

– Several real-time stereo techniques have been developed (most based on simple discrete search)

Nomad robot searches for meteorites in Antartica

http://www.frc.ri.cmu.edu/projects/meteorobot/index.html

slide-90
SLIDE 90
  • Camera calibration errors
  • Poor image resolution
  • Occlusions
  • Violations of brightness constancy (specular reflections)
  • Large motions
  • Low-contrast image regions

Stereo reconstruction pipeline

  • Steps

– Calibrate cameras – Rectify images – Compute disparity – Estimate depth

What will cause errors?

slide-91
SLIDE 91

Active stereo with structured light

  • Project “structured” light patterns onto the object

– simplifies the correspondence problem – basis for active depth sensors, such as Kinect and iPhone X (using IR)

camera 2 camera 1 projector camera 1 projector

Li Zhang’s one-shot stereo

slide-92
SLIDE 92

Active stereo with structured light

https://ios.gadgethacks.com/news/watch-iphone-xs-30k-ir-dots-scan-your-face-0180944/

slide-93
SLIDE 93

Laser scanning

  • Optical triangulation

– Project a single stripe of laser light – Scan it across the surface of the object – This is a very precise version of structured light scanning Digital Michelangelo Project

http://graphics.stanford.edu/projects/mich/

slide-94
SLIDE 94

Laser scanned models

The Digital Michelangelo Project, Levoy et al.

slide-95
SLIDE 95

Laser scanned models

The Digital Michelangelo Project, Levoy et al.

slide-96
SLIDE 96

Laser scanned models

The Digital Michelangelo Project, Levoy et al.

slide-97
SLIDE 97

Laser scanned models

The Digital Michelangelo Project, Levoy et al.