[PPT] - Structure from Motion and Optical Flow Various slides from previous PowerPoint Presentation

SLIDE 1

CS4501: Introduction to Computer Vision

Structure from Motion and Optical Flow

Various slides from previous courses by: D.A. Forsyth (Berkeley / UIUC), I. Kokkinos (Ecole Centrale / UCL). S. Lazebnik (UNC / UIUC), S. Seitz (MSR / Facebook), J. Hays (Brown / Georgia Tech), A. Berg (Stony Brook / UNC), D. Samaras (Stony Brook) . J. M. Frahm (UNC), V. Ordonez (UVA), Steve Seitz (UW).

SLIDE 2

More on Epipolar Geometry
Essential Matrix
Fundamental Matrix

Last Class

SLIDE 3

More on Epipolar Geometry
Essential Matrix
Fundamental Matrix
Optical Flow

Today’s Class

SLIDE 4

Estimating depth with stereo

Stereo: shape from “motion” between two views
We’ll need to consider:
Info on camera pose (“calibration”)
Image point correspondences

scene point

ptical

center image plane

SLIDE 5

Potential matches for x have to lie on the corresponding line l’. Potential matches for x’ have to lie on the corresponding line l.

Key idea: Epipolar constraint

x x ’ X x ’ X x ’ X

SLIDE 6

Epipolar Plane – plane containing baseline (1D family)
Epipoles

= intersections of baseline with image planes = projections of the other camera center

Baseline – line connecting the two camera centers

Epipolar geometry: notation

X x x ’

SLIDE 7

Epipolar Lines - intersections of epipolar plane with image

planes (always come in corresponding pairs)

Epipolar geometry: notation

X x x ’

Epipolar Plane – plane containing baseline (1D family)
Epipoles

= intersections of baseline with image planes = projections of the other camera center

Baseline – line connecting the two camera centers

SLIDE 8

Epipolar Geometry: Another example

Credit: William Hoff, Colorado School of Mines

SLIDE 9

Example: Converging cameras

SLIDE 10

Geometry for a simple stereo system

First, assuming parallel optical axes, known camera

parameters (i.e., calibrated cameras):

SLIDE 11

Simplest Case: Parallel images

Image planes of cameras are

parallel to each other and to the baseline

Camera centers are at same height
Focal lengths are the same
Then epipolar lines fall along the

horizontal scan lines of the images

SLIDE 12

Assume parallel optical axes, known camera parameters (i.e.,

calibrated cameras). What is expression for Z? Similar triangles (pl, P, pr) and (Ol, P, Or):

Geometry for a simple stereo system

Z T f Z x x T

r l

=

+

l r

x x T f Z

=

disparity

SLIDE 13

Depth from disparity

image I(x,y) image I´(x´,y´) Disparity map D(x,y) (x´,y´)=(x+D(x,y), y) So if we could find the corresponding points in two images, we could estimate relative depth…

SLIDE 14

Matching cost disparity Left Right scanline

Correspondence search

Slide a window along the right scanline and

compare contents of that window with the reference window in the left image

Matching cost: SSD or normalized correlation

SLIDE 15

Left Right scanline

Correspondence search

SSD

SLIDE 16

Left Right scanline

Correspondence search

Norm. corr

SLIDE 17

Basic stereo matching algorithm

If necessary, rectify the two stereo images to transform

epipolar lines into scanlines

For each pixel x in the first image
Find corresponding epipolar scanline in the right image
Examine all pixels on the scanline and pick the best match x’
Compute disparity x–x’ and set depth(x) = B*f/(x–x’)

SLIDE 18

Failures of correspondence search

Textureless surfaces Occlusions, repetition Non-Lambertian surfaces, specularities

SLIDE 19

Active stereo with structured light

Project “structured” light patterns onto the object
Simplifies the correspondence problem
Allows us to use only one camera

camera projector

L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured

Light and Multi-pass Dynamic Programming. 3DPVT 2002

SLIDE 20

Kinect and Iphone X Solution

Add Texture!

SLIDE 21

Kinect: Structured infrared light

http://bbzippo.wordpress.com/2010/11/28/kinect-in-infrared/

SLIDE 22

iPhone X

SLIDE 23

Basic stereo matching algorithm

If necessary, rectify the two stereo images to transform

epipolar lines into scanlines

For each pixel x in the first image
Find corresponding epipolar scanline in the right image
Examine all pixels on the scanline and pick the best match x’
Compute disparity x–x’ and set depth(x) = B*f/(x–x’)

SLIDE 24

Effect of window size

Smaller window

+ More detail

More noise
Larger window

+ Smoother disparity maps

Less detail

W = 3 W = 20

SLIDE 25

Results with window search

Window-based matching Ground truth Data

SLIDE 26

Better methods exist...

Graph cuts Ground truth

For the latest and greatest: http://www.middlebury.edu/stereo/

Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy

Minimization via Graph Cuts, PAMI 2001

SLIDE 27

When cameras are not aligned: Stereo image rectification

Reproject image planes onto a common
plane parallel to the line between optical centers
Pixel motion is horizontal after this transformation
Two homographies (3x3 transform), one for each input image

reprojection

C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision. IEEE
Conf. Computer Vision and Pattern Recognition, 1999.

SLIDE 28

Rectification example

SLIDE 29

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

SLIDE 30

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

SLIDE 31

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

SLIDE 32

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

SLIDE 33

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

SLIDE 34

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

SLIDE 35

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

SLIDE 36

Brief Digression: Cross Product as Matrix Multiplication

Dot Product as matrix multiplication: Easy

!" !$ !% & '" '$ '% = !"'" + !$'$ + !%'% !" !$ !% '" '$ '% * = !"'" + !$'$ + !%'%

Cross Product as matrix multiplication:

SLIDE 37

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

SLIDE 38

The Essential Matrix

SLIDE 39

The Essential Matrix

Credit: William Hoff, Colorado School of Mines

Essential Matrix (Longuet-Higgins, 1981)

SLIDE 40

X

x x ’

Epipolar constraint: Calibrated case

Intrinsic and extrinsic parameters of the cameras are known, world

coordinate system is set to that of the first camera

Then the projection matrices are given by K[I | 0] and K’[R | t]
We can multiply the projection matrices (and the image points) by the

inverse of the calibration matrices to get normalized image coordinates:

X t R x K x X, I x K x ] [ 0] [

pixel 1 norm pixel 1 norm

= ¢ ¢ = ¢ = =

SLIDE 41

X

x x’ = Rx+t

Epipolar constraint: Calibrated case

R t

The vectors Rx, t, and x’ are coplanar

= (x,1)T

[ ]

÷ ÷ ø ö ç ç è æ 1 x I

[ ]

÷ ÷ ø ö ç ç è æ 1 x t R

SLIDE 42

Epipolar constraint: Calibrated case

] ) ( [ = ´ × ¢ x R t x

! x T[t×]Rx = 0

X

x x’ = Rx+t

b a b a ] [

´

= ú ú ú û ù ê ê ê ë é ú ú ú û ù ê ê ê ë é

=

´

z y x x y x z y z

b b b a a a a a a

Recall:

The vectors Rx, t, and x’ are coplanar

SLIDE 43

Epipolar constraint: Calibrated case

] ) ( [ = ´ × ¢ x R t x

! x T[t×]Rx = 0

X

x x’ = Rx+t

! x TE x = 0

Essential Matrix (Longuet-Higgins, 1981) The vectors Rx, t, and x’ are coplanar

SLIDE 44

X

x x ’

Epipolar constraint: Calibrated case

E x is the epipolar line associated with x (l' = E x)
Recall: a line is given by ax + by + c = 0 or

! x TE x = 0

ú ú ú û ù ê ê ê ë é = ú ú ú û ù ê ê ê ë é = = 1 , where y x c b a

T

x l x l

SLIDE 45

X

x x ’

Epipolar constraint: Calibrated case

E x is the epipolar line associated with x (l' = E x)
ETx' is the epipolar line associated with x' (l = ETx')
E e = 0 and ETe' = 0
E is singular (rank two)
E has five degrees of freedom

! x TE x = 0

SLIDE 46

Epipolar constraint: Uncalibrated case

The calibration matrices K and K’ of the two

cameras are unknown

We can write the epipolar constraint in terms of

unknown normalized coordinates:

X

x x ’

ˆ ˆ = ¢ x E x T

x K x x K x ¢ ¢ = ¢ =

1

1

ˆ , ˆ

SLIDE 47

Epipolar constraint: Uncalibrated case

X

x x ’

Fundamental Matrix

(Faugeras and Luong, 1992)

ˆ ˆ = ¢ x E x T

x K x x K x ¢ ¢ = ¢ =

1

1

ˆ ˆ

1

with

¢

= = ¢ K E K F x F x

T T

SLIDE 48

Epipolar constraint: Uncalibrated case

F x is the epipolar line associated with x (l' = F x)
FTx' is the epipolar line associated with x' (l = FTx')
F e = 0 and FTe' = 0
F is singular (rank two)
F has seven degrees of freedom

X

x x ’

ˆ ˆ = ¢ x E x T

1

with

¢

= = ¢ K E K F x F x

T T

SLIDE 49

Estimating the Fundamental Matrix

8-point algorithm
Least squares solution using SVD on equations from 8 pairs of correspondences
Enforce det(F)=0 constraint using SVD on F
7-point algorithm
Use least squares to solve for null space (two vectors) using SVD and 7 pairs of

correspondences

Solve for linear combination of null space vectors that satisfies det(F)=0
Minimize reprojection error
Non-linear least squares

Note: estimation of F (or E) is degenerate for a planar scene.

SLIDE 50

8-point algorithm

1. Solve a system of homogeneous linear equations

a. Write down the system of equations

= ¢ x x F

T

!!"#

$$ + !&"# $' + !# $( + &!"# '$ + &&"# '' + &# '( + !"# ($ + &"# (' + # (( = 0

A, = !$!$′ !$&$′ !$ &$!$′ &$&$′ &$ !$′ &$′ 1 ⋮ !0!1

"

⋮ !0&0′ ⋮ !0 ⋮ &0!0′ ⋮ &0&0′ ⋮ &0 ⋮ !0′ ⋮ &0′ ⋮ 1 #

$$

#

$'

#

$(

#

'$

⋮ #

((

=0

SLIDE 51

8-point algorithm

1. Solve a system of homogeneous linear equations

a. Write down the system of equations b. Solve f from Af=0 using SVD

Matlab:

[U, S, V] = svd(A); f = V(:, end); F = reshape(f, [3 3])’;

SLIDE 52

Optical Flow

Slides by Juan Carlos Niebles and Ranjay Krishnan

SLIDE 53

From images to videos

A video is a sequence of frames captured over time
Now our image data is a function of space (x, y) and time (t)

SLIDE 54

Why is motion useful?

SLIDE 55

Why is motion useful?

SLIDE 56

Optical flow

Definition: optical flow is the apparent motion of

brightness patterns in the image

Note: apparent motion can be caused by lighting

changes without any actual motion

Think of a uniform rotating sphere under fixed lighting
vs. a stationary sphere under moving illumination

GOAL: Recover image motion at each pixel from

ptical flow

Source: Silvio Savarese

SLIDE 57

Picture courtesy of Selim Temizer - Learning and Intelligent Systems (LIS) Group, MIT

Optical flow

Vector field function of the spatio-temporal image brightness variations

SLIDE 58

Estimating optical flow

Given two subsequent frames, estimate the apparent motion field

u(x,y), v(x,y) between them

Key assumptions
Brightness constancy: projection of the same point looks the same in

every frame

Small motion: points do not move very far
Spatial coherence: points move like their neighbors

I(x,y,t–1) I(x,y,t)

Source: Silvio Savarese

SLIDE 59

Key Assumptions: small motions

SLIDE 60

Key Assumptions: spatial coherence

* Slide from Michael Black, CS143 2003

SLIDE 61

Key Assumptions: brightness Constancy

* Slide from Michael Black, CS143 2003

SLIDE 62

I(x +u, y + v,t) ≈ I(x, y,t −1)+ Ix ⋅u(x, y)+ Iy ⋅v(x, y)+ It

Brightness Constancy Equation:

I(x, y,t −1) = I(x +u(x, y), y + v(x, y),t)

Linearizing the right side using Taylor expansion: I(x,y,t–1) I(x,y,t) » + × + ×

t y x

I v I u I Hence,

Image derivative along x

→ ∇I ⋅ u v

[ ]

T + It = 0

I(x +u, y + v,t)− I(x, y,t −1) = Ix ⋅u(x, y)+ Iy ⋅v(x, y)+ It

Source: Silvio Savarese

The brightness constancy constraint

SLIDE 63

Filters used to find the derivatives

!" !# !$

SLIDE 64

How many equations and unknowns per pixel?

The component of the flow perpendicular to the gradient (i.e., parallel to the edge) cannot be measured edge (u,v) (u’,v’) gradient (u+u’,v+v’)

If (u, v ) satisfies the equation, so does (u+u’, v+v’ ) if

One equation (this is a scalar equation!), two unknowns (u,v)

∇I ⋅ u' v'

[ ]

T = 0

Can we use this equation to recover image motion (u,v) at each pixel?

Source: Silvio Savarese

The brightness constancy constraint

∇I ⋅ u v

[ ]

T + It = 0

SLIDE 65

The aperture problem

Actual motion

Source: Silvio Savarese

SLIDE 66

The aperture problem

Perceived motion

Source: Silvio Savarese

SLIDE 67

The barber pole illusion

http://en.wikipedia.org/wiki/Barberpole_illusion

Source: Silvio Savarese

SLIDE 68

The barber pole illusion

http://en.wikipedia.org/wiki/Barberpole_illusion

Source: Silvio Savarese

SLIDE 69

What we will learn today?

Optical flow
Lucas-Kanade method

Reading: [Szeliski] Chapters: 8.4, 8.5

[Fleet & Weiss, 2005] http://www.cs.toronto.edu/pub/jepson/teaching/vision/2503/opticalFlow.pdf

SLIDE 70

Solving the ambiguity…

How to get more equations for a pixel?
Spatial coherence constraint:
Assume the pixel’s neighbors have the same (u,v)
If we use a 5x5 window, that gives us 25 equations per pixel
B. Lucas and T. Kanade. An iterative image registration technique with an application to stereo
vision. In Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674–

679, 1981.

Source: Silvio Savarese

SLIDE 71

Overconstrained linear system:

Lucas-Kanade flow

Source: Silvio Savarese

SLIDE 72

Overconstrained linear system

The summations are over all pixels in the K x K window

Least squares solution for d given by

Source: Silvio Savarese

Lucas-Kanade flow

SLIDE 73

Conditions for solvability

Optimal (u, v) satisfies Lucas-Kanade equation

Does this remind anything to you?

When is This Solvable?

ATA should be invertible
ATA should not be too small due to noise

– eigenvalues l1 and l 2 of ATA should not be too small

ATA should be well-conditioned

– l 1/ l 2 should not be too large (l 1 = larger eigenvalue)

Source: Silvio Savarese

SLIDE 74

Eigenvectors and eigenvalues of ATA relate to

edge direction and magnitude

The eigenvector associated with the larger eigenvalue points in

the direction of fastest intensity change

The other eigenvector is orthogonal to it

M = ATA is the second moment matrix ! (Harris corner detector…)

Source: Silvio Savarese

SLIDE 75

Interpreting the eigenvalues

l1 l2 “Corner” l1 and l2 are large, l1 ~ l2 l1 and l2 are small “Edge” l1 >> l2 “Edge” l2 >> l1 “Flat” region

Classification of image points using eigenvalues of the second moment matrix:

Source: Silvio Savarese

SLIDE 76

Edge

– gradients very large or very small – large l1, small l2

Source: Silvio Savarese

SLIDE 77

Low-texture region

– gradients have small magnitude

– small l1, small l2

Source: Silvio Savarese

SLIDE 78

High-texture region

– gradients are different, large magnitudes

– large l1, large l2

Source: Silvio Savarese

SLIDE 79

Errors in Lukas-Kanade

What are the potential causes of errors in this procedure?

Suppose ATA is easily invertible
Suppose there is not much noise in the image
When our assumptions are violated

– Brightness constancy is not satisfied – The motion is not small – A point does not move like its neighbors

window size is too large
what is the ideal window size?

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

SLIDE 80

– Can solve using Newton’s method (out of scope for this class) – Lukas-Kanade method does one iteration of Newton’s method

Better results are obtained via more iterations

Improving accuracy

Recall our small motion assumption
This is not exact

– To do better, we need to add higher order terms back in:

This is a polynomial root finding problem

It-1(x,y) It-1(x,y) It-1(x,y)

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

SLIDE 81

Iterative Refinement

Iterative Lukas-Kanade Algorithm

1. Estimate velocity at each pixel by solving Lucas- Kanade equations 2. Warp I(t-1) towards I(t) using the estimated flow field

use image warping techniques

3. Repeat until convergence

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

SLIDE 82

When do the optical flow assumptions fail?

In other words, in what situations does the displacement of pixel patches not represent physical movement of points in space?

1. Well, TV is based on illusory motion

– the set is stationary yet things seem to move

2. A uniform rotating sphere

– nothing seems to move, yet it is rotating

3. Changing directions or intensities of lighting can make things seem to move

– for example, if the specular highlight on a rotating sphere moves.

4. Muscle movement can make some spots on a cheetah move opposite direction of motion.

– And infinitely more break downs of optical flow.

SLIDE 83

Questions?

83