Structure from Motion and Optical Flow Various slides from previous - - PowerPoint PPT Presentation

structure from motion and optical flow
SMART_READER_LITE
LIVE PREVIEW

Structure from Motion and Optical Flow Various slides from previous - - PowerPoint PPT Presentation

CS4501: Introduction to Computer Vision Structure from Motion and Optical Flow Various slides from previous courses by: D.A. Forsyth (Berkeley / UIUC), I. Kokkinos (Ecole Centrale / UCL). S. Lazebnik (UNC / UIUC), S. Seitz (MSR / Facebook), J.


slide-1
SLIDE 1

CS4501: Introduction to Computer Vision

Structure from Motion and Optical Flow

Various slides from previous courses by: D.A. Forsyth (Berkeley / UIUC), I. Kokkinos (Ecole Centrale / UCL). S. Lazebnik (UNC / UIUC), S. Seitz (MSR / Facebook), J. Hays (Brown / Georgia Tech), A. Berg (Stony Brook / UNC), D. Samaras (Stony Brook) . J. M. Frahm (UNC), V. Ordonez (UVA), Steve Seitz (UW).

slide-2
SLIDE 2
  • More on Epipolar Geometry
  • Essential Matrix
  • Fundamental Matrix

Last Class

slide-3
SLIDE 3
  • More on Epipolar Geometry
  • Essential Matrix
  • Fundamental Matrix
  • Optical Flow

Today’s Class

slide-4
SLIDE 4

Estimating depth with stereo

  • Stereo: shape from “motion” between two views
  • We’ll need to consider:
  • Info on camera pose (“calibration”)
  • Image point correspondences

scene point

  • ptical

center image plane

slide-5
SLIDE 5

Potential matches for x have to lie on the corresponding line l’. Potential matches for x’ have to lie on the corresponding line l.

Key idea: Epipolar constraint

x x ’ X x ’ X x ’ X

slide-6
SLIDE 6
  • Epipolar Plane – plane containing baseline (1D family)
  • Epipoles

= intersections of baseline with image planes = projections of the other camera center

  • Baseline – line connecting the two camera centers

Epipolar geometry: notation

X x x ’

slide-7
SLIDE 7
  • Epipolar Lines - intersections of epipolar plane with image

planes (always come in corresponding pairs)

Epipolar geometry: notation

X x x ’

  • Epipolar Plane – plane containing baseline (1D family)
  • Epipoles

= intersections of baseline with image planes = projections of the other camera center

  • Baseline – line connecting the two camera centers
slide-8
SLIDE 8

Epipolar Geometry: Another example

Credit: William Hoff, Colorado School of Mines

slide-9
SLIDE 9

Example: Converging cameras

slide-10
SLIDE 10

Geometry for a simple stereo system

  • First, assuming parallel optical axes, known camera

parameters (i.e., calibrated cameras):

slide-11
SLIDE 11

Simplest Case: Parallel images

  • Image planes of cameras are

parallel to each other and to the baseline

  • Camera centers are at same height
  • Focal lengths are the same
  • Then epipolar lines fall along the

horizontal scan lines of the images

slide-12
SLIDE 12
  • Assume parallel optical axes, known camera parameters (i.e.,

calibrated cameras). What is expression for Z? Similar triangles (pl, P, pr) and (Ol, P, Or):

Geometry for a simple stereo system

Z T f Z x x T

r l

=

  • +

l r

x x T f Z

  • =

disparity

slide-13
SLIDE 13

Depth from disparity

image I(x,y) image I´(x´,y´) Disparity map D(x,y) (x´,y´)=(x+D(x,y), y) So if we could find the corresponding points in two images, we could estimate relative depth…

slide-14
SLIDE 14

Matching cost disparity Left Right scanline

Correspondence search

  • Slide a window along the right scanline and

compare contents of that window with the reference window in the left image

  • Matching cost: SSD or normalized correlation
slide-15
SLIDE 15

Left Right scanline

Correspondence search

SSD

slide-16
SLIDE 16

Left Right scanline

Correspondence search

  • Norm. corr
slide-17
SLIDE 17

Basic stereo matching algorithm

  • If necessary, rectify the two stereo images to transform

epipolar lines into scanlines

  • For each pixel x in the first image
  • Find corresponding epipolar scanline in the right image
  • Examine all pixels on the scanline and pick the best match x’
  • Compute disparity x–x’ and set depth(x) = B*f/(x–x’)
slide-18
SLIDE 18

Failures of correspondence search

Textureless surfaces Occlusions, repetition Non-Lambertian surfaces, specularities

slide-19
SLIDE 19

Active stereo with structured light

  • Project “structured” light patterns onto the object
  • Simplifies the correspondence problem
  • Allows us to use only one camera

camera projector

  • L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured

Light and Multi-pass Dynamic Programming. 3DPVT 2002

slide-20
SLIDE 20

Kinect and Iphone X Solution

  • Add Texture!
slide-21
SLIDE 21

Kinect: Structured infrared light

http://bbzippo.wordpress.com/2010/11/28/kinect-in-infrared/

slide-22
SLIDE 22

iPhone X

slide-23
SLIDE 23

Basic stereo matching algorithm

  • If necessary, rectify the two stereo images to transform

epipolar lines into scanlines

  • For each pixel x in the first image
  • Find corresponding epipolar scanline in the right image
  • Examine all pixels on the scanline and pick the best match x’
  • Compute disparity x–x’ and set depth(x) = B*f/(x–x’)
slide-24
SLIDE 24

Effect of window size

  • Smaller window

+ More detail

  • More noise
  • Larger window

+ Smoother disparity maps

  • Less detail

W = 3 W = 20

slide-25
SLIDE 25

Results with window search

Window-based matching Ground truth Data

slide-26
SLIDE 26

Better methods exist...

Graph cuts Ground truth

For the latest and greatest: http://www.middlebury.edu/stereo/

  • Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy

Minimization via Graph Cuts, PAMI 2001

slide-27
SLIDE 27

When cameras are not aligned: Stereo image rectification

  • Reproject image planes onto a common
  • plane parallel to the line between optical centers
  • Pixel motion is horizontal after this transformation
  • Two homographies (3x3 transform), one for each input image

reprojection

  • C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision. IEEE
  • Conf. Computer Vision and Pattern Recognition, 1999.
slide-28
SLIDE 28

Rectification example

slide-29
SLIDE 29

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

slide-30
SLIDE 30

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

slide-31
SLIDE 31

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

slide-32
SLIDE 32

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

slide-33
SLIDE 33

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

slide-34
SLIDE 34

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

slide-35
SLIDE 35

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

slide-36
SLIDE 36

Brief Digression: Cross Product as Matrix Multiplication

  • Dot Product as matrix multiplication: Easy

!" !$ !% & '" '$ '% = !"'" + !$'$ + !%'% !" !$ !% '" '$ '% * = !"'" + !$'$ + !%'%

  • Cross Product as matrix multiplication:
slide-37
SLIDE 37

Credit: William Hoff, Colorado School of Mines

Back to the General Problem

slide-38
SLIDE 38

The Essential Matrix

slide-39
SLIDE 39

The Essential Matrix

Credit: William Hoff, Colorado School of Mines

Essential Matrix (Longuet-Higgins, 1981)

slide-40
SLIDE 40

X

x x ’

Epipolar constraint: Calibrated case

  • Intrinsic and extrinsic parameters of the cameras are known, world

coordinate system is set to that of the first camera

  • Then the projection matrices are given by K[I | 0] and K’[R | t]
  • We can multiply the projection matrices (and the image points) by the

inverse of the calibration matrices to get normalized image coordinates:

X t R x K x X, I x K x ] [ 0] [

pixel 1 norm pixel 1 norm

= ¢ ¢ = ¢ = =

slide-41
SLIDE 41

X

x x’ = Rx+t

Epipolar constraint: Calibrated case

R t

The vectors Rx, t, and x’ are coplanar

= (x,1)T

[ ]

÷ ÷ ø ö ç ç è æ 1 x I

[ ]

÷ ÷ ø ö ç ç è æ 1 x t R

slide-42
SLIDE 42

Epipolar constraint: Calibrated case

] ) ( [ = ´ × ¢ x R t x

! x T[t×]Rx = 0

X

x x’ = Rx+t

b a b a ] [

´

= ú ú ú û ù ê ê ê ë é ú ú ú û ù ê ê ê ë é

  • =

´

z y x x y x z y z

b b b a a a a a a

Recall:

The vectors Rx, t, and x’ are coplanar

slide-43
SLIDE 43

Epipolar constraint: Calibrated case

] ) ( [ = ´ × ¢ x R t x

! x T[t×]Rx = 0

X

x x’ = Rx+t

! x TE x = 0

Essential Matrix (Longuet-Higgins, 1981) The vectors Rx, t, and x’ are coplanar

slide-44
SLIDE 44

X

x x ’

Epipolar constraint: Calibrated case

  • E x is the epipolar line associated with x (l' = E x)
  • Recall: a line is given by ax + by + c = 0 or

! x TE x = 0

ú ú ú û ù ê ê ê ë é = ú ú ú û ù ê ê ê ë é = = 1 , where y x c b a

T

x l x l

slide-45
SLIDE 45

X

x x ’

Epipolar constraint: Calibrated case

  • E x is the epipolar line associated with x (l' = E x)
  • ETx' is the epipolar line associated with x' (l = ETx')
  • E e = 0 and ETe' = 0
  • E is singular (rank two)
  • E has five degrees of freedom

! x TE x = 0

slide-46
SLIDE 46

Epipolar constraint: Uncalibrated case

  • The calibration matrices K and K’ of the two

cameras are unknown

  • We can write the epipolar constraint in terms of

unknown normalized coordinates:

X

x x ’

ˆ ˆ = ¢ x E x T

x K x x K x ¢ ¢ = ¢ =

  • 1

1

ˆ , ˆ

slide-47
SLIDE 47

Epipolar constraint: Uncalibrated case

X

x x ’

Fundamental Matrix

(Faugeras and Luong, 1992)

ˆ ˆ = ¢ x E x T

x K x x K x ¢ ¢ = ¢ =

  • 1

1

ˆ ˆ

1

with

  • ¢

= = ¢ K E K F x F x

T T

slide-48
SLIDE 48

Epipolar constraint: Uncalibrated case

  • F x is the epipolar line associated with x (l' = F x)
  • FTx' is the epipolar line associated with x' (l = FTx')
  • F e = 0 and FTe' = 0
  • F is singular (rank two)
  • F has seven degrees of freedom

X

x x ’

ˆ ˆ = ¢ x E x T

1

with

  • ¢

= = ¢ K E K F x F x

T T

slide-49
SLIDE 49

Estimating the Fundamental Matrix

  • 8-point algorithm
  • Least squares solution using SVD on equations from 8 pairs of correspondences
  • Enforce det(F)=0 constraint using SVD on F
  • 7-point algorithm
  • Use least squares to solve for null space (two vectors) using SVD and 7 pairs of

correspondences

  • Solve for linear combination of null space vectors that satisfies det(F)=0
  • Minimize reprojection error
  • Non-linear least squares

Note: estimation of F (or E) is degenerate for a planar scene.

slide-50
SLIDE 50

8-point algorithm

  • 1. Solve a system of homogeneous linear equations

a. Write down the system of equations

= ¢ x x F

T

!!"#

$$ + !&"# $' + !# $( + &!"# '$ + &&"# '' + &# '( + !"# ($ + &"# (' + # (( = 0

A, = !$!$′ !$&$′ !$ &$!$′ &$&$′ &$ !$′ &$′ 1 ⋮ !0!1

"

⋮ !0&0′ ⋮ !0 ⋮ &0!0′ ⋮ &0&0′ ⋮ &0 ⋮ !0′ ⋮ &0′ ⋮ 1 #

$$

#

$'

#

$(

#

'$

⋮ #

((

=0

slide-51
SLIDE 51

8-point algorithm

  • 1. Solve a system of homogeneous linear equations

a. Write down the system of equations b. Solve f from Af=0 using SVD

Matlab:

[U, S, V] = svd(A); f = V(:, end); F = reshape(f, [3 3])’;

slide-52
SLIDE 52

Optical Flow

Slides by Juan Carlos Niebles and Ranjay Krishnan

slide-53
SLIDE 53

From images to videos

  • A video is a sequence of frames captured over time
  • Now our image data is a function of space (x, y) and time (t)
slide-54
SLIDE 54

Why is motion useful?

slide-55
SLIDE 55

Why is motion useful?

slide-56
SLIDE 56

Optical flow

  • Definition: optical flow is the apparent motion of

brightness patterns in the image

  • Note: apparent motion can be caused by lighting

changes without any actual motion

  • Think of a uniform rotating sphere under fixed lighting
  • vs. a stationary sphere under moving illumination

GOAL: Recover image motion at each pixel from

  • ptical flow

Source: Silvio Savarese

slide-57
SLIDE 57

Picture courtesy of Selim Temizer - Learning and Intelligent Systems (LIS) Group, MIT

Optical flow

Vector field function of the spatio-temporal image brightness variations

slide-58
SLIDE 58

Estimating optical flow

  • Given two subsequent frames, estimate the apparent motion field

u(x,y), v(x,y) between them

  • Key assumptions
  • Brightness constancy: projection of the same point looks the same in

every frame

  • Small motion: points do not move very far
  • Spatial coherence: points move like their neighbors

I(x,y,t–1) I(x,y,t)

Source: Silvio Savarese

slide-59
SLIDE 59

Key Assumptions: small motions

slide-60
SLIDE 60

Key Assumptions: spatial coherence

* Slide from Michael Black, CS143 2003

slide-61
SLIDE 61

Key Assumptions: brightness Constancy

* Slide from Michael Black, CS143 2003

slide-62
SLIDE 62

I(x +u, y + v,t) ≈ I(x, y,t −1)+ Ix ⋅u(x, y)+ Iy ⋅v(x, y)+ It

  • Brightness Constancy Equation:

I(x, y,t −1) = I(x +u(x, y), y + v(x, y),t)

Linearizing the right side using Taylor expansion: I(x,y,t–1) I(x,y,t) » + × + ×

t y x

I v I u I Hence,

Image derivative along x

→ ∇I ⋅ u v

[ ]

T + It = 0

I(x +u, y + v,t)− I(x, y,t −1) = Ix ⋅u(x, y)+ Iy ⋅v(x, y)+ It

Source: Silvio Savarese

The brightness constancy constraint

slide-63
SLIDE 63

Filters used to find the derivatives

!" !# !$

slide-64
SLIDE 64
  • How many equations and unknowns per pixel?

The component of the flow perpendicular to the gradient (i.e., parallel to the edge) cannot be measured edge (u,v) (u’,v’) gradient (u+u’,v+v’)

If (u, v ) satisfies the equation, so does (u+u’, v+v’ ) if

  • One equation (this is a scalar equation!), two unknowns (u,v)

∇I ⋅ u' v'

[ ]

T = 0

Can we use this equation to recover image motion (u,v) at each pixel?

Source: Silvio Savarese

The brightness constancy constraint

∇I ⋅ u v

[ ]

T + It = 0

slide-65
SLIDE 65

The aperture problem

Actual motion

Source: Silvio Savarese

slide-66
SLIDE 66

The aperture problem

Perceived motion

Source: Silvio Savarese

slide-67
SLIDE 67

The barber pole illusion

http://en.wikipedia.org/wiki/Barberpole_illusion

Source: Silvio Savarese

slide-68
SLIDE 68

The barber pole illusion

http://en.wikipedia.org/wiki/Barberpole_illusion

Source: Silvio Savarese

slide-69
SLIDE 69

What we will learn today?

  • Optical flow
  • Lucas-Kanade method

Reading: [Szeliski] Chapters: 8.4, 8.5

[Fleet & Weiss, 2005] http://www.cs.toronto.edu/pub/jepson/teaching/vision/2503/opticalFlow.pdf

slide-70
SLIDE 70

Solving the ambiguity…

  • How to get more equations for a pixel?
  • Spatial coherence constraint:
  • Assume the pixel’s neighbors have the same (u,v)
  • If we use a 5x5 window, that gives us 25 equations per pixel
  • B. Lucas and T. Kanade. An iterative image registration technique with an application to stereo
  • vision. In Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674–

679, 1981.

Source: Silvio Savarese

slide-71
SLIDE 71
  • Overconstrained linear system:

Lucas-Kanade flow

Source: Silvio Savarese

slide-72
SLIDE 72
  • Overconstrained linear system

The summations are over all pixels in the K x K window

Least squares solution for d given by

Source: Silvio Savarese

Lucas-Kanade flow

slide-73
SLIDE 73

Conditions for solvability

  • Optimal (u, v) satisfies Lucas-Kanade equation

Does this remind anything to you?

When is This Solvable?

  • ATA should be invertible
  • ATA should not be too small due to noise

– eigenvalues l1 and l 2 of ATA should not be too small

  • ATA should be well-conditioned

– l 1/ l 2 should not be too large (l 1 = larger eigenvalue)

Source: Silvio Savarese

slide-74
SLIDE 74
  • Eigenvectors and eigenvalues of ATA relate to

edge direction and magnitude

  • The eigenvector associated with the larger eigenvalue points in

the direction of fastest intensity change

  • The other eigenvector is orthogonal to it

M = ATA is the second moment matrix ! (Harris corner detector…)

Source: Silvio Savarese

slide-75
SLIDE 75

Interpreting the eigenvalues

l1 l2 “Corner” l1 and l2 are large, l1 ~ l2 l1 and l2 are small “Edge” l1 >> l2 “Edge” l2 >> l1 “Flat” region

Classification of image points using eigenvalues of the second moment matrix:

Source: Silvio Savarese

slide-76
SLIDE 76

Edge

– gradients very large or very small – large l1, small l2

Source: Silvio Savarese

slide-77
SLIDE 77

Low-texture region

– gradients have small magnitude

– small l1, small l2

Source: Silvio Savarese

slide-78
SLIDE 78

High-texture region

– gradients are different, large magnitudes

– large l1, large l2

Source: Silvio Savarese

slide-79
SLIDE 79

Errors in Lukas-Kanade

What are the potential causes of errors in this procedure?

  • Suppose ATA is easily invertible
  • Suppose there is not much noise in the image
  • When our assumptions are violated

– Brightness constancy is not satisfied – The motion is not small – A point does not move like its neighbors

  • window size is too large
  • what is the ideal window size?

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

slide-80
SLIDE 80

– Can solve using Newton’s method (out of scope for this class) – Lukas-Kanade method does one iteration of Newton’s method

  • Better results are obtained via more iterations

Improving accuracy

  • Recall our small motion assumption
  • This is not exact

– To do better, we need to add higher order terms back in:

  • This is a polynomial root finding problem

It-1(x,y) It-1(x,y) It-1(x,y)

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

slide-81
SLIDE 81

Iterative Refinement

  • Iterative Lukas-Kanade Algorithm

1. Estimate velocity at each pixel by solving Lucas- Kanade equations 2. Warp I(t-1) towards I(t) using the estimated flow field

  • use image warping techniques

3. Repeat until convergence

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

slide-82
SLIDE 82

When do the optical flow assumptions fail?

In other words, in what situations does the displacement of pixel patches not represent physical movement of points in space?

  • 1. Well, TV is based on illusory motion

– the set is stationary yet things seem to move

  • 2. A uniform rotating sphere

– nothing seems to move, yet it is rotating

  • 3. Changing directions or intensities of lighting can make things seem to move

– for example, if the specular highlight on a rotating sphere moves.

  • 4. Muscle movement can make some spots on a cheetah move opposite direction of motion.

– And infinitely more break downs of optical flow.

slide-83
SLIDE 83

Questions?

83