Optical flow Cordelia Schmid Motion field The motion field is the - - PDF document

optical flow
SMART_READER_LITE
LIVE PREVIEW

Optical flow Cordelia Schmid Motion field The motion field is the - - PDF document

Optical flow Cordelia Schmid Motion field The motion field is the projection of the 3D scene motion into the image Optical flow Definition: optical flow is the apparent motion of brightness patterns in the image Ideally, optical


slide-1
SLIDE 1

Optical flow

Cordelia Schmid

slide-2
SLIDE 2

Motion field

  • The motion field is the projection of the 3D scene motion

into the image

slide-3
SLIDE 3

Optical flow

  • Definition: optical flow is the apparent motion of

brightness patterns in the image

  • Ideally, optical flow would be the same as the motion

field

  • Have to be careful: apparent motion can be caused by

lighting changes without any actual motion

– Think of a uniform rotating sphere under fixed lighting

  • vs. a stationary sphere under moving illumination
slide-4
SLIDE 4

Estimating optical flow

  • Given two subsequent frames, estimate the apparent motion

field u(x,y) and v(x,y) between them

  • Key assumptions
  • Brightness constancy: projection of the same point looks the

same in every frame

  • Small motion: points do not move very far
  • Spatial coherence: points move like their neighbors

I(x,y,t–1) I(x,y,t)

slide-5
SLIDE 5

Brightness Constancy Equation:

) , ( ) 1 , , (

), , ( ) , ( t y x y x

v y u x I t y x I     ) , ( ) , ( ) , , ( ) 1 , , ( y x v I y x u I t y x I t y x I

y x

   

Linearizing the right side using Taylor expansion:

The brightness constancy constraint

I(x,y,t–1) I(x,y,t)

  

t y x

I v I u I

Hence,

slide-6
SLIDE 6

The brightness constancy constraint

  • How many equations and unknowns per pixel?

– One equation, two unknowns

  • What does this constraint mean?
  • The component of the flow perpendicular to the gradient

(i.e., parallel to the edge) is unknown

  

t y x

I v I u I

) ' , ' (    v u I

edge (u,v) (u’,v’) gradient (u+u’,v+v’)

If (u, v) satisfies the equation, so does (u+u’, v+v’) if

) , (    

t

I v u I

slide-7
SLIDE 7

The aperture problem

Perceived motion

slide-8
SLIDE 8

The aperture problem

Actual motion

slide-9
SLIDE 9

Solving the aperture problem

  • How to get more equations for a pixel?
  • Spatial coherence constraint: pretend the pixel’s

neighbors have the same (u,v)

– E.g., if we use a 5x5 window, that gives us 25 equations per pixel

  • B. Lucas and T. Kanade. An iterative image registration technique with an application to

stereo vision. In International Joint Conference on Artificial Intelligence,1981.

                                  ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) (

2 1 2 2 1 1 n t t t n y n x y x y x

I I I v u I I I I I I x x x x x x x x x   

slide-10
SLIDE 10

Lucas-Kanade flow

  • Linear least squares problem

The summations are over all pixels in the window

Solution given by

                                  ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) (

2 1 2 2 1 1 n t t t n y n x y x y x

I I I v u I I I I I I x x x x x x x x x   

1 1 2 2   

n n

b d A

b A A)d A

T T

 (

                       

     

t y t x y y y x y x x x

I I I I v u I I I I I I I I

slide-11
SLIDE 11

Lucas-Kanade flow

  • Recall the Harris corner detector: M = ATA is

the second moment matrix

  • When is the system solvable?
  • By looking at the eigenvalues of the second moment matrix
  • The eigenvectors and eigenvalues of M relate to edge

direction and magnitude

  • The eigenvector associated with the larger eigenvalue points

in the direction of fastest intensity change, and the other eigenvector is orthogonal to it

                       

     

t y t x y y y x y x x x

I I I I v u I I I I I I I I

slide-12
SLIDE 12

Uniform region

– gradients have small magnitude

– small 1, small 2 – system is ill-conditioned

slide-13
SLIDE 13

Edge

– gradients have one dominant direction – large 1, small 2 – system is ill-conditioned

slide-14
SLIDE 14

High-texture or corner region

– gradients have different directions, large magnitudes

– large 1, large 2 – system is well-conditioned

slide-15
SLIDE 15

Optical Flow Results

slide-16
SLIDE 16

Multi-resolution registration

slide-17
SLIDE 17

Coarse to fine optical flow estimation

slide-18
SLIDE 18

Optical Flow Results

slide-19
SLIDE 19

Horn & Schunck algorithm

Additional smoothness constraint :

  • nearby point have similar optical flow
  • Addition constraint

, )) ( ) ((

2 2 2 2

dxdy v v u u e

y x y x s

    

B.K.P. Horn and B.G. Schunck, "Determining optical flow." Artificial Intelligence,1981

slide-20
SLIDE 20

Horn & Schunck algorithm

Additional smoothness constraint :

, )) ( ) ((

2 2 2 2

dxdy v v u u e

y x y x s

    

besides OF constraint equation term

, ) (

2dxdy

I v I u I e

t y x c

   

minimize es+ec λ regularization parameter

B.K.P. Horn and B.G. Schunck, "Determining optical flow." Artificial Intelligence,1981

slide-21
SLIDE 21

Horn & Schunck algorithm

Coupled PDEs solved using iterative methods and finite differences

slide-22
SLIDE 22

Horn & Schunck

  • Works well for small displacements

– For example Middlebury sequence

slide-23
SLIDE 23

Large displacement estimation in optical flow

Large displacement is still an open problem in optical flow estimation MPI Sintel dataset

slide-24
SLIDE 24

Large displacement optical flow

Classical optical flow [Horn and Schunck 1981]

energy:

minimization using a coarse-to-fine scheme

Large displacement approaches:

LDOF [Brox and Malik 2011] a matching term, penalizing the difference between flow and HOG matches

MDP-Flow2 [Xu et al. 2012]

expensive fusion of matches (SIFT + PatchMatch) and estimated flow at each level

DeepFlow [Weinzaepfel et al. 2013] deep matching + flow refinement with variational approach

color/gradient constancy smoothness constraint

slide-25
SLIDE 25

Deep Matching: main idea

Each subpatch is allowed to move:

independently

in a limited range depending on its size

The approach is fast to compute using convolution and max-pooling

The idea is applied recursively

First image Second image

slide-26
SLIDE 26

Deep Matching (1)

Reference image Target image convolution non-overlapping patches of 4x4 pixels

slide-27
SLIDE 27

Deep Matching (2)

response maps for each 4x4 patch

max-pooling sub-sampling aggregation

response maps

  • f 8x8 patches

max-pooling (3x3 filter) sub-sampling (half) aggregation

slide-28
SLIDE 28

Deep Matching (2)

response maps for each 4x4 patch response maps

  • f 8x8 patches

max-pooling sub-sampling aggregation max-pooling sub-sampling aggregation max-pooling sub-sampling aggregation

… Pipeline similar in spirit to deep convolutional nets [Lecun et al. 1998] response maps

  • f 16x16 patches

response maps

  • f 32x32 patches
slide-29
SLIDE 29

Deep Matching (3)

Multi-scale response pyramid Extract scale-space local maxima Backtrack quasi-dense correspondences Bottom-up Top-down

slide-30
SLIDE 30

Deep Matching (3)

First image Second image local maximum

slide-31
SLIDE 31

Deep Matching: example results

Repetitive textures First image Second image

slide-32
SLIDE 32

Deep Matching: example results

Non-rigid deformation First image Second image

slide-33
SLIDE 33

DeepFlow

Classical optical flow [Horn and Schunck 1981]

energy

Integration of Deep Matching

energy

matches guide the flow

similar to [Brox and Malik 2011]

Minimization using:

coarse-to-fine strategy

fixed point iterations

Successive Over Relaxation (SOR)

slide-34
SLIDE 34

Experimental results: datasets

MPI-Sintel [Butler et al. 2012]

► sequences from a realistic animated movie ► large displacements (>20px for 17.5% of pixels) ► atmospheric effects and motion blur

slide-35
SLIDE 35

Experimental results: datasets

KITTI [Geiger et al. 2013]

► sequences captured from a driving platform ► large displacements (>20px for 16% of pixels) ► real-world: lightings, surfaces, materials

slide-36
SLIDE 36

Experimental results: sample results

Ground-truth LDOF [Brox & Malik 2011] MDP-Flow2 [Xu et al. 2012] DeepFlow

slide-37
SLIDE 37

Experimental results: sample results

Ground-truth LDOF [Brox & Malik 2011] MDP-Flow2 [Xu et al. 2012] DeepFlow

slide-38
SLIDE 38

Experimental results: improvements due to Deep Matching

Comparison on MPI-Sintel training set

AEE: average endpoint error

s40+: only on large displacements HOG matching Deep Matching

slide-39
SLIDE 39

EpicFlow: Sparse-to-dense interpolation based on Deep Matching

  • accurate quasi dense matches with DeepMatching

[Revaud et al., CVPR’15]

slide-40
SLIDE 40

Does not respect motion boundaries Interpolation Ground-Truth

Approach: Sparse-to-dense interpolation based on Deep Matching

slide-41
SLIDE 41

image edges often coincide with motion boundaries (recall 95%)

state-of-the-art SED detector [structured forest for edge detection, Dollar’13] image ground-truth flow SED edges ground-truth motion boundaries

Approach: Sparse-dense interpolation with motion boundaries

slide-42
SLIDE 42

Approach: Sparse-dense interpolation with motion boundaries

EpicFlow:

  • Matching [Deep Matching]
  • Sparse-dense interpolation preserving motion boundaries

 Geodesic distance based on edges [SED]

  • Refinement: One-level energy minimization with variational approach
slide-43
SLIDE 43

Distance : edge-aware geodesic distance

geodesic distance:

shortest distance

knowing a cost map C

Cost map C:

image edges

here computed with SED

p q

slide-44
SLIDE 44

Sparse-to-dense Interpolation: edge-aware geodesic distance

Image edges Geodesic distance Geodesic distance Matches 100 closest matches 100 closest matches

slide-45
SLIDE 45

Comparing Interpolation/EpicFlow/DeepFlow

slide-46
SLIDE 46

Comparison to the state of the art

slide-47
SLIDE 47

Comparison to the state of the art (AEE)

TF + OFM : Kennedy’15. Optical flow with geometric occlusion estimation and fusion of multiple frames NLTGV-SC: Ranftl’14. Non-local total generalized variation for optical flow estimation.

Method Error on MPI-Sintel Error on Kitti Timings EpicFlow 6.28 3.8 16.4s TF+OFM 6.73 5.0 ~500s DeepFlow 7.21 5.8 19s NLTGV-SC 8.75 3.8 16s (GPU)

slide-48
SLIDE 48

Failure cases

Missing matches

(spear and horns of dragon)

Missing contours

(arm)

slide-49
SLIDE 49

CNN to estimate optical flow: FlowNet

[A. Dosovitskiy et al. ICCV’15]

slide-50
SLIDE 50

Architecture FlowNetSimple

slide-51
SLIDE 51

Architecture FlowNetCorrelation

slide-52
SLIDE 52

Synthetic dataset for training: Flying chairs

A dataset of approx. 23k image pairs

slide-53
SLIDE 53

Experimental results

S: simple, C: correlation, v: variational refinement, ft:fine-tuning

slide-54
SLIDE 54

Experimental results