In the name of Allah In the name of Allah the compassionate, the - - PowerPoint PPT Presentation

in the name of allah in the name of allah
SMART_READER_LITE
LIVE PREVIEW

In the name of Allah In the name of Allah the compassionate, the - - PowerPoint PPT Presentation

In the name of Allah In the name of Allah the compassionate, the merciful Digital Video Processing S. Kasaei S. Kasaei R Room: CE 307 CE 307 Department of Computer Engineering Sharif University of Technology E M il E-Mail:


slide-1
SLIDE 1
slide-2
SLIDE 2

In the name of Allah In the name of Allah

the compassionate, the merciful

slide-3
SLIDE 3

Digital Video Processing

  • S. Kasaei
  • S. Kasaei

R CE 307 Room: CE 307 Department of Computer Engineering Sharif University of Technology E M il k i@ h if d E-Mail: skasaei@sharif.edu Webpage: http://sharif.edu/~skasaei

  • Lab. Website: http://ipl.ce.sharif.edu
slide-4
SLIDE 4

Acknowledgment Acknowledgment

Most of the slides used in this course have been provided by: Prof. Yao Wang (Polytechnic University, Brooklyn) based on the book: based on the book: Video Processing & Communications written by: Yao Wang Jom Ostermann & Ya Oin Zhang written by: Yao Wang, Jom Ostermann, & Ya-Oin Zhang Prentice Hall, 1st edition, 2001, ISBN: 0130175471. [SUT Code: TK 5105 .2 .W36 2001]

slide-5
SLIDE 5

Chapter 6 Chapter 6

2-D Motion Estimation 2 D Motion Estimation

Part I: Fundamentals & Basic Techniques

slide-6
SLIDE 6

Outline

2-D motion vs. optical flow Optical flow equation & ambiguity in motion

estimation estimation

General methodologies in motion estimation

Motion representation Motion estimation criterion Optimization methods Gradient descent methods

Pixel-based motion estimation Block-based motion estimation

6 Kasaei

EBMA algorithm

slide-7
SLIDE 7

2-D Motion Estimation 2 D Motion Estimation

Motion estimation (ME) is an important part of

many video processing tasks. ME i li ti id i

ME main applications are video compression,

sampling rate conversion, filtering, …

For computer vision

motion vectors (MV) are used to

For computer vision, motion vectors (MV) are used to

deduce 3-D structure & motion parameters (sparse but accurate set of MVs are required).

For video coding MVs are used to produce motion For video coding, MVs are used to produce motion-

compensated predicted frame to reduce required bitrate for coding MVs & prediction errors (tense & accurate set of MVs are required)

7 Kasaei

accurate set of MVs are required).

slide-8
SLIDE 8

2-D Motion Estimation 2 D Motion Estimation

An ME problem is converted to an

  • ptimization problem that involves key

p p y components of:

Parameterization of motion field. Parameterization of motion field. Formulation of optimization criterion. Searching for optimal parameters Searching for optimal parameters.

Optimal Motion Parameters Optimization Criteria Motion Field Input Frames

8 Kasaei

slide-9
SLIDE 9

2-D Motion vs Optical Flow 2 D Motion vs. Optical Flow

2-D Motion: Projection of 3-D motion Depends on 3-D object motion &

2-D Motion: Projection of 3-D motion. Depends on 3-D object motion & projection operator (physical aspects).

Optical flow: “Perceived” 2-D motion based on changes in image pattern,

also depends on illumination & object surface texture ( ) A h i t ti d also depends on illumination & object surface texture. (a) A sphere is rotating under a constant ambient illumination, but observed image does not change. g (b) A point light source is rotating around a stationary sphere, causing highlight point on

9 Kasaei

causing highlight point on sphere to rotate. (a) (b)

slide-10
SLIDE 10

Correspondence & Optical Flow p & p

2-D displacement & velocity fields are projections of

respective 3-D fields into image plane.

Correspondence field & optical flow field are

displacement & velocity functions “perceived” from h i i i i i the time-varying image intensity pattern.

Correspondence field & optical flow field are also Correspondence field & optical flow field are also

called “apparent 2-D displacement” field & “apparent 2-D velocity” field.

10 Kasaei

slide-11
SLIDE 11

Correspondence & Optical Flow p & p

Since we can only observe correspondence & optical

flow fields, we assume that they are the same as the 2 D motion field 2-D motion field.

When illumination condition is unknown, the best one

can do is to estimate the optical flow can do is to estimate the optical flow.

Constant intensity assumption: The image of the same

bj t i t t diff t ti i t l h th

  • bject point at different time intervals have the same

luminance value. C i i i (CIA) O i l fl

11 Kasaei

Constant intensity assumption (CIA) Optical flow

(OF) equation.

slide-12
SLIDE 12

Optical Flow Equation p q

) , , ( ) , , ( : " assumption intensity constant " Under = + + + t y x d t d y d x

t y x

ψ ψ

[(x,y,t) (x+dx, y+dy, t+dt)]

) , , ( ) , , ( : expansion s Taylor' using But, ∂ ∂ + ∂ ∂ + ∂ ∂ + = + + + d d d t y x d t d y d x

t y x t y x

ψ ψ ψ ψ ψ : equation flow

  • ptical

the have we two, above the Compare ∂ ∂ ∂ y y x

t y x t y x

  • r
  • r

= ∂ ∂ + ∇ = ∂ ∂ + ∂ ∂ + ∂ ∂ = ∂ ∂ + ∂ ∂ + ∂ ∂ t t v y v x d t d y d x

T y x t y x

ψ ψ ψ ψ ψ ψ ψ ψ v

12 Kasaei

[The velocity vector (flow vector), v, is the unknown parameter. One equation with two unknowns.]

spatial gradient vector

slide-13
SLIDE 13

Ambiguities in Motion Estimation Ambiguities in Motion Estimation

Optical flow equation only

constrains the flow vector in the gradient direction ( )

v ?

aperture problem

the gradient direction ( ).

The flow vector in the tangent

direction ( ) is under-

n

v

t

v

? ?

problem

( ) determined (aperture problem).

Also, in regions with constant

brightness ( ) the flow is

∇ψ

t

v

brightness ( ), the flow is indeterminate Motion estimation is unreliable in

= ∇ψ

∂ + = v v

t t n n

ψ e e v

If:

no t!

13 Kasaei

regions with flat texture, but more reliable near edges.

= ∂ ∂ + ∇ t vn ψ ψ

gradient vector magnitude vt!

slide-14
SLIDE 14

Ambiguities in Motion Estimation Ambiguities in Motion Estimation

To solve the undetermined component problem

( ) of OFE, one must impose additional constraints

t

v

constraints.

The most common constraints is that the flow

vectors should vary smoothly spatially (to estimate the motion vector).

  • k

?

  • k

14 Kasaei

slide-15
SLIDE 15

General Considerations for ME General Considerations for ME

Two categories of approaches:

Feature-based: More often used in object tracking & 3-D

reconstruction from 2-D (least-squares fitting of features, good for global motions).

Intensity-based: Based on CIA (no simple model) More Intensity-based: Based on CIA (no simple model). More

  • ften used for motion compensated prediction (required in

video coding), frame interpolation Our focus.

Three important questions:

How to represent (parameterize) the motion field?

15 Kasaei

p (p )

What criteria to use to estimate motion parameters? How to search for optimal motion parameters?

slide-16
SLIDE 16

Motion Representation Motion Representation

Global: Pixel-based: Entire motion field is represented by a few global parameters (global motion representation; camera One MV at each pixel, with some smoothness constraint between adjacent MVs (very time consuming) representation; camera motion). time consuming). Region-based: Block-based: E ti f i di id d Entire frame is divided into regions, then each region corresponding to an object (or sub-

  • bject) with consistent

Entire frame is divided into non-overlapping blocks, then motion in each block is characterized by a few

  • bject) with consistent

motion, is represented by a few parameters (requires region segmentation map, y parameters (good compromise between accuracy & complexity, discontinuous across blocks no multiple

16 Kasaei

which pels have similar motions?). blocks, no multiple

  • bjects, scale, or

rotation).

slide-17
SLIDE 17

Motion Representation Motion Representation

Other representation: mesh-based representation.

Underlying image frame is partitioned into non-overlapping

polygonal elements. p yg

Mvs at the corners of polygonal elements determine the

entire motion field.

Mvs at the interior points of an element are interpolated from

h d l MV the nodal MVs.

Induced motion field is continuous everywhere. Adaptive methods allow discontinuities when necessary

(on object boundaries).

17 Kasaei

slide-18
SLIDE 18

Notations Notations

Anchor frame: Target frame: M ti t ) (

1 x

ψ ) (

2 x

ψ Motion parameters: Motion vector at a pixel in the anchor f ) ( d a frame: Motion field: ) (x d Λ ∈ x a x d ), ; ( Mapping function: Λ ∈ + = x a x d x a x w ), ; ( ) ; (

reference current

18 Kasaei

frame [in video coding] frame [in video coding]

slide-19
SLIDE 19

Regularization Theory Regularization Theory

  • Ill-posed problems.
  • Regularization methods
  • Regularization methods.
  • Stochastic regularization methods.

R l ti l b li

  • Relaxation labeling.

Discrete relaxation labeling. Stochastic relaxation labeling.

19 Kasaei

slide-20
SLIDE 20

Well-Posed Problems Well Posed Problems

  • A mathematical problem is well-posed

when its solution

1.

Exists,

2.

is unique, and

3.

is robust to noise.

  • Physical simulation problems are well-
  • Physical simulation problems are well-

posed, but “inverse” problems are usually ill-posed.

20 Kasaei

p

slide-21
SLIDE 21

Regularization Methods Regularization Methods

  • Basis idea behind regularization is

1.

to restrict the space of acceptable solutions,

2.

by choosing the function that minimizes an appropriate functional (a cost function, J).

F l i ill d bl l i ti

  • For solving an ill-posed problem, regularization

theory provides the mathematical function for choosing the norm and stabilizing functional that h h i h l b l i f together characterize the global constraints for the problem.

  • i.e., finding x that satisfies: ||G(x)||<C & min ||F(x)-y||.

21 Kasaei

Stabilizing Functional Norm

slide-22
SLIDE 22

Stochastic Regularization Methods Stochastic Regularization Methods

Instead of regularization theory, a Bayesian formulation

can be used to transform ill-posed inverse problems into the functional optimization framework. the functional optimization framework.

Given a set of data, looking for their most likely model.

Likelihood: evaluates how well the model describes the data

(stabilizing functional).

A priori: evaluates the model (norm).

Other modeling approaches:

Mi i d i ti l th l d ib th d l t i t

Minimum description length also describe the model constraints.

Other stochastic (or probabilistic) optimization methods:

Simulated annealing (SA)

22 Kasaei

Simulated annealing (SA), Genetic algorithms/evolutionary strategy, Expectation-maximization (EM).

slide-23
SLIDE 23

Consistence Labeling Consistence Labeling

  • How to infer smooth features, detect discontinuities,

and identify outliers?

  • As each location, only assuming one of these roles, we need a
  • As each location, only assuming one of these roles, we need a

consistence labeling framework.

  • A labeling problem is characterized by:

1

a set of objects (pixels)

1.

a set of objects (pixels),

2.

a set of possible labels (edges with orientations, discontinuities, gray-levels, regions, line matches) for each

  • bject,

j

3.

a neighbor relation over objects, and

4.

a compatibility relation over labels at pairs of neighboring

  • bjects.

23 Kasaei

  • The goal is to assign a label to each object such that the labeling

is consistent with respect to the compatibility relation (4).

slide-24
SLIDE 24

Relaxation Labeling Relaxation Labeling

  • A natural extension of regularization operation

to the class of problem whose solution involves symbols rather than functions.

  • Structure of relaxation labeling is motivated by

t b i two basis concerns:

1.

decomposition of the complex computation into a network of simple “myopic” or local computations network of simple myopic or local computations, and

2.

requisite use of context in resolving ambiguities.

24 Kasaei

slide-25
SLIDE 25

Relaxation Labeling Relaxation Labeling

Three main types of relaxation labeling

methods include:

Discrete relaxation labeling. Continuous relaxation labeling. Continuous relaxation labeling. Stochastic relaxation labeling.

25 Kasaei

slide-26
SLIDE 26

Discrete Relaxation Labeling Discrete Relaxation Labeling

Assigns labels to graph nodes. Is governed by the label discarding rule:

Discard a label at a node, if there exists a neighbor

such that every label currently assigned to the neighbor is incompatible with the label. g p

Iterate the discarding process. Apply it in parallel at each node until one or more

limiting label sets are obtained limiting label sets are obtained.

Main issues of the iteration process are:

initialization,

26 Kasaei

updating, and stopping condition.

slide-27
SLIDE 27

Stochastic Relaxation Labeling Stochastic Relaxation Labeling

Labeling weights and constraints preference weights are

replaced by probability distributions.

Is based on the use of a stochastic modeling of the

physical phenomenon called Markov random fields (MRF). MRF i ft bi d ith th B i ti ti

MRF is often combined with the Bayesian estimation

techniques known as maximum a posteriori (MAP), forming MRF-MAP.

It involves solving an energy minimization problem. Typically, one uses a global minimum seeking algorithms such as

simulated annealing (SA), evolutionary algorithms (EA), or expectation maximization (EM) to minimize the often non convex

27 Kasaei

expectation-maximization (EM) to minimize the often non-convex energy functions.

slide-28
SLIDE 28

Motion Estimation Criteria

To minimize the displaced frame difference (DFD):

p

MSE : 2 MAD; : 1 min ) ( )) ; ( ( ) (

1 2 DFD

= = → − + = ∑

Λ ∈

P p E

x p

x a x d x a ψ ψ

To satisfy the optical flow equation:

( )

min ) ( ) ( ) ; ( ) ( ) (

1 2 1 OF

→ − + ∇ = ∑

Λ ∈ x p T

E x x a x d x a ψ ψ ψ

28 Kasaei

slide-29
SLIDE 29

Motion Estimation Criteria

To impose additional smoothness constraint using

regularization technique (important in pixel- & block- based representation): based representation):

) ; ( ) ; ( ) (

2

− = ∑ ∑

Λ ∈ ∈

a y d a x d a

x y N s

E

x

smoothness constraint

Lower penalty weights at object boundaries.

min ) ( ) (

DFD

→ +

Λ ∈ ∈

a a

x y s s DFD N

E w E w

x

Lower penalty weights at object boundaries.

Bayesian (MAP) criterion: to maximize the a posteriori

probability:

29 Kasaei

probability:

max ) , (

1 2

→ = ψ ψ d D P

slide-30
SLIDE 30

Relation Among Different Criteria Relation Among Different Criteria

OF criterion is good only if motion is small. OF criterion can often yield closed-form solution as the

y

  • bjective function is quadratic in MVs.

When the motion is not small, one can iterate the

solution based on OF criterion to satisfy DFD criterion.

Bayesian criterion can be reduced to DFD criterion plus

motion smoothness constraint.

More in the textbook.

30 Kasaei

[DFD: displaced frame difference]

slide-31
SLIDE 31

Optimization Methods p

Exhaustive search:

Typically used for the DFD criterion with p=1 (MAD).

hi h l b l i l

Guarantees reaching the global optimal. Required computation may be unacceptable when the

number of parameters to search simultaneously is large! p y g

Fast search algorithms reach sub-optimal solution in a

shorter time.

31 Kasaei

slide-32
SLIDE 32

Optimization Methods p

Gradient-based search:

Typically used for the DFD or OF criterion with p=2

(MSE) (MSE).

The gradient can often be calculated analytically. When used with the OF criterion, closed-form solution may be

  • btained
  • btained.

Reaches the local optimal point closest to the initial

solution.

Multi-resolution search:

Searches from coarse-to-fine resolution. I f

t th h ti h

32 Kasaei

Is faster than exhaustive search. Avoids being trapped into a local minimum.

slide-33
SLIDE 33

Gradient Descent Method Gradient Descent Method

Iteratively updates the current estimate in the

direction opposite to the gradient direction.

Not a good initial. A good initial. Appropriate stepsize.

33 Kasaei

Too big stepsize.

slide-34
SLIDE 34

Gradient Descent Method Gradient Descent Method

The solution depends on the initial condition.

Reaches the local minimum closest to the initial diti condition.

You can start with several different initial solutions.

Choice of stepsize: Choice of stepsize:

Fixed stepsize: Stepsize must be small to avoid

  • scillation (requires many iterations).

Steepest gradient descent: A 1st order gradient decent

method that uses a variable stepsize (adjusts the stepsize optimally).

34 Kasaei

p p y)

Converges in few iterations, but with more computations.

slide-35
SLIDE 35

Newton’s Method Newton s Method

Newton’s method uses the first- & second-order

derivatives:

Hessian matrix

35 Kasaei

slide-36
SLIDE 36

Newton’s Method Newton s Method

Converges faster than the 1st order method (i.e., requires

fewer number of iterations to reach convergence).

Requires more calculations in each iteration Requires more calculations in each iteration. More prone to noise (gradient calculation is subject to noise

more with 2nd order than with 1st order).

Uses a constant stepsize (a) smaller that 1.

May not converge, if a >=1.

Should choose the stepsize appropriate to reach a good

p pp p g compromise between guaranteeing convergence & convergence rate.

36 Kasaei

slide-37
SLIDE 37

Newton-Raphson Method p

Newton-Ralphson method:

Approximates 2nd order gradient by a product of 1st order

gradients.

Applicable when the objective function is a sum of squared

errors errors.

Only needs to calculate 1st order gradients, yet converges at

a rate similar to Newton’s method.

37 Kasaei

slide-38
SLIDE 38

Newton-Raphson Method Newton Raphson Method

If:

1st order gradients

38 Kasaei

slide-39
SLIDE 39

Pixel-Based Motion Estimation Pixel Based Motion Estimation

Horn-Schunck method:

OF + smoothness criterion.

Multipoint neighborhood method:

Assumes that every pixel in a small block surrounding a

pixel has the same MV.

Pel-recurrsive method:

MV f l i d d f h f i i

MV for a current pel is updated from those of its previous

pels, so that the MV does not need to be coded.

Developed for early generation of video coders.

39 Kasaei

Developed for early generation of video coders.

slide-40
SLIDE 40

Multipoint Neighborhood Method Multipoint Neighborhood Method

E ti t th MV t h i l i d d tl b

Estimates the MV at each pixel independently, by

minimizing the DFD error over a neighborhood surrounding this pixel. g p

Every pixel in the neighborhood is assumed to

have the same MV.

Minimizing (cost) function:

min ) ( ) ( ) ( ) (

2 1 2 n DFD

→ − + = ∑

n

w E x d x x d ψ ψ ) ( ) ( ) ( ) (

) ( 1 2 n DFD

n

B n x x

ψ ψ

40 Kasaei

slide-41
SLIDE 41

Multipoint Neighborhood Method Multipoint Neighborhood Method

O ti i ti th d

Optimization method:

Exhaustive search (feasible as one only needs to search

  • ne MV at a time).

)

Needs to select the appropriate search range & the search step-

size.

Gradient-based method.

41 Kasaei

slide-42
SLIDE 42

Example: Gradient Descent p Method

min ) ( ) ( ) ( ) (

2 1 2 DFD

w E x d x x d → − + = ∑ ψ ψ ) ( ) ( ) ( min ) ( ) ( ) ( ) (

2 ) ( n ) ( 1 2 n DFD n B n

n

e w E w E x d x x d d g x d x x d

x x∈

∂ ∂ + = ∂ ∂ = → +

∑ ∑

ψ ψ ψ

[ ]

) ( ) ( ) ( ) (

2 2 2 2 2 ) ( 2 n 2 n ) ( n n T B B

n n

e w w E x d x x x x x d d H x d

d x x d x x x ∈ + ∈

∂ ∂ + +       ∂ ∂ ∂ ∂ = ∂ ∂ = ∂ ∂

ψ ψ ψ ) (

2 2 ) ( ) ( n T B B

n n n

w x x x d

x x d x d x x x ∈ + + ∈

      ∂ ∂ ∂ ∂ ≈   ∂

ψ ψ ) ( : descent gradient

  • rder

First

) ( n ) ( n ) 1 ( n ) ( l l l B

n n

d g d d

d x x x + + ∈

− =   α

42 Kasaei

[ ]

) ( ) ( : method Raphson

  • Newton

) (

) ( n 1 ) ( n ) ( n ) 1 ( n n n n l l l l

d g d H d d g

− +

− = α

slide-43
SLIDE 43

Simplification using OF Criterion Simplification using OF Criterion

( )

→ − + ∇ = ∑

2 1 2 1 n OF

min ) ( ) ( ) ( ) ( ) (

n T

w E x x d x x d ψ ψ ψ

( ) ( )

( )

= ∇ − + ∇ = ∂ ∂ → ∇

∑ ∑

∈ ∈ 1 ) ( 1 2 1 n ) ( 1 2 1 n OF

) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) (

n

B n T B n

w E w

x x x x

x x x d x x d d d ψ ψ ψ ψ ψ ψ ψ

( ) ( )

        ∇ −         ∇ ∇ =

∑ ∑

∈ − ∈ ∈ ) ( 1 2 1 1 ) ( 1 1

  • pt

n, ) ( n

) ( ) ( ) ( ) ( ) ( ) ( ) (

n n n

B B T B

w w

x x x x x x

x x x x x x x d ψ ψ ψ ψ ψ    

) ( ) (

n n

This solution is good only if the actual MV is small. When this is not the case, one should iterate the above solution, with the following update: case, one should iterate the above solution, with the following update: ) ( ) (

) 1 ( ) ( ) 1 ( ) ( n 2 ) 1 ( 2 + + +

∆ + = + =

l l l l l

d d d x x ψ ψ

43 Kasaei

iteration at that found MV the denote

) 1 ( n n +

∆ ∆ + =

l n n

where d d

slide-44
SLIDE 44

Block-Based Motion Estimation (A Brief Overview)

Assumes that all pixels in a block undergo a coherent

motion & searches for the motion parameters for each block independently.

Block matching algorithm (BMA): assumes a

l i l i bl k ( ) translational motion, 1 MV per block (2 parameters):

Exhaustive BMA (EBMA). F

t l ith

Fast algorithms.

Deformable block matching algorithm (DBMA): allows

more complex motion (affine bilinear); to be discussed

44 Kasaei

more complex motion (affine, bilinear); to be discussed later.

slide-45
SLIDE 45

Block-Based Motion Estimation (A Brief Overview)

45 Kasaei

slide-46
SLIDE 46

Block Matching Algorithm Block Matching Algorithm

Overview:

Assumes that all pixels in a block undergo a translation,

denoted by a single MV denoted by a single MV.

Estimate the MV for each block independently, by

minimizing the DFD error over this block.

Results in non-smooth MVs, but better handles the object

boundaries, new appearing objects, & occlusion problem.

Minimizing function:

min ) ( ) ( ) (

1 2 m DFD

→ − + = ∑

p m

E x d x d ψ ψ

46 Kasaei

) ( ) ( ) (

1 2 m DFD

m

B m x

ψ ψ

slide-47
SLIDE 47

Block Matching Algorithm Block Matching Algorithm

Optimization method:

Exhaustive search (feasible as one only needs to search one

MV at a time) using MAD criterion (p 1) MV at a time), using MAD criterion (p=1).

Fast search algorithms. Integer- vs. fractional-pel accuracy search.

g p y

47 Kasaei

slide-48
SLIDE 48

Exhaustive Block Matching g Algorithm (EBMA)

48 Kasaei

slide-49
SLIDE 49

Complexity of Integer-Pel EBMA Complexity of Integer Pel EBMA

Assumption:

Image size: MxM.

l k i

Block size: NxN. Search range: (-R, R) in each dimension. Search stepsize: 1 pixel (assuming integer MV) Search stepsize: 1 pixel (assuming integer MV).

Operation counts (1 operation=1 “-”, 1 “+”, 1 “*”):

Each candidate position: N^2. Each block going through all candidates: (2R+1)^2 N^2. Entire frame: (M/N)^2 (2R+1)^2 N^2=M^2 (2R+1)^2.

I d d t f bl k i !

49 Kasaei

Independent of block size!

slide-50
SLIDE 50

Complexity of Integer-Pel EBMA Complexity of Integer Pel EBMA

Example: M=512, N=16, R=16, 30 fps.

Total operation count = 2.85x10^8/frame

8 55x10^9/second =8.55x10^9/second.

Regular structure suitable for VLSI implementation. Challenging for software-only implementation Challenging for software-only implementation.

50 Kasaei

slide-51
SLIDE 51

Sample Matlab Script for p p Integer-Pel EBMA

%f1: anchor frame; f2: target frame, fp: predicted image; %f1: anchor frame; f2: target frame, fp: predicted image; %mvx,mvy: store the MV image %widthxheight: image size; N: block size, R: search range for i=1:N:height-N, f j 1 N idth N %f bl k i th h f for j=1:N:width-N %for every block in the anchor frame MAD_min=256*N*N;mvx=0;mvy=0; for k=-R:1:R, for l=-R:1:R %for every search candidate MAD=sum(sum(abs(f1(i:i+N-1,j:j+N-1)-f2(i+k:i+k+N-1,j+l:j+l+N-1)))); ( ( ( ( j j ) ( j j )))) % calculate MAD for this candidate if MAD<MAX_min MAD_min=MAD,dy=k,dx=l; end; end;end; end;end; fp(i:i+N-1,j:j+N-1)= f2(i+dy:i+dy+N-1,j+dx:j+dx+N-1); %put the best matching block in the predicted image iblk=(floor)(i-1)/N+1; jblk=(floor)(j-1)/N+1; %block index mvx(iblk,jblk)=dx; mvy(iblk,jblk)=dy; %record the estimated MV

51 Kasaei

end;end;

Note: A real working program needs to check whether a pixel in the candidate matching block falls

  • utside the image boundary and such pixel should not count in MAD. This program is meant to

illustrate the main operations involved. Not the actual working Matlab script.

slide-52
SLIDE 52

Fractional Accuracy EBMA Fractional Accuracy EBMA

Real MV may not always be multiples of pixels. To

allow sub-pixel MV, the search stepsize must be less than 1 pixel than 1 pixel.

Half-pel EBMA: stepsize=1/2 pixel in both dimensions. Difficulty: Difficulty:

Target frame only has integer pels.

Solution:

Solution:

Interpolate the target frame by a factor of two before searching. Bilinear interpolation is typically used.

52 Kasaei

slide-53
SLIDE 53

Fractional Accuracy EBMA Fractional Accuracy EBMA

Complexity:

4-times of integer-pel, plus additional operations for

interpolation interpolation.

Fast algorithms:

Searches in integer precisions first, then refines in a small Searches in integer precisions first, then refines in a small

search region in half-pel accuracy.

53 Kasaei

slide-54
SLIDE 54

Half-Pel Accuracy EBMA Half Pel Accuracy EBMA

54 Kasaei

slide-55
SLIDE 55

Bilinear Interpolation Bilinear Interpolation

(x+1, y) (x, y) (2x, 2y) (2x+1, 2y) (2x 2y+1) (2x+1 2y+1) (x+1, y+1) (x, y+1) (2x, 2y+1) (2x+1, 2y+1)

O[2x, 2y]=I[x, y] O[2x+1, 2y]=(I[x, y]+I[x+1, y])/2

55 Kasaei

y ( y y ) O[2x, 2y+1]=(I[x, y]+I[x, y+1])/2 O[2x+1, 2y+1]=(I[x, y]+I[x+1, y]+I[x, y+1]+I[x+1, y+1])/4

slide-56
SLIDE 56

hor Frame et Frame Anch Targ 9.86 dB) Frame (29 Field ted Anchor Motion

56 Kasaei

Predict Example: Half-pel EBMA.

slide-57
SLIDE 57

Pros & Cons with EBMA Pros & Cons with EBMA

Blocking artifacts (discontinuity across block

boundary) in the predicted image:

Because the block-wise translation model is not accurate. Fix: Deformable BMA (next lecture).

Motion field somewhat chaotic:

Because MVs are estimated independently from block to

block block.

Fix 1: Mesh-based motion estimation (next lecture). Fix 2: Imposing smoothness constraint explicitly.

57 Kasaei

Fix 2: Imposing smoothness constraint explicitly.

slide-58
SLIDE 58

Pros & Cons with EBMA Pros & Cons with EBMA

Wrong MV in flat regions:

Because motion is indeterminate when spatial gradient is

near zero.

Nonetheless, widely used for motion compensated

di ti i id di prediction in video coding.

Because of its simplicity & optimality in minimizing

prediction error prediction error.

58 Kasaei

slide-59
SLIDE 59

Fast Algorithms for BMA Fast Algorithms for BMA

Key idea to reduce the computation in EBMA:

Reduce the number of search candidates:

Only search for those that are likely to produce small errors. Predict possible remaining candidates, based on previous search

results.

Simplify the error measure (DFD) to reduce the

computation involved for each candidate.

Classical fast algorithms:

Three-step.

2 D l

59 Kasaei

2-D log. Conjugate direction.

slide-60
SLIDE 60

Fast Algorithms for BMA Fast Algorithms for BMA

Many new fast algorithms have been developed since

then.

Some suitable for software implementation, others for

VLSI implementation (memory access, etc).

60 Kasaei

slide-61
SLIDE 61

2-D Log Search

final match

2 D Log Search

match

  • Each step tests 5 diamond search

i t points.

  • Initial stepsize is half of the search

range.

  • Search stepsize reduces if the best

Search stepsize reduces if the best matching point is:

  • the center point, or
  • on the border of the max

search range.

  • Final step is reached when:
  • stepsize is reduced to 1 pel, &
  • 9 search points are examined
  • 9 search points are examined

at this last step.

  • No. of steps cannot be determined.

61 Kasaei

Best matching MVs in steps 1-5 are: (0,2), (0,4), (2,4), (2,6), & (2,6).

slide-62
SLIDE 62

Three-Step Search Algorithm Three Step Search Algorithm

final match decreasing steps

  • Each step tests 8 search points

Each step tests 8 search points.

  • At first, it tests 9 search points.
  • Initial stepsize is half of the search

range.

  • Search stepsize reduces by half

after each step.

  • Final step is reached when:
  • stepsize is reduced to 1 pel &
  • stepsize is reduced to 1 pel, &
  • 8 search points are examined

at this last step.

  • No. of steps is (8L+1).

p ( )

  • Proper for VLSI implementation.

62 Kasaei

Best matching MVs in steps 1–3 are: (3,3), (3,5), & (2,6).

slide-63
SLIDE 63

Three-Step Search Algorithm Three Step Search Algorithm

decreasing steps final match match

63 Kasaei

slide-64
SLIDE 64

VcDemo Example VcDemo Example

VcDemo: Image & Video Compression Learning Tool Developed at Delft University of Technology: p y gy http://www-ict.its.tudelft.nl/~inald/vcdemo/ Use the ME tool to show the motion estimation results with different parameter choices different parameter choices.

64 Kasaei

slide-65
SLIDE 65

Summary

Optical flow equation:

Derived from constant intensity & small motion

assumptions assumptions.

Ambiguity in motion estimation.

How to represent motion:

How to represent motion:

Pixel-based, block-based, region-based, mesh-based,

global, etc.

i i i i

Estimation criterion:

DFD (constant intensity). OF (constant intensity+small motion)

65 Kasaei

OF (constant intensity+small motion). Bayesian (MAP, DFD+motion smoothness).

slide-66
SLIDE 66

Summary

Search method:

Exhaustive search, gradient-descent, multi-resolution (next

lecture) lecture).

Basic motion estimation techniques:

Pixel-based: Pixel based:

Most accurate representation, but also most costly to estimate.

Block-based:

EBMA i t l h lf l f t l ith

EBMA, integer-pel vs. half-pel accuracy, fast algorithms Good trade-off between accuracy & speed. EBMA (and its fast but suboptimal variant) is widely used in video

coding for motion compensated temporal prediction

66 Kasaei

coding for motion-compensated temporal prediction.

slide-67
SLIDE 67

Homework 4

Reading assignment:

Chap 6: Sec. 6.1-6.4 (Sec. 6.4.5,6.4.6 not required), & Apx.

A & B.

Written assignment:

  • Prob. 6.4, 6.5, 6.6

67 Kasaei

slide-68
SLIDE 68

Homework 4

Computer assignment:

  • Prob. 6.12, 6.13

i l b

Optional: Prob.6.14 Note: you can download sample video frames from the

course webpage. When applying your motion estimation p g pp y g y algorithm, you should choose two frames that have sufficient motion in between so that it is easy to observe effect of motion estimation inaccuracy. If necessary, y y, choose two frames that are several frames apart. For example, foreman: frame 100 & frame 103.

68 Kasaei

slide-69
SLIDE 69

The End The End