In the name of Allah the compassionate, the merciful
Digital Video Systems S. Kasaei S. Kasaei Room: CE 307 Department of Computer Engineering Sharif University of Technology E-Mail: skasaei@sharif.edu Webpage: http://sharif.edu/~skasaei Lab. Website: http://mehr.sharif.edu/~ipl
Acknowledgment Most of the slides used in this course have been provided by: Prof. Yao Wang (Polytechnic University, Brooklyn) based on the book: Video Processing & Communications written by: Yao Wang, Jom Ostermann, & Ya-Oin Zhang Prentice Hall, 1 st edition, 2001, ISBN: 0130175471. [SUT Code: TK 5105 .2 .W36 2001].
Chapter 6 2-D Motion Estimation Part II: Advanced Techniques
Outline � Problems with EBMA � Deformable block matching algorithm (DBMA): � Node-based motion model � Mesh-based motion estimation: � Mesh-based motion representation � Mesh-based motion estimation � Global motion estimation: � Direct method � Indirect method � Region-based motion estimation � Multi-resolution motion estimation: � Hierarchical block matching algorithm (HBMA) � Summary Kasaei 6
Problems with EBMA � Blocking artifact (discontinuity across block boundaries) in the predicted image: � Because the block-wise translation model is not accurate. � Real motion in a block may be more complicated than a pure translation (rotation, zooming, …). Fix: Deformable BMA: � � Uses a more sophisticated model: affine, bilinear, or perspective mapping (to describe block motion). Kasaei 7
Problems with EBMA � There may be multiple objects with different motions in a block. � Fix: � Region-based motion estimation. � Mesh-based motion estimation (using adaptive meshes). � Intensity changes may be due to illumination effect: � Should compensate for illumination effect before applying the “constant intensity assumption”. Kasaei 8
Problems with EBMA � Motion field is somewhat chaotic: � Because MVs are estimated independently from block to block. � Fix: Imposing smoothness constraint explicitly. � Multi-resolution approach. � Mesh-based motion estimation. � � Wrong MV in the flat region: � Because motion is indeterminate when spatial gradient is near zero. � Ideally, should use non-regular partitions. � Fix: region-based motion estimation. Kasaei 9
Problems with EBMA � Requires tremendous computation! � Fix: � Fast algorithms. � Multi - resolution. Kasaei 10
Deformable Block Matching Algorithm (DBMA) Kasaei 11
Overview of DBMA � Three steps: � Partition the anchor frame into regular blocks. � Model the motion in each block by a more complex motion. � A 2-D motion caused by a flat surface patch undergoing a rigid 3-D motion can be approximated well by a projective mapping. � Projective mapping can be approximated by affine mapping + bilinear mapping. � Various possible mappings can be described by a node- based motion model. Kasaei 12
Overview of DBMA � Estimate the motion parameters block by block independently. � Discontinuity problem cross block boundaries still remains. � Still cannot solve the problem of multiple motions within a block or changes due to illumination effect! Kasaei 13
Problems with DBMA � There might be motion discontinuity across block boundaries (because nodal MVs are estimated independently from block to block): � Fix: mesh-based motion estimation. � First apply EBMA to all blocks. Kasaei 14
Problems with DBMA � Cannot do well on blocks with multiple moving objects or changes due to illumination effect. � Three mode method: First, apply EBMA to all blocks. � Blocks with small EBMA errors have translational motion. � Blocks with large EBMA errors may have non-translational � motion. � First, apply DBMA to these blocks. � Blocks still having errors are non-motion compensable. [Ref] O. Lee and Y. Wang, Motion compensated prediction � using nodal-based deformable block matching. J. Visual Communications and Image Representation (March 1995), 6:26-34 Kasaei 15
Affine & Bilinear Model � Affine (6 parameters): � Good for mapping triangles to triangles. + + d ( x , y ) a a x a y x 0 1 2 = + + d ( x , y ) b b x b y y 0 1 2 � Bilinear (8 parameters): � Good for mapping blocks to quadrangles. + + + d ( x , y ) a a x a y a xy x 0 1 2 3 = + + + d ( x , y ) b b x b y b xy y 0 1 2 3 Kasaei 16
Difficulties in Estimating Affine & Bilinear Motion Parameters � The coefficients need floating point precision. � The coefficients have different influence on the estimated motion. � 0 -th order coefficients ( a0,b0 ) represent the translation component. � Other coefficients’ influence depends on pixel coordinates. Kasaei 17
Node-Based Motion Model Control nodes (can move freely) in this example: Block corners. Motion in other points are interpolated from the nodal MVs d m,k. Control node MVs can be described with integer- or half- pel accuracy, all have same importance. displacement at any point in the element Translation (1-node), affine (3- nodes), & bilinear (4-nodes) are special cases of this model. “interpolation kernel” associated with node k in element m Kasaei 18
Interpolation Kernels � To guarantee continuity across element boundary: � Shape functions of standard triangular element: � Affine function. Kasaei 19
Estimation of Nodal Motions � Shape functions of standard quadrilateral element: � Bilinear function. � Objective DFD function: Difficult to calculate! Kasaei 20
Estimation of Nodal Motions � Search method: � Exhaustive search: search K nodal MVs simultaneously in integer- or half-pel � accuracy (may not be feasible in practice). � Gradient descent approach: See textbook for the Newton-Raphson update algorithm. � Solution depends on the initial solution. A good initial solution � is the translation MV found using EBMA. Kasaei 21
Mesh-Based Motion Estimation (An Overview) non-overlapping polygonal elements (a) Using a triangular mesh. (b) Using a quadrilateral mesh. Kasaei 22
Mesh-Based vs. Block- Based Motion Estimation (a) block-based backward ME (blocking artifacts). (b) mesh-based backward ME (continuous tracking, better to have separate meshes for different objects). (c) mesh-based forward ME.
Mesh-Based Motion Model The motion in each element is interpolated from nodal MVs: � Mesh-based vs. node-based model: � Mesh-based: Each node has a single MV, which influences the � motion of all four adjacent elements. Node-based: Each node can have four different MVs depending on � within which element it is considered to be in. Kasaei 24
Mesh Generation & Motion Estimation � Two problems: � Given a mesh in the anchor frame, determine nodal positions in the target frame – Motion estimation. � Set up the mesh in the anchor frame, so that the mesh conforms with object boundaries – Mesh generation. Backward ME: can use either regular mesh or object adaptive � mesh at each new frame. � Motion estimation is easier with a regular mesh, but adaptive mesh can yield more accurate result. Forward ME: � � Only needs to establish a mesh for the initial frame. Meshes in the following frames depend on the nodal MVs between successive frames. � To accommodate appearing/disappearing objects, the mesh geometry needs to be updated. � We only discuss motion estimation problem here. Kasaei 25
Estimation of Nodal Motion Unlike DBMA, all nodal MVs should be estimated simultaneously. � Unless the anchor frame uses a regular mesh, the interpolation � kernels are complicated. To simplify, use a mapping to a master element: � * u * * Kasaei 26
Estimation of Nodal Motion (cntd) Simplification: � Update one node at a time, � minimizing DFD over all adjacent elements. Gradient descent method � [Wang and Lee 1994]. Exhaustive search [Wang and � Ostermann 1998]. Update order is important: � First, update those nodes � where motion can be estimated accurately (near edges). Motion of this node should be � constrained not to cause excessively deformed elements. Kasaei 27
Motion field target frame Example: Half-pel EBMA anchor frame Predicted anchor frame (29.86dB)
EBMA vs. Mesh-based Motion Estimation EBMA (29.86dB) mesh-based method (29.72dB)
Estimation of Nodal Motion (cntd) � In order to handle newly appearing or disappearing objects in a scene, one should allow for the deletion of nodes corresponding to disappeared objects, and the creation of new nodes in newly appearing objects. Kasaei 30
Global Motion Estimation � Global motion is caused by a camera motion, or if the imaged scene consists of a single object undergoing a rigid 3-D motion: � Camera moving over a stationary scene. Most projected camera motions can be captured by affine � mapping! � The scene moves in its entirety (a rare event)! � The motion at any pixel can be decomposed into a global motion (caused by camera movement) & a local motion because of the movement of the underlying object. � Typically, the scene can be decomposed into several major regions, each moving differently (region-based motion estimation). Kasaei 31
Recommend
More recommend