On Modelling the Hybrid Video Coding for Analysis and Future - - PDF document

on modelling the hybrid video coding for analysis and
SMART_READER_LITE
LIVE PREVIEW

On Modelling the Hybrid Video Coding for Analysis and Future - - PDF document

The Hong Kong Polytechnic University Department of Electronic and Information Engineering Prof. W.C. Siu December 2007 ICICS2007: 6 th International Conference on Information, Communications and Signal Processing, 10-13 December 2007, Singapore


slide-1
SLIDE 1

1

1

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

On Modelling the Hybrid Video Coding for Analysis and Future Development (Invited Paper)

Wan-Chi Siu and Ko-Cheung Hui Centre for Signal Processing Department of Electronic and Information Engineering The Hong Kong Polytechnic University

Invited Speaker: Prof. Wan-Chi Siu

ICICS’2007: 6th International Conference on Information, Communications and Signal Processing, 10-13 December 2007, Singapore 2

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Outline

1. Introduction 2. Former Models of The Autocorrelation of Block-Based Motion Prediction Errors 3. The Proposed Model 4. Experimental Results 5. Conclusion and Further Development

slide-2
SLIDE 2

2

3

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • 1. Introduction

Most of the work for the design and optimization of the hybrid video codecs is carried out experimentally. It is always desirable to have a proper theoretical treatment

  • f the motion-compensated video coding system, which is

extremely useful for making an analysis of the codecs available nowadays, and for designing new and efficient codecs for future applications. It requires many assumptions and simplifications. A simple Markov model with the assumption of wide-sense stationary signals may not work as well.

4

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Recall the Hybrid video coding[1-2] which is the most popular approach for video coding, and it has also been adopted by most recent video coding standards. It makes uses of some efficient motion estimation algorithms[4-7] to form block based motion compensation frame difference (MCFD) signals and subsequently to be coded by the discrete cosine transform (DCT) or integer cosine transform (ICT)[8].

slide-3
SLIDE 3

3

5

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Hybrid Video Coding

Source video

Frame Memory Frame Memory Frame Memory Frame Memory Motion Estimation Motion vectors Motion Estimation Motion Compensation Predicted frame Motion Compensation 2D-DCT

+

  • 2D-DCT

Quantizer VLC coder Buffer Quantizer VLC coder Buffer Dequantizer Inverse 2D-DCT

+ +

Dequantizer Inverse 2D-DCT Regulator 011010…111 Compressed bit stream Regulator 011010…111 6

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Hybrid Video Coding

Source video

Frame Memory Frame Memory Motion Estimation Motion Compensation Predictive frame Regulator 011010…111 Motion Compensation 2D-DCT

+

  • Quantizer

VLC coder Buffer Dequantizer Inverse 2D-DCT

+ +

Compressed bit stream Motion vectors

Error Frame Motion Compensation

slide-4
SLIDE 4

4

7

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

A proper theoretical treatment of motion-compensated video coding is valuable for the design and analysis of state-of-the-art video codecs, even though most research works were carried out experimentally. The CP-model: In 1987, the first comprehensive rate- distortion analysis

  • f motion-compensated prediction

(MCP) was presented [9]. After this initial analysis, a number of researchers investigated this subject in depth and developed many different techniques for efficiency improvement [10-21].

8

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • 2. Former Models of The Autocorrelation of

Block-Based Motion Predication Errors

Let us firstly define the covariance estimates of an NxN square matrix, S, as follows: S[sv,0, sv,1, …, sv,N-1] = [sh,0, sh,1, …, sh,N-1]T where rows sh,n and columns sv,n of S are realizations of Sh and Sv vectors, respectively, and usually N=8, representing the block

  • size. An estimate of the covariance variance matrix in the

horizontal direction is defined as Ch = E (ShSh

T), and its elements in vector form are

S S c

T N n T n h n h h

N s s N 1 1

1 , ,

 

 

slide-5
SLIDE 5

5

9

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Chen and Pang[16,19] proposed a theoretical model (the CP model) to represent the autocorrelation function of the residual errors of a motion compensated frame. The residual errors were regarded as random variables in both horizontal and vertical

  • directions. (i) The probability density function (pdf) was assumed to be

uniformly distributed over an interval range, and (ii) an impulse at the

  • rigin was included. This impulse represents the finite probability that

motion vectors could have absolute error. A compound covariance sequence of the prediction errors, C(I), was defined as shown below: (1) where I is the pixel separation in x-dimension, and (I) is the Kronecker delta function with (0)=1 and (I)=0 for I0; with A = 0.5 and ρ = 0.95 for motion-compensated frame difference.

     

I C I C ) I ( A 1 A ) I ( C

2 1 I

       

10

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

This model assumes that the prediction errors of a block are the sum of two uncorrelated zero-mean WSS processes: CI = CI1 + CI2, or The first component, C1(I), in eqn. (1) represents the autocorrelation of a first order autoregressive process, AR(1), with ρ = 0.95. The second component represents the white noise with a flat power spectrum [16]. However, this model deviates significantly from experimental results.

                                  

     

1 .. .. 1 .. 1 .. 1 ) 1 ( 1 .. .. 1 .. 1 .. 1

3 2 1 3 2 2 1 2

          A A C

N N N N N N I

           

slide-6
SLIDE 6

6

11

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

In [17], Niehsen and Brünig confirmed that the statistical means and standard deviations of the errors may change significantly from block to block. Hence they proposed another compound covariance model (the NB model) empirically, which takes the overlapped block motion estimation into account. The compound covariance of the prediction error, Ce(I), was defined as (with two correlation functions): (2) where c, ρ0 and ρ1 are model parameters. Model parameters, c=0.17, ρ0=0.91 and ρ1=0.38, were chosen to fit their empirical covariance in the l1-norm sense. According to their experimental results, their model closely fitted the characteristic of practical signals. The major disadvantage of this model, however, is that it lacks a theoretical basis, and thus its use for other analytical purposes is limited.

   

2

I 1 I e

c 1 c I C     

1st order 2nd order

12

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • 3. The Proposed Model

For the sake of simplicity, our model is also based on the first-order Autoregressive model (AR(1)) [22] and with image correlation coefficient equal to ρ. Let us consider a block of pixels ft(i,j) in a frame at time t. The block-based motion compensation uses a matched block ft-1(i+u,j+v) in a reference frame at time t-1 for prediction. The motion prediction error is then given by, e(i,j) = ft(i,j) – ft-1(i+u,j+v) (3) where (u,v) represent the motion vector of the block.

slide-7
SLIDE 7

7

13

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • A. Autocorrelation function:

The autocorrelation function with respect to (I,J) of the prediction error with correlation coefficient  is given by, where I and J are the pixel separations in x- and y-dimensions, respectively, and (u,v) is the motion vector.

    

    

) , ( ) , ( ,

1

v j u i f j i f E J I C

t t e

)] , ( ) , ( [ )] , ( ) , ( [ )] , ( ) , ( [ ) , ( ) , ( [

1 1 1 1

J v j I u i f v j u i f E J j I i f v j u i f E J v j I u i f j i f E J j I i f j i f E

t t t t t t t t

                   

   

 

) , ( ) , (

1

J v j I u i f J j I i f

t t

      

14

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Let us assume that the statistical properties of both the current frame (frame t) and the reference frame (frame t-1) are the same; we can write the autocorrelation terms as and the cross-correlation terms between a frame at time t and the reference frame at time t-1as (4a) and (4b)

)] , ( ) , ( J j I i f j i f E C

t t f

  

)] , ( ) , (

1 1

J v j I u i f v j u i f E

t t

      

 

)] , ( ) , ( ) , (

1 ,

J v j I u i f j i f E v J u I C

t t t f

      

)] , ( ) , ( ) , (

1 ,

J j I i f v j u i f E v J u I C

t t t f

      

frame t frame t-1

slide-8
SLIDE 8

8

15

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

We then have (4c) The formulation using cross-correlation functions is usually difficult to implement and may not be convenient for its simplification. Let us assume that the matched block, ft-1(i+u,j+v), be approximated to the current block ft(i,j) with reasonable deformation. That is, some physical models can approximate the deformations. For example, the affine transform can be used such as those in global motion estimation. Thus (5)

     

v J u I C v J u I C J I C J I C

t f t f f e

       , , , 2 ) , (

, ,

 

v j , u i f ) n j , m i ( f

1 t y x t

    

Note Note (mx,ny)

16

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

The deformation vector (mx,ny) represents the deformation of each pixel in the current block. The values of mx and ny rely on a number of factors, such as the motion activities, light variation, inaccuracy of motion compensation, quantization error and noise. Furthermore, in this modelling, the magnitudes of mx and ny are not related to that of the u and v directly. They depend only upon the matching or correlation between the matched block, ft-1(i+u, j+v) and the current block, ft (i, j). Substitute the L.H.S. of eqn. (5) into the R.H.S. of eqn. (4b). (6a) Similarly for eqn.4(a), (6b)

)] , ( ) , ( ) , (

1 ,

J j I i f v j u i f E v J u I C

t t t f

      

  

 

J j I i f n j m i f E

t y x t

     , ,

 

y x f

n J , m I C   

 

y x f t f

n J m I C v J u I C      , ) , (

,

Crosscorrelation Autocorrelation

slide-9
SLIDE 9

9

17

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

We regard the deformation vector (mx,my) as a pair

  • f independent random variables. Substitute eqns.6(a) and

6(b) into eqn.4(c), and take the expected value of the autocorrelation function, Ce(I,J), with respect to (mx,my), Due to the symmetry property of autocorrelation functions, (7)

     

)] , ( [ )] , ( [ ] , [ 2 ,

y x f y x f f e

n J m I C E n J m I C E J I C E J I C E       

     

   

y x f f e

n J , m I C E 2 J , I C 2 J , I C E    

     

) 4 ( , , , 2 ) , (

, ,

c v J u I C v J u I C J I C J I C

t f t f f e

      

18

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

In order to find the expected value of the autocorrelation function with respect to the deformation vector, let us assume a separable 2D AR(1) model, (8) where f

2 is the variance of the pixels in AR(1) model, and

(9)  

J I 2 f f

J , I C    

   

           

 

y x

n J m I 2 f y x f

E n J , m I C E

slide-10
SLIDE 10

10

19

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

For simplicity, a separable autocorrelation model with independent mx and ny is considered. Eqn. (9) along the x- axis can then be expressed as, for I  0, J = 0 (10) The term can be computed as (11) where p(mx) is the probability density function (pdf) of mx    

] [ E ] [ E n J , m I C E

y x

n J m I 2 f y x f  

     

 

x

m I

E

 

 

 

  

x m I x m I

dm m p E

x x

20

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • B. Deformation Model:

Different from the previous studies, we assume that pixels in a deformed block tend to deform along a definite direction rather than deforming randomly. In other words, the mean vector of the deformation vectors (mx,ny) is not regarded as zero according to this

  • assumption. Fig.1a illustrates some possible directions of deformed
  • pixels. Fig.1b illustrates our assumption on the deformation using a

simplified version of the affine transform as an example. The relationship between the coefficients of this deformation model can be written as (12) where (x,y) is the coordinate of pixel  in the current block, (x`,y`) is the coordinate of the corresponding pixel  in the deformed block, and M is the 3x3 affine transform matrix, with parameters a, b, c, d, e and f representing rotation, translation and zooming in this example.

                                          1 1 1 1 ' ' y x f e d c b a y x y x

  

M

slide-11
SLIDE 11

11

21

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Figure 1b. Difference between the autocorrelation function of a frame at time t and the cross-correlation function between the frame at time t and the reference frame at time t-1 in one dimensional case.

Cf autocorrelation function of frame at time t Cf,t cross-correlation function between a frame at time t and the reference frame at time t-1 Cf - Cf,t x Domain within a motion compensated block I

22

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

63 7 1 8 56 63 7 1 8 56

(mx,ny)0 (mx,ny)56 (mx,ny)63 (mx,ny)7 The deformed block of the matched block at frame t The matched block of the current block

Figure 1b: Possible directions of deformed pixels.

slide-12
SLIDE 12

12

23

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

The deformation vector of pixel  can be defined as, (mx,ny)  = (x`,y`) - (x,y) , and the mean deformation vector of (mx,ny)  of a block is (x,y), where (13) for N is the number of pixels in the block and is equal to 64 in this example.

. N ) n ( , N ) m ( ) , (

1 N y 1 N x y x

          

 

       

63 7 1 8 56

(mx,ny)0 (mx,ny)56 (mx,ny)63 (mx,ny)7

Figure 1c: Illustration of the assumption that pixels in a deformed block tend to deform along a definite direction.

24

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Let also σmvx and σmvy be the standard deviations of the x- and y-components of the deformation vectors (mx,ny)  in the block. Fig.1 demonstrates that part of a current block is not predicted accurately enough, because not all pixels in the block are translated in the same direction and their moving distances are not identical. However, they still present some motion tendency. Thus, a finite mean vector (x,y) can be defined for the deformation vectors of (mx,ny) in a block. Hence, we refine eqn. (11) with the above consideration. It then becomes (14) where p(mx,x) is the probability density function (pdf) of mx with mean deformation x, where x is the x-component of the mean deformation vector in each block.

 

 

 

   

x m I x x m I

dm , m p E

x x

slide-13
SLIDE 13

13

25

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

The assumed block deformation makes the cross- correlation function, E[Cf,t(I-mx,J-ny)] and the error autocorrelation function depend on image correlation coefficient  and the direction of the mean deformation

  • vector. However, an error autocorrelation function must be

an even function with respect to I. Hence to remedy this directional dependence, we only consider the absolute value of x.

26

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Let us further assume that x is randomly distributed. We also consider the separable property and use the Gaussian distribution to represent pdf. can then be written as, (15) where  is the standard deviation of the mean deformation vector and mv is the standard deviation of the deformation vectors in a single block.

 

 

x x m I 2 m 2 mv m I

d dm e e 2 1 2 2 E

x 2 mv 2 x x 2 2 x x

       

             

 

 

x

m I

E

slide-14
SLIDE 14

14

27

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • C. Normalized Formulation and Realization:

In this section, we are going to produce an equation finally suitable for the realization using the computer. Let us assume that the block deformation in the x- and y- dimensions be modeled with the same standard deviations

 and mv. Substituting eqns.8 and 10 into eqn.7, we

have the error autocorrelation function along x-direction as shown below,

,I0,J=0

(16)

   

            

 

] [ E ] [ E 2 J , I C E

y x

n J m I J I 2 f e 28

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

For comparison, let us normalize eqn.16. This becomes the variance- normalized autocorrelation function, , I0, J=0 (17) Eqn.(17) can be realized by numerical calculation. However, this form is not convenient for analytical purposes. For simplification, we can use the expected value of the mean deformation vector in eqn. (14), instead of using the pdf, p(x) to approximate the autocorrelation model. The expected value of the mean deformation vector becomes

       

] [ E ] [ E 1 ] [ E ] [ E , C E , I C E

y x y x

n m n m I I e e

        

 

       

    

2 2 2 ) (

2

2 2

 

  x x x

d e

x
slide-15
SLIDE 15

15

29

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

After some further arrangement, can be simplified in terms of two error functions erf(z) as shown below, (18) where  

| |

x

m I

E

 

 

     , , , ~

| | | | mv x I m I

I R E

x 

 

   

                                                 

  

2 ln 1 2 ln 1 2 1 , , , ~

2 2 2 2 ln

2

mv mv x I mv mv x mv x

I erf I erf e I R

x x mv

            

   

30

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Substitute eqn.18 into 17, we have (19)

         

, I C ~ , C E , I C E

mv x,

, e e e  

       

                 , , , R ~ , , , R ~ 1 , , , R ~ , , , I R ~ 1

mv x mv x mv x mv x I

 

J mv x J I

, , , J , I

      

slide-16
SLIDE 16

16

31

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Finally, we express a separable 2-D variance-normalized autocorrelation function as, (20) This is a practical equation for the realization. It is interesting to note that the compound covariance model of eqn.(1) proposed in ref.[16], is only a special case of this formulation, which can be obtained by substituting (mean deformation vector) and mv (standard derivation, same for both directions) as zero and 0.5 respectively, and making use of some properties of the erf(z) function.

           

2

, , ,

e e e

C E J C E I C E

  

 

 

       

       

y mvx y y mvy y y mvy y y mvy y J y x mvx x x mvx x x mvx x x mvx x I x y mvy y J y x mvx x I x y mvy y x mvx x J y I x

R R R J R R R R I R J I J I                                           , , , ~ , , , ~ 1 , , , ~ , , , ~ 1 , , , ~ , , , ~ 1 , , , ~ , , , ~ 1 , , , , , , , , , , , , , , , ,             

x

32

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • 4. Experimental Results

The objective of this section is to verify the above mentioned covariance models by comparing their fitness to empirical results. A large set of experimental works has been done. In some cases, we made use of the experimental results from [16, 17 and 19] to compare the fitness of our model with that of the CP model and the NB model, while in other cases, we have also implemented additional experiments, in which the required parameters have been obtained experimentally. In the latter case, usually we performed motion estimation between frame 2 and frame 1 (or fame 1 and frame 0), with a search range

  • f 16 pixel distance and with integer pixel accuracy to
  • btain motion vectors of each block.
slide-17
SLIDE 17

17

33

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

The deformation vectors of each pixel were measured by dividing a block into a number of 2x2 pixels sub-blocks and then the motion estimation with a search range of 2 pixels and with half pixel accuracy was carried out. Hence, a mean deformation vector (x,y) and the standard deviations of the deformation vectors (mv,my)in a block can be estimated. Furthermore, the averaged pixel correlation coefficients of the target MBs were measured along the vertical and horizontal directions within a window of 33x33 pixels centered with the target MBs respectively.

Figure 2. Position of Target MB in the Saleman sequence.

S1 34

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • A. Basic Experimental Results

For example, we measured the autocorrelation functions of the MCFD of the “Salesman” sequence. The typical position of a sample 16x16 Macroblocks (MB) of the “Salesman” is indicated in Fig.2. Fig.3 shows a 3D plot of the autocorrelation function of the MCFD signal of the MB. From our measurement, parameters of the sequence “Salesman” are:

x = 0.970, y = 0.960 (correlation functions),

x = 1.327, y = 1.140 (standard deviations of (mx ,my )),and x = 0.469, y = 0.789 (mean deformation vector).

  • Figure3. Autocorrelation function of the MCFD of sequence Salesman.

The Original CP model, with x = 0.970 and y = 0.960 was used.

slide-18
SLIDE 18

Slide 33 S1

Siu, 12/10/2007

slide-19
SLIDE 19

18

35

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Our improved model shows an extremely different autocorrelation function between the I- and J-directions. The improved model provides an extremely good prediction in the I-direction, for |I|4, and the best prediction in the J-direction when compared with the

  • ther two models on real signals.

In the I-direction at |I|4, the autocorrelation function of the Salesman’s MCFD signals decreases slowly to about 0.29. The improved model decreases gradually to about 0.36. However, the CP model and the NB model decrease too rapidly and tend to 0.52 and 0.13 respectively. In the J-direction, only our improved model can predict correctly that the signal tends to zero.

Figure 4: Modeled autocorrelation functions of a block of Salesman: (a): Improved model and (b) Original CP model.

(a) (b)

By measurement

36

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Figure 5(a): 1-D Normalized Autocorrelation of the Salesman’s MCFD in I-direction

1 2 3 4

  • 4
  • 3
  • 2
  • 12

0.2 0.4 0.6 0.8 1 1.2

Pixel Distance Normalized Autocorrelation of the MCFD

Real Data NB Model CP Model Improved Model

Animation

slide-20
SLIDE 20

19

37

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Figure 5(b): 1-D Normalized Autocorrelation of the Salesman’s MCFD in J-direction

1 2 3 4

  • 4
  • 3
  • 2
  • 12
  • 0.4

0.2 0.2 0.4 0.8 1 1.2

Pixel Distance Normalized Autocorrelation of the MCFD

0.6 Real Data NB Model CP Model Improved Model

Animation

38

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

0.2 0.4 0.6 0.8 1 1.2
  • 4
  • 3
  • 2
  • 1
1 2 3 4

Pixel Distance Normalized Autocorrelation of the MCFD

Real Data NB Model CP Model Improced Model

  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 1.2

  • 4
  • 2

2 4

Pixel Distance Normalized Autocorrelation of the MCFD

Real Data NB Model CP Model Improced Model

Figure 5(a): 1-D Normalized Autocorrelation

  • f the Salesman’s MCFD in I-direction

Figure 5(b): 1-D Normalized Autocorrelation

  • f the Salesman’s MCFD in J-direction
slide-21
SLIDE 21

20

39

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • B. Further Results including the H.264 Codec

We have used the affine transform to describe the process; however our model is general and the theory behind is fundamental. We have set forth a new analytical tool to model the deformation of pixels for block based motion-compensated video coding. This is a useful tool which allows a good insight into the underlining principle of video codecs, which in turn facilitates the future design of new coding algorithms.

40

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

For example, the H.264 supports video coding with variable block sizes. Besides the typical 16x16 block size, we can partition a block into 16x8, 8x16, and 8x8 sub-blocks. If the sample size is 8x8, the H.264 allows the further partitioning of it into 8x4, 4x8, or 4x4 sub-blocks. Hence the H.264 encoder involves the mode decision of different sub-block sizes, motion estimation algorithms for different sub-blocks, and bitrate control for the possible modes. These require an in depth study of the trade-off between encoding workload, the resulting bitrate and the quality of the encoded video. Our proposed model can be used as an analytical tool to study the effect of different sub-block sizes in an encoder, because the use of various sub-bocks reflects some kinds of deformation within the target macroblock.

slide-22
SLIDE 22

21

41

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Fig.7 shows some further results on using various approaches on testing the frame difference errors with variable block (VB) sizes in H.264 and the case with real data of 16x16 block size. The Real Data were extracted from the sequence Stefan (352x240), frame 2 referencing frame 1, macroblock at top-left corner: coordinates x = 112 and y = 16. In the CP Model , the parameters were A = 0.5 and  = 0.95. For the NB Model, the settings were c = 0.17, 0 = 0.95, and 1 = 0.38, while for our Improved Model, we set x = 0.8, y = 0.9, x =0.6, y = 2.9, x = 0.5 and y = 0.9. It is seen that our model is able to follow closely the variable block sizes approach of the H.264 codec.

0.2 0.4 0.6 0.8 1
  • 16
  • 12
  • 8
  • 4
4 8 12 16

Pixel Distance

CP Model NB Model Improved Model Real Data VB Real Data 16x16

Figure 7. Analysis with variable block sizes in H.264.

42

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

  • 5. Conclusion and Further Development

In the literature, there has not much work on modelling and theory of video coding. The present investigation enables us to enhance our understanding of the mechanism of hybrid video coding system and eventually facilitates the design of an efficient video encoder. We have shown that the first order Markov model can be used to derive an approximated separable autocorrelation model for the block based motion compensation difference signal. In the derivation, we have assumed that a net deformation of pixels is directional in general rather than having a uniform error distribution in a block. Results of experimental realization show that our model can describe the characteristics of the MCFD signals more accurately.

slide-23
SLIDE 23

22

43

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Much further work can be done. There are many possible directions, some of which belong to our recent investigations. (1) We studied the variable block sizes in H.264 as a global effect for MCFD. Detailed modelling of the block size effect is desirable to achieve new algorithms and better analysis. Should we use a single , or multiple ’s to represent variable block sizes? (2) It is also good to study the theoretical effect of the integer cosine transform, after 20 inter-frames say for example. (3) The deformation vector (mx,ny) mentioned in this paper is only an overall effect of the affine or perspective model for motion description. Besides, (mx,ny), should one or more parameters be included in the model? (4) Can the techniques be useful in MCTF (Motion Compensated Temporal Filtering) using the wavelet transform? What would be the case of doing motion compensation after transformation? This is not restricted to wavelet technique but could be useful for using transformation technique as the 1st step of image coding. (5) We are also trying to use this modelling technique to describe some of our image/video interpolation techniques, and the results appear attractive. This is a fruitful direction for further research.

Multiple ’s

44

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

References: C.F Chen and K. K. Pang, “The Optimal Transform of Motion-Compensated Frame Difference Images in a Hybrid Coder”, IEEE Transactions on Circuits and Systems-II: Analog and Digital Signal Processing, Vol. 40, No. 6, pp. 393-397, June 1993.

  • W. Niehsen and M. Brünig, “Covariance Analysis of Motion-Compensated Frame Difference”,

IEEE Transactions on Circuits and Systems For Video Technology, Vol. 9, No. 4, pp. 536-539, June 1999. C.F. Chen and K. K. Pang, “Hybrid Coders with Motion Compensation”, Multidimensional Systems and Signal Processing, Vol. 3, No. 3, pp. 241-266, 1992. Hoi-Kok Cheung, Wan-Chi Siu, Dagan Feng and Tom Cai, ‘New Block-base Motion Estimation for Sequences with Brightness Variation and its Application to Static Sprite Generation for Video Compression’, paper accepted, to be published in IEEE Transactions on Circuits and Systems for Video Technology, USA. Ko-Cheung Hui and Wan-Chi Siu, ‘Extended Analysis of Motion-Compensated Frame Difference for Block-based Motion Prediction Error’, pp.1232-1245, Vo.16 No.5, IEEE Transactions on Image Processing, May 2007, USA. H.K Cheung and W.C. Siu, “Robust Global Motion Estimation and Novel Updating Strategy for Sprite Generation”, pp.13-20, Vol.1, Issue 1, IET Image Processing, March 2007, UK. Ko-Cheung Hui, Wan-Chi Siu and Yui-Lam Chan, ‘New Adaptive Partial Distortion Search using Clustered Pixel Matching Error Characteristic’, pp.597-607, Vol.14, No.5, May 2005, IEEE Transactions on Image Processing. Yui-Lam Chan and Wan-Chi Siu, ‘An Efficient Search Strategy for Block Motion Estimation using Image Features’, IEEE Transactions on Image Processing , pp.1223-38, Vol.10, No.8, August 2001, USA.

slide-24
SLIDE 24

23

45

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

The End

46

The Hong Kong Polytechnic University Department of Electronic and Information Engineering

  • Prof. W.C. Siu

December 2007

Illustration of the assumption that pixels in a deformed block tend to deform along a definite direction.

(x, y)0,0 (x,  y)1,0 (x, y)2,0,0 (x,  y)2,1,0 ( x, y)2,2,0 (x,  y)3,0

Return