In the name of Allah the compassionate, the merciful Digital Video - - PowerPoint PPT Presentation
In the name of Allah the compassionate, the merciful Digital Video - - PowerPoint PPT Presentation
In the name of Allah the compassionate, the merciful Digital Video Systems S. Kasaei S. Kasaei Room: CE 307 Department of Computer Engineering Sharif University of Technology E-Mail: skasaei@sharif.edu Webpage: http://sharif.edu/~skasaei
In the name of Allah
the compassionate, the merciful
Digital Video Systems
- S. Kasaei
- S. Kasaei
Room: CE 307 Department of Computer Engineering Sharif University of Technology E-Mail: skasaei@sharif.edu Webpage: http://sharif.edu/~skasaei
- Lab. Website: http://mehr.sharif.edu/~ipl
Acknowledgment
Most of the slides used in this course have been provided by:
- Prof. Yao Wang (Polytechnic University, Brooklyn) based on
the book: Video Processing & Communications written by: Yao Wang, Jom Ostermann, & Ya-Oin Zhang Prentice Hall, 1st edition, 2001, ISBN: 0130175471. [SUT Code: TK 5105 .2 .W36 2001].
Chapters 9 & 11
Video Coding using Motion Compensation
Kasaei 6
Outline
- Block-Based Hybrid Video Coding
- Overview of Block
- B
a sed Hybrid Video Coding
- Overlapped Block Motion Compensation
- Coding Mode Selection & Rate Control
- Loop Filtering
- Scalable Video Coding
- Motivation for Scalable Coding
- Basic Modes of Scalability
- Scalability in MPEG
- 2
- Fine Granularity Scalability in MPEG
- 4
Kasaei 7
Characteristics
- f Typical Videos
Frame t-1 Frame t
Adjacent frames are similar & changes are due to
- bject or camera motion.
Kasaei 8
Key Ideas in Video Compression
- Predicts a new frame from a previous frame & only
codes the prediction error.
- Prediction error cab be coded using the DCT method.
- Prediction errors have smaller energy than the
- riginal pixel values & can be coded with fewer bits.
- Those regions that cannot be predicted well will be
coded directly using DCT.
- Works on each macroblock (MB) (16x16 pixels)
independently to reduce the complexity.
- Motion compensation is done at the MB level.
- DCT coding of error is done at block level (8x8 pixels).
Kasaei 9
Different Coding Modes
Kasaei 10
Temporal Prediction
- No Motion Compensation:
- Works well in stationary regions.
- Uni
- d
irectional Motion Compensation:
- Does not work well for uncovered regions by object motion.
- Bi
- d
irectional Motion Compensation:
- Can better handle covered/uncovered regions.
$( , , ) ( , , ) f t m n f t m n = −1
$( , , ) ( , , ) f t m n f t m d n d
x y
= − − − 1 $( , , ) ( , , ) ( , , )
, , , ,
f t m n w f t m d n d w f t m d n d
b b x b y f f x f y
= − − − + + − − 1 1
Kasaei 11
Temporal Prediction
Although bi-directional prediction can
improve prediction accuracy & consequently coding efficiency, it incurs encoding delay & is typically not used in real-time applications.
H.261/H.263 use only uni-directional
prediction & a restricted bi-directional prediction (PB-mode).
MPEG employs both uni- & bi-directional
prediction.
Kasaei 12
Encoder Block Diagram
- f a Block-Based Hybrid Video Coder
[Hybrid: a combination of motion-compensated temporal prediction + transform coding.]
Kasaei 13
Decoder Block Diagram
Kasaei 14
Block-Based Hybrid Video Coding
- The encoder must emulate the decoder
- peration to deduce the same reconstructed
frame as the decoder.
- Frame types:
- When a frame is coded entirely in the intra-mode
I-frame.
- When a previous frame is used for prediction in the
inter-mode P-frame.
- When a weighted sum of previous & following frame
is used for prediction in the inter-mode B-frame.
- The mode information, MVs, & other side
information (picture format, block location, …) are coded using VLC.
Kasaei 15
Block-Based Video Partitions
Kasaei 16
MB Structure in 4:2:0 Color Format
4 of 8x8 Y blocks. 1 of 8x8 Cb blocks. 1 of 8x8 Cr blocks.
Kasaei 17
Block Matching Algorithm for Motion Estimation
Search Region
Frame t-1 (reference frame) Frame t (predicted frame)
MV
Kasaei 18
Macroblock Coding in I-Mode
DCT transform each 8x8 DCT block. Quantize DCT coefficients (with properly chosen quantization matrices). Zig
- z
ag order & run
- l
ength code quantized DCT coefficients.
Kasaei 19
Macroblock Coding in P-Mode
Estimate one MV for each macroblock (16x16). Depending on the motion compensation error, determine the coding mode (intra, inter
- w
ith
- n
- MC, inter
- with
- MC, etc.)
Original values (for intra
- mode) or motion compensation
errors (for inter
- mode) in each of the DCT blocks (8x8)
are DCT transformed, quantized, zig
- z
ag/alternate scanned, & run- length coded.
Kasaei 20
Macroblock Coding in B-Mode
- Same as for the P
- m
- de, except a macroblock can
be predicted from a previous frame, a following one,
- r both.
vb
v f
Kasaei 21
Overlapped Block Motion Compensation (OBMC)
- Conventional block motion compensation:
- One best matching block is found from a reference
frame.
- The current block is replaced by the best matching
block.
- OBMC:
- Each pixel in the current block is predicted by a
weighted average of several corresponding pixels in the reference frame.
- The corresponding pixels are determined by the
MVs of the current as well as adjacent MBs.
- The weights for each corresponding pixel depends
- n the expected accuracy of the associated MV.
Kasaei 22
OBMC using 4-Neighboring MBs
should be inversely proportional to the distance between x & the center of ([hm,1 & hm,4] > [hm,2 & hm,3])
weight assigned to estimated value based on MV (dm,k).
Kasaei 23
Optimal Weighting Design
- Convert to an optimization problem:
- Minimize:
- Subject to:
- Optimal weighting functions:
autocorrelation cross-correlation
Kasaei 24
How to Determine MVs with OBMC
- Option 1: using conventional BMA, minimize the
prediction error (MAD) within each MB independently.
- Option 2: minimize the prediction error
assuming OBMC:
Solve the MV for the current MB while
keeping the MVs for the neighboring MBs found in the previous iterations.
Kasaei 25
How to Determine MVs with OBMC
- Option 3: Using a weighted error criterion over
a larger block:
window function
Kasaei 26
Weighting Coefficients used in H.263
Kasaei 27
Window Function Corresponding to H.263 Weights for OBMC
Coding Parameter Selection
- Coding modes:
- Intra vs. inter, quantization parameter (QP) for
each MB, ME method ( different rates).
- Rate-distortion optimized selection, given a
target rate:
- Minimize the distortion, subject to the target rate
constraint:
simplified version:
The optimal mode is such that each MB works at the same R-D slope:
Kasaei 29
Rate Control
- Rate control:
- How to code a video so that the resulting bit
stream satisfies a target bit rate?
- For pleasant visual perception, video should have
a constant quality.
- But, the coding method necessarily yields variable
bit rate.
- So, at least the bit rate should be constant when
averaged over a short period.
- Rate control is also necessary when the video is to
be sent over a constant bit rate (CBR) channel.
- The fluctuation within the period can be smoothed
by a buffer at the encoder output.
Kasaei 30
Rate Control
Rate control accomplished steps:
- Step 1) Determine the target average bit rate at
the frame, GOB, & MB level, based on the current buffer fullness.
- Step 2) Satisfy frame level target rate by varying
frame rate (skip frames when necessary).
- Step 3) Satisfy GOB/MB level target rate by
varying the coding mode & QP at each MB.
(= Rate-distortion optimized mode selection.)
Kasaei 31
Loop Filtering
- Errors in previously reconstructed frames
accumulate over time with motion compensated temporal prediction.
- Error propagation leads to:
- Reduction of prediction accuracy.
- Increase of the bitrate needed for coding new
frames.
Kasaei 32
Loop Filtering
- Loop filtering:
- Filters the reference frame before using it for
prediction.
- Can be embedded in the motion compensation
loop.
- Half-pel motion compensation.
- OBMC.
- Loop filtering can significantly improve coding
efficiency.
- For theoretically optimal design of loop filters
see text.
Kasaei 33
Scalable Coding (Ch. 11)
- Scalability refers to the capability of recovering
physically meaningful image (or video) information by decoding only partial compressed bit stream.
- It is used when users try to access the same video
through different communication links (bandwidth scalability).
- A scalable stream can also offer adaptivity to
varying channel error characteristics & computing power at the receiving terminal.
- This includes: quality, spatial, temporal, &
frequency scalability.
Kasaei 34
Scalable Coding
- Motivation:
- Real networks are heterogeneous in rate.
- Streaming video from home (56 kbps) using
modem vs. corporate LAN (10-100 mbps).
- Scalable video coding:
- Ideal goal (embedded stream): creating a bit
stream that can be accessed at any rate.
- Practical video coder:
- layered coder: base-layer provides basic quality,
successive layers refine the quality incrementally.
- Coarse granularity: (typically known as layered
coder).
- Fine granularity (FGS).
Kasaei 35
Combined Spatial/Quality Scalability
Kasaei 36
Illustration of Scalable Coding
6.5 kbps 21.6 kbps 133.9 kbps 436.3 kbps Quality (SNR) Scalability Spatial Scalability
Kasaei 37
Quality Scalability by Multistage Quantization
coarse fine
Kasaei 38
Spatial/Temporal Scalability through Down/Up-Sampling
coarse fine
Kasaei 39
Scalability in MPEG-2
MPEG-2 is the earliest standard that offers
scalability tools.
Four types of scalability:
Data partition (frequency scalability). SNR scalability (quality scalability). Temporal scalability (frame-rate scalability). Spatial scalability (resolution scalability).
Kasaei 40
Fine Granularity Scalability in MPEG-4
MPEG-4 achieves fine granularity quality
scalability through bit-plane coding.
The DCT coefficients are represented
losslessly in binary bits.
The bit planes are coded successively, from
the most significant bit to the least.
The bit plane within each block is coded
using run-length coding.
Temporal scalability is accomplished by
combining I, B, & P-frames.
Spatial scalability is achieved by spatial
down/up-sampling.
Kasaei 41
Bit-Plane Coding
Kasaei 42
Bit-Plane Coding
Kasaei 43
Scalable Video Coding using Wavelet Transform
Wavelet-based image coding:
Full frame image transform (as opposed to
block-based transform).
Bit plane coding of the transform coefficients
can lead to embedded bit streams.
Embedded zero-tree wavelet (EZW)
SPIHT.
Kasaei 44
Scalable Video Coding using Wavelet Transforms
Wavelet-based video coding:
Temporal filtering with & without motion
compensation.
Can achieve temporal, spatial, & quality
scalability simultaneously.
MPEG4 uses DCT-based coding for natural
videos, but uses WT-based coding for still images & graphics.
Still an active research activity!
Kasaei 45
Homework 9
Reading assignment:
- Sec. 9.3, 11.1
Written assignment:
- Prob. 9.13, 11.3, 11.4
Computer assignment:
- Prob. 9.11, 9.12
Optional: 9.15