TSBK01 Image Coding and Data Compression Lecture 10 Jrgen - - PowerPoint PPT Presentation

tsbk01 image coding and data compression lecture 10 j
SMART_READER_LITE
LIVE PREVIEW

TSBK01 Image Coding and Data Compression Lecture 10 Jrgen - - PowerPoint PPT Presentation

TSBK01 Image Coding and Data Compression Lecture 10 Jrgen Ahlberg I. Colour coding II. Moving images: From 2D to 3D? III. Hybrid coding IV. Video coding standards


slide-1
SLIDE 1
  • TSBK01 Image Coding and Data Compression

Lecture 10 Jörgen Ahlberg

slide-2
SLIDE 2
  • I.

Colour coding II. Moving images: From 2D to 3D? III. Hybrid coding IV. Video coding standards

slide-3
SLIDE 3
  • The base colours of colour television are

– Red:

700 nm

– Green:

546 nm

– Blue:

435 nm

Three base colours enough to synthesize any visible colour!

slide-4
SLIDE 4

B G R

  • In this plane, the

luminance Y = R+G+B = 1

slide-5
SLIDE 5
  • Y = 0.30B + 0.59G + 0.11B

Cr = 0.70R - 0.59G - 0.11B Cb = - 0.30R - 0.59G + 0.89B Y luminance; Cr, Cb chrominance Matrix R G B Y R-Y B-Y

slide-6
SLIDE 6

Change basis to YUV (almost the same as YCrCb). – For more info on color spaces, see colour FAQ at

www.poynton.com/Poynton-color.html

The Human Visual System perceives the luminance in higher

resolution than the chrominance! Subsample the colour components.

  • Y

U V 4:2:0

Y

U V 4:2:2

slide-7
SLIDE 7
  • Principle I - Extend known methods to 3D

Low Very high 0.1 - 0.5 Fractal High High 0.1 – 1.0 Subband/ Wavelet High High 0.5 – 1.5 Transform Low Low 2 – 5 Predictive Low Very high 0.5 – 2 VQ Low Low 6 – 8 PCM

Decoding complexity Complexity Prestanda (bpp) Coding Method

slide-8
SLIDE 8
  • Predictive coding

3D predictors

Motion compensated predictors

  • Transform coding

3D transforms

  • Subband coding

3D subband filters

BUT! The properties of the image signal are different in the temporal and the spatial domain!

slide-9
SLIDE 9
  • Principle II:

Hybrid methods

Hybrid predictive/transform coding popular++

slide-10
SLIDE 10
  • Combine predictive coding and transform coding.

Use predictive coding to predict the next frame in the

sequence.

Use transform coding to code the prediction error.

slide-11
SLIDE 11
  • T

Q VLC T: Transform Q: Quantizer VLC: Variable Length Coder

slide-12
SLIDE 12
  • Q

Q-1 VLC P

Q: Quantizer Q-1: Inverse quantizer (reconstructor) P: Predictor

slide-13
SLIDE 13
  • T

T-1 Q Q-1 VLC P

slide-14
SLIDE 14
  • Intra-coded

I-frame Predictively coded P-frames

Better prediction if it can compensate for motion!

slide-15
SLIDE 15
slide-16
SLIDE 16
  • VLC

ME

ME: Motion estimation

TQ-1 TQ P VLC

TQ: Transform + quantization

slide-17
SLIDE 17
  • Typically one motion vector per macroblock (4

transform blocks)

Motion estimation is a time consuming process

– Hierarchical motion estimation – Maximum length of motion vectors – Clever search strategies

Motion vector accuracy:

– Integer, half or quarter pixel – Bilinear interpolation

slide-18
SLIDE 18
  • 8

16 64 384 1.5 5 20

kbit/s Mbit/s

Very low bitrate Low bitrate Medium bitrate High bitrate Mobile videophone Videophone

  • ver PSTN

ISDN videophone Digital TV HDTV Video CD

MPEG-4 MPEG-1 MPEG-2 H.261 H.263

slide-19
SLIDE 19
  • H.26x

– Standards for real time communication like video telephony

and video conferencing.

– Standardized by ITU.

MPEG

– Standards for stored video data like movies on CDs, DVDs,

etc.

– Standardized by ISO.

slide-20
SLIDE 20
  • Standard for ISDN picture phones in 1990.

Motion compensation: – One motion vector per macroblock. – One macroblock = four 88 luminance blocks + two chrominance

blocks (one U and one V).

– Motion vectors max 15 pixels long in each direction. Format: – CIF (352288) or QCIF (176144) – 7.5 – 30 frames/s. Bitrate: Multiple of 64 kbit/s (=ISDN) including audio. Quality: Acceptable for small motion at 128 kbit/s.

slide-21
SLIDE 21
  • Standard for picture telephones over analog subscriber

lines in 1995.

Format:

– CIF, QCIF or Sub-QCIF. – Usually less than 10 frames/s.

Bitrate: Typically 20 – 30 kbit/s. Quality: With new options as good as H.261 (at half the

bitrate).

slide-22
SLIDE 22
  • Moving Pictures Expert Group – a committee under

ISO and IEC.

Original plan:

– MPEG-1 for 1.5 Mbit/s (VideoCD) – MPEG-2 for 10 Mbit/s (Digital TV) – MPEG-3 for 40 Mbit/s (HDTV)

What happened:

– MPEG-1 for 1.5 Mbit/s (Video CD) – MPEG-2 for 2 – 60 Mbit/s (TV and HDTV) – MPEG-4, -7 and -21 for other things.

slide-23
SLIDE 23
  • ISO/IEC standard in 1991.

Target bitrate around 1.5 Mbit/s (Video CD). Properties:

– Bi-directionally predictively coded frames (”B-frames”, see next

slide).

– More flexible than H.261. – Almost JPEG for intra frames.

Format:

– CIF – No interlace. – 24 – 30 frames/s.

slide-24
SLIDE 24
  • I

B P B B P B B P B B I B

Intra-coded I-frame Predictively coded P-frames Bi-directionally predictively coded B-frames

Group of frames (GOF)

slide-25
SLIDE 25
  • Intracoded

88 DCT Arbitrary weighting matrix for coefficients Predictive coding of DC-coefficients Uniform quantization Zig-zag, run-level, entropy coding

slide-26
SLIDE 26
  • Motion compensated prediction from I- or P-frame.

Half-pixel accuracy of motion vectors, bilinear

interpolation.

Predictive coding of motion vectors. Prediction error coded as I-frame.

slide-27
SLIDE 27
  • Motion compensated prediction from two consecutive I-
  • r P-frames.

– Forward prediction only (1 vector/macroblock). – Backward prediction only (1 vector/macroblock). – Average of fwd and bwd (2 vectors/macroblock).

Otherwise as P-frames.

slide-28
SLIDE 28
  • ISO/IEC standard in 1994.

Properties: – Handles interlace (optimized for TV) – Even more flexible than MPEG-1 Format: – 352288 – 704576 (25 frames/s) or 720480 (30 frames/s) – 14401152 or 19201080 (HDTV) Bitrate: – 2 – 60 Mbit/s – ~4 Mbits/s: Image quality similar to PAL / NTSC / SECAM. – 18 – 20 Mbit/s: HDTV.

slide-29
SLIDE 29
  • Profiles:

– Simple profile without B-frames. – Scaleable profiles.

Experience tells that:

– At 1.5 – 2 Mbit/s MPEG-2 is not better than MPEG-1. – With manual interaction at the coding, good quality can be

achieved at 3 – 4 Mbit/s.

– Problems with implementing the full standard has caused

compatibility problems.

– Buffering and rate control hard problems.

slide-30
SLIDE 30
  • ISO/IEC standard in 1998, version 2 in 1999

Instead of frames as coding units, MPEG-4 use audio-visual

  • bjects

Focus is not primarily on compression, but on content-based

functionality

Contains definitions of: – Media object types (video, audio, text, graphics, ...) – Parameters for describing the objects – Bitstream syntax for the (compressed) parameters – Scene description, file format, streaming, synchronization, ... Allows mixing of media objects.

slide-31
SLIDE 31
  • Part 1, Systems, contains

– The bitstream syntax and the the binary ”language” for scene

description

– Computer graphics object descriptions – Multiplexing, transport, ... Part 2, Visual, contains – Video coding – Still image coding – Texture coding, ... Part 3, Audio, contains a toolbox of audio coders for different

applications

...

slide-32
SLIDE 32
slide-33
SLIDE 33
  • Instead of frames: Video Object Planes

Coded with Shape Adaptive DCT

slide-34
SLIDE 34
  • TQ-1

TQ VLC

  • Mux

VLC VLC

slide-35
SLIDE 35
  • Mix traditional video with 2D/3D graphics

– Compose virtual environments – Easy to add text, graphs, images, etc High compression Receive object from separate sources – Use predefined or locally defined objects Scaleability – Progressive decoding – Better terminal gives better quality.

slide-36
SLIDE 36
  • 2D/3D graphics

Lines, polygons

Still images

Image/video mapping on polygon meshes

  • VRML scenes and objects
  • Animated people
  • More on animation and virtual characters in Lecture 12!
  • Synthetic audio
  • More on natural and synthetic audio in Lecture 11!
slide-37
SLIDE 37
slide-38
SLIDE 38
  • Downloaded virtual environment

Different environments for different users Simple change between environments Synthetic environments are cheaper than real ones

slide-39
SLIDE 39
  • Wavelet-based still image compression

– Scaleable quality and resolution – Progressive decoding – Can be mapped on 2D or 3D meshes

Compression of 2D and 3D meshes

– Mesh geometry and animation – Transmit vertex coordinates and let the receiving terminal

calculate the polygons

– A moving or still image can be mapped on the mesh (texture

mapping).

slide-40
SLIDE 40
  • Face and Body Animation

Text-to-speech (TTS) interface View-dependent scaleable texture

– Information about the users view position in a 3D scene is

transmitted on a back-channel

– Only the necessary texture information is transmitted to the

user

slide-41
SLIDE 41
  • Original texture

The texture is mapped on a surface What the user sees

slide-42
SLIDE 42
  • Microsoft, RealVideo, QuickTime, ...

All are variations of the hybrid coder used in MPEG-

coders, with some extra features.

slide-43
SLIDE 43
  • ITU and ISO in cooperation:

H.264 = MPEG-4 part 10

Finished in 2003.

slide-44
SLIDE 44
  • 44 integer transform (approximating DCT).

Prediction of blocks of sizes up to 1616. Motion vectors for blocks of sizes 44 up to 1616. Up to 5 reference images for prediction. Non-uniform qunatization. Arithmetic coding of run-level pairs.

slide-45
SLIDE 45
  • MPEG-1

– Audio layer I, II and III (mp3).

MPEG-2

– Four channels, same codec as in MPEG-1. – AAC (Advanced Audio Codec) added later.

MPEG-4

– AAC – Two speech coders – Structured audio – And more...

More on audio coding in Lecture 11.

slide-46
SLIDE 46
  • Color coding

– Change basis from RGB to YUV – Colour components are compressed harder than the

luminance

Moving image coding

– Hybrid coding: Motion compensated predictive coding and

transform coding of the prediction error

– I-, P-, and B-frames – Object-based coding (MPEG-4) mixing synthetic and natural

audio & video

slide-47
SLIDE 47
  • Standards

– MPEG-1:

Video CD

– MPEG-2:

Digital TV

– MPEG-4:

Multimedia

– H.261:

ISDN videophone

– H.263:

PSTN videophone

– H.264 / MPEG-4 part 10: Universal video

slide-48
SLIDE 48

That was the last slide!