Image and Video Coding: From Block Transforms to JPEG JPEG Overview - - PowerPoint PPT Presentation

image and video coding from block transforms to jpeg
SMART_READER_LITE
LIVE PREVIEW

Image and Video Coding: From Block Transforms to JPEG JPEG Overview - - PowerPoint PPT Presentation

Image and Video Coding: From Block Transforms to JPEG JPEG Overview The JPEG Standard Joint Photographic Experts Group (JPEG) Standard is named after the group which created it Joint committee of ITU-T and ISO/IEC JTC 1 ITU-T Study Group 16,


slide-1
SLIDE 1

Image and Video Coding: From Block Transforms to JPEG

slide-2
SLIDE 2

JPEG Overview

The JPEG Standard

Joint Photographic Experts Group (JPEG) Standard is named after the group which created it Joint committee of ITU-T and ISO/IEC JTC 1

ITU-T Study Group 16, Working Party 3, Question 6 (Visual Coding Experts Group: VCEG) ISO/IEC Joint Technical Committee 1, Subcommittee 29, Working Group 1 (JTC 1/SC 29/WG 1)

Digital Compression and Coding of Continuous-Tone Still Images (JPEG Standard) Officially ITU-T Recommendation T.81 and International Standard ISO/IEC 10918-1 Commonly referred to as JPEG Specifies compression for gray-level and color images Work commences in 1986 Standard published in 1992 Still most widely used standard for image compression

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 2 / 43

slide-3
SLIDE 3

JPEG Overview / Color Components

JPEG: Source Image Formats

Single-Component Images 2D array of integer sample values Characterized by

image width W and image height H sample bit depth B range [ 0; 2B − 1 ]

Multi-Component Images / Color Images Multiple 2D arrays of integer sample values Characterized by

Maximum width W and height H of arrays Vertical and horizontal downsampling factors

Most common: Y’CbCr format

Y’: Full-resolution luma components Cb, Cr: Downsampled chroma components single component Y’ Cb Cr most common: Y’CbCr 4:2:0

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 3 / 43

slide-4
SLIDE 4

JPEG Overview / Partitioning

JPEG: Color Components and Partitioning

Partitioning of Color Components Each color components is partitioned into 8×8 blocks of samples If arrays size is not a multiple of 8×8: Insert missing samples (will be removed after decoding) Most common method: Constant border extension

s[ x, y ] = s[ W −1, y ] s[ x, y ] = s[ x, H−1 ]

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 4 / 43

slide-5
SLIDE 5

JPEG Overview / Source Coding Algorithm

JPEG: Basic Source Coding Algorithm

sample array

2d block transform quantization quantization lossy part entropy coding bitstream

  • riginal

8×8 block bits

sample array

2d inverse transform dequantization entropy decoding bitstream transmission

  • r storage

reconstructed 8×8 block bits

Basic Encoding Algorithm (for 8×8 blocks of samples)

1 Transform:

Energy compaction (reduce statistical dependencies between samples)

2 Quantization:

Approximate signal in a suitable way (enables more efficient coding)

3 Entropy Coding:

Represent quantization indexes with as less bits as possible Decoder: Inverse operations of encoder (note: quantization is not invertible)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 5 / 43

slide-6
SLIDE 6

Block Transform / Linear Transform

Linear Transform of Sample Vectors

Consider arbitrary vector s = (s0, s1, s2, · · · , sN−1)T of N samples Linear transform: Matrix-vector multiplication (each coefficient = linear combination of samples) Forward Transform (in encoder) Map sample vector s to vector u of transform coefficients u = A · s with A : N×N transform matrix u : vector with N coefficients u

=

A

·

s Inverse Transform (in decoder) Map reconstructed coefficients u′ to sample vector s′ s′ = A−1 · u′ with A−1 : inverse of matrix A u′ : N reconstructed coefficients s′

=

A−1

·

u′ Perfect reconstruction (s′ = s) in the absence of quantization (u′ = u)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 6 / 43

slide-7
SLIDE 7

Block Transform / Linear Transform

Interpretation of Linear Transforms

Inverse Transform Reconstructed vector of samples is obtained by s′ = A−1 · u′      s′      =      b0 b1 b2 · · · bN−1     ·      u′ u′

1

. . . u′

N−1

     = u′

0 ·b0 + u′ 1 ·b1 + u′ 2 ·b2 + · · · + u′ N−1 ·bN−1

Reconstructed vector s′ is represented as linear combination of basis vectors bk (columns of A−1) Transform coefficients u′

k represent weighting factors for corresponding basis vectors bk

Forward Transform Vector of transform coefficients is obtained by u = A · s Decomposition of sample vector s into linear combination of basis vectors bk Transform coefficients uk represent the corresponding weighting factors

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 7 / 43

slide-8
SLIDE 8

Block Transform / Linear Transform

Example for Possible Basis Vectors

Linear independent basis vectors (required for invertibility of A and A−1) b0 =

    1 1 1 1     ,

b1 =

    3 1 −1 −3     ,

b2 =

    1 −1 −1 1     ,

b3 =

    1 −1 1 −1    

Inverse transform matrix A−1 A−1 =

    b0 b1 b2 b3     =     1 3 1 1 1 1 −1 −1 1 −1 −1 1 1 −3 1 −1    

Forward transform matrix A A =

  • A−1−1 =

    0.250 0.250 0.250 0.250 0.125 0.125 −0.125 −0.125 0.250 −0.250 −0.250 0.250 0.125 −0.375 0.375 −0.125    

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 8 / 43

slide-9
SLIDE 9

Block Transform / Orthogonal Transform

Orthogonal Transform

Mean Squared Error (MSE) of Samples

1 N

  • k
  • s′

k − sk

2 = 1 N

  • s′ − s
  • T

s′ − s

  • =

1 N

  • A−1u′ − A−1u
  • T

A−1u′ − A−1u

  • =

1 N

  • u′ − u
  • T ·
  • A−1

TA−1

·

  • u′ − u
  • In general: Complicated relationship to quantization errors u′

k − uk of transform coefficients

MSE cannot be minimized by independent quantization of transform coefficients Orthogonal Transforms Inverse matrix is equal to the transpose: A−1 = AT MSE in sample domain is equal to MSE in transform domain

1 N

  • k
  • s′

k − sk

2 = 1 N

  • k
  • u′

k − uk

2

Transform coefficients can be quantized independently of each other Main reason for using orthogonal transforms in lossy coding

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 9 / 43

slide-10
SLIDE 10

Block Transform / Orthogonal Transform

Orthonormal Basis

Property of Orthogonal Transforms Consider product of forward and inverse matrix: A · AT = A · A−1 = I

       b0 b1 b2 . . . bN−1        ·        b0 b1 b2 · · · bN−1        =        1 · · · 1 · · · 1 · · · . . . . . . . . . ... . . . · · · 1       

Basis vectors bk are orthogonal to each other Basis vectors bk have a length equal to 1 Basis vectors of orthogonal matrices form an orthonormal basis Geometric Interpretation Rotation (and possible reflection) of coordinate system

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 10 / 43

slide-11
SLIDE 11

Block Transform / Orthogonal Transform

Example of an Orthogonal Transform for N = 2

Vector of two samples s = (s0, s1)T Inverse transform matrix A−1 =

  • b0

b1

  • =

1 √ 2 1 1 1 −1

  • Representation of signal vector

s = u0 · b0 + u1 · b1

  • 4

2

  • = u0 · 1

√ 2 1 1

  • + u1 · 1

√ 2

  • 1

−1

  • 4

2

  • = 3 ·

1 1

  • + 1 ·
  • 1

−1

  • s0

s1 u0 · b0 u1 · b1 s b0 b1

Forward transform: Project signal vector onto basis vectors u0 = bT

0 · s = 3

√ 2 and u1 = bT

1 · s =

√ 2

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 11 / 43

slide-12
SLIDE 12

Block Transform / Decorrelation

Goal of Transform

sn sn+1 N(sn, sn+1)

joint histogram of two adjacent samples

15 test images (each 768×512)

Goal of Transform: Compaction of Signal Energy Reduce statistical dependencies between samples inside block ! Linear transforms can only remove linear dependencies

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 12 / 43

slide-13
SLIDE 13

Block Transform / Decorrelation

Statistical Measures for Characterizing Linear Dependencies

Consider collection of one-dimensional signals (interpreted as realization of random source) Consider two samples with fixed spatial relationship s[ k ] and s[ ℓ ] Associated random variables are denoted as Sk and Sℓ Covariance and Correlation Coefficient Variance and Mean: σ2

k = E

  • (Sk − µk)2

µk = E{ Sk } Covariance: σ2

k,ℓ = E

  • (Sk − µk) (Sℓ − µℓ)
  • ( σ2

k,k = σ2 k )

Correlation coefficient: ̺k,ℓ = σ2

k,ℓ

  • σ2

k · σ2 ℓ

( −1 ≤ ̺k,ℓ ≤ 1 ) Interpretation No linear dependencies: ̺k,ℓ = 0 and σ2

k,ℓ = 0

(decorrelated samples) Strong linear dependencies: |̺k,ℓ| 1

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 13 / 43

slide-14
SLIDE 14

Block Transform / Decorrelation

Linear Dependencies inside Vectors of Samples

Consider collection of sample vectors s = [ s0, s1, s2, · · · , sN−1 ]T of size N Associated random vectors is denoted by S = [ S0, S1, S2, · · · , SN−1 ]T Covariance Matrix Matrix of covariances between samples inside vectors C SS = E

  • (S − µ) (S − µ)T
  • =

          σ2

0,0

σ2

0,1

σ2

0,2

· · · σ2

0,N−1

σ2

1,0

σ2

1,1

σ2

1,2

· · · σ2

1,N−1

σ2

2,0

σ2

2,1

σ2

2,2

· · · σ2

2,N−1

. . . . . . . . . ... . . . σ2

N−1,0 σ2 N−1,1 σ2 N−1,2 · · · σ2 N−1,N−1

         

with µ =

       µ0 µ1 µ2 . . . µN−1       

Covariance matrix characterizes linear dependencies between samples of a vector

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 14 / 43

slide-15
SLIDE 15

Block Transform / Decorrelation

Determination of Covariance Matrix for Training Set

Determination of Covariance Matrix Use sufficiently large set of N example vectors {si} Each covariance (matrix element) can be estimated according to σ2

k,ℓ = 1

N

  • ∀i
  • si[ k ] − µk
  • si[ ℓ ] − µℓ
  • with

µk = 1 N

  • ∀i

si[ k ] and µℓ = 1 N

  • ∀i

si[ ℓ ] Examples for Covariance Matrices (vectors of 4 horizontally adjacent samples)

σ2

S

    1.00 0.97 0.93 0.88 0.97 1.00 0.97 0.93 0.93 0.97 1.00 0.97 0.88 0.93 0.97 1.00     σ2

S

    1.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 1.00    

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 15 / 43

slide-16
SLIDE 16

Block Transform / Decorrelation

Linear Dependencies between Transform Coefficients

Covariance Matrix for Vectors of Transform Coefficients Same expression as for signal vectors C UU = E

  • (U − µU) (U − µU)T

= E

  • A · (S − µS) (S − µS)T · AT

= A · C SS · AT Goal: Choose orthogonal matrix A in a way that transform coefficients get decorrelated Measure for Energy Compaction and Decorrelation Linear algebra: Trace of matrix is similarity-invariant: tr

  • X
  • = tr
  • A X A−1

Transform Gain (in dB) GT = 10 · log10

  • 1

N

  • k σ2

k

N

  • kσ2

k

arithmetic mean of transform coefficient variances

geometric mean of transform coefficient variances

Maximized if transform coefficients are completely decorrelated (C UU is diagonal matrix)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 16 / 43

slide-17
SLIDE 17

Block Transform / Decorrelation

Example: Effect of Decorrelating Transform s0 s1

C SS =

  • 1

0.9 0.9 1

  • A = 1

√ 2

  • 1 1

−1 1

  • rotation by

φ = −45◦

transform gain: GT = 3.6 dB

u0 u1

C UU = 1.9 0.1

  • Signal energy is concentrated in first transform coefficient

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 17 / 43

slide-18
SLIDE 18

Block Transform / The Karhunen Loève Transform

The Karhunen Loève Transform (KLT)

Optimal Transform for Lossy Source Coding Need to consider interactions with quantization and entropy coding Difficult to determine The Karhunen Loève Transform (KLT) Orthogonal transform A that produces completely uncorrelated transform coefficients Design criterion CUU = A · CSS · AT =       σ2 · · · σ2

1

· · · . . . . . . ... . . . · · · σ2

N−1

      KLT exists for all random sources (symmetric matrices are always orthogonally diagonalizable)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 18 / 43

slide-19
SLIDE 19

Block Transform / The Karhunen Loève Transform

Basis Vectors of the Karhunen Loève Transform

Required property for the orthogonal transform matrix A A · CSS · AT = CUU (with CUU being a diagonal matrix)

  • AT · A
  • · CSS · AT = AT · CUU

(orthogonal transform: AT = A−1) CSS · AT = AT · CUU (rows of A: basis vectors bk) CSS ·       b0 b1 · · · bN−1       =       b0 b1 · · · bN−1       ·       σ2 · · · σ2

1

· · · . . . . . . ... . . . · · · σ2

N−1

      Consider individual columns of matrix equation ∀k : CSS · bk = σ2

k · bk

Eigenvector equation

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 19 / 43

slide-20
SLIDE 20

Block Transform / The Karhunen Loève Transform

Determination of KLT Transform Matrix

Condition for KLT Basis Vectors For each basis vector bk, we have an eigenvector equation C SS · bk = σ2

k · bk

C SS · v = ξ · v Note: Eigenvectors v are not unique (can be scaled by any non-zero factor) But basis vectors bk must have an ℓ2-norm equal to bk2 = 1 KLT Matrix Basis vectors are the unit-norm eigenvectors of C SS bk = v k v k2 Transform coefficient variances σ2

k = ξk are given by the

associated eigenvalues of C SS AKLT =       b0 b1 . . . bN−1       Typically, basis vectors are sorted in decreasing order of their eigenvalues

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 20 / 43

slide-21
SLIDE 21

Block Transform / The Karhunen Loève Transform

Simple Model for Correlated Sources

Auto-Regressive Sources of Order One — AR(1) Sources Auto-covariances depend only on difference of sample locations σ2

k,ℓ = E{ (Sk − µ)(Sℓ − µ) } = σ2 S · ̺|k−ℓ|

with ̺ being the first-order correlation coefficient (between two adjacent samples) N-th order auto-covariance matrix CSS = E

  • (S − µ) (S − µ)T

CSS = σ2

S

            1 ̺ ̺2 ̺3 · · · ̺N−1 ̺ 1 ̺ ̺2 · · · ̺N−2 ̺2 ̺ 1 ̺ · · · ̺N−3 ̺3 ̺2 ̺ 1 · · · ̺N−4 . . . . . . . . . . . . ... . . . ̺N−1 ̺N−2 ̺N−3 ̺N−4 · · · 1            

KLT Transform matrix only depends on correlation coefficient ̺ and the transform size N

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 21 / 43

slide-22
SLIDE 22

Block Transform / The Karhunen Loève Transform

KLT of size N = 4 for AR(1) Source with Correlation Coefficient ̺ = 0.5

CSS = σ2

S

     1.0000 0.5000 0.2500 0.1250 0.5000 1.0000 0.5000 0.2500 0.2500 0.5000 1.0000 0.5000 0.1250 0.2500 0.5000 1.0000     

AKLT =

     0.4352 0.5573 0.5573 0.4352 0.6325 0.3162 −0.3162 −0.6325 0.5573 −0.4352 −0.4352 0.5573 0.3162 −0.6325 0.6325 −0.3162     

CUU = σ2

S

     2.0856 1.0000 0.5394 0.3750     

GT = 0.94 dB (factor 1.24) b0 b1 b2 b3

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 22 / 43

slide-23
SLIDE 23

Block Transform / The Karhunen Loève Transform

KLT of size N = 4 for AR(1) Source with Correlation Coefficient ̺ = 0.95

CSS = σ2

S

     1.0000 0.9500 0.9025 0.8574 0.9500 1.0000 0.9500 0.9025 0.9025 0.9500 1.0000 0.9500 0.8574 0.9025 0.9500 1.0000     

AKLT =

     0.4937 0.5062 0.5062 0.4937 0.6516 0.2747 −0.2747 −0.6516 0.5062 −0.4937 −0.4937 0.5062 0.2747 −0.6516 0.6516 −0.2747     

CUU = σ2

S

     3.7568 0.1627 0.0506 0.0300     

GT = 7.58 dB (factor 5.73) b0 b1 b2 b3

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 23 / 43

slide-24
SLIDE 24

Block Transform / The Discrete Cosine Transform

Intermediate Results

Linear Transform of Sample Vectors Forward transform: u = A · s Inverse transform: s′ = A−1 · u′ Orthogonal Transform Transform matrix has property A−1 = AT Allows independent quantization of transform coefficients Karhunen Loève Transform Orthogonal transform that produces uncorrelated transform coefficients Maximum energy compaction for a given source Transform matrix is signal dependent Question: Can we derive a suitable signal-independent transform ?

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 24 / 43

slide-25
SLIDE 25

Block Transform / The Discrete Cosine Transform

Convergence of KLT for AR(1) Sources

̺ = 0.1

b0 b1 b2 b3

̺ = 0.5

b0 b1 b2 b3

̺ = 0.9

b0 b1 b2 b3

̺ = 0.95

b0 b1 b2 b3

KLT transform matrix converges for ̺ → 1

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 25 / 43

slide-26
SLIDE 26

Block Transform / The Discrete Cosine Transform

The Discrete Cosine Transform of Type II (DCT-II)

Transform Matrix of the Discrete Cosine Transform of Type II (DCT-II) The DCT is an orthogonal transform The N×N transform matrix ADCT = {akn} has the elements akn = αk N · cos π N k

  • n + 1

2

  • with

αk = 1 : k = 0 2 : k = 0 The basis vectors bk = {akn} represent sampled cosine functions of different frequencies Relation to KLT Unit-norm eigenvectors of CSS approach DCT-II basis vectors for ̺ → 1 Advantages of DCT-II Transform matrix does not depend on the input signal Fast algorithms for computing the forward and inverse transforms (butterfly structure)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 26 / 43

slide-27
SLIDE 27

Block Transform / The Discrete Cosine Transform

Basis Functions of the DCT-II (Example for N = 8)

bk[n] = αk N · cos π N k

  • n + 1

2

  • b0

b1 b2 b3 b4 b5 b6 b7

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 27 / 43

slide-28
SLIDE 28

Block Transform / The Discrete Cosine Transform

AR(1) Sources: Energy Compaction of KLT and DCT-II for N = 8

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07

loss of DCT-II vs KLT in energy compaction GT

correlation factor ̺ Difference in energy compaction [dB]

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 28 / 43

slide-29
SLIDE 29

Block Transform / 2D Transforms

Image Coding: 2D Transform

Image Coding Statistical dependencies in multiple directions (e.g., vertically and horizontally adjacent samples) Images are typically coded using N×N blocks of samples Straightforward Extension to Two Dimensions Arrange samples of N×N block into vector of size N2 sblk =

    s00 s01 s02 s03 s10 s11 s12 s13 s20 s21 s22 s23 s30 s31 s32 s33    

svec =

  • s00 s01 s02 s03 s10 s11 s12 s13 s20 s21 s22 s23 s30 s31 s32 s33

T

Design transform matrix A for vectors svec of size N2 Transform matrix has the size N2 × N2 Requires N2 multiplications per sample (for N = 8 → 64 multiplications per sample)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 29 / 43

slide-30
SLIDE 30

Block Transform / 2D Transforms

Image Coding: Separable 2D Transform

Separable 2D Transform Successive 1D transform for rows and columns of an N×N image block Separable orthogonal transform

    u00 u01 u02 u03 u10 u11 u12 u13 u20 u21 u22 u23 u30 u31 u32 u33     = Aver ·     s00 s01 s02 s03 s10 s11 s12 s13 s20 s21 s22 s23 s30 s31 s32 s33     · AT

hor

with Aver being an N×N transform matrix for transforming the columns, and Ahor being an N×N transform matrix for transforming the rows Inverse transform is also separable: s′ = AT

ver · u′ · Ahor

Impact on Complexity Require 2N multiplication per sample (instead of N2 for non-separable design)

N = 8: 16 multiplications (instead of 64 multiplications) N = 16: 32 multiplications (instead of 256 multiplications)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 30 / 43

slide-31
SLIDE 31

Block Transform / 2D Transforms

Transform in JPEG

s u

= ADCT AT

DCT

and u′ s′

= AT

DCT

ADCT

Separable DCT-II for 8×8 blocks Successive 1d transforms of rows and columns (both orders yield same result) DCT-II transform matrix ADCT for both transforms

  • riginal block

after horizontal DCT horizontal DCT-II after 2D DCT vertical DCT-II

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 31 / 43

slide-32
SLIDE 32

Block Transform / 2D Transforms

Basis Images of Separable 8×8 DCT-II (used in JPEG)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 32 / 43

slide-33
SLIDE 33

Block Transform / 2D Transforms

Separable 8×8 DCT-II for an Example Image

  • riginal image (256×256)

after block-wise 8×8 DCT sorted DCT coefficients

Transform gain (energy compaction): GT = 22.5 dB

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 33 / 43

slide-34
SLIDE 34

Scalar Quantization / Uniform Reconstruction Quantizer

Scalar Quantization of Transform Coefficients

∆ −∆ 2∆ −2∆ 3∆ −3∆ 4∆ −4∆ quantization indexes q: 1 −1 2 −2 3 −3 4 −4 ∆ u f (u)

JPEG: Uniform Reconstruction Quantizer Reconstruction levels: Uniformly spaced and centered around zero (quantization step size ∆) Simple decoder operation: u′ = q · ∆ (q: quantization index) Encoder: Freedom to adapt decision to source and entropy coding Simplest encoder: q = round u ∆

  • Better encoder:

Will be discussed later (rate-distortion optimized quantization) Quantization step size ∆ determines trade-off between quality and bit rate

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 34 / 43

slide-35
SLIDE 35

Scalar Quantization / Uniform Reconstruction Quantizer

Uniform Reconstruction Quantizer versus Optimal Scalar Quantizer

General scalar quantizers provide more freedom than URQs (optimization of reconstruction levels) Compare optimal scalar quantizers (ECSQs) and URQs with optimal encoder decision

Gaussian Pdf

1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 ·10−3

  • ptimal URQ vs ECSQ

bit rate (entropy) [bits per sample] SNR loss [dB]

∆SNR < 0.0063 dB MSEURQ MSEopt < 1.0015 Laplacian Pdf

1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 ·10−4

  • ptimal URQ vs ECSQ

bit rate (entropy) [bits per sample] SNR loss [dB]

∆SNR < 0.00081 dB MSEURQ MSEopt < 1.0002

Restriction to URQs has (typically) very small impact on coding efficiency

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 35 / 43

slide-36
SLIDE 36

Scalar Quantization / Quantization Tables

Quantization Tables

Freedom for Quantization of Transform Blocks Unique quantization step size per frequency position Can emulate behavior of contrast sensitivity functions Quantization Tables in JPEG Different step sizes ∆ik for frequency positions (i, k) Need to be transmitted in image header (no defaults) Example tables for YCbCr format specified in Annex K (empirically derived based on psychovisual experiments) Quantization Tables in Praxis Most JPEG encoders use flat tables ∆ik = ∆ Quantization step size ∆ selected by quality parameter Yields best coding efficiency in terms of PSNR

luma blocks

16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99

chroma blocks

17 18 24 47 99 99 99 99 18 21 26 66 99 99 99 99 24 26 56 99 99 99 99 99 47 66 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 36 / 43

slide-37
SLIDE 37

Scalar Quantization / Summary of JPEG Quantization

JPEG: Transform and Quantization

JPEG Encoder Select quantization table {∆ik} For each 8×8 block

Apply separable DCT-II Quantize transform coefficients u simple : qik = round uik ∆ik

  • JPEG Decoder

Read quantization table {∆ik} For each 8×8 block

Reconstruct transform coefficients u′ u′

ik = qik · ∆ik

Apply separable inverse DCT-II

  • riginal block s

transform coefficients u DCT-II reconstructed coefficients u′ u′ = ∆ u ∆

  • reconstructed block s′

IDCT-II

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 37 / 43

slide-38
SLIDE 38

Entropy Coding / Scanning

Entropy Coding of Quantization Indexes

Scanning of Quantization Indexes Decorrelating transform concentrates signal energy into low-frequency transform coefficients (top-left corner) Quantization indexes for high-frequency positions (bottom-right) are more likely to be become equal to zero Scanning: Traverse positions from low to high frequencies JPEG uses a so-called zig-zag scan Coding of Quantization Indexes Different concepts for DC coefficient and AC coefficients DC coefficient: Prediction + variable-length coding AC coefficients: Zig-zag scan + run-level coding

DC AC

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 38 / 43

slide-39
SLIDE 39

Entropy Coding / DC Coefficient

Entropy Coding of DC Coefficient

Prediction of DC Quantization Index DC (direct current) represents scaled average of block samples (often similar to DC of neighboring blocks) Prediction using DC quantization index of preceding block Prediction difference DIFF = DCk − DCk−1 is entropy coded

block k − 1 block k DCk−1 DCk DIFF = DCk − DCk−1

Entropy Coding of Difference DIFF Combination of variable-length and fixed-length coding Variable-length coding of category C

Specifies range of values for DIFF Codeword table has to be transmitted in header

Fixed-length coding value inside category C

Number of bits = category C

C range of DIFF example codeword 00 1

  • 1, 1

010 2

  • 3 .. -2, 2 .. 3

011 3

  • 7 .. -4, 4 .. 7

100 4

  • 15 .. -8, 8 .. 15

101 5

  • 31 .. -16, 16 .. 31

110 6

  • 63 .. -32, 32 .. 63

1110 7

  • 127 .. -64, 64 .. 127

1111 0 8

  • 255 .. -128, 128 .. 255

1111 10 9

  • 511 .. -256, 256 .. 511

1111 110 10

  • 1023 .. -512, 512 .. 1023

1111 1110 11

  • 2047 .. -1024, 1024 .. 2048

1111 1111 0

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 39 / 43

slide-40
SLIDE 40

Entropy Coding / AC Coefficients

Run-Level Coding of AC Coefficients

6 2 2 1 6 2 2 1

6, 2, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0 (0, 6) (0, 2) (2, 2) (3, 1) (eob)

Run-Level Coding of AC Quantization Indexes Convert into sequence using zig-zag scan Represent scanned sequence as (run, level) pairs (with special end-of-block symbol)

run : Number of zero indexes before next non-zero index level : Value of next non-zero index eob : Special (run, level) pair specifying end-of-block (all following indexes are equal to zero)

Variable-length coding of (run, category) pair (codeword table includes eob symbol)

Category has same meaning as for DC quantization index Codeword table is transmitted in image header

Fixed-length coding of level inside category (same as for DC quantization index)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 40 / 43

slide-41
SLIDE 41

Entropy Coding / AC Coefficients

Example Codeword Table for Run-Category Pairs

run/category codeword eob 1010 0/1 00 0/2 01 0/3 100 0/4 1011 0/5 11010 0/6 1111000 0/7 11111000 0/8 1111110110 0/9 1111111110000010 0/10 1111111110000011 1/1 1100 1/2 11011 1/3 1111001 1/4 111110110 1/5 11111110110 1/6 1111111110000100 1/7 1111111110000101 1/8 1111111110000110 1/9 1111111110000111 1/10 1111111110001000 run/category codeword 2/1 11100 2/2 11111001 2/3 1111110111 2/4 111111110100 2/5 1111111110001001 2/6 1111111110001010 2/7 1111111110001011 2/8 1111111110001100 2/9 1111111110001101 2/10 1111111110001110 3/1 111010 3/2 111110111 3/3 111111110101 3/4 1111111110001111 3/5 1111111110010000 3/6 1111111110010001 3/7 1111111110010010 3/8 1111111110010011 3/9 1111111110010100 3/10 1111111110010101 · · · · · ·

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 41 / 43

slide-42
SLIDE 42

Coding Efficiency

JPEG: Coding Efficiency and Visual Quality

Quantization step size (or quantization table) controls trade-off between quality and bit rate

0.5 1 1.5 2 24 26 28 30 32 34 36 38

JPEG

bit rate [bits per pixel] PSNRRGB [dB]

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 42 / 43

slide-43
SLIDE 43

Summary

Summary of Lecture

Image Coding Standard “JPEG” Separate coding of color components (typically in Y’CbCr 4:2:0 format) Transform coding of 8×8 blocks of samples (transform, quantization, entropy coding) Orthogonal Block Transform Separable block transform: Successive transformation of rows and columns of a block DCT-II: High energy compaction for highly correlated sources Scalar Quantization Uniform Reconstruction Quantizers (possible usage of quantization tables) Lossy part: Controls trade-off between reconstruction quality and bit rate Entropy Coding of Quantization Indexes DC coefficient: Prediction and variable-length coding AC coefficients: Zig-zag scan and run-level coding (with end-of-block symbol)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: From Block Transforms to JPEG 43 / 43