Transform Coding - Overview Principle of block-wise transform coding - PowerPoint PPT Presentation

Transform Coding - Overview Principle of block-wise transform coding Properties of orthonormal transforms Discrete cosine transform (DCT) Bit allocation for transform coefficients Threshold coding Typical coding artifacts Fast implementation of the DCT 6-1 Girod: Image and Video Compression

Transform Coding reconstructed original image image original reconstructed image block block Inverse Transform A transform A -1 Quantization & Transmission quantized transform transform coefficients coefficients 6-2 Girod: Image and Video Compression

Properties of Orthonormal Transforms Forward Transform → → y = A x input signal block of size N*N, N*N transform coefficients, Transform matrix arranged as a vector arranged as a vector of size N2 * N2 Inverse transform → → → -1 T x = A y = A y → Linearity: x is represented as linear combination of “basis functions“. Parseval‘s Theorem holds: transform is a rotation of the signal vector around the origin of an N 2 -dimensional vector space. 6-3 Girod: Image and Video Compression

Separable Orthonormal Transforms, I An orthonormal transform is separable, if the transformof a signal block of size N*N-can be expressed by Note: A = A ⊗ A y = A x A T Kronecker product N*N transform coefficients Orthonormal transform matrix N*N block of input signal of size N * N The inverse transform is x = A T y A Great practical importance: The transform requires 2 matrix multiplications of size N*N instead one multiplication of a vector of size 1*N 2 with a matrix of size N 2 *N 2 Reduction of the complexity from O(N 4 ) to O(N 3 ) 6-4 Girod: Image and Video Compression

Separable Orthonormal Transforms, II 2D transform realized by 2 one-dimensional transforms (along rows and columns of the signal block) N*N block of N*N block of transform pixels coefficients N T x A x A x A column-wise row-wise N N-transform N-transform 6-5 Girod: Image and Video Compression

Criteria for the Selection of a Particular Transform Decorrelation, energy concentration (e.g., KLT, DCT, . . .) Visually pleasant basis functions (e.g., pseudo-random-noise , m-sequences, lapped transforms) Low complexity of computation 6-6 Girod: Image and Video Compression

Karhunen Loève Transform (KLT) Karhunen Loève Transform (KLT) yields decorrelated transform coefficients. Basis functions are eigenvectors of the covariance matrix of the input signal. KLT achieves optimum energy concentration. Disadvantages: KLT dependent on signal statistics KLT not separable for image blocks Transform matrix cannot be factored into sparse matrices. 6-7 Girod: Image and Video Compression

Comparison of Various Transforms, I Karhunen Loève transform (1948/1960) Haar transform (1910) Walsh-Hadamard transform(1923) Slant transform (Enomoto, Shibata, 1971) Discrete CosineTransform (DCT) (Ahmet, Natarajan, Rao, 1974) Comparison of 1D basis functions for block size N=8 6-8 Girod: Image and Video Compression

Comparison of Various Transforms, II Energy concentration measured for typical natural images, block size 1x32 (Lohscheller): KLT is optimum DCT performs only slightly worse than KLT 6-9 Girod: Image and Video Compression

Discrete Cosine Transform and Discrete Fourier Transform Transform coding of images using the Discrete Fourier Transform (DFT): edge For stationary image statistics, the energy folded concentration properties of the DFT converge against those of the KLT for large block sizes. Problem of blockwise DFT coding: blocking effects due to circular topology of the DFT and Gibbs phenomena. Remedy: reflect image at block boundaries, DFT of larger symmetric block -> “DCT“ pixel folded 6-10 Girod: Image and Video Compression

DCT Type II-DCT of blocksize M x M 2D basis functions of the DCT: is defined by transform matrix A containing elements a ik = α i cos π (2k + 1) i 2 M i, k = 0.....M-1 1 α 0 = with M 2 α i = M ∀ i ≠ 0 6-11 Girod: Image and Video Compression

Bit Allocation for Transform Coefficients I Problem: divide bit-rate R among MxM transform coefficients i such that resulting distortion D is minimized. ∑ Assumptions ∑ R = R i D = D i i i Total rate Rate for Distortion contributed Total coefficient i by coefficient i distortion lead to "Pareto condition" = ∂ D j ∂ D i for all i,j ∂ R i ∂ R j 6-12 Girod: Image and Video Compression

Bit Allocation for Transform Coefficients II Additional assumptions “Gaussian r.v.“ and mse distortion yield the optimum rate for each transform coefficient i: variance of transform coefficient (i,j) 2 σ i 1 R = max [ ( log ), 0 ] bit i 2 2 D Maximum acceptable mean squared error Literature contains many practical bit allocation schemes that are based on this insight 6-13 Girod: Image and Video Compression

Amplitude Distribution of the DCT Coefficients ✗ Histograms for 8x8 DCT coefficient amplitudes measured for natural images (from Mauersberger): DC coefficient is typically uniformly distributed. For the other coefficients, the distribution resembles a Laplacian pdf. 6-14 Girod: Image and Video Compression

Threshold Coding, I Transform coefficients that fall below a threshold are discarded. Implementation by uniform quantizer with threshold characteristic: Quantizer output Quantizer input Positions of non-zero transform coefficients are transmitted in addition to their amplitude values. 6-15 Girod: Image and Video Compression

Threshold Coding, II Efficient encoding of the position of non-zero transform coefficients: zig-zag-scan + run-level-coding ordering of the transform coefficients by zig-zag-scan 6-16 Girod: Image and Video Compression

Threshold Coding, III 201 195 188 193 169 157 196 190 1480 49 33 -15 -14 33 -38 20 185 3 1 1 -3 2 -1 0 193 188 187 201 195 193 213 193 10 -52 11 -12 16 17 -13 -12 1 1 -1 0 -1 0 0 1 Q 184 192 180 195 182 151 199 193 DCT ( 185 3 1 0 1 1 1 -1 0 1 0 1 1 0 -3 19 32 -22 -10 22 -20 9 8 0 0 1 0 -1 0 0 0 176 172 179 179 152 148 198 183 2 -1 0 0 0 0 0 0 1 -1 -1 0 -1 0 0 16 10 17 27 -31 12 6 -5 1 1 0 -1 0 0 0 -1 196 195 169 171 159 185 218 175 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 -30 -6 13 -12 8 4 -3 -3 0 0 1 0 0 0 -1 0 214 213 205 170 173 185 206 150 0 0 0 0 0 -1 -1 EOB) -25 16 6 -24 9 3 3 3 0 0 0 0 0 0 0 0 207 205 207 184 180 167 173 160 -2 17 4 -6 0 -4 -9 8 run-level- 0 0 0 0 0 0 0 0 198 203 205 186 196 149 159 163 1 -2 6 0 7 -5 -8 -7 0 0 0 0 0 0 0 0 coding Mean of block: 185 Original 8x8 block (0,3) (0,1) (1,1) (0,1) (0,1) (0,-1) (1,1) (1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,- 1) (1,-1) (14,1) (9,-1) (0,-1) (EOB) transmission Mean of block: 185 (0,3) (0,1) (1,1) (0,1) (0,1) (0,-1) (1,1) (1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,- 1) (1,-1) (14,1) (9,-1) (0,-1) Reconstructed 8x8 block (EOB) run-level- decoding 196 193 187 192 179 176 196 189 185 3 1 1 -3 2 -1 0 198 188 182 198 196 192 208 200 1 1 -1 0 -1 0 0 1 185 189 191 197 174 159 184 189 0 0 1 0 -1 0 0 0 ( 185 3 1 0 1 1 1 -1 0 1 0 1 1 0 -3 1 1 0 -1 0 0 0 -1 2 -1 0 0 0 0 0 0 1 -1 -1 0 -1 0 0 167 181 182 177 154 153 187 189 0 0 1 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 201 199 178 165 163 185 206 179 scaling 0 0 0 0 0 0 0 0 220 217 193 176 165 179 197 170 0 0 0 0 0 -1 -1 EOB) 0 0 0 0 0 0 0 0 194 198 195 193 169 156 180 179 and inverse inverse 210 196 192 209 185 149 157 160 0 0 0 0 0 0 0 0 DCT zig-zag- scan 6-17 Girod: Image and Video Compression

Detail in a Block vs. DCT Coefficients Transmitted block reconstructed quantized DCT from quantized DCT coefficients coefficients image block coefficients of block of block 30 30 20 20 10 10 0 0 -10 -10 -20 -20 0 0 -30 -30 2 2 0 0 4 4 2 2 4 4 6 6 6 6 30 30 20 20 10 10 0 0 -10 -10 -20 -20 0 0 -30 -30 2 2 0 0 4 4 2 2 4 4 6 6 6 6 30 30 20 20 10 10 0 0 -10 -10 -20 -20 0 0 -30 -30 2 2 0 0 4 4 2 2 4 4 6 6 6 6 6-18 Girod: Image and Video Compression

Typical DCT Coding Artifacts DCT coding with increasingly coarse quantization, block size 8x8 quantizer stepsize quantizer stepsize quantizer stepsize for AC coefficients: 25 for AC coefficients: 100 for AC coefficients: 200 6-19 Girod: Image and Video Compression

Adaptive Transform Coding Input signal Entropy Transform Quantization coding Block class classification Quantization and entropy coding optimized separately for each class. Typical classes: Blocks without detail Horizontal structures Vertical structures Diagonals Textures without preferred orientation 6-20 Girod: Image and Video Compression

Influence of DCT Block Size Efficiency as a function of blocksize NxN, measured for 8 bit quantization in the original domain and equivalent quantization in the transform domain Memoryless entropy of original signal G = 0 mean entropy of transform coefficients Block size 8x8 is a good compromise. 6-21 Girod: Image and Video Compression

Transform Coding - Overview Principle of block-wise transform coding - PowerPoint PPT Presentation

Transform Coding - Overview Principle of block-wise transform coding Properties of orthonormal transforms Discrete cosine transform (DCT) Bit allocation for transform coefficients Threshold coding Typical coding artifacts Fast implementation

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Fourier Series and Transform Overview Why Fourier transform? Trigonometric functions Who is

Topic 10: The Z Transform o Introduction to Z Transform o Relationship to the Fourier transform o

SMART GOVERNMENT INVOICING: INVOICE PROCESSING PLATFORM LEAD. TRANSFORM. DELIVER LEAD. TRANSFORM.

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Risk-Based Coding and Reimbursement What is Risk-Based Coding? Risk-Based Coding Overview A

Topic 4: Continuous-Time Fourier Transform (CTFT) o Introduction to Fourier Transform o Fourier

Image and Video Coding: Video Coding Standards s k [ x , y ] u k [ x , y ] quantization indexes q

Image and Video Coding: Transform Coefficient Coding 18 6 2 0 1 0 0 0 2 0 1 0 0 0 0 0 1 2 0 0 0

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

Coding and Applications in Sensor Networks Coding and Applications in Sensor Networks Why coding?

Applications of Random Coding and Algebraic Coding Theories to Universal Lossless Source Coding

Coding and Applications in Sensor Networks Why coding? Information compression

Topic 5: Discrete-Time Fourier Transform (DTFT) o DT Fourier Transform o Overview of Fourier

Secondary electron interference from trigonal warping in clean carbon nanotubes A. Dirnaichner et

Data Structures in Java Session 10 Instructor: Bert Huang

Space-filling curves in S p MV multiplication Albert-Jan Yzelman (ExaScience Lab / KU Leuven)

Statistical Machine Learning Lecture 04: Optimization Refresher Kristian Kersting TU Darmstadt

Probabilistic cellular automata with memory two Ir` ene Marcovici Joint work with J er ome

Finding a Path Often seems obvious and natural in real life Advanced Pathfinding e.g.,

Blended Program Analysis Barbara G. Ryder Virginia Tech Collaborators: Bruno Dufour (Rutgers)

CATEGORICAL ASPECTS of TORIC TOPOLOGY Nigel Ray nige@ma.man.ac.uk School of Mathematics