Information Theory and Coding i f s f f Image, Video and Audio - PowerPoint PPT Presentation

Sampling, aliasing and Nyquist limit Information Theory and Coding – i ⋅ f s ± f f Image, Video and Audio Compression Markus Kuhn 0 Lent 2003 – Part II Computer Laboratory 0 −3fs −2fs −fs 0 fs 2fs 3fs A wave cos(2 πtf ) sampled with frequency f s cannot be distinguished http://www.cl.cam.ac.uk/Teaching/2002/InfoTheory/ from cos(2 πt ( if s ± f )) for any i ∈ Z , therefore ensure | f | < f s / 2 . 3 Structure of modern audiovisual Quantization communication systems Uniform: 4 2 Perceptual Entropy Sensor+ Channel 0 Signal ✲ ✲ ✲ ✲ sampling coding coding coding −2 −4 −4 −3 −2 −1 0 1 2 3 ❄ Noise Channel ✲ Non-uniform (e.g., logarithmic): 8 ❄ 6 Perceptual Entropy Channel Human Display ✛ ✛ ✛ ✛ senses decoding decoding decoding 4 2 The dashed box marks the focus of the main part of this course as taught by Neil Dodgson. 0 0.5 1 2 4 8 2 4

Example for non-uniform quantization: digital telephone network Fechner’s scale matches older subjective intensity scales that follow differentiability of stimuli, e.g. the astronomical magnitude numbers for star brightness introduced by Hipparchos ( ≈ 150 BC). µ −law (US) A−law (Europe) signal voltage Stevens’ law 0 A sound that is 20 DL over SL is perceived as more than twice as loud as one that is 10 DL over SL, i.e. Fechner’s scale does not describe well perceived intensity. A rational scale attempts to reflect subjective −128 −96 −64 −32 0 32 64 96 128 byte value relations perceived between different values of stimulus intensity φ . Stevens observed that such rational scales ψ follow a power law: Simple logarithm fails for values ≤ 0 → apply µ -law compression y = V log(1 + µ | X | /V ) ψ = k · ( φ − φ 0 ) a sgn ( x ) log(1 + µ ) Example coefficients a : temperature 1.6, weight 1.45, loudness 0.6, before uniform quantization ( µ = 255 , V maximum value). brightness 0.33. Lloyd’s algorithm: finds least-square-optimal non-uniform quantization function for a given probability distribution of sample values. S.P. Lloyd: Least Squares Quantization in PCM. IEEE Trans. on Information Theory. Vol. 28, March 1982, pp 129–137. 5 7 Psychophysics of perception Decibel Sensation limit (SL) = lowest intensity stimulus that can still be perceived Communications engineers love logarithmic units: Difference limit (DL) = smallest perceivable stimulus difference at given → Quantities often vary over many orders of magnitude → difficult intensity level to agree on a common SI prefix Weber’s law → Quotient of quantities (amplification/attenuation) usually more Difference limit ∆ φ is proportional to the intensity φ of the stimulus interesting than difference (except for a small correction constant a describe deviation of experi- mental results near SL): → Signal strength usefully expressed as field quantity (voltage, ∆ φ = c · ( φ + a ) current, pressure, etc.) or power, but quadratic relationship between these two ( P = U 2 /R = I 2 R ) rather inconvenient Fechner’s scale → Weber/Fechner: perception is logarithmic Define a perception intensity scale ψ using the sensation limit φ 0 as the origin and the respective difference limit ∆ φ = c · φ as a unit step. Plus: Using magic special-purpose units has its own odd attractions ( → typographers, navigators) The result is a logarithmic relationship between stimulus intensity and Neper (Np) denotes the natural logarithm of the quotient of a field scale value: quantity F and a reference value F 0 . φ ψ = log c Bel (B) denotes the base-10 logarithm of the quotient of a power P φ 0 and a reference power P 0 . Common prefix: 10 decibel (dB) = 1 bel. 6 8

YCrCb video colour coordinates Where P is some power and P 0 a 0 dB reference power, or F is a field quantity and F 0 the reference: Human eye processes color and luminosity at different resolutions, therefore use colour space with luminance coordinate P F 10 dB · log 10 = 20 dB · log 20 Y = 0 . 3 R + 0 . 6 G + 0 . 1 B P 0 F 0 and colour components Common reference vales indicated with additional letter afer dB: V = R − Y = 0 . 7 R − 0 . 6 G − 0 . 1 B 0 dBW = 1 W U = B − Y = − 0 . 3 R − 0 . 6 G + 0 . 9 B 0 dBm = 1 mW = − 30 dBW Since − 0 . 7 ≤ V ≤ 0 . 7 and − 0 . 9 ≤ U ≤ 0 . 9 , a more convenient 0 dB µ V = 1 µ V normalized encoding of chrominance is: 0 dB SPL = 20 µ Pa (sound pressure level) U Cb = 2 . 0 + 0 . 5 0 dB SL = perception threshold (sensation level) V Cr = 1 . 6 + 0 . 5 3 dB = double power, 6 dB = double pressure/voltage/etc. 10 dB = 10 × power, 20 dB = 10 × pressure/voltage/etc. Modern image compression techniques operate on Y , Cr , Cb channels separately, using half the resolution of Y for storing Cr , Cb . 9 11 RGB video colour coordinates Correlation of neighbour pixels Values of nighbour pixels at distance 1 Values of nighbour pixels at distance 2 Hardware interface (VGA): red, green, blue signals with 0–0.7 V 250 250 Electron-beam current and photon count of cathode-ray display are 200 200 proportional to ( v − v 0 ) γ , where v is the video-interface or screen-grid 150 150 voltage and γ is usually in the range 1.5–3.0. CRT non-linearity is 100 100 compensated electronically in TV cameras and approximates Stevens scale. 50 50 Software interfaces map RGB voltage linearly to { 0 , 1 , . . . , 255 } or 0–1 0 0 0 100 200 0 100 200 Values of nighbour pixels at distance 4 Values of nighbour pixels at distance 8 Mapping of numeric RGB values to colour and luminosity is at present 250 250 still highly hardware and sometimes even operating-system or device- 200 200 driver dependent. 150 150 New specification “sRGB” aims to fix meaning of RGB with γ = 2 . 2 and standard primary colour coordinates. 100 100 http://www.w3.org/Graphics/Color/sRGB 50 50 http://www.srgb.com/ IEC 61966 0 0 0 100 200 0 100 200 10 12

Karhunen-Lo` eve transform (KLT) The 2-dimensional variant of the DCT applies the 1-D transform on both rows and columns of an image: Two random variables x , y are not correlated if their covariance S ( u, v ) = C ( u ) C ( v ) cov( x, y ) = E { ( x − E { x } ) · ( y − E { y } ) } = 0 . · � � N/ 2 N/ 2 Take an image (or in practice a small 8 × 8 pixel block) as a random- N − 1 N − 1 s ( y, x ) cos (2 x + 1) uπ cos (2 x + 1) vπ variable vector b . The components of a random-variable vector b = � � 2 N 2 N ( b 1 , . . . , b k ) are decorrelated if the covariance matrix cov( b ) with y =0 x =0 (cov( b )) i,j = E { ( b i − E { b i } ) · ( b j − E { b j } ) } = cov( b i , b j ) Breakthrough: is a diagonal matrix. The Karhunen-Lo` eve transform of b is the matrix Ahmed/Natarajan/Rao discovered the DCT as an excellent approxima- A with which cov(A b ) is diagonal. tion of the KLT for typical photographic images, but far more efficient Since cov( b ) is symmetric, its eigenvectors are orthogonal. Using these to calculate. eigenvectors as the rows of A and the corresponding eigenvalues as the Ahmed, Natarajan, Rao: Discrete Cosine Transform. IEEE Transactions on Computers, Vol. 23, diagonal elements of the diagonal matrix D , we obtain the decompo- January 1974, pp. 90–93. sition cov( b ) = A T DA , and therefore cov( A b ) = D . A range of fast algorithms have been found for calculating 1-D and The Karhunen-Lo` eve transform is the orthogonal matrix of the singular- 2-D DCTs (e.g., Ligtenberg/Vetterli). value decomposition of the covariance matrix of its input. 13 15 Whole-image DCT Discrete cosine transform (DCT) The forward and inverse discrete cosine transform 2D Discrete Cosine Transform (log10) Original image 4 N − 1 C ( u ) s ( x ) cos (2 x + 1) uπ 3 � S ( u ) = � 2 N N/ 2 2 x =0 1 N − 1 C ( u ) S ( u ) cos (2 x + 1) uπ � 0 s ( x ) = � 2 N N/ 2 −1 u =0 −2 with −3 1 � u = 0 √ −4 C ( u ) = 2 1 u > 0 is an orthonormal transform: � 1 N − 1 C ( u ) cos (2 x + 1) uπ · C ( u ′ ) cos (2 x + 1) u ′ π u = u ′ � = 0 u � = u ′ � 2 N � 2 N N/ 2 N/ 2 x =0 14 16

Whole-image DCT, 80% coefficient cutoff Whole-image DCT, 95% coefficient cutoff 80% truncated 2D DCT (log10) 80% truncated DCT: reconstructed image 95% truncated 2D DCT (log10) 95% truncated DCT: reconstructed image 4 4 3 3 2 2 1 1 0 0 −1 −1 −2 −2 −3 −3 −4 −4 17 19 Whole-image DCT, 90% coefficient cutoff Whole-image DCT, 99% coefficient cutoff 90% truncated 2D DCT (log10) 90% truncated DCT: reconstructed image 99% truncated 2D DCT (log10) 99% truncated DCT: reconstructed image 4 4 3 3 2 2 1 1 0 0 −1 −1 −2 −2 −3 −3 −4 −4 18 20

Information Theory and Coding i f s f f Image, Video and Audio - PowerPoint PPT Presentation

Sampling, aliasing and Nyquist limit Information Theory and Coding i f s f f Image, Video and Audio Compression Markus Kuhn 0 Lent 2003 Part II Computer Laboratory 0 3fs 2fs fs 0 fs 2fs 3fs A wave cos(2 tf )

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Coding and Applications in Sensor Networks Why coding? Information compression

Overview Coding and Information Theory What is information theory? Entropy Coding Chris

Coding and Applications in Sensor Networks Coding and Applications in Sensor Networks Why coding?

An Introduction to (Network) Coding Theory Anna-Lena Horlemann-Trautmann University of St.

Risk-Based Coding and Reimbursement What is Risk-Based Coding? Risk-Based Coding Overview A

Applications of Random Coding and Algebraic Coding Theories to Universal Lossless Source Coding

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

Image and Video Coding: Motion Estimation and Coding 4 5 6 B C D 1 D 0 3 7 A current 2

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

Image and Video Coding: Hybrid Video Coding s n 1 [ x , y ] s n [ x , y ] m k = ( m x , m

Image and Video Coding: Improved Inter-Picture Prediction Review of Hybrid Video Coding Last

CCN & Network Coding Cedric Westphal Huawei and UCSC ICN & Network Coding - RFC7933

A partition function algorithm for RNA-RNA interaction Hamidreza Chitsaz Raheleh Salari, Cenk

Programming Molecules Anne Condon U. British Columbia 100 nm Paul Rothemund, 2006 Programming

1-10: Learning Goals Lets use different base-height pairs to find the area of a triangle.

Stanford-UBC at TAC-KBP Eneko Agirre , Angel Chang, Dan Jurafsky, Christopher Manning, Valentin

Sleep Modes Pacemaker Training Program The heart benefits from a decreased heart rate

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio. 1990. One of the

Global and local alignments Global vs. local alignments Global: align all nucleotides

Heuris'c)search:)FastA)and)BLAST ) COMPSCI)260))Spring)2016 ) Previous)lectures)

Information Theory and Coding i f s f f Image, Video and Audio - PowerPoint PPT Presentation

Sampling, aliasing and Nyquist limit Information Theory and Coding i f s f f Image, Video and Audio Compression Markus Kuhn 0 Lent 2003 Part II Computer Laboratory 0 3fs 2fs fs 0 fs 2fs 3fs A wave cos(2 tf )

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

Coding and Applications in Sensor Networks Why coding? Information compression

Overview Coding and Information Theory What is information theory? Entropy Coding Chris

Coding and Applications in Sensor Networks Coding and Applications in Sensor Networks Why coding?

An Introduction to (Network) Coding Theory Anna-Lena Horlemann-Trautmann University of St.

Risk-Based Coding and Reimbursement What is Risk-Based Coding? Risk-Based Coding Overview A

Applications of Random Coding and Algebraic Coding Theories to Universal Lossless Source Coding

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

Image and Video Coding: Motion Estimation and Coding 4 5 6 B C D 1 D 0 3 7 A current 2

Speech &amp; Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

Image and Video Coding: Hybrid Video Coding s n 1 [ x , y ] s n [ x , y ] m k = ( m x , m

Image and Video Coding: Improved Inter-Picture Prediction Review of Hybrid Video Coding Last

CCN &amp; Network Coding Cedric Westphal Huawei and UCSC ICN &amp; Network Coding - RFC7933

A partition function algorithm for RNA-RNA interaction Hamidreza Chitsaz Raheleh Salari, Cenk

Programming Molecules Anne Condon U. British Columbia 100 nm Paul Rothemund, 2006 Programming

1-10: Learning Goals Lets use different base-height pairs to find the area of a triangle.

Stanford-UBC at TAC-KBP Eneko Agirre , Angel Chang, Dan Jurafsky, Christopher Manning, Valentin

Sleep Modes Pacemaker Training Program The heart benefits from a decreased heart rate

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio. 1990. One of the

Global and local alignments Global vs. local alignments Global: align all nucleotides

Heuris'c)search:)FastA)and)BLAST ) COMPSCI)260))Spring)2016 ) Previous)lectures)

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jrgen

CCN & Network Coding Cedric Westphal Huawei and UCSC ICN & Network Coding - RFC7933