Compression: Information Theory Greg Plaxton Theory in Programming - PowerPoint PPT Presentation

Compression: Information Theory Greg Plaxton Theory in Programming Practice, Spring 2005 Department of Computer Science University of Texas at Austin

Coding Theory • Encoder – Input: a message over some finite alphabet such as { 0 , 1 } or { a, . . . , z } – Output: encoded message • Decoder – Input: some encoded message produced by the encoder – Output: (a good approximation to) the associated input message • Motivation? Theory in Programming Practice, Plaxton, Spring 2005

Some Applications of Coding Theory • Compression – Goal: Produce a short encoding of the input message • Error detection/correction – Goal: Produce a fault-tolerant encoding of the input message • Cryptography – Goal: Produce an encoding of the input message that can only be decoded by the intended recipient(s) of the message • It is desirable for the encoding and decoding algorithms to be efficient in terms of time and space – Various tradeoffs are appropriate for different applications Theory in Programming Practice, Plaxton, Spring 2005

Compression • Lossless: decoder recovers the original input message • Lossy: decoder recovers an approximation to the original input message • The application dictates how much, if any, loss we can tolerate – Text compression is usually required to be lossless – Image/video compression is often lossy • We will focus on techniques for lossless compression Theory in Programming Practice, Plaxton, Spring 2005

Text Compression • Practical question: I’m running out of disk space; how much can I compress my files? • A (naive?) idea: – Any file can be compressed to the empty string: just write a decoder that outputs the file when given the empty string as input! – A problem with this approach is that we need to store the decoder, and the naive implementation of the decoder (which simply stores the original file in some static data structure within the decoder program) is at least as large as the original file – Can this idea be salvaged? Theory in Programming Practice, Plaxton, Spring 2005

Kolmogorov Complexity • In some cases, a large file can be generated by a very small program running on the empty string; e.g., a file containing a list of the first trillion prime numbers • Your files can be compressed down to the size of the smallest program that (when given the empty string as input) produces them as output – How do I figure out this shortest program? – Won’t it be time-consuming to write/debug/maintain? Theory in Programming Practice, Plaxton, Spring 2005

Information Theory • May be viewed as providing a practical way to (approximately) carry out the strategy suggested by Kolmogorov complexity • Consider a file that you would like to compress – Assume that this file can be viewed, to a reasonable degree of approximation, as being drawn from a particular probability distribution (e.g., we will see that this is true of English text) – Perhaps many other people have files drawn from this distribution, or from distributions in a similar class – If so, a good encoder/decoder pair for that class of distributions may already exist; with luck, it will already be installed on your system Theory in Programming Practice, Plaxton, Spring 2005

Example: English Text • In what sense can we view English text as being (approximately) drawn from a probability distribution? • English text is one of the example applications discussed in Shannon’s 1948 paper “A Mathematical Theory of Communication” – On page 7 we find a sequence of successively more accurate probabilistic models of English text – Claude Shannon (1916–2001) is known as the “father of information theory” Theory in Programming Practice, Plaxton, Spring 2005

Entropy in Thermodynamics • In thermodynamics, entropy is a measure of energy dispersal – The more “spread out” the energy of a system is, the higher the entropy – A system in which the energy is concentrated at a single point has zero entropy – A system in which the energy is uniformly distributed has reached its maximum possible entropy • Second law of thermodynamics: The entropy of an isolated system can only increase – Bad news: The entropy of the universe can only increase as matter and energy degrade to an ultimate state of inert uniformity – Good news: This process is likely to take a while Theory in Programming Practice, Plaxton, Spring 2005

Entropy in Information Theory (Shannon) • A measure of the uncertainty associated with a probability distribution – The more “spread out” the distribution is, the higher the entropy – A probability distribution in which all of the probability is concentrated on a single outcome has zero entropy – For any given set of possible outcomes, the probability distribution with the maximum entropy is the uniform distribution • Consider a distribution over a set of n outcomes in which the i th outcome has associated probability p i ; Shannon defined the entropy of this distribution as p i log 1 � � = − p i log p i p i i i • The logarithm above is normally assumed to be taken base 2, in which case the units of entropy are bits (binary digits) Theory in Programming Practice, Plaxton, Spring 2005

Entropy of an I.I.D. Source • Consider a message in which each successive symbol is independently drawn from the same probability distribution over n symbols, where the probability of drawing the i th symbol is p i • The entropy of such a source is − � i p i log p i bits per symbol • Example: Shannon’s first-order model of English text yields an entropy of 4 . 07 bits per symbol Theory in Programming Practice, Plaxton, Spring 2005

Discrete Markov Process • A more general notion of a source • Includes as special cases the k th order processes discussed earlier in connection with Shannon’s modeling of English text • Closely related to the concept of finite state machines to be discussed later in this course Theory in Programming Practice, Plaxton, Spring 2005

Entropy of a Discrete Markov Process • Under certain (relatively mild) technical assumptions, for any k > 0 and any X in A k where A denotes the set of symbols, the fraction of all sequences of length k in the output that are equal to X converges to a particular number p ( X ) • We may then define H k as 1 1 � p ( X ) log p ( X ) k X ∈ A k • Theorem (Shannon): If a given discrete Markov process satisfies the technical assumptions alluded to above, then its entropy is equal to lim k →∞ H k bits per symbol Theory in Programming Practice, Plaxton, Spring 2005

Example: English Text • Zero-order approximation: log 27 ≈ 4 . 75 bits per symbol • First-order approximation: 4 . 07 bits per symbol • Second-order approximation: 3 . 36 bits per symbol • Third-order approximation: 2 . 77 bits per symbol • Approximation based on experiments involving humans: 0 . 6 to 1 . 3 bits per symbol Theory in Programming Practice, Plaxton, Spring 2005

Entropy as a Measure of Compressibility • Fundamental Theorem for a Noiseless Channel (Shannon): Let a source have entropy H (bits per symbol) and a channel have capacity C (bits per second). Then it is possible to encode the output of the source in such a way as to transmit at the average rate C H − � symbols per second where � is arbitrarily small. It is not possible to transmit at an average rate greater than C H . • What does this imply regarding how much we can hope to compress a given file containing n symbols, where n is large? – Suppose the file content is similar in structure to the output of a source with entropy H – Then we cannot hope to encode the file using fewer than about nH bits – Furthermore this bound can be achieved to within an arbitrarily small factor Theory in Programming Practice, Plaxton, Spring 2005

Compression: Information Theory Greg Plaxton Theory in Programming - PowerPoint PPT Presentation

Compression: Information Theory Greg Plaxton Theory in Programming Practice, Spring 2005 Department of Computer Science University of Texas at Austin Coding Theory Encoder Input: a message over some finite alphabet such as { 0 , 1 } or

Lossless compression in lossy compression systems Almost every lossy compression system

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Digital Image Compression Digital Image Compression Digital Image Compression and JPEG Standards

Digital Video Compression Digital Video Compression Digital Video Compression and H.261

From Sorting to Heaps to Compression Data Compression video on demand/set top box jpeg

Tradeoffs in XML Database Compression James Cheney University of Edinburgh Data Compression

Information Retrieval Tutorial 3: Index Compression Professor: Michel Schellekens TA: Ang Gao

A Model to Address Salary Compression for Faculty (an anti-compression model) Presented to

Compression Overview Multimedia Encoding and Compression Huffman codes Lossless

Compression Programs File Compression: Gzip, Bzip Archivers :Arc, Pkzip, Winrar,

Scientific Data Compression: From Stone-Age to Renaissance Factor 10,100 compression

Basic Techniques II: Iterative Compression Marek Cygan Institute of Informatics University of

Compression Strategies & Alternate Summarization Systems and Applications Ling 573 May 23,

Video Compression Lecture # 5 6 Shahab Baqai LUMS Outline Image compression

Analyzing Large Communication Networks Shirin Jalali joint work with Michelle Effros and Tracey

Pre-Quantum Information Theory Goutam Paul Cryptology and Security Research Unit, Indian

SoC SoC Design SoC SoC Design Design Design Lecture Lecture 1 1: Introduction :

SiPM photosensor development for nEXO Thomas Brunner McGill University and TRIUMF TAUP 2019

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Optimal Design of Information Channels in Networked Control Serdar Y uksel Queens

May 24: Confinement Confinement, non-VM isolation Program modification Covert channels

Noise " ? " " Remove Additive Erkut Erdem ! Dept. of Computer Engineering !

Compression: Information Theory Greg Plaxton Theory in Programming - PowerPoint PPT Presentation

Compression: Information Theory Greg Plaxton Theory in Programming Practice, Spring 2005 Department of Computer Science University of Texas at Austin Coding Theory Encoder Input: a message over some finite alphabet such as { 0 , 1 } or

Lossless compression in lossy compression systems Almost every lossy compression system

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Digital Image Compression Digital Image Compression Digital Image Compression and JPEG Standards

Digital Video Compression Digital Video Compression Digital Video Compression and H.261

From Sorting to Heaps to Compression Data Compression video on demand/set top box jpeg

Tradeoffs in XML Database Compression James Cheney University of Edinburgh Data Compression

Information Retrieval Tutorial 3: Index Compression Professor: Michel Schellekens TA: Ang Gao

A Model to Address Salary Compression for Faculty (an anti-compression model) Presented to

Compression Overview Multimedia Encoding and Compression Huffman codes Lossless

Compression Programs File Compression: Gzip, Bzip Archivers :Arc, Pkzip, Winrar,

Scientific Data Compression: From Stone-Age to Renaissance Factor 10,100 compression

Basic Techniques II: Iterative Compression Marek Cygan Institute of Informatics University of

Compression Strategies &amp; Alternate Summarization Systems and Applications Ling 573 May 23,

Video Compression Lecture # 5 6 Shahab Baqai LUMS Outline Image compression

Analyzing Large Communication Networks Shirin Jalali joint work with Michelle Effros and Tracey

Pre-Quantum Information Theory Goutam Paul Cryptology and Security Research Unit, Indian

SoC SoC Design SoC SoC Design Design Design Lecture Lecture 1 1: Introduction :

SiPM photosensor development for nEXO Thomas Brunner McGill University and TRIUMF TAUP 2019

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Optimal Design of Information Channels in Networked Control Serdar Y uksel Queens

May 24: Confinement Confinement, non-VM isolation Program modification Covert channels

Noise &quot; ? &quot; &quot; Remove Additive Erkut Erdem ! Dept. of Computer Engineering !

Compression Strategies & Alternate Summarization Systems and Applications Ling 573 May 23,

Noise " ? " " Remove Additive Erkut Erdem ! Dept. of Computer Engineering !