Huffman Codes Idea: The straightforward coding an alphabet A of - PowerPoint PPT Presentation

Huffman Codes Idea: • The straightforward coding an alphabet A of ASize characters into binary is to associate each character a string of  log 2 ( ASize )  bits. • For example, the standard ASCII character set associates with each of the 128 characters ( ASize =128) a string of 7 bits. • However, any method for encoding characters as variable length bit strings may be acceptable in practice as long as strings are always uniquely decodable ; that is, it is always possible to determine where the code for one character ends and the code for the next one begins. - 1 -

Prefix Codes A sufficient (but not necessary) condition for an encoding method to be uniquely decodable is that the code for one character is never a prefix of a code for another; such codes are called prefix codes . For example, items {A, B, C, D, E} could be represented by the following trie: a c A = a b B = ba A F C = bb a c b D = bca B C a E = bcb b F = c D E - 2 -

Here, we will be interested in tries that represent binary prefix codes , and where all internal vertices have two children. For example, items {A, B, C, D, E, F} could be represented by the following trie: - 3 -

Binary prefix code encoding algorithm: initialize a stack to be empty Read a data item ("a character") c . p = the leaf corresponding to c while p is not the root do begin Push on to the stack the bit that connects p to its parent. p = PARENT( p ) end while the stack is not empty do write POP Binary prefix code decoding algorithm: p = the root while p is not a leaf do begin Read a bit b . p = the child of p corresponding to b end write the character corresponding to p - 4 -

Huffman Tries A trie is a Huffman trie for coding with an alphabet B of BSize ≥ 2 characters if: a. All edges are labelled by a character of B . b. Each non-leaf vertex has exactly BSize children. c. Positive weights (probabilities) are associated with each leaf such that the sum of these weights is 1. d. The sum over all the leaves of the leaf's depth times its weight is minimum among all tries satisfying conditions a, b, and c, with the same collection of leaf weights. Notes : • We will limit our attention for the moment to encoding in binary ( B = {0,1} and BSize =2), which reflects the vast majority of all practical applications. • When drawing Huffman tries, we adopt the convention that the edge to the left leaf of a vertex is labelled 0 and the edge to its right leaf is labelled 1. - 5 -

*** Huffman tries have become so well known that prefix codes in general are often referred to as Huffman codes, even when they are not optimal. Will often refer to a Huffman trie as an optimal Huffman trie if we want to add emphasis. - 6 -

Some examples for ASize =8 : �� - 7 -

Building an optimal Huffman trie: If an internal vertex v is labelled with the sum w of the weights of all of its descandant leaves, to the parent of v , v "looks" just like a leaf of weight w. Use a "bottom-up" greedy algorithm that starts with each leaf being a separate tree in the forest, and then repeatedly combines the two trees of lowest weight until only one tree remains. Huffman trie construction algorithm: Initialize a FOREST of single-vertex binary tries, one for each weight. while the FOREST has more than one tree do begin Let X and Y be tries in the FOREST of lowest weight (ties broken arbitrarily). Create a new root r and make the roots of X and Y the children of r . Define the weight of r to be the sum of the weights of the roots of X and Y . end - 8 -

Higher Order Huffman Codes It may be that the probability distribution for the next character is highly dependent on the previous characters. For example, if we have just received the charaters elephan in a stream of English text, it is more likely that the next character will be t than s, but if we had just received the characters hippopotamu , s would be more likely than t . One way to perform better on data with higher order dependicies is to encode more than one chararacter at a time. For example, to compress English text, if the alphabet was taken to be all possible pairs of ASCII characters, then a Huffman trie of 128 ∗ 128 = 16,384 leaves could be constructed based on well known statistics for English bi-grams. In general, a Huffman trie for blocks of m characters for any m ≥ 1 can be employed, but at a cost of O ( ASize m ) space. A slightly more "graceful" approach based on m th order statistics is to have a different Huffman trie for each possible string of m –1 characters. That is, use the previous m –1 characters to access one of ASize m –1 Huffman tries, each using O ( ASize ) memory. - 9 -

Adaptive Huffman Codes The first n characters of an input file define a probability distribution; that is if a given character c occurrs m times in the first n characters, its probability can be defined to be m / n . So it is natural to consider coding a string where after each character is encoded by the encoder and decoded by the decoder, the encoder and decoder compute a new Huffman trie, based on the probability distribution of the characters seen thus far. In theory, if we assume the alphabet size is a constant independant of the number of characters encoded and decoded, then a simple linear time adaptive Huffman encoding and decoding algorithm is to run the Huffman trie construction algorithm after each character processed. However, it is possible to much better than this in practice by making only small modification to an optimal Huffman trie based on the first i characters to obtain an optimal Huffman trie for the first i characters. - 10 -

Huffman Codes Idea: The straightforward coding an alphabet A of - PowerPoint PPT Presentation

Huffman Codes Idea: The straightforward coding an alphabet A of ASize characters into binary is to associate each character a string of log 2 ( ASize ) bits. For example, the standard ASCII character set associates with each of the

Huffman Coding Variable Rate Codes Example: David A. Huffman (1951) Huffman coding uses

Objectives Review Huffman Codes Introducing Divide and Conquer Algorithms March 6, 2019

Welcome to M2 SCCI 2014-2015 Promotion David Albert Huffman David A. Huffman(1925-1999) [Photo:

1 Data structures for decoder: Construction of canonical Huffman: (sketch) The array

PTAS for Huffman coding with unequal letter costs Mordecai Golin (HKUST), Claire Mathieu (Brown)

Building Codes Building Codes Building Codes Building Codes 1 1 Builder Responsibilities

ECEN 5682 Theory and Practice of Error Control Codes Cyclic Codes Peter Mathys University of

Formal Modeling in Cognitive Science Source Codes Lecture 30: Codes; Kraft Inequality; Source

CODES FOR ALL SEASONS Emina Soljanin, Bell Labs IN THE CLOUD? CODES Emina @ Bell Labs Codes at

G ENERALIZED R EED -S OLOMON CODES (GRS CODES ) A CHARACTERIZATION OF MDS CODES THAT HAVE AN ERROR

Lattices from Codes or Codes from Lattices Amin Sakzad Dept of Electrical and Computer Systems

Error-Correcting codes: Application of convolutional codes to Video Streaming Diego Napp

Information Theory Lecture 8 BCH codes BCH codes: R8.45 (R5.6) Decoding BCH (and

SITE REPORT: THE FRANCIS CRICK INSTITUTE ADAM HUFFMAN Senior HPC and Cloud Systems Engineer

Huffman Coding with Gap Arrays for GPU Acceleration Naoya Yamamoto, Koji Nakano, Yasuaki Ito,

THE MARRIAGE OF CLOUD, HPC AND CONTAINERS ...AND SERVERLESS? ADAM HUFFMAN Senior HPC and Cloud

NPTEL VIDEO COURSES (672) IN SUPPLEMENTARY FORMATS PDF Slides of MP4, Audio Lectures (MP3),

Automated Analysis of Wireless Communication Protocols via SDR Bachelor thesis colloquium Roman

Section 8: Carrier Modulated Transmission Convert binary data into form suited to channel

Formal System Design Process with UML Use a formal process & tools to facilitate and

Huffman Encoding 13-Oct-11 Entropy Entropy is a measure of information content: the number of

Compression Programs File Compression: Gzip, Bzip Archivers :Arc, Pkzip, Winrar,

4. Source Encoding Methods Called also entropy coders , because the methods try to get

4.4. Arithmetic coding Advantages: Reaches the entropy (within computing precision)

Huffman Codes Idea: The straightforward coding an alphabet A of - PowerPoint PPT Presentation

Huffman Codes Idea: The straightforward coding an alphabet A of ASize characters into binary is to associate each character a string of log 2 ( ASize ) bits. For example, the standard ASCII character set associates with each of the

Huffman Coding Variable Rate Codes Example: David A. Huffman (1951) Huffman coding uses

Objectives Review Huffman Codes Introducing Divide and Conquer Algorithms March 6, 2019

Welcome to M2 SCCI 2014-2015 Promotion David Albert Huffman David A. Huffman(1925-1999) [Photo:

1 Data structures for decoder: Construction of canonical Huffman: (sketch) The array

PTAS for Huffman coding with unequal letter costs Mordecai Golin (HKUST), Claire Mathieu (Brown)

Building Codes Building Codes Building Codes Building Codes 1 1 Builder Responsibilities

ECEN 5682 Theory and Practice of Error Control Codes Cyclic Codes Peter Mathys University of

Formal Modeling in Cognitive Science Source Codes Lecture 30: Codes; Kraft Inequality; Source

CODES FOR ALL SEASONS Emina Soljanin, Bell Labs IN THE CLOUD? CODES Emina @ Bell Labs Codes at

G ENERALIZED R EED -S OLOMON CODES (GRS CODES ) A CHARACTERIZATION OF MDS CODES THAT HAVE AN ERROR

Lattices from Codes or Codes from Lattices Amin Sakzad Dept of Electrical and Computer Systems

Error-Correcting codes: Application of convolutional codes to Video Streaming Diego Napp

Information Theory Lecture 8 BCH codes BCH codes: R8.45 (R5.6) Decoding BCH (and

SITE REPORT: THE FRANCIS CRICK INSTITUTE ADAM HUFFMAN Senior HPC and Cloud Systems Engineer

Huffman Coding with Gap Arrays for GPU Acceleration Naoya Yamamoto, Koji Nakano, Yasuaki Ito,

THE MARRIAGE OF CLOUD, HPC AND CONTAINERS ...AND SERVERLESS? ADAM HUFFMAN Senior HPC and Cloud

NPTEL VIDEO COURSES (672) IN SUPPLEMENTARY FORMATS PDF Slides of MP4, Audio Lectures (MP3),

Automated Analysis of Wireless Communication Protocols via SDR Bachelor thesis colloquium Roman

Section 8: Carrier Modulated Transmission Convert binary data into form suited to channel

Formal System Design Process with UML Use a formal process &amp; tools to facilitate and

Huffman Encoding 13-Oct-11 Entropy Entropy is a measure of information content: the number of

Compression Programs File Compression: Gzip, Bzip Archivers :Arc, Pkzip, Winrar,

4. Source Encoding Methods Called also entropy coders , because the methods try to get

4.4. Arithmetic coding Advantages: Reaches the entropy (within computing precision)

Formal System Design Process with UML Use a formal process & tools to facilitate and