Huffman Coding Variable Rate Codes Example: David A. Huffman - PowerPoint PPT Presentation

Huffman Coding Variable Rate Codes � Example: � David A. Huffman (1951) Huffman coding uses frequencies of symbols in a string to build a variable rate prefix code � 1) A → 00 ; B → 01 ; C → 10 ; D → 11 ; � Each symbol is mapped to a binary string 2) A → 0 ; B → 100 ; C → 101 ; D → 11 ; � More frequent symbols have shorter codes � � No code is a prefix of another No code is a prefix of another � Two different encodings of AABDDCAA � Example: 0 1 � 0000011111100000 (16 bits) A A 0 1 0 B 100 D � 00100111110100 (14 bits) 0 1 C 101 D 11 B C 27/02/2011 Applied Algorithmics - week7 1 27/02/2011 Applied Algorithmics - week7 2 Cost of Huffman Trees Cost of Huffman Trees - example � Let A ={ a 1 , a 2 , .., a m } be the alphabet in which each � Example: symbol a i has probability p i � Let a 1 = A , p 1 = 1/2 ; a 2 = B , p 2 = 1/8 ; a 3 = C , p 3 = 1/8 ; a 4 = D , p 4 = 1/4 � We can define the cost of the Huffman tree HT as where r 1 = 1 , r 2 = 3 , r 3 = 3 , and r 4 = 2 m C(HT) = Σ p i ·r i , i=1 HT 0 where r i is the length of the path from the root to a i 1 C(HT) =1·1/2 +3·1/8 +3·1/8 +2·1/4=1.75 � The cost C(HT) is the expected length (in bits) of a code A 1 0 word represented by the tree HT . The value of C(HT) is D 0 1 called the bit rate of the code. B C 27/02/2011 Applied Algorithmics - week7 3 27/02/2011 Applied Algorithmics - week7 4

Huffman Tree Property Huffman Tree Property � Input: Given probabilities p 1 , p 2 , .., p m for symbols a 1 , a 2 , � Input: Given probabilities p 1 , p 2 , .., p m for symbols a 1 , a 2 , .., a m from alphabet A .., a m from alphabet A � Output: A tree that minimizes the average number of bits � Output: A tree that minimizes the average number of bits (bit rate) to code a symbol from A (bit rate) to code a symbol from A (bit rate) to code a symbol from A (bit rate) to code a symbol from A � I.e., the goal is to minimize function: � I.e., the goal is to minimize function: C(HT) = Σ p i ·r i , C(HT) = Σ p i ·r i , where r i is the length of the path from the root to leaf a i . where r i is the length of the path from the root to leaf a i . This is called a Huffman tree or Huffman code for alphabet A This is called a Huffman tree or Huffman code for alphabet A 27/02/2011 Applied Algorithmics - week7 5 27/02/2011 Applied Algorithmics - week7 6 Construction of Huffman Trees Construction of Huffman Trees � Form a (tree) node for each symbol a i with weight p i P( A )= 0.4 , P( B )= 0.1 , P( C )= 0.3 , P( D )= 0.1 , P( E )= 0.1 � Insert all nodes to a priority queue PQ (e.g., a heap) ordered by nodes probabilities 0.1 0.1 0.1 0.3 0.4 E E D D B B C C A A � while (the priority queue has more than two nodes) � while (the priority queue has more than two nodes) � min 1 ← remove-min ( PQ ); min 2 ← remove-min ( PQ ); � create a new (tree) node T ; 0.2 0.1 0.3 0.4 � T.weight ← min 1 .weight + min 2 .weight ; B C A � T.left ← min 1 ; T.right ← min 2 ; � insert ( PQ , T ) D E � return (last node in PQ ) 27/02/2011 Applied Algorithmics - week7 7 27/02/2011 Applied Algorithmics - week7 8

Construction of Huffman Trees Construction of Huffman Trees 0.2 0.1 0.3 0.4 0.3 0.6 B C A 0.4 0.3 0.4 C A A 0 1 0 1 0 1 D E B B C C 0 1 0 1 0.3 0.3 0.4 C A D E B 0 1 0 1 B D E 0 1 D E 27/02/2011 Applied Algorithmics - week7 9 27/02/2011 Applied Algorithmics - week7 10 Construction of Huffman Trees Construction of Huffman Trees 0.4 0.6 1 A 0 A = 0 1 0 0 1 A B = 100 0 0 A A 1 1 C C = 11 0 1 0 1 D = 1010 C 0 1 C B E = 1011 0 1 0 1 B B 0 1 D E 0 1 D E D E 27/02/2011 Applied Algorithmics - week7 11 27/02/2011 Applied Algorithmics - week7 12

Huffman Codes Basics of Information Theory � The entropy of an information source (string) S built over � Theorem: For any source S the Huffman code can alphabet A ={ a 1 , a 2 , .., a m }is defined as: be computed efficiently in time O(n·log n) , where n H(S) = � i p i ·log 2 (1/p i ) is the size of the source S. where p i is the probability that symbol a i in S will occur where p i is the probability that symbol a i in S will occur Proof: The time complexity of Huffman coding � log 2 (1/p i ) indicates the amount of information contained algorithm is dominated by the use of priority queues in a i , i.e., the number of bits needed to code a i . � For example, in an image with uniform distribution of � One can also prove that Huffman coding creates the gray-level intensity, i.e. all p i = 1/256 , then the number of most efficient set of prefix codes for a given text bits needed to encode each gray level is 8 bits. The � It is also one of the most efficient entropy coder entropy of this image is 8. 27/02/2011 Applied Algorithmics - week7 13 27/02/2011 Applied Algorithmics - week7 14 Huffman Code vs. Entropy Error detection and correction P( A )= 0.4 , P( B )= 0.1 , P( C )= 0.3 , P( D )= 0.1 , P( E )= 0.1 � Hamming codes: � codewords in Hamming (error detecting and error correcting) � Entropy: codes consist of m data bits and r redundant bits. � 0.4 · log 2 (10/4) + 0.1 · log 2 (10) + 0.3 · log 2 (10/3) + � 0.4 · log 2 (10/4) + 0.1 · log 2 (10) + 0.3 · log 2 (10/3) + � � Hamming distance between two strings represents the number Hamming distance between two strings represents the number of bit positions on which two bit patterns differ (similar to 0.1 · log 2 (10) + 0.1 · log 2 (10) = 2.05 bits per symbol pattern matching k mismatches). � Huffman Code: � Hamming distance of the code is determined by the two codewords whose Hamming distance is the smallest . � 0.4 · 1 + 0.1 · 3 + 0.3 · 2 + 0.1 · 4 + 0.1 · 4 = 2.10 � error detection involves determining if codewords in the � Not bad, not bad at all. received message match closely enough legal codewords. 27/02/2011 Applied Algorithmics - week7 15 27/02/2011 Applied Algorithmics - week8 16

Error detection and correction Error detection and correction � To detect properly d single bit errors, one needs to apply a (b) A code with good distance properties d+1 code distance. (a) A code with poor distance properties � To correct properly d single bit errors, one needs to apply o o o o x x x a 2d+1 code distance. a 2d+1 code distance. o o o o o o o o x x o o x o x o x x x o x � In general, the price for redundant bits is too expensive (!!) o o x x o o to do error correction for all network messages o o o o o x x o o o � Thus safety and integrity of network communication is based on error detecting codes and extra transmissions in code distance x = codewords o = non-codewords case any errors were detected 27/02/2011 Applied Algorithmics - week8 17 27/02/2011 Applied Algorithmics - week8 18 Error-Detection System using Check Bits Cyclic Redundancy Checking (CRC) Received information bits Information bits c yclic r edundancy c heck (CRC) is a popular technique for detecting data transmission errors. Transmitted Recalculate messages are divided into predetermined lengths messages are divided into predetermined lengths check bits that are divided by a fixed divisor. According to the calculation, the remainder number is appended Channel onto and sent with the message. When the message Calculate Compare is received, the computer recalculates the remainder check bits and compares it to the transmitted remainder. Information Received Check If the numbers do not match, an error is detected. accepted if check bits bits check bits match 27/02/2011 Applied Algorithmics - week8 19 27/02/2011 Applied Algorithmics - week8 20

Huffman Coding Variable Rate Codes Example: David A. Huffman - PowerPoint PPT Presentation

Huffman Coding Variable Rate Codes Example: David A. Huffman (1951) Huffman coding uses frequencies of symbols in a string to build a variable rate prefix code 1) A 00 ; B 01 ; C 10 ; D 11 ; Each symbol is mapped

Formal Modeling in Cognitive Science 1 Coding Theorems Lecture 28: Kraft Inequality; Source Coding

Labor Classification Yrs Rate 1 Rate 2 Rate 3 Rate 4 Rate 5 Rate 6 Rate 7 Rate 8 Rate 9

PTAS for Huffman coding with unequal letter costs Mordecai Golin (HKUST), Claire Mathieu (Brown)

Formal Modeling in Cognitive Science Source Codes Lecture 30: Codes; Kraft Inequality; Source

Interest Rate Swap and Interest Rate Swap and Variable Rate Debt Programs Variable Rate Debt

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

Huffman Coding with Gap Arrays for GPU Acceleration Naoya Yamamoto, Koji Nakano, Yasuaki Ito,

Huffman Coding Eric Dubois School of Electrical Engineering and Computer Science University of

Variable Rate Debt Options: Auction Rate Securities Auction Rate Securities What are Auction Rate

Welcome to M2 SCCI 2014-2015 Promotion David Albert Huffman David A. Huffman(1925-1999) [Photo:

1 Data structures for decoder: Construction of canonical Huffman: (sketch) The array

Objectives Review Huffman Codes Introducing Divide and Conquer Algorithms March 6, 2019

Numberjack User Guide May 27, 2013 1 Variables Constructor for the class Variable : Constructor

Image and Video Coding: Video Coding Extensions Screen Content Coding Screen Content Coding

ADVANCED MULTIMEDIA ADVANCED MULTIMEDIA CODING CODING Fernando Pereira Instituto Superior

Dynamical systems Expanding maps on the circle. Coding Jana Rodriguez Hertz ICTP 2018 coding

School and EA Network Meeting Spring 2020 Enterprise Adviser Network Update Team update EA

A.G.M. 17th January 2020 Agenda 1. Introductions 2. To discuss and agree the PMFs aims and

Year three evaluation December 2018 LWN Hub evaluation partners Small Kings College London

Where we are and what we need to do to finish the Network Map Green Infrastructure Planning

ROW.mp3 Colin Raffel, Jieun Oh, Isaac Wang Music 422 Final Project 3/12/2010 Motivation The

FRIENDSWOOD DOWNTOWN ECONOMIC DEVELOPMENT CORPORATION (FDEDC) FRIENDS OF DOWNTOWN FRIENDSWOOD

Ma Machine chine Lear arning ning for r Auton tonomous mous Dr Driving ving Nasser r

Kingwood Area Mobility Study Lake Houston Redevelopment Authority (TIRZ #10) Stakeholder Meeting