Computing and Communications 2. Information Theory -Data - PowerPoint PPT Presentation

1896 1920 1987 2006 Computing and Communications 2. Information Theory -Data Compression Ying Cui Department of Electronic Engineering Shanghai Jiao Tong University, China 2017, Autumn 1

Outline • Examples of codes • Kraft inequality for instantaneous codes • Kraft inequality for uniquely decodable codes • Optimal codes • Huffman codes 2

Reference • Elements of information theory, T. M. Cover and J. A. Thomas, Wiley 3

EXAMPLES OF CODES 4

Source Code • D-ary alphabet {0, 1, …, D-1} 5

Examples H(X) = 1.75 bits L(C) = 1.75 bits H(X)= L(X) H(X) = 1.58 bits L(C) = 1.66 bits H(X)< L(X) 6

Conditions on Codes • Guarantee decodability of a single value of X ? • Describe a sequence of values of X ? – example: if C(x 1 )=00 and C(x 2 )=11 , then C(x 1 x 2 )=0011 7

Conditions on Codes • Guarantee decodability of a sequence of values of X w/o adding a special symbol between any two codewords? • Guarantee decodability of a sequence of values of X w/o reference to future codewords? – end of a codeword is immediately recognizable – an instantaneous code is a self-punctuating code – example: codewords C(1) = 0 , C(2) = 10 , C(3) = 110 , C(4) = 111 , binary string 01011111010 is parsed as 0, 10, 111, 110, 10 8

Classes of Codes decodability of a single value of X decodability of a sequence of values of X w/o adding a special symbol between any two codewords decodability of a sequence of values of X w/o reference to future codewords 9

Example – code 1: source of 0 can be 1, 2, 3, 4 – code 2: source sequence of 010 can be 2, 14, 31 – code 3: source sequence of 11…can be 3 (following bit is 1), 4 (following bits are 0’s of odd number), 3 (following bits are 0’s of even number) – code 4: prefix free 10

KRAFT INEQUALITY FOR INSTANTANEOUS CODES 11

Kraft Inequality • Wish to construct instantaneous codes of minimum expected length to describe a given source – cannot assign short codewords to all source symbols and still be prefix-free 12

Idea of Proof Consider a D -ary tree in which each node has D children. Let the branches • of the tree represent the symbols of the codeword. Each codeword is represented by a leaf on the tree. The path from the root traces out the symbols of the codeword. The prefix condition on the codewords implies that no codeword is an ancestor of any other codeword on the tree. 0 1 0 1 0 1 13

KRAFT INEQUALITY FOR UNIQUELY DECODABLE CODES 14

McMillan Inequality • Expect uniquely decodable codes to offer further possibilities for the set of codeword lengths than instantaneous codes – class of uniquely decodable codes is larger i – NO! 15

OPTIMAL CODES 16

Expected Code Length Minimization integer programming convex optimization continuous linear function relaxation larger feasible set convex lower minimum value function near optimal optimal solution solution rounding up 17

Expected Length 18

Another Proof 19

Minimum Expected Length – there is an overhead of at most 1 bit due to the non- integer case – reduce the overhead per symbol by spreading it out over many symbols 20

Minimum Expected Length per Symbol • Send a sequence of n symbols from X as a super symbol with expected length another justification for entropy rate: expected number of bits per symbol required to describe the process – i.i.d. case: entropy rate = H(x) 21

HUFFMAN CODES 22

Huffman Algorithm 23

Example 24

Example 25

Optimality of Huffman Code 26

History of Huffman Code In 1951, David A. Huffman (at the edge of 26) and his MIT information theory classmates were given the choice of a term paper or a final exam. The professor, Robert M. Fano, assigned a term paper on the problem of finding the most efficient binary code. Huffman, unable to prove any codes were the most efficient, was about to give up and start studying for the final when he hit upon the idea of using a frequency-sorted binary tree and quickly proved this method the most efficient. In doing so, Huffman outdid Fano, who had worked with information theory inventor Claude Shannon to develop a similar code. Building the tree from the bottom up guaranteed optimality, unlike top-down Shannon-Fano coding. 27

Summary 28

Summary 29

Computing and Communications 2. Information Theory -Data - PowerPoint PPT Presentation

1896 1920 1987 2006 Computing and Communications 2. Information Theory -Data Compression Ying Cui Department of Electronic Engineering Shanghai Jiao Tong University, China 2017, Autumn 1 Outline Examples of codes Kraft inequality

Crisis Communications & Media Training Presented By: Bryan Brown Brown Communications LLC

Information Theory and Communications CSM25 Secure Information Hiding Dr Hans Georg Schaathun

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

15-292 History of Computing Growth of Analog Computing & the Birth of Computing Theory

MARKETING AND COMMUNICATIONS COMMITTEE Chairman Joseph Samuel Marketing and Communications

ONRSR Communications Ross Stargatt Manager, Strategic Planning and Communications 6 December

Communications to support people and places Re -imagining Communications to serve local

Christian Bates-Hardy Agenda Intro to Communications Tools and Techniques to Share Your

Communications and Consultation Plan 28 February 2013 Route Map Key Communications Messages

Leadplane Training Course Leadplane Training Course Communications Linking your communications

Broadband Mobile Communications Broadband Mobile Communications Broadband Mobile Communications

Amateur Radio Emergency Communications Where Does Amateur Radio Fit in Emergency Communications?

Communications plan Supporting the Outbreak Control Plan Communications update Sharing the

Communications Community Emergency Response Team Introduction to Radio Communications James

Communications A look back Background Communications is the act of conveying information

Why Inverse F-transform? A General Definition of . . . A Compression-Based A Reasonable . . .

In-Depth Exploration of Single-Snapshot Lossy Compression Techniques for N- Body Simulations

Remote Sensing Data Compression Joan Serra-Sagrist` a Universitat Aut` onoma de Barcelona

Scientific Data Compression: From Stone-Age to Renaissance Factor 10,100 compression

Modeling Data the different views on Data Mining Views on Data Mining Fitting the data

CS 3000: Algorithms & Data Jonathan Ullman Lecture 19: Data Compression Greedy

MA/CSSE 473 Day 31 Student questions Data Compression Minimal Spanning Tree Intro More

15 Data Compression Foundations of Computer Science Cengage Learning 15.1 Objectives

Computing and Communications 2. Information Theory -Data - PowerPoint PPT Presentation

1896 1920 1987 2006 Computing and Communications 2. Information Theory -Data Compression Ying Cui Department of Electronic Engineering Shanghai Jiao Tong University, China 2017, Autumn 1 Outline Examples of codes Kraft inequality

Crisis Communications &amp; Media Training Presented By: Bryan Brown Brown Communications LLC

Information Theory and Communications CSM25 Secure Information Hiding Dr Hans Georg Schaathun

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

15-292 History of Computing Growth of Analog Computing &amp; the Birth of Computing Theory

MARKETING AND COMMUNICATIONS COMMITTEE Chairman Joseph Samuel Marketing and Communications

ONRSR Communications Ross Stargatt Manager, Strategic Planning and Communications 6 December

Communications to support people and places Re -imagining Communications to serve local

Christian Bates-Hardy Agenda Intro to Communications Tools and Techniques to Share Your

Communications and Consultation Plan 28 February 2013 Route Map Key Communications Messages

Leadplane Training Course Leadplane Training Course Communications Linking your communications

Broadband Mobile Communications Broadband Mobile Communications Broadband Mobile Communications

Amateur Radio Emergency Communications Where Does Amateur Radio Fit in Emergency Communications?

Communications plan Supporting the Outbreak Control Plan Communications update Sharing the

Communications Community Emergency Response Team Introduction to Radio Communications James

Communications A look back Background Communications is the act of conveying information

Why Inverse F-transform? A General Definition of . . . A Compression-Based A Reasonable . . .

In-Depth Exploration of Single-Snapshot Lossy Compression Techniques for N- Body Simulations

Remote Sensing Data Compression Joan Serra-Sagrist` a Universitat Aut` onoma de Barcelona

Scientific Data Compression: From Stone-Age to Renaissance Factor 10,100 compression

Modeling Data the different views on Data Mining Views on Data Mining Fitting the data

CS 3000: Algorithms &amp; Data Jonathan Ullman Lecture 19: Data Compression Greedy

MA/CSSE 473 Day 31 Student questions Data Compression Minimal Spanning Tree Intro More

15 Data Compression Foundations of Computer Science Cengage Learning 15.1 Objectives

Crisis Communications & Media Training Presented By: Bryan Brown Brown Communications LLC

15-292 History of Computing Growth of Analog Computing & the Birth of Computing Theory

CS 3000: Algorithms & Data Jonathan Ullman Lecture 19: Data Compression Greedy