Previously... Forward and converse proof of the rate-distortion - PowerPoint PPT Presentation

Lecture 14 Method of types Example Let X ∈ { 1 , 2 , 3 } and x N = 11321 p x N (1) = 3 5 , p x N (2) = 1 5 , p x N (3) = 1 5 T ( p x N ) = { 11123 , 11132 , 11231 , 11321 , · · · } containing all sequences with three 1’s, one 2, and one 3 5! | T ( p x N ) | = 3!1!1! = 20. S. Cheng (OU-Tulsa) November 28, 2017 8 / 27

Lecture 14 Method of types Example Let X ∈ { 1 , 2 , 3 } and x N = 11321 p x N (1) = 3 5 , p x N (2) = 1 5 , p x N (3) = 1 5 T ( p x N ) = { 11123 , 11132 , 11231 , 11321 , · · · } containing all sequences with three 1’s, one 2, and one 3 5! | T ( p x N ) | = 3!1!1! = 20. In general, N ! | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! · · · S. Cheng (OU-Tulsa) November 28, 2017 8 / 27

Lecture 14 Method of types Example Let X ∈ { 1 , 2 , 3 } and x N = 11321 p x N (1) = 3 5 , p x N (2) = 1 5 , p x N (3) = 1 5 T ( p x N ) = { 11123 , 11132 , 11231 , 11321 , · · · } containing all sequences with three 1’s, one 2, and one 3 5! | T ( p x N ) | = 3!1!1! = 20. In general, N ! | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! · · · Actually we don’t care too much what | T ( p ) | is exactly. We will provide bounds for | T ( p ) | as we come back later on S. Cheng (OU-Tulsa) November 28, 2017 8 / 27

Lecture 14 Method of types Example Let X ∈ { 1 , 2 , 3 } and x N = 11321 p x N (1) = 3 5 , p x N (2) = 1 5 , p x N (3) = 1 5 T ( p x N ) = { 11123 , 11132 , 11231 , 11321 , · · · } containing all sequences with three 1’s, one 2, and one 3 5! | T ( p x N ) | = 3!1!1! = 20. In general, N ! | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! · · · Actually we don’t care too much what | T ( p ) | is exactly. We will provide bounds for | T ( p ) | as we come back later on And for any sequence y in T ( p x N ), p ( y ) = q (1) 3 q (2) q (3), where q ( · ) is the true distribution S. Cheng (OU-Tulsa) November 28, 2017 8 / 27

Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N � � N q N ( x N ) = i =1 log q ( x i ) q ( x i ) = 2 i =1 S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N i =1 log q ( x i ) = 2 � � N � a ∈X N ( a | x N ) log q ( a ) q N ( x N ) = q ( x i ) = 2 i =1 S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N i =1 log q ( x i ) = 2 � � N � a ∈X N ( a | x N ) log q ( a ) q N ( x N ) = q ( x i ) = 2 i =1 = 2 − N � a ∈X − p xN ( a ) log q ( a ) S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N i =1 log q ( x i ) = 2 � � N � a ∈X N ( a | x N ) log q ( a ) q N ( x N ) = q ( x i ) = 2 i =1 � � a ∈X p ( a ) log p ( a ) a ∈X − p xN ( a ) log q ( a ) = 2 − � a ∈X p ( a ) log p ( a ) − � − N = 2 − N � q ( a ) S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

Lecture 14 Method of types Type sequence probability Even though we have seen that in the coin toss example, let’s restate it more formally. Theorem 1 If x N ∈ T ( p ) and q ( · ) is the true distribution of X , the probability of getting x N from sampling q ( · ) for N times, as denoted as q N ( x N ), is given by 2 − N ( H ( p )+ KL ( p || q )) Proof N i =1 log q ( x i ) = 2 � � N � a ∈X N ( a | x N ) log q ( a ) q N ( x N ) = q ( x i ) = 2 i =1 � � a ∈X p ( a ) log p ( a ) a ∈X − p xN ( a ) log q ( a ) = 2 − � a ∈X p ( a ) log p ( a ) − � − N = 2 − N � q ( a ) = 2 − N ( H ( p )+ KL ( p || q )) S. Cheng (OU-Tulsa) November 28, 2017 9 / 27

Lecture 14 Method of types Probability of a sequence in the “typical” class If x N ∈ T ( q ), where q ( · ) is the true distribution of X , then q N ( x N ) = 2 − NH ( q ) = 2 − NH ( X ) S. Cheng (OU-Tulsa) November 28, 2017 10 / 27

Lecture 14 Method of types Probability of a sequence in the “typical” class If x N ∈ T ( q ), where q ( · ) is the true distribution of X , then q N ( x N ) = 2 − NH ( q ) = 2 − NH ( X ) Remarks Note that the probability is exactly equal to 2 − NH ( X ) S. Cheng (OU-Tulsa) November 28, 2017 10 / 27

Lecture 14 Method of types Probability of a sequence in the “typical” class If x N ∈ T ( q ), where q ( · ) is the true distribution of X , then q N ( x N ) = 2 − NH ( q ) = 2 − NH ( X ) Remarks Note that the probability is exactly equal to 2 − NH ( X ) Recall that this is the probability of a typical sequence supposed to be. Therefore, any x N in T ( q ) is a typical sequence ( T ( q ) ⊂ A N ǫ ( X )) S. Cheng (OU-Tulsa) November 28, 2017 10 / 27

Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence Example If X ∈ { 0 , 1 } , � 0 � 1 � � N , N − 1 � � N N , 0 �� N , N P N ( X ) = ( p X (0) , p X (1)) : , , · · · , N N N Note that |P N ( X ) | = N + 1 S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence Example If X ∈ { 0 , 1 } , � 0 � 1 � � N , N − 1 � � N N , 0 �� N , N P N ( X ) = ( p X (0) , p X (1)) : , , · · · , N N N Note that |P N ( X ) | = N + 1 Since a type is uniquely characterized by a distribution of X in a length- N sequence S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence Example If X ∈ { 0 , 1 } , � 0 � 1 � � N , N − 1 � � N N , 0 �� N , N P N ( X ) = ( p X (0) , p X (1)) : , , · · · , N N N Note that |P N ( X ) | = N + 1 Since a type is uniquely characterized by a distribution of X in a length- N sequence Each element p of P N ( X ) corresponds a type T ( p ) S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

Lecture 14 Method of types Set of all empirical distribution P N ( X ) Denote P N ( X ) as the set of all empirical distribution of X in a length- N sequence Example If X ∈ { 0 , 1 } , � 0 � 1 � � N , N − 1 � � N N , 0 �� N , N P N ( X ) = ( p X (0) , p X (1)) : , , · · · , N N N Note that |P N ( X ) | = N + 1 Since a type is uniquely characterized by a distribution of X in a length- N sequence Each element p of P N ( X ) corresponds a type T ( p ) Number of types is |P N ( X ) | S. Cheng (OU-Tulsa) November 28, 2017 11 / 27

Lecture 14 Method of types Number of types It is not too difficult to count the exact number of types. But in practice, we don’t quite bother with it as long as we know that the number is relatively “small” Theorem 2 |P N ( X ) | ≤ ( N + 1) |X| S. Cheng (OU-Tulsa) November 28, 2017 12 / 27

Lecture 14 Method of types Number of types It is not too difficult to count the exact number of types. But in practice, we don’t quite bother with it as long as we know that the number is relatively “small” Theorem 2 |P N ( X ) | ≤ ( N + 1) |X| Proof Note that each type is specified by the empirical probability of each outcome of X . And the possible values of the empirical probabilities are N , 1 0 N , · · · , N N ( N + 1 of them). S. Cheng (OU-Tulsa) November 28, 2017 12 / 27

Lecture 14 Method of types Number of types It is not too difficult to count the exact number of types. But in practice, we don’t quite bother with it as long as we know that the number is relatively “small” Theorem 2 |P N ( X ) | ≤ ( N + 1) |X| Proof Note that each type is specified by the empirical probability of each outcome of X . And the possible values of the empirical probabilities are N , 1 0 N , · · · , N N ( N + 1 of them). Since there are |X| elements, the number of types is bounded by ( N + 1) |X| S. Cheng (OU-Tulsa) November 28, 2017 12 / 27

Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here � p N ( x N ) 1 ≥ x N ∈ T ( p ) S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � 1 = Pr ( T (ˆ p )) ˆ p ∈P N S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � � 1 = Pr ( T (ˆ p )) ≤ max Pr ( T (˜ p )) ˜ p p ∈P N ˆ p ∈P N ˆ S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � � � 1 = Pr ( T (ˆ p )) ≤ max Pr ( T (˜ p )) = Pr ( T ( p )) ˜ p ˆ p ∈P N p ∈P N ˆ p ∈P N ˆ S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � � � Pr ( T ( p )) ≤ ( N + 1) |X| Pr ( T ( p )) 1 = Pr ( T (ˆ p )) ≤ max Pr ( T (˜ p )) = ˜ p ˆ p ∈P N p ∈P N ˆ p ∈P N ˆ S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

Lecture 14 Method of types Size of a type class N ! Recall that | T ( p ) | = ( Np ( x 1 ))!( Np ( x 2 ))!( Np ( x 3 ))! ··· but the following bounds are much more useful in practice Theorem 3 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Proof Let’s assume p ( · ) is the actual distribution of X here 2 − NH ( p ) = | T ( p ) | 2 − NH ( p ) � � p N ( x N ) = 1 ≥ x N ∈ T ( p ) x N ∈ T ( p ) � � � Pr ( T ( p )) ≤ ( N + 1) |X| Pr ( T ( p )) 1 = Pr ( T (ˆ p )) ≤ max Pr ( T (˜ p )) = ˜ p ˆ p ∈P N p ∈P N ˆ p ∈P N ˆ = ( N + 1) |X| | T ( p ) | 2 − NH ( p ) S. Cheng (OU-Tulsa) November 28, 2017 13 / 27

Lecture 14 Method of types Probability of a type class Theorem 4 Let the true distribution of X is q ( · ), then 2 − N ( KL ( p || q )) ≤ Pr ( T ( p )) ≤ 2 − N ( KL ( p || q )) ( N + 1) |X| S. Cheng (OU-Tulsa) November 28, 2017 14 / 27

Lecture 14 Method of types Probability of a type class Theorem 4 Let the true distribution of X is q ( · ), then 2 − N ( KL ( p || q )) ≤ Pr ( T ( p )) ≤ 2 − N ( KL ( p || q )) ( N + 1) |X| Proof From Theorem 1, each sequence in T ( p ) has probability 2 − N ( H ( p )+ KL ( p || q )) ( N +1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) from Theorem 3, 1 and since 1 ( N + 1) |X| 2 NH ( p ) 2 − N ( H ( p )+ KL ( p || q )) ≤ Pr ( T ( p )) ≤ 2 NH ( p ) 2 − N ( H ( p )+ KL ( p || q )) S. Cheng (OU-Tulsa) November 28, 2017 14 / 27

Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N All sequences in the type class T ( p ) has the same probability ( q ( · ) is the true distribution) q N ( x N ) = 2 − N ( H ( p )+ KL ( p || q ) S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N All sequences in the type class T ( p ) has the same probability ( q ( · ) is the true distribution) q N ( x N ) = 2 − N ( H ( p )+ KL ( p || q ) There are about 2 NH ( p ) sequences in T ( p ) 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N All sequences in the type class T ( p ) has the same probability ( q ( · ) is the true distribution) q N ( x N ) = 2 − N ( H ( p )+ KL ( p || q ) There are about 2 NH ( p ) sequences in T ( p ) 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Probability of getting a sequence in T ( p ) is about 2 − N ( KL ( p || q )) . More precisely, 2 − N ( KL ( p || q )) ( N + 1) |X| ≤ Pr ( T ( p )) ≤ 2 − N ( KL ( p || q )) S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

Lecture 14 Method of types Summary of type Type class T ( p ) contains all sequences with empirical distribution of p . That is, x N : N ( a | x N ) � � T ( p ) = = p ( a ) N All sequences in the type class T ( p ) has the same probability ( q ( · ) is the true distribution) q N ( x N ) = 2 − N ( H ( p )+ KL ( p || q ) There are about 2 NH ( p ) sequences in T ( p ) 1 ( N + 1) |X| 2 NH ( p ) ≤ | T ( p ) | ≤ 2 NH ( p ) Probability of getting a sequence in T ( p ) is about 2 − N ( KL ( p || q )) . More precisely, 2 − N ( KL ( p || q )) ( N + 1) |X| ≤ Pr ( T ( p )) ≤ 2 − N ( KL ( p || q )) There are ( N + 1) |X| types S. Cheng (OU-Tulsa) November 28, 2017 15 / 27

Lecture 14 Univesal source coding Rationale For the compression scheme (such as Huffmann coding) that we discussed earlier in this class, one needs to know the source distribution ahead to design the encoder and decoder S. Cheng (OU-Tulsa) November 28, 2017 16 / 27

Lecture 14 Univesal source coding Rationale For the compression scheme (such as Huffmann coding) that we discussed earlier in this class, one needs to know the source distribution ahead to design the encoder and decoder Question: Is it possible to construct compression scheme without knowing the source distibution and still performs as good? S. Cheng (OU-Tulsa) November 28, 2017 16 / 27

Lecture 14 Univesal source coding Rationale For the compression scheme (such as Huffmann coding) that we discussed earlier in this class, one needs to know the source distribution ahead to design the encoder and decoder Question: Is it possible to construct compression scheme without knowing the source distibution and still performs as good? Answer: Yes. At least theoretically → universal source coding S. Cheng (OU-Tulsa) November 28, 2017 16 / 27

Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as � | A | = | T ( p ) | p : H ( p ) < R N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as � � 2 NH ( p ) | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as 2 NH ( p ) < � � � 2 NR N | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N p : H ( p ) < R N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as 2 NH ( p ) < � � � 2 NR N | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N p : H ( p ) < R N ≤ ( N + 1) |X| 2 NR N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as 2 NH ( p ) < � � � 2 NR N | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N p : H ( p ) < R N � � R N + |X| log( N +1) ≤ ( N + 1) |X| 2 NR N = 2 N = 2 NR N S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

Lecture 14 Univesal source coding Theory of universal source coding Given any source Q with H ( Q ) < R , there exists a length- N universal code of rate R such that the source can be decoded losslessly as N → ∞ Proof Let R N = R − |X| log( N +1) , and consider the set of sequences N A = { x N : H ( p x N ) < R N } as the code book. Note that the rate is < R as 2 NH ( p ) < � � � 2 NR N | A | = | T ( p ) | ≤ p : H ( p ) < R N p : H ( p ) < R N p : H ( p ) < R N � � R N + |X| log( N +1) ≤ ( N + 1) |X| 2 NR N = 2 N = 2 NR N Encoder: given input, check if input is in A , output index if so. Otherwise, declare failure Decoder: simply map index back to the sequence S. Cheng (OU-Tulsa) November 28, 2017 17 / 27

Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � P e = Pr ( T ( p )) p : H ( p ) > R N S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ If H ( q ) < R , as R N → R as N increases, we can find some N 0 such that H ( q ) < R N for all N ≥ N 0 S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ If H ( q ) < R , as R N → R as N increases, we can find some N 0 such that H ( q ) < R N for all N ≥ N 0 Therefore, any p in { p : H ( p ) > R N } cannot be the same as q S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ If H ( q ) < R , as R N → R as N increases, we can find some N 0 such that H ( q ) < R N for all N ≥ N 0 Therefore, any p in { p : H ( p ) > R N } cannot be the same as q ⇒ min ˜ p ) > R N KL (˜ p || q ) > 0 for N ≥ N 0 p : H (˜ S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

Lecture 14 Univesal source coding Theory of universal source coding Proof (con’t) Note that the probability of error P e is given by � � P e = Pr ( T ( p )) ≤ max Pr ( T (˜ p )) p : H (˜ ˜ p ) > R N p : H ( p ) > R N p : H ( p ) > R N � � ≤ (1 + N ) |X| 2 − N min ˜ p ) > RN KL (˜ p || q ) p : H (˜ If H ( q ) < R , as R N → R as N increases, we can find some N 0 such that H ( q ) < R N for all N ≥ N 0 Therefore, any p in { p : H ( p ) > R N } cannot be the same as q ⇒ min ˜ p ) > R N KL (˜ p || q ) > 0 for N ≥ N 0 p : H (˜ Hence, P e → 0 as N → ∞ S. Cheng (OU-Tulsa) November 28, 2017 18 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 1 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 1 , 0 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 1 , 0 , 11 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 1 , 0 , 11 , 01 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 1 , 0 , 11 , 01 , 110 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 1 , 0 , 11 , 01 , 110 , 111 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 1 , 0 , 11 , 01 , 110 , 111 , 10 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: 1) index of segment (excluding the last bit) in the dictionary; S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: 1) index of segment (excluding the last bit) in the dictionary; 2) the last bit S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: 1) index of segment (excluding the last bit) in the dictionary; 2) the last bit ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv coding Its variants are widely used by compression tools almost everywhere (zip, pkzip, tiff, etc.) Main ideas Construct a dictionary including all previously seen segments Bits needed to send a new segment can be reduced taking advantage known segment in the dictionary Example: let’s compress 10110111011110111 First parse segment into segments that haven’t seen before ⇒ 1 2 3 4 5 6 7 8 1 , 0 , 11 , 01 , 110 , 111 , 10 , 111 Encode each segment into representation containing a pair of numbers: 1) index of segment (excluding the last bit) in the dictionary; 2) the last bit ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Encode representation to bit stream. Note that as the dictionary grows, number of bits needed to store the index increases ⇒ 0100011101011001110010110 S. Cheng (OU-Tulsa) November 28, 2017 19 / 27

Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 1 ⇒ 1 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 2 1 0 ⇒ 10 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 2 3 1 0 11 ⇒ 1011 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 2 3 4 1 0 11 01 ⇒ 101101 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

Lecture 14 Univesal source coding Lempel-Ziv decoding Decode bitstream back to representation 0100011101011001110010110 ⇒ (0 , 1) , (0 , 0) , (1 , 1) , (2 , 1) , (3 , 0) , (3 , 1) , (1 , 0) , (6 , ∅ ) Build dictionary and decode 1 2 3 4 5 1 0 11 01 110 ⇒ 101101110 S. Cheng (OU-Tulsa) November 28, 2017 20 / 27

Previously... Forward and converse proof of the rate-distortion - PowerPoint PPT Presentation

Lecture 14 Review Previously... Forward and converse proof of the rate-distortion theorem S. Cheng (OU-Tulsa) November 28, 2017 1 / 27 Lecture 14 Overview This time Method of types Universal source coding Large deviation theory S. Cheng

Previously in Game Theory Previously in Game Theory decision makers: choices

HAYMARKET Previously Previously recap the Feb 2017 presentation Presented to scrutiny

Heavy flavour spectroscopy at ATLAS, CMS and LHCb Mat Charles (Sorbonne Universit * /LPNHE) 1

FCSL Previously on this channel Previously on this channel letrec span (x : ptr) : bool = {

Creating Digital Photo Books Step-By-Step by Jim and Diane Bodkin Previously We Showed: Saving

Scala.js Safety & Sanity in the wild west of the web Li Haoyi, Dropbox, 20 July 2015 1.1 Who

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

CSE 373: P vs NP Michael Lee Monday, Mar 5, 2018 1 Overview Previously: We spent a lot of

History of Present Illness 14 month old previously healthy infant boy presented via EMS after

Nadav Sahar Sahar Dr. Nadav Dr. Patient Details Patient Details 47 year old female

Audit of NOAC patients previously on warfarin Sue Bacon Anticoagulation Nurse Specialist North

MERTON IAPT Page 9 About us We are part of Addaction-Thinkaction (previously KCA) Page 10

DALT Fellowship Works In Progress: Exploration of Substance Use Patterns in Previously

Family: Mint ( Lamiaceae : lay-mee-AY-see-eye or ee; previously Labiatae : lay-bee-AH- tie or

The research commercialisation office of the University of Oxford, previously called Isis

Ingersoll Rand Bridge came across a former industrial facility, previously owned and operated by

Enterprise Storage Architecture Fall 2018 Storage Efficiency Tyler Bletsch Duke University Two

Network Security Dr. Mohammed Shafiul Alam Khan Assistant Professor Institute of Information

Multi-Party Computation in Presence of Corrupted Majorities Dominik Raub Institute of

Measurement and Analysis of Private Key Sharing in the HTTPS Ecosystem Frank Cangialosi, Taejoong

From Sorting to Heaps to Compression Data Compression video on demand/set top box jpeg

Finite-State Methods in Natural-Language Processing: 1Motivation Ronald M. Kaplan and

Who idea is it? Acknowledging and building on other work, or just plain plagiarism. Allison Mann

Whose idea is it? Acknowledging and building on other work, or just plain plagiarism? Lina Qiu,

Previously... Forward and converse proof of the rate-distortion - PowerPoint PPT Presentation

Lecture 14 Review Previously... Forward and converse proof of the rate-distortion theorem S. Cheng (OU-Tulsa) November 28, 2017 1 / 27 Lecture 14 Overview This time Method of types Universal source coding Large deviation theory S. Cheng

Previously in Game Theory Previously in Game Theory decision makers: choices

HAYMARKET Previously Previously recap the Feb 2017 presentation Presented to scrutiny

Heavy flavour spectroscopy at ATLAS, CMS and LHCb Mat Charles (Sorbonne Universit * /LPNHE) 1

FCSL Previously on this channel Previously on this channel letrec span (x : ptr) : bool = {

Creating Digital Photo Books Step-By-Step by Jim and Diane Bodkin Previously We Showed: Saving

Scala.js Safety &amp; Sanity in the wild west of the web Li Haoyi, Dropbox, 20 July 2015 1.1 Who

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

CSE 373: P vs NP Michael Lee Monday, Mar 5, 2018 1 Overview Previously: We spent a lot of

History of Present Illness 14 month old previously healthy infant boy presented via EMS after

Nadav Sahar Sahar Dr. Nadav Dr. Patient Details Patient Details 47 year old female

Audit of NOAC patients previously on warfarin Sue Bacon Anticoagulation Nurse Specialist North

MERTON IAPT Page 9 About us We are part of Addaction-Thinkaction (previously KCA) Page 10

DALT Fellowship Works In Progress: Exploration of Substance Use Patterns in Previously

Family: Mint ( Lamiaceae : lay-mee-AY-see-eye or ee; previously Labiatae : lay-bee-AH- tie or

The research commercialisation office of the University of Oxford, previously called Isis

Ingersoll Rand Bridge came across a former industrial facility, previously owned and operated by

Enterprise Storage Architecture Fall 2018 Storage Efficiency Tyler Bletsch Duke University Two

Network Security Dr. Mohammed Shafiul Alam Khan Assistant Professor Institute of Information

Multi-Party Computation in Presence of Corrupted Majorities Dominik Raub Institute of

Measurement and Analysis of Private Key Sharing in the HTTPS Ecosystem Frank Cangialosi, Taejoong

From Sorting to Heaps to Compression Data Compression video on demand/set top box jpeg

Finite-State Methods in Natural-Language Processing: 1Motivation Ronald M. Kaplan and

Who idea is it? Acknowledging and building on other work, or just plain plagiarism. Allison Mann

Whose idea is it? Acknowledging and building on other work, or just plain plagiarism? Lina Qiu,

Scala.js Safety & Sanity in the wild west of the web Li Haoyi, Dropbox, 20 July 2015 1.1 Who