entropy and shannon s theorem
play

Entropy and Shannons Theorem Lecture 24 November 18, 2015 Sariel - PowerPoint PPT Presentation

NEW CS 473: Theory II, Fall 2015 Entropy and Shannons Theorem Lecture 24 November 18, 2015 Sariel (UIUC) New CS473 1 Fall 2015 1 / 25 Part I Entropy Sariel (UIUC) New CS473 2 Fall 2015 2 / 25 Part II Extracting randomness


  1. Proof... � n � There are input strings with exactly j heads. 1 j each has probability p j (1 − p ) n − j . 2 map string s like that to index number in the set 3 � �� � n S j = 1 , . . . , . j Given that input string s has j ones (out of n bits) defines a 4 uniform distribution on S n,j . x ← EncodeBinomCoeff ( s ) 5 � n � x uniform distributed in { 1 , . . . , N } , N = . 6 j Seen in previous lecture... 7 ... extract in expectation, ⌊ lg N ⌋ − 1 bits from uniform 8 random variable in the range 1 , . . . , N . Extract bits using ExtractRandomness ( x, N ):. 9 Sariel (UIUC) New CS473 11 Fall 2015 11 / 25

  2. Proof... � n � There are input strings with exactly j heads. 1 j each has probability p j (1 − p ) n − j . 2 map string s like that to index number in the set 3 � �� � n S j = 1 , . . . , . j Given that input string s has j ones (out of n bits) defines a 4 uniform distribution on S n,j . x ← EncodeBinomCoeff ( s ) 5 � n � x uniform distributed in { 1 , . . . , N } , N = . 6 j Seen in previous lecture... 7 ... extract in expectation, ⌊ lg N ⌋ − 1 bits from uniform 8 random variable in the range 1 , . . . , N . Extract bits using ExtractRandomness ( x, N ):. 9 Sariel (UIUC) New CS473 11 Fall 2015 11 / 25

  3. Proof... � n � There are input strings with exactly j heads. 1 j each has probability p j (1 − p ) n − j . 2 map string s like that to index number in the set 3 � �� � n S j = 1 , . . . , . j Given that input string s has j ones (out of n bits) defines a 4 uniform distribution on S n,j . x ← EncodeBinomCoeff ( s ) 5 � n � x uniform distributed in { 1 , . . . , N } , N = . 6 j Seen in previous lecture... 7 ... extract in expectation, ⌊ lg N ⌋ − 1 bits from uniform 8 random variable in the range 1 , . . . , N . Extract bits using ExtractRandomness ( x, N ):. 9 Sariel (UIUC) New CS473 11 Fall 2015 11 / 25

  4. Proof... � n � There are input strings with exactly j heads. 1 j each has probability p j (1 − p ) n − j . 2 map string s like that to index number in the set 3 � �� � n S j = 1 , . . . , . j Given that input string s has j ones (out of n bits) defines a 4 uniform distribution on S n,j . x ← EncodeBinomCoeff ( s ) 5 � n � x uniform distributed in { 1 , . . . , N } , N = . 6 j Seen in previous lecture... 7 ... extract in expectation, ⌊ lg N ⌋ − 1 bits from uniform 8 random variable in the range 1 , . . . , N . Extract bits using ExtractRandomness ( x, N ):. 9 Sariel (UIUC) New CS473 11 Fall 2015 11 / 25

  5. Proof... � n � There are input strings with exactly j heads. 1 j each has probability p j (1 − p ) n − j . 2 map string s like that to index number in the set 3 � �� � n S j = 1 , . . . , . j Given that input string s has j ones (out of n bits) defines a 4 uniform distribution on S n,j . x ← EncodeBinomCoeff ( s ) 5 � n � x uniform distributed in { 1 , . . . , N } , N = . 6 j Seen in previous lecture... 7 ... extract in expectation, ⌊ lg N ⌋ − 1 bits from uniform 8 random variable in the range 1 , . . . , N . Extract bits using ExtractRandomness ( x, N ):. 9 Sariel (UIUC) New CS473 11 Fall 2015 11 / 25

  6. Proof... � n � There are input strings with exactly j heads. 1 j each has probability p j (1 − p ) n − j . 2 map string s like that to index number in the set 3 � �� � n S j = 1 , . . . , . j Given that input string s has j ones (out of n bits) defines a 4 uniform distribution on S n,j . x ← EncodeBinomCoeff ( s ) 5 � n � x uniform distributed in { 1 , . . . , N } , N = . 6 j Seen in previous lecture... 7 ... extract in expectation, ⌊ lg N ⌋ − 1 bits from uniform 8 random variable in the range 1 , . . . , N . Extract bits using ExtractRandomness ( x, N ):. 9 Sariel (UIUC) New CS473 11 Fall 2015 11 / 25

  7. Exciting proof continued... Z : random variable: number of heads in input string s . 1 B : number of random bits extracted. 2 n � � � � � � B = Pr[ Z = k ] E B � Z = k , E � k =0 � � n �� � � � Know: E B � Z = k ≥ lg − 1 . � 3 k ε < p − 1 / 2 : sufficiently small constant. 4 n ( p − ε ) ≤ k ≤ n ( p + ε ) : 5 ≥ 2 n H ( p + ε ) � n � � n � ≥ n + 1 , k ⌊ n ( p + ε ) ⌋ � n ... since 2 n H ( p ) is a good approximation to � as proved in 6 np previous lecture. Sariel (UIUC) New CS473 12 Fall 2015 12 / 25

  8. Exciting proof continued... Z : random variable: number of heads in input string s . 1 B : number of random bits extracted. 2 n � � � � � � B = Pr[ Z = k ] E B � Z = k , E � k =0 � � n �� � � � Know: E B � Z = k ≥ lg − 1 . � 3 k ε < p − 1 / 2 : sufficiently small constant. 4 n ( p − ε ) ≤ k ≤ n ( p + ε ) : 5 ≥ 2 n H ( p + ε ) � n � � n � ≥ n + 1 , k ⌊ n ( p + ε ) ⌋ � n ... since 2 n H ( p ) is a good approximation to � as proved in 6 np previous lecture. Sariel (UIUC) New CS473 12 Fall 2015 12 / 25

  9. Exciting proof continued... Z : random variable: number of heads in input string s . 1 B : number of random bits extracted. 2 n � � � � � � B = Pr[ Z = k ] E B � Z = k , E � k =0 � � n �� � � � Know: E B � Z = k ≥ lg − 1 . � 3 k ε < p − 1 / 2 : sufficiently small constant. 4 n ( p − ε ) ≤ k ≤ n ( p + ε ) : 5 ≥ 2 n H ( p + ε ) � n � � n � ≥ n + 1 , k ⌊ n ( p + ε ) ⌋ � n ... since 2 n H ( p ) is a good approximation to � as proved in 6 np previous lecture. Sariel (UIUC) New CS473 12 Fall 2015 12 / 25

  10. Exciting proof continued... Z : random variable: number of heads in input string s . 1 B : number of random bits extracted. 2 n � � � � � � B = Pr[ Z = k ] E B � Z = k , E � k =0 � � n �� � � � Know: E B � Z = k ≥ lg − 1 . � 3 k ε < p − 1 / 2 : sufficiently small constant. 4 n ( p − ε ) ≤ k ≤ n ( p + ε ) : 5 ≥ 2 n H ( p + ε ) � n � � n � ≥ n + 1 , k ⌊ n ( p + ε ) ⌋ � n ... since 2 n H ( p ) is a good approximation to � as proved in 6 np previous lecture. Sariel (UIUC) New CS473 12 Fall 2015 12 / 25

  11. Exciting proof continued... Z : random variable: number of heads in input string s . 1 B : number of random bits extracted. 2 n � � � � � � B = Pr[ Z = k ] E B � Z = k , E � k =0 � � n �� � � � Know: E B � Z = k ≥ lg − 1 . � 3 k ε < p − 1 / 2 : sufficiently small constant. 4 n ( p − ε ) ≤ k ≤ n ( p + ε ) : 5 ≥ 2 n H ( p + ε ) � n � � n � ≥ n + 1 , k ⌊ n ( p + ε ) ⌋ � n ... since 2 n H ( p ) is a good approximation to � as proved in 6 np previous lecture. Sariel (UIUC) New CS473 12 Fall 2015 12 / 25

  12. Exciting proof continued... Z : random variable: number of heads in input string s . 1 B : number of random bits extracted. 2 n � � � � � � B = Pr[ Z = k ] E B � Z = k , E � k =0 � � n �� � � � Know: E B � Z = k ≥ lg − 1 . � 3 k ε < p − 1 / 2 : sufficiently small constant. 4 n ( p − ε ) ≤ k ≤ n ( p + ε ) : 5 ≥ 2 n H ( p + ε ) � n � � n � ≥ n + 1 , k ⌊ n ( p + ε ) ⌋ � n ... since 2 n H ( p ) is a good approximation to � as proved in 6 np previous lecture. Sariel (UIUC) New CS473 12 Fall 2015 12 / 25

  13. Exciting proof continued... Z : random variable: number of heads in input string s . 1 B : number of random bits extracted. 2 n � � � � � � B = Pr[ Z = k ] E B � Z = k , E � k =0 � � n �� � � � Know: E B � Z = k ≥ lg − 1 . � 3 k ε < p − 1 / 2 : sufficiently small constant. 4 n ( p − ε ) ≤ k ≤ n ( p + ε ) : 5 ≥ 2 n H ( p + ε ) � n � � n � ≥ n + 1 , k ⌊ n ( p + ε ) ⌋ � n ... since 2 n H ( p ) is a good approximation to � as proved in 6 np previous lecture. Sariel (UIUC) New CS473 12 Fall 2015 12 / 25

  14. Super exciting proof continued... � � � = � n � � B k =0 Pr[ Z = k ] E B � Z = k . E � � � � � � � � ≥ � ⌈ n ( p + ε ) ⌉ B k = ⌊ n ( p − ε ) ⌋ Pr Z = k B � Z = k E E � ⌈ n ( p + ε ) ⌉ � �� � n �� � � � ≥ Pr Z = k lg − 1 k k = ⌊ n ( p − ε ) ⌋ ⌈ n ( p + ε ) ⌉ � � � lg 2 n H ( p + ε ) � � ≥ Pr Z = k − 2 n + 1 k = ⌊ n ( p − ε ) ⌋ � � = n H ( p + ε ) − lg( n + 1) − 2 Pr[ | Z − np | ≤ εn ] − nε 2 �� � �� � ≥ n H ( p + ε ) − lg( n + 1) − 2 1 − 2 exp , 4 p since µ = E[ Z ] = np and � � 2 � � � � � � − nε 2 | Z − np | ≥ ε − np ε Pr p pn ≤ 2 exp = 2 exp , 4 p 4 p by the Chernoff inequality. Sariel (UIUC) New CS473 13 Fall 2015 13 / 25

  15. Super exciting proof continued... � � � = � n � � B k =0 Pr[ Z = k ] E B � Z = k . E � � � � � � � � ≥ � ⌈ n ( p + ε ) ⌉ B k = ⌊ n ( p − ε ) ⌋ Pr Z = k B � Z = k E E � ⌈ n ( p + ε ) ⌉ � �� � n �� � � � ≥ Pr Z = k lg − 1 k k = ⌊ n ( p − ε ) ⌋ ⌈ n ( p + ε ) ⌉ � � � lg 2 n H ( p + ε ) � � ≥ Pr Z = k − 2 n + 1 k = ⌊ n ( p − ε ) ⌋ � � = n H ( p + ε ) − lg( n + 1) − 2 Pr[ | Z − np | ≤ εn ] − nε 2 �� � �� � ≥ n H ( p + ε ) − lg( n + 1) − 2 1 − 2 exp , 4 p since µ = E[ Z ] = np and � � 2 � � � � � � − nε 2 | Z − np | ≥ ε − np ε Pr p pn ≤ 2 exp = 2 exp , 4 p 4 p by the Chernoff inequality. Sariel (UIUC) New CS473 13 Fall 2015 13 / 25

  16. Super exciting proof continued... � � � = � n � � B k =0 Pr[ Z = k ] E B � Z = k . E � � � � � � � � ≥ � ⌈ n ( p + ε ) ⌉ B k = ⌊ n ( p − ε ) ⌋ Pr Z = k B � Z = k E E � ⌈ n ( p + ε ) ⌉ � �� � n �� � � � ≥ Pr Z = k lg − 1 k k = ⌊ n ( p − ε ) ⌋ ⌈ n ( p + ε ) ⌉ � � � lg 2 n H ( p + ε ) � � ≥ Pr Z = k − 2 n + 1 k = ⌊ n ( p − ε ) ⌋ � � = n H ( p + ε ) − lg( n + 1) − 2 Pr[ | Z − np | ≤ εn ] − nε 2 �� � �� � ≥ n H ( p + ε ) − lg( n + 1) − 2 1 − 2 exp , 4 p since µ = E[ Z ] = np and � � 2 � � � � � � − nε 2 | Z − np | ≥ ε − np ε Pr p pn ≤ 2 exp = 2 exp , 4 p 4 p by the Chernoff inequality. Sariel (UIUC) New CS473 13 Fall 2015 13 / 25

  17. Super exciting proof continued... � � � = � n � � B k =0 Pr[ Z = k ] E B � Z = k . E � � � � � � � � ≥ � ⌈ n ( p + ε ) ⌉ B k = ⌊ n ( p − ε ) ⌋ Pr Z = k B � Z = k E E � ⌈ n ( p + ε ) ⌉ � �� � n �� � � � ≥ Pr Z = k lg − 1 k k = ⌊ n ( p − ε ) ⌋ ⌈ n ( p + ε ) ⌉ � � � lg 2 n H ( p + ε ) � � ≥ Pr Z = k − 2 n + 1 k = ⌊ n ( p − ε ) ⌋ � � = n H ( p + ε ) − lg( n + 1) − 2 Pr[ | Z − np | ≤ εn ] − nε 2 �� � �� � ≥ n H ( p + ε ) − lg( n + 1) − 2 1 − 2 exp , 4 p since µ = E[ Z ] = np and � � 2 � � � � � � − nε 2 | Z − np | ≥ ε − np ε Pr p pn ≤ 2 exp = 2 exp , 4 p 4 p by the Chernoff inequality. Sariel (UIUC) New CS473 13 Fall 2015 13 / 25

  18. Super exciting proof continued... � � � = � n � � B k =0 Pr[ Z = k ] E B � Z = k . E � � � � � � � � ≥ � ⌈ n ( p + ε ) ⌉ B k = ⌊ n ( p − ε ) ⌋ Pr Z = k B � Z = k E E � ⌈ n ( p + ε ) ⌉ � �� � n �� � � � ≥ Pr Z = k lg − 1 k k = ⌊ n ( p − ε ) ⌋ ⌈ n ( p + ε ) ⌉ � � � lg 2 n H ( p + ε ) � � ≥ Pr Z = k − 2 n + 1 k = ⌊ n ( p − ε ) ⌋ � � = n H ( p + ε ) − lg( n + 1) − 2 Pr[ | Z − np | ≤ εn ] − nε 2 �� � �� � ≥ n H ( p + ε ) − lg( n + 1) − 2 1 − 2 exp , 4 p since µ = E[ Z ] = np and � � 2 � � � � � � − nε 2 | Z − np | ≥ ε − np ε Pr p pn ≤ 2 exp = 2 exp , 4 p 4 p by the Chernoff inequality. Sariel (UIUC) New CS473 13 Fall 2015 13 / 25

  19. Super exciting proof continued... � � � = � n � � B k =0 Pr[ Z = k ] E B � Z = k . E � � � � � � � � ≥ � ⌈ n ( p + ε ) ⌉ B k = ⌊ n ( p − ε ) ⌋ Pr Z = k B � Z = k E E � ⌈ n ( p + ε ) ⌉ � �� � n �� � � � ≥ Pr Z = k lg − 1 k k = ⌊ n ( p − ε ) ⌋ ⌈ n ( p + ε ) ⌉ � � � lg 2 n H ( p + ε ) � � ≥ Pr Z = k − 2 n + 1 k = ⌊ n ( p − ε ) ⌋ � � = n H ( p + ε ) − lg( n + 1) − 2 Pr[ | Z − np | ≤ εn ] − nε 2 �� � �� � ≥ n H ( p + ε ) − lg( n + 1) − 2 1 − 2 exp , 4 p since µ = E[ Z ] = np and � � 2 � � � � � � − nε 2 | Z − np | ≥ ε − np ε Pr p pn ≤ 2 exp = 2 exp , 4 p 4 p by the Chernoff inequality. Sariel (UIUC) New CS473 13 Fall 2015 13 / 25

  20. Super exciting proof continued... � � � = � n � � B k =0 Pr[ Z = k ] E B � Z = k . E � � � � � � � � ≥ � ⌈ n ( p + ε ) ⌉ B k = ⌊ n ( p − ε ) ⌋ Pr Z = k B � Z = k E E � ⌈ n ( p + ε ) ⌉ � �� � n �� � � � ≥ Pr Z = k lg − 1 k k = ⌊ n ( p − ε ) ⌋ ⌈ n ( p + ε ) ⌉ � � � lg 2 n H ( p + ε ) � � ≥ Pr Z = k − 2 n + 1 k = ⌊ n ( p − ε ) ⌋ � � = n H ( p + ε ) − lg( n + 1) − 2 Pr[ | Z − np | ≤ εn ] − nε 2 �� � �� � ≥ n H ( p + ε ) − lg( n + 1) − 2 1 − 2 exp , 4 p since µ = E[ Z ] = np and � � 2 � � � � � � − nε 2 | Z − np | ≥ ε − np ε Pr p pn ≤ 2 exp = 2 exp , 4 p 4 p by the Chernoff inequality. Sariel (UIUC) New CS473 13 Fall 2015 13 / 25

  21. Hyper super exciting proof continued... Fix ε > 0 , such that H ( p + ε ) > (1 − δ/ 4) H ( p ) , p is fixed. 1 = ⇒ n H ( p ) = Ω( n ) , 2 For n sufficiently large: − lg( n + 1) ≥ − δ 10 n H ( p ) . 3 � � − nε 2 δ ... also 2 exp ≤ 10 . 4 4 p For n large enough; 5 � 1 − δ 4 − δ � � 1 − δ � E[ B ] ≥ n H ( p ) 10 10 ≥ (1 − δ ) n H ( p ) , Sariel (UIUC) New CS473 14 Fall 2015 14 / 25

  22. Hyper super exciting proof continued... Fix ε > 0 , such that H ( p + ε ) > (1 − δ/ 4) H ( p ) , p is fixed. 1 = ⇒ n H ( p ) = Ω( n ) , 2 For n sufficiently large: − lg( n + 1) ≥ − δ 10 n H ( p ) . 3 � � − nε 2 δ ... also 2 exp ≤ 10 . 4 4 p For n large enough; 5 � 1 − δ 4 − δ � � 1 − δ � E[ B ] ≥ n H ( p ) 10 10 ≥ (1 − δ ) n H ( p ) , Sariel (UIUC) New CS473 14 Fall 2015 14 / 25

  23. Hyper super exciting proof continued... Fix ε > 0 , such that H ( p + ε ) > (1 − δ/ 4) H ( p ) , p is fixed. 1 = ⇒ n H ( p ) = Ω( n ) , 2 For n sufficiently large: − lg( n + 1) ≥ − δ 10 n H ( p ) . 3 � � − nε 2 δ ... also 2 exp ≤ 10 . 4 4 p For n large enough; 5 � 1 − δ 4 − δ � � 1 − δ � E[ B ] ≥ n H ( p ) 10 10 ≥ (1 − δ ) n H ( p ) , Sariel (UIUC) New CS473 14 Fall 2015 14 / 25

  24. Hyper super exciting proof continued... Fix ε > 0 , such that H ( p + ε ) > (1 − δ/ 4) H ( p ) , p is fixed. 1 = ⇒ n H ( p ) = Ω( n ) , 2 For n sufficiently large: − lg( n + 1) ≥ − δ 10 n H ( p ) . 3 � � − nε 2 δ ... also 2 exp ≤ 10 . 4 4 p For n large enough; 5 � 1 − δ 4 − δ � � 1 − δ � E[ B ] ≥ n H ( p ) 10 10 ≥ (1 − δ ) n H ( p ) , Sariel (UIUC) New CS473 14 Fall 2015 14 / 25

  25. Hyper super exciting proof continued... Fix ε > 0 , such that H ( p + ε ) > (1 − δ/ 4) H ( p ) , p is fixed. 1 = ⇒ n H ( p ) = Ω( n ) , 2 For n sufficiently large: − lg( n + 1) ≥ − δ 10 n H ( p ) . 3 � � − nε 2 δ ... also 2 exp ≤ 10 . 4 4 p For n large enough; 5 � 1 − δ 4 − δ � � 1 − δ � E[ B ] ≥ n H ( p ) 10 10 ≥ (1 − δ ) n H ( p ) , Sariel (UIUC) New CS473 14 Fall 2015 14 / 25

  26. Hyper super duper exciting proof continued... Need to prove upper bound. 1 If input sequence x has probability Pr[ X = x ] , then 2 y = Ext ( x ) has probability to be generated ≥ Pr[ X = x ] . All sequences of length | y | have equal probability to be 3 generated (by definition). 2 | Ext ( x ) | Pr[ X = x ] ≤ 2 | Ext ( x ) | Pr[ y = Ext ( x )] ≤ 1 . 4 = ⇒ | Ext ( x ) | ≤ lg(1 / Pr[ X = x ]) 5 � � � � = � B x Pr X = x | Ext ( x ) | E 6 � � 1 ≤ � x Pr X = x lg [ X = x ] = H ( X ) . Pr Sariel (UIUC) New CS473 15 Fall 2015 15 / 25

  27. Hyper super duper exciting proof continued... Need to prove upper bound. 1 If input sequence x has probability Pr[ X = x ] , then 2 y = Ext ( x ) has probability to be generated ≥ Pr[ X = x ] . All sequences of length | y | have equal probability to be 3 generated (by definition). 2 | Ext ( x ) | Pr[ X = x ] ≤ 2 | Ext ( x ) | Pr[ y = Ext ( x )] ≤ 1 . 4 = ⇒ | Ext ( x ) | ≤ lg(1 / Pr[ X = x ]) 5 � � � � = � B x Pr X = x | Ext ( x ) | E 6 � � 1 ≤ � x Pr X = x lg [ X = x ] = H ( X ) . Pr Sariel (UIUC) New CS473 15 Fall 2015 15 / 25

  28. Hyper super duper exciting proof continued... Need to prove upper bound. 1 If input sequence x has probability Pr[ X = x ] , then 2 y = Ext ( x ) has probability to be generated ≥ Pr[ X = x ] . All sequences of length | y | have equal probability to be 3 generated (by definition). 2 | Ext ( x ) | Pr[ X = x ] ≤ 2 | Ext ( x ) | Pr[ y = Ext ( x )] ≤ 1 . 4 = ⇒ | Ext ( x ) | ≤ lg(1 / Pr[ X = x ]) 5 � � � � = � B x Pr X = x | Ext ( x ) | E 6 � � 1 ≤ � x Pr X = x lg [ X = x ] = H ( X ) . Pr Sariel (UIUC) New CS473 15 Fall 2015 15 / 25

  29. Hyper super duper exciting proof continued... Need to prove upper bound. 1 If input sequence x has probability Pr[ X = x ] , then 2 y = Ext ( x ) has probability to be generated ≥ Pr[ X = x ] . All sequences of length | y | have equal probability to be 3 generated (by definition). 2 | Ext ( x ) | Pr[ X = x ] ≤ 2 | Ext ( x ) | Pr[ y = Ext ( x )] ≤ 1 . 4 = ⇒ | Ext ( x ) | ≤ lg(1 / Pr[ X = x ]) 5 � � � � = � B x Pr X = x | Ext ( x ) | E 6 � � 1 ≤ � x Pr X = x lg [ X = x ] = H ( X ) . Pr Sariel (UIUC) New CS473 15 Fall 2015 15 / 25

  30. Hyper super duper exciting proof continued... Need to prove upper bound. 1 If input sequence x has probability Pr[ X = x ] , then 2 y = Ext ( x ) has probability to be generated ≥ Pr[ X = x ] . All sequences of length | y | have equal probability to be 3 generated (by definition). 2 | Ext ( x ) | Pr[ X = x ] ≤ 2 | Ext ( x ) | Pr[ y = Ext ( x )] ≤ 1 . 4 = ⇒ | Ext ( x ) | ≤ lg(1 / Pr[ X = x ]) 5 � � � � = � B x Pr X = x | Ext ( x ) | E 6 � � 1 ≤ � x Pr X = x lg [ X = x ] = H ( X ) . Pr Sariel (UIUC) New CS473 15 Fall 2015 15 / 25

  31. Hyper super duper exciting proof continued... Need to prove upper bound. 1 If input sequence x has probability Pr[ X = x ] , then 2 y = Ext ( x ) has probability to be generated ≥ Pr[ X = x ] . All sequences of length | y | have equal probability to be 3 generated (by definition). 2 | Ext ( x ) | Pr[ X = x ] ≤ 2 | Ext ( x ) | Pr[ y = Ext ( x )] ≤ 1 . 4 = ⇒ | Ext ( x ) | ≤ lg(1 / Pr[ X = x ]) 5 � � � � = � B x Pr X = x | Ext ( x ) | E 6 � � 1 ≤ � x Pr X = x lg [ X = x ] = H ( X ) . Pr Sariel (UIUC) New CS473 15 Fall 2015 15 / 25

  32. Hyper super duper exciting proof continued... Need to prove upper bound. 1 If input sequence x has probability Pr[ X = x ] , then 2 y = Ext ( x ) has probability to be generated ≥ Pr[ X = x ] . All sequences of length | y | have equal probability to be 3 generated (by definition). 2 | Ext ( x ) | Pr[ X = x ] ≤ 2 | Ext ( x ) | Pr[ y = Ext ( x )] ≤ 1 . 4 = ⇒ | Ext ( x ) | ≤ lg(1 / Pr[ X = x ]) 5 � � � � = � B x Pr X = x | Ext ( x ) | E 6 � � 1 ≤ � x Pr X = x lg [ X = x ] = H ( X ) . Pr Sariel (UIUC) New CS473 15 Fall 2015 15 / 25

  33. Hyper super duper exciting proof continued... Need to prove upper bound. 1 If input sequence x has probability Pr[ X = x ] , then 2 y = Ext ( x ) has probability to be generated ≥ Pr[ X = x ] . All sequences of length | y | have equal probability to be 3 generated (by definition). 2 | Ext ( x ) | Pr[ X = x ] ≤ 2 | Ext ( x ) | Pr[ y = Ext ( x )] ≤ 1 . 4 = ⇒ | Ext ( x ) | ≤ lg(1 / Pr[ X = x ]) 5 � � � � = � B x Pr X = x | Ext ( x ) | E 6 � � 1 ≤ � x Pr X = x lg [ X = x ] = H ( X ) . Pr Sariel (UIUC) New CS473 15 Fall 2015 15 / 25

  34. Hyper super duper exciting proof continued... Need to prove upper bound. 1 If input sequence x has probability Pr[ X = x ] , then 2 y = Ext ( x ) has probability to be generated ≥ Pr[ X = x ] . All sequences of length | y | have equal probability to be 3 generated (by definition). 2 | Ext ( x ) | Pr[ X = x ] ≤ 2 | Ext ( x ) | Pr[ y = Ext ( x )] ≤ 1 . 4 = ⇒ | Ext ( x ) | ≤ lg(1 / Pr[ X = x ]) 5 � � � � = � B x Pr X = x | Ext ( x ) | E 6 � � 1 ≤ � x Pr X = x lg [ X = x ] = H ( X ) . Pr Sariel (UIUC) New CS473 15 Fall 2015 15 / 25

  35. Part III Coding: Shannon’s Theorem Sariel (UIUC) New CS473 16 Fall 2015 16 / 25

  36. Shannon’s Theorem Definition binary symmetric channel with parameter p 1 sequence of bits x 1 , x 2 , . . . , an 2 output: y 1 , y 2 , . . . , 3 a sequence of bits such that... Pr[ x i = y i ] = 1 − p independently for each i . 4 Sariel (UIUC) New CS473 17 Fall 2015 17 / 25

  37. Shannon’s Theorem Definition binary symmetric channel with parameter p 1 sequence of bits x 1 , x 2 , . . . , an 2 output: y 1 , y 2 , . . . , 3 a sequence of bits such that... Pr[ x i = y i ] = 1 − p independently for each i . 4 Sariel (UIUC) New CS473 17 Fall 2015 17 / 25

  38. Shannon’s Theorem Definition binary symmetric channel with parameter p 1 sequence of bits x 1 , x 2 , . . . , an 2 output: y 1 , y 2 , . . . , 3 a sequence of bits such that... Pr[ x i = y i ] = 1 − p independently for each i . 4 Sariel (UIUC) New CS473 17 Fall 2015 17 / 25

  39. Shannon’s Theorem Definition binary symmetric channel with parameter p 1 sequence of bits x 1 , x 2 , . . . , an 2 output: y 1 , y 2 , . . . , 3 a sequence of bits such that... Pr[ x i = y i ] = 1 − p independently for each i . 4 Sariel (UIUC) New CS473 17 Fall 2015 17 / 25

  40. Encoding/decoding with noise Definition ( k, n ) encoding function Enc : { 0 , 1 } k → { 0 , 1 } n takes as 1 input a sequence of k bits and outputs a sequence of n bits. ( k, n ) decoding function Dec : { 0 , 1 } n → { 0 , 1 } k takes as 2 input a sequence of n bits and outputs a sequence of k bits. Sariel (UIUC) New CS473 18 Fall 2015 18 / 25

  41. Encoding/decoding with noise Definition ( k, n ) encoding function Enc : { 0 , 1 } k → { 0 , 1 } n takes as 1 input a sequence of k bits and outputs a sequence of n bits. ( k, n ) decoding function Dec : { 0 , 1 } n → { 0 , 1 } k takes as 2 input a sequence of n bits and outputs a sequence of k bits. Sariel (UIUC) New CS473 18 Fall 2015 18 / 25

  42. Claude Elwood Shannon Claude Elwood Shannon (April 30, 1916 - February 24, 2001), an American electrical engineer and mathematician, has been called “the father of information theory”. His master thesis was how to building boolean circuits for any boolean function. Sariel (UIUC) New CS473 19 Fall 2015 19 / 25

  43. Shannon’s theorem (1948) Theorem (Shannon’s theorem) For a binary symmetric channel with parameter p < 1 / 2 and for any constants δ, γ > 0 , where n is sufficiently large, the following holds: (i) For an k ≤ n (1 − H ( p ) − δ ) there exists ( k, n ) encoding and decoding functions such that the probability the receiver fails to obtain the correct message is at most γ for every possible k -bit input messages. (ii) There are no ( k, n ) encoding and decoding functions with k ≥ n (1 − H ( p ) + δ ) such that the probability of decoding correctly is at least γ for a k -bit input message chosen uniformly at random. Sariel (UIUC) New CS473 20 Fall 2015 20 / 25

  44. When the sender sends a string... S = s 1 s 2 . . . s n S Sariel (UIUC) New CS473 21 Fall 2015 21 / 25

  45. When the sender sends a string... S = s 1 s 2 . . . s n S np Sariel (UIUC) New CS473 21 Fall 2015 21 / 25

  46. When the sender sends a string... S = s 1 s 2 . . . s n S np Sariel (UIUC) New CS473 21 Fall 2015 21 / 25

  47. When the sender sends a string... S = s 1 s 2 . . . s n S np (1 − δ ) np (1 + δ ) np Sariel (UIUC) New CS473 21 Fall 2015 21 / 25

  48. When the sender sends a string... S = s 1 s 2 . . . s n S np (1 − δ ) np (1 + δ ) np One ring to rule them all! Sariel (UIUC) New CS473 21 Fall 2015 21 / 25

  49. Some intuition... senders sent string S = s 1 s 2 . . . s n . 1 receiver got string T = t 1 t 2 . . . t n . 2 p = Pr[ t i � = s i ] , for all i . 3 � � U : Hamming distance between S and T : U = � s i � = t i . 4 i By assumption: E[ U ] = pn , and U is a binomial variable. 5 � � By Chernoff inequality: U ∈ (1 − δ ) np, (1 + δ ) np with 6 high probability, where δ is tiny constant. T is in a ring R centered at S , with inner radius (1 − δ ) np 7 and outer radius (1 + δ ) np . This ring has 8 (1+ δ ) np � n � � n � � ≤ α = 2 · 2 n H ((1+ δ ) p ) . ≤ 2 i (1 + δ ) np i =(1 − δ ) np strings in it. Sariel (UIUC) New CS473 22 Fall 2015 22 / 25

  50. Some intuition... senders sent string S = s 1 s 2 . . . s n . 1 receiver got string T = t 1 t 2 . . . t n . 2 p = Pr[ t i � = s i ] , for all i . 3 � � U : Hamming distance between S and T : U = � s i � = t i . 4 i By assumption: E[ U ] = pn , and U is a binomial variable. 5 � � By Chernoff inequality: U ∈ (1 − δ ) np, (1 + δ ) np with 6 high probability, where δ is tiny constant. T is in a ring R centered at S , with inner radius (1 − δ ) np 7 and outer radius (1 + δ ) np . This ring has 8 (1+ δ ) np � n � � n � � ≤ α = 2 · 2 n H ((1+ δ ) p ) . ≤ 2 i (1 + δ ) np i =(1 − δ ) np strings in it. Sariel (UIUC) New CS473 22 Fall 2015 22 / 25

  51. Some intuition... senders sent string S = s 1 s 2 . . . s n . 1 receiver got string T = t 1 t 2 . . . t n . 2 p = Pr[ t i � = s i ] , for all i . 3 � � U : Hamming distance between S and T : U = � s i � = t i . 4 i By assumption: E[ U ] = pn , and U is a binomial variable. 5 � � By Chernoff inequality: U ∈ (1 − δ ) np, (1 + δ ) np with 6 high probability, where δ is tiny constant. T is in a ring R centered at S , with inner radius (1 − δ ) np 7 and outer radius (1 + δ ) np . This ring has 8 (1+ δ ) np � n � � n � � ≤ α = 2 · 2 n H ((1+ δ ) p ) . ≤ 2 i (1 + δ ) np i =(1 − δ ) np strings in it. Sariel (UIUC) New CS473 22 Fall 2015 22 / 25

  52. Some intuition... senders sent string S = s 1 s 2 . . . s n . 1 receiver got string T = t 1 t 2 . . . t n . 2 p = Pr[ t i � = s i ] , for all i . 3 � � U : Hamming distance between S and T : U = � s i � = t i . 4 i By assumption: E[ U ] = pn , and U is a binomial variable. 5 � � By Chernoff inequality: U ∈ (1 − δ ) np, (1 + δ ) np with 6 high probability, where δ is tiny constant. T is in a ring R centered at S , with inner radius (1 − δ ) np 7 and outer radius (1 + δ ) np . This ring has 8 (1+ δ ) np � n � � n � � ≤ α = 2 · 2 n H ((1+ δ ) p ) . ≤ 2 i (1 + δ ) np i =(1 − δ ) np strings in it. Sariel (UIUC) New CS473 22 Fall 2015 22 / 25

  53. Some intuition... senders sent string S = s 1 s 2 . . . s n . 1 receiver got string T = t 1 t 2 . . . t n . 2 p = Pr[ t i � = s i ] , for all i . 3 � � U : Hamming distance between S and T : U = � s i � = t i . 4 i By assumption: E[ U ] = pn , and U is a binomial variable. 5 � � By Chernoff inequality: U ∈ (1 − δ ) np, (1 + δ ) np with 6 high probability, where δ is tiny constant. T is in a ring R centered at S , with inner radius (1 − δ ) np 7 and outer radius (1 + δ ) np . This ring has 8 (1+ δ ) np � n � � n � � ≤ α = 2 · 2 n H ((1+ δ ) p ) . ≤ 2 i (1 + δ ) np i =(1 − δ ) np strings in it. Sariel (UIUC) New CS473 22 Fall 2015 22 / 25

  54. Some intuition... senders sent string S = s 1 s 2 . . . s n . 1 receiver got string T = t 1 t 2 . . . t n . 2 p = Pr[ t i � = s i ] , for all i . 3 � � U : Hamming distance between S and T : U = � s i � = t i . 4 i By assumption: E[ U ] = pn , and U is a binomial variable. 5 � � By Chernoff inequality: U ∈ (1 − δ ) np, (1 + δ ) np with 6 high probability, where δ is tiny constant. T is in a ring R centered at S , with inner radius (1 − δ ) np 7 and outer radius (1 + δ ) np . This ring has 8 (1+ δ ) np � n � � n � � ≤ α = 2 · 2 n H ((1+ δ ) p ) . ≤ 2 i (1 + δ ) np i =(1 − δ ) np strings in it. Sariel (UIUC) New CS473 22 Fall 2015 22 / 25

  55. Some intuition... senders sent string S = s 1 s 2 . . . s n . 1 receiver got string T = t 1 t 2 . . . t n . 2 p = Pr[ t i � = s i ] , for all i . 3 � � U : Hamming distance between S and T : U = � s i � = t i . 4 i By assumption: E[ U ] = pn , and U is a binomial variable. 5 � � By Chernoff inequality: U ∈ (1 − δ ) np, (1 + δ ) np with 6 high probability, where δ is tiny constant. T is in a ring R centered at S , with inner radius (1 − δ ) np 7 and outer radius (1 + δ ) np . This ring has 8 (1+ δ ) np � n � � n � � ≤ α = 2 · 2 n H ((1+ δ ) p ) . ≤ 2 i (1 + δ ) np i =(1 − δ ) np strings in it. Sariel (UIUC) New CS473 22 Fall 2015 22 / 25

  56. Some intuition... senders sent string S = s 1 s 2 . . . s n . 1 receiver got string T = t 1 t 2 . . . t n . 2 p = Pr[ t i � = s i ] , for all i . 3 � � U : Hamming distance between S and T : U = � s i � = t i . 4 i By assumption: E[ U ] = pn , and U is a binomial variable. 5 � � By Chernoff inequality: U ∈ (1 − δ ) np, (1 + δ ) np with 6 high probability, where δ is tiny constant. T is in a ring R centered at S , with inner radius (1 − δ ) np 7 and outer radius (1 + δ ) np . This ring has 8 (1+ δ ) np � n � � n � � ≤ α = 2 · 2 n H ((1+ δ ) p ) . ≤ 2 i (1 + δ ) np i =(1 − δ ) np strings in it. Sariel (UIUC) New CS473 22 Fall 2015 22 / 25

  57. Many rings for many codewords... Sariel (UIUC) New CS473 23 Fall 2015 23 / 25

  58. Some more intuition... Pick as many disjoint rings as possible: R 1 , . . . , R κ . 1 If every word in the hypercube would be covered... 2 ... use 2 n codewords = ⇒ κ ≥ 3 κ ≥ 2 n 2 n 2 · 2 n H ((1+ δ ) p ) ≈ 2 n (1 − H ((1+ δ ) p )) . | R | ≥ Consider all possible strings of length k such that 2 k ≤ κ . 4 Map i th string in { 0 , 1 } k to the center C i of the i th ring R i . 5 If send C i = ⇒ receiver gets a string in R i . 6 Decoding is easy - find the ring R i containing the received 7 string, take its center string C i , and output the original string it was mapped to. How many bits? 8 � � �� k = ⌊ log κ ⌋ = n 1 − H (1 + δ ) p ≈ n (1 − H ( p )) , Sariel (UIUC) New CS473 24 Fall 2015 24 / 25

  59. Some more intuition... Pick as many disjoint rings as possible: R 1 , . . . , R κ . 1 If every word in the hypercube would be covered... 2 ... use 2 n codewords = ⇒ κ ≥ 3 κ ≥ 2 n 2 n 2 · 2 n H ((1+ δ ) p ) ≈ 2 n (1 − H ((1+ δ ) p )) . | R | ≥ Consider all possible strings of length k such that 2 k ≤ κ . 4 Map i th string in { 0 , 1 } k to the center C i of the i th ring R i . 5 If send C i = ⇒ receiver gets a string in R i . 6 Decoding is easy - find the ring R i containing the received 7 string, take its center string C i , and output the original string it was mapped to. How many bits? 8 � � �� k = ⌊ log κ ⌋ = n 1 − H (1 + δ ) p ≈ n (1 − H ( p )) , Sariel (UIUC) New CS473 24 Fall 2015 24 / 25

  60. Some more intuition... Pick as many disjoint rings as possible: R 1 , . . . , R κ . 1 If every word in the hypercube would be covered... 2 ... use 2 n codewords = ⇒ κ ≥ 3 κ ≥ 2 n 2 n 2 · 2 n H ((1+ δ ) p ) ≈ 2 n (1 − H ((1+ δ ) p )) . | R | ≥ Consider all possible strings of length k such that 2 k ≤ κ . 4 Map i th string in { 0 , 1 } k to the center C i of the i th ring R i . 5 If send C i = ⇒ receiver gets a string in R i . 6 Decoding is easy - find the ring R i containing the received 7 string, take its center string C i , and output the original string it was mapped to. How many bits? 8 � � �� k = ⌊ log κ ⌋ = n 1 − H (1 + δ ) p ≈ n (1 − H ( p )) , Sariel (UIUC) New CS473 24 Fall 2015 24 / 25

  61. Some more intuition... Pick as many disjoint rings as possible: R 1 , . . . , R κ . 1 If every word in the hypercube would be covered... 2 ... use 2 n codewords = ⇒ κ ≥ 3 κ ≥ 2 n 2 n 2 · 2 n H ((1+ δ ) p ) ≈ 2 n (1 − H ((1+ δ ) p )) . | R | ≥ Consider all possible strings of length k such that 2 k ≤ κ . 4 Map i th string in { 0 , 1 } k to the center C i of the i th ring R i . 5 If send C i = ⇒ receiver gets a string in R i . 6 Decoding is easy - find the ring R i containing the received 7 string, take its center string C i , and output the original string it was mapped to. How many bits? 8 � � �� k = ⌊ log κ ⌋ = n 1 − H (1 + δ ) p ≈ n (1 − H ( p )) , Sariel (UIUC) New CS473 24 Fall 2015 24 / 25

  62. Some more intuition... Pick as many disjoint rings as possible: R 1 , . . . , R κ . 1 If every word in the hypercube would be covered... 2 ... use 2 n codewords = ⇒ κ ≥ 3 κ ≥ 2 n 2 n 2 · 2 n H ((1+ δ ) p ) ≈ 2 n (1 − H ((1+ δ ) p )) . | R | ≥ Consider all possible strings of length k such that 2 k ≤ κ . 4 Map i th string in { 0 , 1 } k to the center C i of the i th ring R i . 5 If send C i = ⇒ receiver gets a string in R i . 6 Decoding is easy - find the ring R i containing the received 7 string, take its center string C i , and output the original string it was mapped to. How many bits? 8 � � �� k = ⌊ log κ ⌋ = n 1 − H (1 + δ ) p ≈ n (1 − H ( p )) , Sariel (UIUC) New CS473 24 Fall 2015 24 / 25

  63. Some more intuition... Pick as many disjoint rings as possible: R 1 , . . . , R κ . 1 If every word in the hypercube would be covered... 2 ... use 2 n codewords = ⇒ κ ≥ 3 κ ≥ 2 n 2 n 2 · 2 n H ((1+ δ ) p ) ≈ 2 n (1 − H ((1+ δ ) p )) . | R | ≥ Consider all possible strings of length k such that 2 k ≤ κ . 4 Map i th string in { 0 , 1 } k to the center C i of the i th ring R i . 5 If send C i = ⇒ receiver gets a string in R i . 6 Decoding is easy - find the ring R i containing the received 7 string, take its center string C i , and output the original string it was mapped to. How many bits? 8 � � �� k = ⌊ log κ ⌋ = n 1 − H (1 + δ ) p ≈ n (1 − H ( p )) , Sariel (UIUC) New CS473 24 Fall 2015 24 / 25

  64. Some more intuition... Pick as many disjoint rings as possible: R 1 , . . . , R κ . 1 If every word in the hypercube would be covered... 2 ... use 2 n codewords = ⇒ κ ≥ 3 κ ≥ 2 n 2 n 2 · 2 n H ((1+ δ ) p ) ≈ 2 n (1 − H ((1+ δ ) p )) . | R | ≥ Consider all possible strings of length k such that 2 k ≤ κ . 4 Map i th string in { 0 , 1 } k to the center C i of the i th ring R i . 5 If send C i = ⇒ receiver gets a string in R i . 6 Decoding is easy - find the ring R i containing the received 7 string, take its center string C i , and output the original string it was mapped to. How many bits? 8 � � �� k = ⌊ log κ ⌋ = n 1 − H (1 + δ ) p ≈ n (1 − H ( p )) , Sariel (UIUC) New CS473 24 Fall 2015 24 / 25

  65. Some more intuition... Pick as many disjoint rings as possible: R 1 , . . . , R κ . 1 If every word in the hypercube would be covered... 2 ... use 2 n codewords = ⇒ κ ≥ 3 κ ≥ 2 n 2 n 2 · 2 n H ((1+ δ ) p ) ≈ 2 n (1 − H ((1+ δ ) p )) . | R | ≥ Consider all possible strings of length k such that 2 k ≤ κ . 4 Map i th string in { 0 , 1 } k to the center C i of the i th ring R i . 5 If send C i = ⇒ receiver gets a string in R i . 6 Decoding is easy - find the ring R i containing the received 7 string, take its center string C i , and output the original string it was mapped to. How many bits? 8 � � �� k = ⌊ log κ ⌋ = n 1 − H (1 + δ ) p ≈ n (1 − H ( p )) , Sariel (UIUC) New CS473 24 Fall 2015 24 / 25

  66. Some more intuition... Pick as many disjoint rings as possible: R 1 , . . . , R κ . 1 If every word in the hypercube would be covered... 2 ... use 2 n codewords = ⇒ κ ≥ 3 κ ≥ 2 n 2 n 2 · 2 n H ((1+ δ ) p ) ≈ 2 n (1 − H ((1+ δ ) p )) . | R | ≥ Consider all possible strings of length k such that 2 k ≤ κ . 4 Map i th string in { 0 , 1 } k to the center C i of the i th ring R i . 5 If send C i = ⇒ receiver gets a string in R i . 6 Decoding is easy - find the ring R i containing the received 7 string, take its center string C i , and output the original string it was mapped to. How many bits? 8 � � �� k = ⌊ log κ ⌋ = n 1 − H (1 + δ ) p ≈ n (1 − H ( p )) , Sariel (UIUC) New CS473 24 Fall 2015 24 / 25

  67. What is wrong with the above? Can not find such a large set of disjoint rings. 1 Reason is that when you pack rings (or balls) you are going to 2 have wasted spaces around. Overcome this: allow rings to overlap somewhat. 3 Makes things considerably more involved. 4 Details in class notes. 5 Sariel (UIUC) New CS473 25 Fall 2015 25 / 25

  68. What is wrong with the above? Can not find such a large set of disjoint rings. 1 Reason is that when you pack rings (or balls) you are going to 2 have wasted spaces around. Overcome this: allow rings to overlap somewhat. 3 Makes things considerably more involved. 4 Details in class notes. 5 Sariel (UIUC) New CS473 25 Fall 2015 25 / 25

  69. What is wrong with the above? Can not find such a large set of disjoint rings. 1 Reason is that when you pack rings (or balls) you are going to 2 have wasted spaces around. Overcome this: allow rings to overlap somewhat. 3 Makes things considerably more involved. 4 Details in class notes. 5 Sariel (UIUC) New CS473 25 Fall 2015 25 / 25

  70. What is wrong with the above? Can not find such a large set of disjoint rings. 1 Reason is that when you pack rings (or balls) you are going to 2 have wasted spaces around. Overcome this: allow rings to overlap somewhat. 3 Makes things considerably more involved. 4 Details in class notes. 5 Sariel (UIUC) New CS473 25 Fall 2015 25 / 25

  71. What is wrong with the above? Can not find such a large set of disjoint rings. 1 Reason is that when you pack rings (or balls) you are going to 2 have wasted spaces around. Overcome this: allow rings to overlap somewhat. 3 Makes things considerably more involved. 4 Details in class notes. 5 Sariel (UIUC) New CS473 25 Fall 2015 25 / 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend