Information Theory Lecture 9 Error Exponents The part on discrete - PDF document

Information Theory Lecture 9 • Error Exponents • The part on discrete channels of • R. Gallager, “A Simple Derivation of the Coding Theorem and Some Applications,” IEEE Trans. on Inform. Theory , Jan. 1965 • In addition some concepts found in • R. Gallager, Information Theory and Reliable Communication , Wiley 1968 Mikael Skoglund, Information Theory 1/29 Discrete Channels (recap) channel X n Y n • Let X and Y be finite sets. A discrete channel is a random mapping from X n to Y n described by the conditional pmfs 1 ∈ X n and y n p n ( y n 1 | x n 1 ) for all n ≥ 1 , x n 1 ∈ Y n . • The channel is (stationary and) memoryless if n p n ( y n 1 | x n � 1 ) = p ( y m | x m ) , n = 2 , 3 , . . . m =1 • A discrete memoryless channel (DMC) is completely described by the triple ( X , p ( y | x ) , Y ) Mikael Skoglund, Information Theory 2/29

Block Channel Codes (recap) encoder channel decoder x n 1 ( ω ) Y n ω ˆ ω 1 α β • Define an ( M, n ) block channel code for a DMC ( X , p ( y | x ) , Y ) by 1 An index set I M � { 1 , . . . , M } 2 An encoder mapping α : I M → X n . The set � � x n 1 : x n C � 1 = α ( i ) , ∀ i ∈ I M of codewords is called the codebook . 3 A decoder mapping β : Y n → I M , as characterized by the decoding subsets 1 ∈ Y n : β ( y n Y n ( i ) = { y n 1 ) = i } , i = 1 , . . . , M Mikael Skoglund, Information Theory 3/29 • The rate of the code is R � log M [bits per channel use] n • A code is often represented by its codebook only ; the decoder can often be derived from the codebook using a specific rule (joint typicality, maximum a posteriori, maximum likelihood,. . . ) • Assume, in the following, that ω ∈ I M is drawn according to p ( m ) = Pr( ω = m ) Mikael Skoglund, Information Theory 4/29

Error Probabilities (recap) • For a given code • Conditional � p n ( y n 1 | x n P e,m = 1 ( m )) (= λ m in CT) y n 1 ∈ ( Y n ( m )) c • Maximal � = λ ( n ) in CT � P e, max = P ( n ) e, max = max m P e,m • Overall/average/total M � P e = P ( n ) = p ( m ) P e,m e m =1 Mikael Skoglund, Information Theory 5/29 “Random Coding” (recap) • Assume that the M codewords x n 1 ( m ) , m = 1 , . . . , M , of a codebook C are drawn independently according to 1 ∈ X n = q n ( x n 1 ) , x n � x n � � x n � ⇒ P ( C ) = q n 1 (1) · · · q n 1 ( M ) . • Error probabilities over an ensemble of codes, • Conditional ¯ � P e,m = P ( C ) P e,m ( C ) C • Overall/average/total ¯ � P e = P ( C ) P e ( C ) C • Note : In addition to C a decoder needs to be specified Mikael Skoglund, Information Theory 6/29

The Channel Coding Theorem (recap) • A rate R is achievable if there exists a sequence of ( M, n ) codes, with M = ⌈ 2 nR ⌉ , such that P ( n ) e, max → 0 as n → ∞ . Capacity C is the supremum of all achievable rates. • For a discrete memoryless channel, C = max p ( x ) I ( X ; Y ) • Previous proof (in CT) based on typical sequences = ⇒ limited insight, e.g., into how fast P ( n ) e, max → 0 as n → ∞ for R < C . . . • In fact, for any n > 0 , P ( n ) e, max < 4 · 2 − nE r ( R ) where E r ( R ) is the random coding exponent Mikael Skoglund, Information Theory 7/29 Exponential Bounds • A code C ( n, R ) of length n and rate R • Assume p ( m ) = M − 1 , a DMC and consider the average error probability M = 1 � P ( n ) P e,m e M m =1 • Bounds easily extended to P ( n ) e, max • Non-zero lower bound may not exist for arbitrary p ( m ) • Upper-bounds (there exists a code) P ( n ) ≤ 2 − nE min ( R ) , any n > 0 e • Lower-bounds (for all codes) P ( n ) ≥ 2 − nE max ( R ) , as n → ∞ e Mikael Skoglund, Information Theory 8/29

Reliability Function, Error Exponents • The reliability function of a channel, − log P ∗ e ( n, R ) E ( R ) = lim , n n →∞ where P ∗ e ( n, R ) is the minimum over all codes C ( n, R ) • Lower bounds to E ( R ) yield upper bounds to P ( n ) (as e n → ∞ ) • “random coding” E r ( R ) and “expurgated” E ex ( R ) exponents • Upper bounds to E ( R ) yield lower bounds to P ( n ) (as e n → ∞ ) • “sphere-packing” E sp ( R ) and “straight-line” E sl ( R ) exponents Mikael Skoglund, Information Theory 9/29 • With E max = max ( E r , E ex ) and E min = min ( E sp , E sl ) E max ( R ) ≤ E ( R ) ≤ E min ( R ) • The critical rate R cr is the smallest R in [0 , C ] such that E max ( R ) = E min ( R ) = E ( R ) for R cr ≤ R ≤ C ; • For R ∈ [ R cr , C ) the exponent E ( R ) > 0 in ≈ 2 − nE ( R ) as n → ∞ P ( n ) e for the best possible existing code is known! Mikael Skoglund, Information Theory 10/29

Decoding Rules • Joint typicality ( A ( n ) jointly typical set) ǫ 1 ∈ Y n : ( x n ⇒ m ′ = m } Y n ( m ) = { y n 1 ( m ′ ) , y n 1 ) ∈ A ( n ) ⇐ ǫ • Maximum a posteriori (minimum error probability) 1 ∈ Y n : m = argmax Y n ( m ) = { y n Pr( m ′ | y n 1 ) } m ′ • Maximum likelihood (a priori unknown / unmeaningful / uniform) 1 ∈ Y n : m = argmax Y n ( m ) = { y n p n ( y n 1 | x n 1 ( m ′ )) } m ′ • To derive existence results it suffices to consider a specific rule Mikael Skoglund, Information Theory 11/29 Two Codewords • Two codewords, C = { x n 1 (1) , x n 1 (2) } , and any channel p n ( y n 1 | x n 1 ) • Assume maximum likelihood decoding, 1 ∈ Y n : p n ( y n Y n (1) = { y n 1 | x n 1 (1)) > p n ( y n 1 | x n 1 (2)) } Hence, for any s ∈ (0 , 1) it holds that � p n ( y n 1 | x n P e, 1 = 1 (1)) y n 1 ∈Y n (1) c � p n ( y n 1 | x n 1 (1)) 1 − s p n ( y n 1 | x n 1 (2)) s ≤ y n 1 ∈Y n (1) c � p n ( y n 1 | x n 1 (1)) 1 − s p n ( y n 1 | x n 1 (2)) s ≤ y n 1 ∈Y n • An equivalent bound applies to P e, 2 Mikael Skoglund, Information Theory 12/29

• For a memoryless channel we get (with ¯ m = ( m mod 2) + 1 ) n n m )) s = � � p ( y i | x i ( m )) 1 − s p ( y i | x i ( ¯ � P e,m ≤ g n ( s ) , m = 1 , 2 i =1 y i ∈Y i =1 • For a BSC( ǫ ) with two codewords at distance d n � d � � � P e,m ≤ min g n ( s ) = 2 ǫ (1 − ǫ ) m = 1 , 2 s ∈ (0 , 1) i =1 ⇒ For a “best” pair of codewords ( d = n ) � n � � P e,m ≤ 2 ǫ (1 − ǫ ) m = 1 , 2 ⇒ For a “typical” pair of codewords ( d = n/ 2 ) � n/ 2 � � P e,m ≤ 2 ǫ (1 − ǫ ) m = 1 , 2 Mikael Skoglund, Information Theory 13/29 Ensemble Average – Two Codewords • Pick a probability assignment q n on X n , and choose M codewords in C = { x n 1 (1) , . . . , x n 1 ( M ) } independently; M � q n ( x n P ( C ) = 1 ( m )) m =1 • For memoryless channels, we take q n of the form n q n ( x n � 1 ) = q 1 ( x i ) i =1 Mikael Skoglund, Information Theory 14/29

• Thus, for m = 1 , 2 ¯ � � q n ( x n 1 (1)) q n ( x n P e,m = 1 (2)) P e,m x n x n 1 (1) ∈X n 1 (2) ∈X n   � � q n ( x n 1 (1)) p n ( y n 1 | x n 1 (1)) 1 − s ≤   y n 1 ∈Y n x n 1 (1) ∈X n   � q n ( x n 1 (2)) p n ( y n 1 | x n 1 (2)) s ×   x n 1 (2) ∈X n Minimum over s ∈ (0 , 1) at s = 1 / 2 = ⇒ 2   � ¯ �  � q n ( x n p n ( y n 1 | x n P e,m ≤ 1 ) 1 )  y n x n 1 ∈Y n 1 ∈X n Mikael Skoglund, Information Theory 15/29 • For a memoryless channel n  � 2  ��   ¯ � � P e,m ≤ q 1 ( x ) p 1 ( y | x ) m = 1 , 2   y ∈Y n x ∈X • In particular, for a BSC( ǫ ) with q 1 ( x ) = 1 / 2 � 2 � n √ � √ ǫ + � 1 ¯ P e,m ≤ 1 − ǫ m = 1 , 2 2 1 � √ ǫ + √ 1 − ǫ � 2 (random) 1 – Solid: 0.8 2 0.6 � 1 / 2 � � – Dashed: 2 ǫ (1 − ǫ ) (typical) 0.4 0.2 � – Dotted: 2 ǫ (1 − ǫ ) (best) 0.1 0.2 0.3 0.4 0.5 Mikael Skoglund, Information Theory 16/29

Alternative Derivation — Still Two Codewords • Examine the ensemble average directly ¯ � � q n ( x n p n ( y n 1 | x n 1 (1)) Pr( y n 1 ∈ Y n (1) c ) P e, 1 = 1 (1)) y n x n 1 ∈Y n 1 (1) ∈X n • Since the codewords are randomly chosen Pr( y n 1 ∈ Y n (1) c ) � q n ( x n = 1 (2)) x n 1 (2): p n ( y n 1 | x n 1 (1)) ≤ p n ( y n 1 | x n 1 (2)) � s � p n ( y n 1 | x n 1 (2)) � q n ( x n ≤ 1 (2)) p n ( y n 1 | x n 1 (1)) x n 1 (2) ∈X n • Substituting this into the first equation yields the result • This method generalizes more easily! Mikael Skoglund, Information Theory 17/29 Bound on ¯ P e,m – Many Codewords • As before, ¯ � q n ( x n � p n ( y n 1 | x n 1 ( m )) Pr( y n 1 ∈ Y n ( m ) c ) P e,m = 1 ( m )) y n x n 1 ( m ) ∈X n 1 ∈Y n • For M ≥ 2 codewords, any ρ ∈ [0 , 1] and s > 0 � Pr( y n 1 ∈ Y n ( m ) c ) { y n 1 ∈ Y n ( m ′ ) } ) ≤ Pr( m ′ � = m ρ    � Pr( y n 1 ∈ Y n ( m ′ )) ≤  m ′ � = m ρ   p n ( y n 1 | x n 1 ) s � q n ( x n ≤  ( M − 1) 1 )  p n ( y n 1 | x n 1 ( m )) s x n 1 ∈X n Mikael Skoglund, Information Theory 18/29

Information Theory Lecture 9 Error Exponents The part on discrete - PDF document

Information Theory Lecture 9 Error Exponents The part on discrete channels of R. Gallager, A Simple Derivation of the Coding Theorem and Some Applications, IEEE Trans. on Inform. Theory , Jan. 1965 In addition some concepts

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

Game Theory and Nuclear Weapons Game Theory and Nuclear Weapons Game Theory and Nuclear Warfare

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

SOCIOLOGICAL THEORY: A SCIENTIFIC APPROACH What is a theory? ! What does a theory consist of?

Applied Hodge Theory: Social Choice, Crowdsourced Ranking, and Game Theory Yuan Yao HKUST

SOCIOLOGICAL THEORY: A SCIENTIFIC APPROACH What is a theory? What does a theory consist of?

General motivations Model theory Recursion theory Lambda calculus Set theory

Information Theory project Lo Bordy 29 mai 2017 Lo Bordy Information Theory project Global

Overview Coding and Information Theory What is information theory? Entropy Coding Chris

Lectures 34: Consumer Theory Alexander Wolitzky MIT 14.121 1 Consumer Theory Consumer theory

Absolute notions in model theory Syntactic and semantic notions Absolutness from model theory

Theory or Practice? Theory : Without theory, practice is but routine born out of habit.

String Theory Ideology Or Tool Box Plan What is string theory? Unification ideology.

FILLING IN THE MARGINS: THE USE OF QUEER THEORY, FEMINIST STANDPOINT THEORY, & CRITICAL RACE

EXCHANGE THEORY Chapter 3 Leader Member Exchange Theory 2 Initially the theory described the

r s rs

Reliable multiprecision implementation of a class of special functions Team: A. Cuyt, V.B.

Dependability Evaluation Techniques for Dependability Evaluation The dependability evaluation of

Revisiting Zero-Rate Bounds on the Reliability Function of Discrete Memoryless Channels Marco

organization and performance in Burundi Ren NSABIMANA Electricity sector in Burundi

Management Dr. Stefan Wagner Technische Universitt Mnchen Garching 11 June 2010 1 Last

Challenges for Provenance in Cloud Compu5ng Imad Abbadi and

Comparing players in simple games Haris Aziz 1 1 Department of Computer Science & DiMAP

Information Theory Lecture 9 Error Exponents The part on discrete - PDF document

Information Theory Lecture 9 Error Exponents The part on discrete channels of R. Gallager, A Simple Derivation of the Coding Theorem and Some Applications, IEEE Trans. on Inform. Theory , Jan. 1965 In addition some concepts

Chapter 2- -3 3 Chapter 2 Definition of Theory: A theory is a systematic Definition of

Game Theory and Nuclear Weapons Game Theory and Nuclear Weapons Game Theory and Nuclear Warfare

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

Theory and Applications of Boosting Theory and Applications of Boosting Theory and Applications

SOCIOLOGICAL THEORY: A SCIENTIFIC APPROACH What is a theory? ! What does a theory consist of?

Applied Hodge Theory: Social Choice, Crowdsourced Ranking, and Game Theory Yuan Yao HKUST

SOCIOLOGICAL THEORY: A SCIENTIFIC APPROACH What is a theory? What does a theory consist of?

General motivations Model theory Recursion theory Lambda calculus Set theory

Information Theory project Lo Bordy 29 mai 2017 Lo Bordy Information Theory project Global

Overview Coding and Information Theory What is information theory? Entropy Coding Chris

Lectures 34: Consumer Theory Alexander Wolitzky MIT 14.121 1 Consumer Theory Consumer theory

Absolute notions in model theory Syntactic and semantic notions Absolutness from model theory

Theory or Practice? Theory : Without theory, practice is but routine born out of habit.

String Theory Ideology Or Tool Box Plan What is string theory? Unification ideology.

FILLING IN THE MARGINS: THE USE OF QUEER THEORY, FEMINIST STANDPOINT THEORY, &amp; CRITICAL RACE

EXCHANGE THEORY Chapter 3 Leader Member Exchange Theory 2 Initially the theory described the

r s rs

Reliable multiprecision implementation of a class of special functions Team: A. Cuyt, V.B.

Dependability Evaluation Techniques for Dependability Evaluation The dependability evaluation of

Revisiting Zero-Rate Bounds on the Reliability Function of Discrete Memoryless Channels Marco

organization and performance in Burundi Ren NSABIMANA Electricity sector in Burundi

Management Dr. Stefan Wagner Technische Universitt Mnchen Garching 11 June 2010 1 Last

Challenges for Provenance in Cloud Compu5ng Imad Abbadi and

Comparing players in simple games Haris Aziz 1 1 Department of Computer Science &amp; DiMAP

FILLING IN THE MARGINS: THE USE OF QUEER THEORY, FEMINIST STANDPOINT THEORY, & CRITICAL RACE

Comparing players in simple games Haris Aziz 1 1 Department of Computer Science & DiMAP