information theory
play

Information Theory Lecture 9 Error Exponents The part on discrete - PDF document

Information Theory Lecture 9 Error Exponents The part on discrete channels of R. Gallager, A Simple Derivation of the Coding Theorem and Some Applications, IEEE Trans. on Inform. Theory , Jan. 1965 In addition some concepts


  1. Information Theory Lecture 9 • Error Exponents • The part on discrete channels of • R. Gallager, “A Simple Derivation of the Coding Theorem and Some Applications,” IEEE Trans. on Inform. Theory , Jan. 1965 • In addition some concepts found in • R. Gallager, Information Theory and Reliable Communication , Wiley 1968 Mikael Skoglund, Information Theory 1/29 Discrete Channels (recap) channel X n Y n • Let X and Y be finite sets. A discrete channel is a random mapping from X n to Y n described by the conditional pmfs 1 ∈ X n and y n p n ( y n 1 | x n 1 ) for all n ≥ 1 , x n 1 ∈ Y n . • The channel is (stationary and) memoryless if n p n ( y n 1 | x n � 1 ) = p ( y m | x m ) , n = 2 , 3 , . . . m =1 • A discrete memoryless channel (DMC) is completely described by the triple ( X , p ( y | x ) , Y ) Mikael Skoglund, Information Theory 2/29

  2. Block Channel Codes (recap) encoder channel decoder x n 1 ( ω ) Y n ω ˆ ω 1 α β • Define an ( M, n ) block channel code for a DMC ( X , p ( y | x ) , Y ) by 1 An index set I M � { 1 , . . . , M } 2 An encoder mapping α : I M → X n . The set � � x n 1 : x n C � 1 = α ( i ) , ∀ i ∈ I M of codewords is called the codebook . 3 A decoder mapping β : Y n → I M , as characterized by the decoding subsets 1 ∈ Y n : β ( y n Y n ( i ) = { y n 1 ) = i } , i = 1 , . . . , M Mikael Skoglund, Information Theory 3/29 • The rate of the code is R � log M [bits per channel use] n • A code is often represented by its codebook only ; the decoder can often be derived from the codebook using a specific rule (joint typicality, maximum a posteriori, maximum likelihood,. . . ) • Assume, in the following, that ω ∈ I M is drawn according to p ( m ) = Pr( ω = m ) Mikael Skoglund, Information Theory 4/29

  3. Error Probabilities (recap) • For a given code • Conditional � p n ( y n 1 | x n P e,m = 1 ( m )) (= λ m in CT) y n 1 ∈ ( Y n ( m )) c • Maximal � = λ ( n ) in CT � P e, max = P ( n ) e, max = max m P e,m • Overall/average/total M � P e = P ( n ) = p ( m ) P e,m e m =1 Mikael Skoglund, Information Theory 5/29 “Random Coding” (recap) • Assume that the M codewords x n 1 ( m ) , m = 1 , . . . , M , of a codebook C are drawn independently according to 1 ∈ X n = q n ( x n 1 ) , x n � x n � � x n � ⇒ P ( C ) = q n 1 (1) · · · q n 1 ( M ) . • Error probabilities over an ensemble of codes, • Conditional ¯ � P e,m = P ( C ) P e,m ( C ) C • Overall/average/total ¯ � P e = P ( C ) P e ( C ) C • Note : In addition to C a decoder needs to be specified Mikael Skoglund, Information Theory 6/29

  4. The Channel Coding Theorem (recap) • A rate R is achievable if there exists a sequence of ( M, n ) codes, with M = ⌈ 2 nR ⌉ , such that P ( n ) e, max → 0 as n → ∞ . Capacity C is the supremum of all achievable rates. • For a discrete memoryless channel, C = max p ( x ) I ( X ; Y ) • Previous proof (in CT) based on typical sequences = ⇒ limited insight, e.g., into how fast P ( n ) e, max → 0 as n → ∞ for R < C . . . • In fact, for any n > 0 , P ( n ) e, max < 4 · 2 − nE r ( R ) where E r ( R ) is the random coding exponent Mikael Skoglund, Information Theory 7/29 Exponential Bounds • A code C ( n, R ) of length n and rate R • Assume p ( m ) = M − 1 , a DMC and consider the average error probability M = 1 � P ( n ) P e,m e M m =1 • Bounds easily extended to P ( n ) e, max • Non-zero lower bound may not exist for arbitrary p ( m ) • Upper-bounds (there exists a code) P ( n ) ≤ 2 − nE min ( R ) , any n > 0 e • Lower-bounds (for all codes) P ( n ) ≥ 2 − nE max ( R ) , as n → ∞ e Mikael Skoglund, Information Theory 8/29

  5. Reliability Function, Error Exponents • The reliability function of a channel, − log P ∗ e ( n, R ) E ( R ) = lim , n n →∞ where P ∗ e ( n, R ) is the minimum over all codes C ( n, R ) • Lower bounds to E ( R ) yield upper bounds to P ( n ) (as e n → ∞ ) • “random coding” E r ( R ) and “expurgated” E ex ( R ) exponents • Upper bounds to E ( R ) yield lower bounds to P ( n ) (as e n → ∞ ) • “sphere-packing” E sp ( R ) and “straight-line” E sl ( R ) exponents Mikael Skoglund, Information Theory 9/29 • With E max = max ( E r , E ex ) and E min = min ( E sp , E sl ) E max ( R ) ≤ E ( R ) ≤ E min ( R ) • The critical rate R cr is the smallest R in [0 , C ] such that E max ( R ) = E min ( R ) = E ( R ) for R cr ≤ R ≤ C ; • For R ∈ [ R cr , C ) the exponent E ( R ) > 0 in ≈ 2 − nE ( R ) as n → ∞ P ( n ) e for the best possible existing code is known! Mikael Skoglund, Information Theory 10/29

  6. Decoding Rules • Joint typicality ( A ( n ) jointly typical set) ǫ 1 ∈ Y n : ( x n ⇒ m ′ = m } Y n ( m ) = { y n 1 ( m ′ ) , y n 1 ) ∈ A ( n ) ⇐ ǫ • Maximum a posteriori (minimum error probability) 1 ∈ Y n : m = argmax Y n ( m ) = { y n Pr( m ′ | y n 1 ) } m ′ • Maximum likelihood (a priori unknown / unmeaningful / uniform) 1 ∈ Y n : m = argmax Y n ( m ) = { y n p n ( y n 1 | x n 1 ( m ′ )) } m ′ • To derive existence results it suffices to consider a specific rule Mikael Skoglund, Information Theory 11/29 Two Codewords • Two codewords, C = { x n 1 (1) , x n 1 (2) } , and any channel p n ( y n 1 | x n 1 ) • Assume maximum likelihood decoding, 1 ∈ Y n : p n ( y n Y n (1) = { y n 1 | x n 1 (1)) > p n ( y n 1 | x n 1 (2)) } Hence, for any s ∈ (0 , 1) it holds that � p n ( y n 1 | x n P e, 1 = 1 (1)) y n 1 ∈Y n (1) c � p n ( y n 1 | x n 1 (1)) 1 − s p n ( y n 1 | x n 1 (2)) s ≤ y n 1 ∈Y n (1) c � p n ( y n 1 | x n 1 (1)) 1 − s p n ( y n 1 | x n 1 (2)) s ≤ y n 1 ∈Y n • An equivalent bound applies to P e, 2 Mikael Skoglund, Information Theory 12/29

  7. • For a memoryless channel we get (with ¯ m = ( m mod 2) + 1 ) n n m )) s = � � p ( y i | x i ( m )) 1 − s p ( y i | x i ( ¯ � P e,m ≤ g n ( s ) , m = 1 , 2 i =1 y i ∈Y i =1 • For a BSC( ǫ ) with two codewords at distance d n � d � � � P e,m ≤ min g n ( s ) = 2 ǫ (1 − ǫ ) m = 1 , 2 s ∈ (0 , 1) i =1 ⇒ For a “best” pair of codewords ( d = n ) � n � � P e,m ≤ 2 ǫ (1 − ǫ ) m = 1 , 2 ⇒ For a “typical” pair of codewords ( d = n/ 2 ) � n/ 2 � � P e,m ≤ 2 ǫ (1 − ǫ ) m = 1 , 2 Mikael Skoglund, Information Theory 13/29 Ensemble Average – Two Codewords • Pick a probability assignment q n on X n , and choose M codewords in C = { x n 1 (1) , . . . , x n 1 ( M ) } independently; M � q n ( x n P ( C ) = 1 ( m )) m =1 • For memoryless channels, we take q n of the form n q n ( x n � 1 ) = q 1 ( x i ) i =1 Mikael Skoglund, Information Theory 14/29

  8. • Thus, for m = 1 , 2 ¯ � � q n ( x n 1 (1)) q n ( x n P e,m = 1 (2)) P e,m x n x n 1 (1) ∈X n 1 (2) ∈X n   � � q n ( x n 1 (1)) p n ( y n 1 | x n 1 (1)) 1 − s ≤   y n 1 ∈Y n x n 1 (1) ∈X n   � q n ( x n 1 (2)) p n ( y n 1 | x n 1 (2)) s ×   x n 1 (2) ∈X n Minimum over s ∈ (0 , 1) at s = 1 / 2 = ⇒ 2   � ¯ �  � q n ( x n p n ( y n 1 | x n P e,m ≤ 1 ) 1 )  y n x n 1 ∈Y n 1 ∈X n Mikael Skoglund, Information Theory 15/29 • For a memoryless channel n  � 2  ��   ¯ � � P e,m ≤ q 1 ( x ) p 1 ( y | x ) m = 1 , 2   y ∈Y n x ∈X • In particular, for a BSC( ǫ ) with q 1 ( x ) = 1 / 2 � 2 � n √ � √ ǫ + � 1 ¯ P e,m ≤ 1 − ǫ m = 1 , 2 2 1 � √ ǫ + √ 1 − ǫ � 2 (random) 1 – Solid: 0.8 2 0.6 � 1 / 2 � � – Dashed: 2 ǫ (1 − ǫ ) (typical) 0.4 0.2 � – Dotted: 2 ǫ (1 − ǫ ) (best) 0.1 0.2 0.3 0.4 0.5 Mikael Skoglund, Information Theory 16/29

  9. Alternative Derivation — Still Two Codewords • Examine the ensemble average directly ¯ � � q n ( x n p n ( y n 1 | x n 1 (1)) Pr( y n 1 ∈ Y n (1) c ) P e, 1 = 1 (1)) y n x n 1 ∈Y n 1 (1) ∈X n • Since the codewords are randomly chosen Pr( y n 1 ∈ Y n (1) c ) � q n ( x n = 1 (2)) x n 1 (2): p n ( y n 1 | x n 1 (1)) ≤ p n ( y n 1 | x n 1 (2)) � s � p n ( y n 1 | x n 1 (2)) � q n ( x n ≤ 1 (2)) p n ( y n 1 | x n 1 (1)) x n 1 (2) ∈X n • Substituting this into the first equation yields the result • This method generalizes more easily! Mikael Skoglund, Information Theory 17/29 Bound on ¯ P e,m – Many Codewords • As before, ¯ � q n ( x n � p n ( y n 1 | x n 1 ( m )) Pr( y n 1 ∈ Y n ( m ) c ) P e,m = 1 ( m )) y n x n 1 ( m ) ∈X n 1 ∈Y n • For M ≥ 2 codewords, any ρ ∈ [0 , 1] and s > 0 � Pr( y n 1 ∈ Y n ( m ) c ) { y n 1 ∈ Y n ( m ′ ) } ) ≤ Pr( m ′ � = m ρ    � Pr( y n 1 ∈ Y n ( m ′ )) ≤  m ′ � = m ρ   p n ( y n 1 | x n 1 ) s � q n ( x n ≤  ( M − 1) 1 )  p n ( y n 1 | x n 1 ( m )) s x n 1 ∈X n Mikael Skoglund, Information Theory 18/29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend