error probability bounds in information theory
play

Error probability bounds in information theory: Role of structure, - PowerPoint PPT Presentation

Outline Error probability bounds in information theory: Role of structure, performance criteria and decision rules Eli Haim Tel Aviv University January 21, 2018 January 21, 2018 ACC Annual Workshop 1 Outline Outline Introduction: Error


  1. Outline Error probability bounds in information theory: Role of structure, performance criteria and decision rules Eli Haim Tel Aviv University January 21, 2018 January 21, 2018 ACC Annual Workshop 1

  2. Outline Outline Introduction: Error exponent for single user channel Overview of linear codes in network problems Contribution I: Distributed expurgation using structured codes for network problems – terminals use different linear codes Contribution II: Distributed hypothesis testing using structured codes – terminals use same linear code January 21, 2018 ACC Annual Workshop 2

  3. Introduction Single-User Channel x 1 , . . . , x n y 1 , . . . , y n p ( y | x ) Transmitter Receiver Memoryless channel n p ( y 1 , . . . , y n | x 1 , . . . , x n ) = � p ( y t | x t ) t = 1 Basic definitions: Blocklength n : number of channel uses Codebook C : a set of M = 2 nR codewords (vectors of length n ) � � ˆ Average Error Probability: P e = P C � = C , where C ∼ Uniform ( C ) January 21, 2018 ACC Annual Workshop 3

  4. Introduction Single-User Channel x 1 , . . . , x n y 1 , . . . , y n p ( y | x ) Transmitter Receiver Memoryless channel n � p ( y 1 , . . . , y n | x 1 , . . . , x n ) = p ( y t | x t ) t = 1 Basic tradeoff: Tradeoff between number of codewords, blocklength and average error probability January 21, 2018 ACC Annual Workshop 3

  5. Introduction Single-User Channel x 1 , . . . , x n y 1 , . . . , y n p ( y | x ) Transmitter Receiver Memoryless channel n � p ( y 1 , . . . , y n | x 1 , . . . , x n ) = p ( y t | x t ) t = 1 First-Order (Capacity): asymptotics in blocklength Capacity C : Highest achievable rate with vanishing P e as n → ∞ January 21, 2018 ACC Annual Workshop 3

  6. Introduction Shannon Theory Random Code: Symbol-wise (and codeword-wise) i.i.d. p ( x ) Information Density: = log p ( X , Y ) △ i ( X ; Y ) p ( X ) p ( Y ) Mutual Information: △ I ( X ; Y ) = E i ( X ; Y ) Shannon’s Channel Coding Theorem [’48] (first-order characterization) C = max p ( x ) I ( X ; Y ) maximization over all input distributions p ( x ) January 21, 2018 ACC Annual Workshop 4

  7. Introduction Tradeoff: Refined Analysis There is a long history of finite blocklength bounds: Elias, Feinstein, Gallager, . . . Polyanskiy et al. [2010] gave two simple achievability bounds (DT & RCU). Disturbing point: neither dominates We have resolved this issue (but not in this talk...) Asymptotic analysis: the error event amounts to (except for low rates) n i ( X n ; Y n ) � 1 � i ( X k ; Y k ) < R n k = 1 January 21, 2018 ACC Annual Workshop 5

  8. Introduction Asymptotic Bounds on the Information Density The following asymptotics are with respect to the blocklength (for high rates): Central Limit Theorem (CLT): good for high P e , dispersion [Strassen 1962, Polyanskiy et al. 2010] We have derived results regarding the extension to network problems (but not in this talk...) Large Deviations Principle (LDP): good for low P e , exponent Pr { i ( X n ; Y n ) < R } ≤ exp {− nE ( R ) } Similar lower bounds are known January 21, 2018 ACC Annual Workshop 6

  9. Introduction Error Exponent: Code Structure May Matter High rates: typical error due to a "bad" channel i ( X n ; Y n ) < R . Random coding achieves the exponent 0.7 Random coding Low rates: typical error due to "bad" Best known 0.6 codewords (e.g. for BSC, minimum 0.5 Error Exponent distance dominates) 0.4 Can be solved by expurgation of 0.3 random codes, or (almost all) linear codes 0.2 Who cares about expurgation? For 0.1 almost noiseless (binary input) 0 0 0.1 0.2 0.3 0.4 0.5 0.6 channels Rate (nats) R ex C → 1 1 − → C January 21, 2018 ACC Annual Workshop 7

  10. Distributed Structure Outline Introduction: Error exponent for single user channel Overview of linear codes in network problems Contribution I: Distributed expurgation using structured codes for network problems – terminals use different linear codes Contribution II: Distributed hypothesis testing using structured codes – terminals use same linear code January 21, 2018 ACC Annual Workshop 8

  11. Distributed Structure But First: Why Linear Codes in Single-User Channel? Whenever uniform distribution is optimal, linear codes achieve capacity, exponents, dispersion But no theoretical gain Historically, interest was due to practical (complexity) advantages January 21, 2018 ACC Annual Workshop 9

  12. Distributed Structure Why Linear Codes in Networks? Contribution II (in this talk...) Recent interest, reviving a theme introduced by Körner-Marton 1979: first-order (capacity) advantage in some network settings (Nazer & Gastpar, Wilson et al., Philisof et al., . . . ) In this work: distributed hypothesis testing Terminals use the same linear code Contribution I (in this talk...) Error-probability advantage in network settings (even when no first-order gain) – multiple-access (MAC) channel Terminals use different linear codes The prospect for such an improvement was hinted to in a distributed source coding context by Csiszár [1982, “Linear Codes for Sources and Source Networks: Error Exponents, Universal Coding” ] January 21, 2018 ACC Annual Workshop 10

  13. Distributed Structure Outline Introduction: Error exponent for single user channel Overview of linear codes in network problems Contribution I: Distributed expurgation using structured codes for network problems – terminals use different linear codes Contribution II: Distributed hypothesis testing using structured codes – terminals use same linear code January 21, 2018 ACC Annual Workshop 11

  14. Distributed Expurgation MAC Channel For simplicity 2 users X 1 P Y | X 1 , X 2 Y X 2 Capacity region: the closure of the convex-hull of all ( R 1 , R 2 ) satisfying: R 2 I ( X 2 ; Y | X 1 ) R 1 + R 2 = I ( X 1 , X 2 ; Y ) R 1 ≤ I ( X 1 ; Y | X 2 ) R 2 ≤ I ( X 2 ; Y | X 1 ) R 1 + R 2 ≤ I ( X 1 , X 2 ; Y ) , R 1 I ( X 1 ; Y | X 2 ) over some product distribution p ( x 1 , x 2 ) = p ( x 1 ) p ( x 2 ) January 21, 2018 ACC Annual Workshop 12

  15. Distributed Expurgation Toy Example: Erasure-Additive MAC Channel X 1 X 1 X Erasure Erasure Y Channel Channel X 2 X 2 Obvious bounds on P e Lower bound: single-user erasure channel Upper bound: same with half blocklength (time sharing) Is any of these bounds tight? January 21, 2018 ACC Annual Workshop 13

  16. Distributed Expurgation What Can Be Achieved Using Random Codes? Slepian & Wolf [’73], Gallager [’85] Receiver’s perspective: sum of codebooks, C = C 1 + C 2 For random codes: summation preserves pairwise independence, thus most standard bounds (RCU, DT, dispersion, random exponent) hold Codebook structure (e.g. minimum distance) is not preserved But recall that minimum distance dictates error exponent at low rates Expurgation attempts recently by Nazari et al.: expurgate one user (even for MAC channel with many users) January 21, 2018 ACC Annual Workshop 14

  17. Distributed Expurgation Solution: Use Linear Codes Create a linear sum-codebook (recall: inherently expurgated) Simply split the generating matrix between users At the receiver, the summation is indistinguishable from a single user channel with the sum-rate Performance identical to single user with the sum rate Any performance that is attainable via linear codes over the single-user channel is also attainable for the considered MAC The generation process is equivalent to generating two different linear codes January 21, 2018 ACC Annual Workshop 15

  18. Distributed Expurgation The Error Exponent of MAC Channels In toy example: single-user 0.7 Random coding random+expurgated Best known 0.6 exponents are achievable 0.5 Error Exponent Extends to any MAC channel 0.4 that is finite-field summation + 0.3 single-user channel (e.g., BSC 0.2 MAC) 0.1 Advantage for any “similar” 0 0 0.1 0.2 0.3 0.4 0.5 0.6 channel (by continuity) Rate (nats) AWGN MAC channel - constraints are a challenge. n ( P 1 + P 2 ) nP 1 nP 2 � � � For certain parameters - + � = improving on Gallager [’85] General case: wide open. January 21, 2018 ACC Annual Workshop 16

  19. Distributed Hypothesis Testing Outline Introduction: Error exponent for single user channel Overview of linear codes in network problems Contribution I: Distributed expurgation using structured codes for network problems – terminals use different linear codes Contribution II: Distributed hypothesis testing using structured codes – terminals use same linear code January 21, 2018 ACC Annual Workshop 17

  20. Distributed Hypothesis Testing Distributed Hypothesis Testing [Berger ’79] i X ∈ M X X φ X ψ ˆ H i Y ∈ M Y Y φ Y H 0 : ( X , Y ) ∼ i.i.d. P 0 ( x , y ) H 1 : ( X , Y ) ∼ i.i.d. P 1 ( x , y ) Rates: R X = 1 / n · log |M X | , R Y = 1 / n · log |M Y | Error probabilities { ǫ 0 } , { ǫ 1 } as in standard hypothesis testing But now, there is a tradeoff between rates, error probabilities and blocklength Long history: Ahlswede & Csiszár ’81, ’86, Han ’87, Shalaby & Papamarcou ’92, Shimokawa et al. ’94, Han & Amari ’98, Rahman & Wagner 2012... January 21, 2018 ACC Annual Workshop 18

Recommend


More recommend