Algebraic Structure in Network Information Theory Michael Gastpar - PowerPoint PPT Presentation

Algebraic Structure in Network Information Theory Michael Gastpar EPFL / Berkeley European Information Theory School, Antalya, Turkey April 2012 slides jointly with Bobak Nazer (Boston Univ.) download slides from linx.epfl.ch under “Teaching”

Motivation p Y | X

Motivation p Y | X p Y | X 1 X 2

Motivation p Y | X p Y | X 1 X 2 p Y 1 Y 2 Y 3 | X 1 X 2 X 3

Outline I. Discrete Alphabets II. AWGN Channels III. Network Applications

Point-to-Point Channels x y p Y | X w E D w ˆ The Usual Suspects: • Message w ∈ { 0 , 1 } k w ∈ { 0 , 1 } k • Estimate ˆ • Encoder E : { 0 , 1 } k → X n • Decoder D : Y n → { 0 , 1 } k • Input x ∈ X n • Output y ∈ Y n n � • Memoryless Channel p ( y | x ) = p ( y i | x i ) i =1 • Rate R = k n . • (Average) Probability of Error: P { ˆ w � = w } → 0 as n → ∞ . Assume w is uniform over { 0 , 1 } k .

i.i.d. Random Codes • Generate 2 nR codewords x = [ X 1 X 2 · · · X n ] independently and elementwise i.i.d. according to some distribution p X q − 1 . . . n � 4 p ( x ) = p X ( x i ) 3 i =1 2 • Bound the average error probability 1 for a random codebook. 0 4 · · · q − 1 0 1 2 3 • If the average performance over codebooks is good, there must exist at least one good fixed codebook.

(Weak) Joint Typicality • Two sequences x and y are (weakly) jointly typical if � � � − 1 � � n log p ( x ) − H ( X ) � <ǫ � � � � � − 1 � � n log p ( y ) − H ( Y ) � <ǫ � � � − 1 � � � � n log p ( x , y ) − H ( X, Y ) � <ǫ � � • For our considerations, weak typicality is convenient as it can also be stated in terms of differential entropies. • If x and y are i.i.d. sequences, the probability that they are jointly typical goes to 1 as n goes to infinity.

Joint Typicality Decoding Decoder looks for a codeword that is jointly typical with the received sequence y Error Events 1. Transmitted codeword x is not jointly typical with y . = ⇒ Low probability by the Weak Law of Large Numbers. 2. Another codeword ˜ x is jointly typical with y . Cuckoo’s Egg Lemma Let ˜ x be an i.i.d. sequence that is independent from the received sequence y . � � ≤ 2 − n ( I ( X ; Y ) − 3 ǫ ) ( ˜ x , y ) is jointly typical P See Cover and Thomas .

Point-to-Point Capacity • We can upper bound the probability of error via the union bound: � � � P { ˆ w � = w } ≤ P ( x ( ˜ w ) , y ) is jointly typical. w � = w ˜ ≤ 2 − n ( I ( X ; Y ) − R − 3 ǫ ) ← Cuckoo’s Egg Lemma • If R < I ( X ; Y ) , then the probability of error can be driven to zero as the blocklength increases. Theorem (Shannon ’48) The capacity of a point-to-point channel is C = max p X I ( X ; Y ) .

Linear Codes • Linear Codebook: A linear map between messages and codewords (instead of a lookup table). q -ary Linear Codes • Represent message w as a length- k vector over F q . • Codewords x are length- n vectors over F q . • Encoding process is just a matrix multiplication, x = Gw .    · · ·    x 1 g 11 g 12 g 1 k w 1 x 2 g 21 g 22 · · · g 2 k w 2        =  .   . . .   .  ... . . . . .       . . . . .      · · · x n g n 1 g n 2 g nk w k • Recall that, for prime q , operations over F q are just mod q operations over the reals. • Rate R = k n log q

Random Linear Codes • Linear code looks like a regular subsampling of the elements of F n q . q − 1 . . . • Random linear code: Generate 4 each element g ij of the generator F q 3 matrix G elementwise i.i.d. 2 according to a uniform distribution 1 over { 0 , 1 , 2 , . . . , q − 1 } . 0 4 · · · q − 1 0 1 2 3 • How are the codewords distributed? F q

Codeword Distribution x = Gw ⊕ v It is convenient to instead analyze the shifted ensemble ¯ where v is an i.i.d. uniform sequence. (See Gallager. ) Shifted Codeword Properties 1. Marginally uniform over F n q . For a given message w , the codeword ¯ x looks like an i.i.d. uniform sequence. x = x } = 1 for all x ∈ F n P { ¯ q q n 2. Pairwise independent. For w 1 � = w 2 , codewords ¯ x 1 , ¯ x 2 are independent. 1 P { ¯ x 1 = x 1 , ¯ x 2 = x 2 } = q 2 n = P { ¯ x 1 = x 1 } P { ¯ x 2 = x 2 }

Achievable Rates • Cuckoo’s Egg Lemma only requires independence between the true codeword x ( w ) and the other codeword x ( ˜ w ) . From the union bound: � � � P { ˆ w � = w } ≤ ( x ( ˜ w ) , y ) is jointly typical. P w � = w ˜ ≤ 2 − n ( I ( X ; Y ) − R − 3 ǫ ) • This is exactly what we get from pairwise independence. • Thus, there exists a good fixed generator matrix G and shift v for any rate R < I ( X ; Y ) where X is uniform.

Removing the Shift z y ¯ ¯ x w E D w ˆ • For a binary symmetric channel (BSC), the output can be written as the modulo sum of the input plus i.i.d. Bernoulli ( p ) noise, ¯ y = ¯ x ⊕ z y = Gw ⊕ v ⊕ z ¯ • Due to this symmetry, the probability of error depends only on the realization of the noise vector z . = ⇒ For a BSC, x = Gw is a good code as well. • We can now assume the existence of good generator matrices for channel coding.

Random I.I.D. vs. Random Linear • What have we gotten for linearity (so far)? Simplified encoding. (Decoder is still quite complex.) • What have we lost? Can only achieve R = I ( X ; Y ) for uniform X instead of max p X I ( X ; Y ) . • In fact, this is a fundamental limitation of group codes, Ahlswede ’71 . • Workarounds: symbol remapping Gallager ’68 , nested linear codes • Are random linear codes strictly worse than random i.i.d. codes?

Slepian-Wolf Problem R 1 s 1 E 1 ˆ s 1 D R 2 ˆ s 2 s 2 E 2 m � • Joint i.i.d. sources p ( s 1 , s 2 ) = p S 1 S 2 ( s 1 i , s 2 i ) i =1 • Rate Region: Set of rates ( R 1 , R 2 ) such that the encoders can send s 1 and s 2 to the decoder with vanishing probability of error P { ( ˆ s 1 , ˆ s 2 ) � = ( s 1 , s 2 ) } → 0 as m → ∞

Random Binning • Codebook 1: Independently and uniformly assign each source sequence s 1 to a label { 1 , 2 , . . . , 2 mR 1 } • Codebook 2: Independently and uniformly assign each source sequence s 2 to a label { 1 , 2 , . . . , 2 mR 2 } • Decoder: Look for jointly typical pair ( ˆ s 1 , ˆ s 2 ) within the received bin. Union bound: � � jointly typical ( ˆ s 1 , ˆ s 2 ) � = ( s 1 , s 2 ) in bin ( ℓ 1 , ℓ 2 ) P � 2 − m ( R 1 + R 2 ) ≤ jointly typical ( ˜ s 1 , ˜ s 2 ) ≤ 2 m ( H ( S 1 ,S 2 )+ ǫ ) 2 − m ( R 1 + R 2 ) • Need R 1 + R 2 > H ( S 1 , S 2 ) . • Similarly, R 1 > H ( S 1 | S 2 ) and R 2 > H ( S 2 | S 1 )

Slepian-Wolf Problem: Binning Illustration · · · 1 2 3 4 2 nR 1 1 2 3 4 . . . 2 nR 2

Random Linear Binning • Assume source symbols take values in F q . • Codebook 1: Generate matrix G 1 with i.i.d. uniform entries drawn from F q . Each sequence s 1 is binned via matrix multiplication, w 1 = G 1 s 1 . • Codebook 2: Generate matrix G 2 with i.i.d. uniform entries drawn from F q . Each sequence s 2 is binned via matrix multiplication, w 2 = G 2 s 2 . • Bin assignments are uniform and pairwise independent (except for s ℓ = 0 ) • Can apply the same union bound analysis as random binning.

Slepian-Wolf Rate Region Slepian-Wolf Theorem R 2 Reliable compression possible if and S-W only if: R 1 ≥ H ( S 1 | S 2 ) = h B ( p ) R 2 ≥ H ( S 2 | S 1 ) = h B ( p ) h B ( p ) R 1 + R 2 ≥ H ( S 1 , S 2 ) = 1 + h B ( p ) R 1 + R 2 = 1 + h B ( p ) Random linear binning is as good R 1 h B ( p ) as random i.i.d. binning! Example: Doubly Symmetric Binary Source S 1 ∼ Bern (1 / 2) U ∼ Bern ( p ) S 2 = S 1 ⊕ U

K¨ orner-Marton Problem • Binary sources R 1 s 1 E 1 • s 1 is i.i.d. Bernoulli( 1 / 2 ) D ˆ u • s 2 is s 1 corrupted by Bernoulli( p ) R 2 noise s 2 E 2 • Decoder wants the modulo- 2 sum . u = s 1 ⊕ s 2 Rate Region: Set of rates ( R 1 , R 2 ) such that there exist encoders and decoders with vanishing probability of error P { ˆ u � = u } → 0 as m → ∞ Are any rate savings possible over sending s 1 and s 2 in their entirety?

Random Binning • Sending s 1 and s 2 with random binning requires R 1 + R 2 > 1 + h B ( p ) ? • What happens if we use rates such that R 1 + R 2 < 1 + h B ( p ) ? • There will be exponentially many pairs ( s 1 , s 2 ) in each bin! • This would be fine if all pairs in a bin have the same sum, s 1 + s 2 . But this probability goes to zero exponentially fast!

K¨ orner-Marton Problem: Random Binning Illustration · · · 1 2 3 4 2 nR 1 1 2 3 4 . . . 2 nR 2

Algebraic Structure in Network Information Theory Michael Gastpar - PowerPoint PPT Presentation

Algebraic Structure in Network Information Theory Michael Gastpar EPFL / Berkeley European Information Theory School, Antalya, Turkey April 2012 slides jointly with Bobak Nazer (Boston Univ.) download slides from linx.epfl.ch under

A. Operations with algebraic Algebra practice part 1 expressions 3 4 A. Operations with

Towards an Algebraic Network Information Theory Bobak Nazer Boston University Charles River

A general algebraic structure theory for tropical mathematics Algebra Conference in Spa Louis

Some Algebraic Structures Here are some algebraic structures we will study this year : Rings

Intersection theory and homotopy types with algebraic structure Ettore Aldrovandi FSU

Algebraic and holomorphic flows in the bi-algebraic context Emmanuel Ullmo, IHES joint work with

Algebraic property testing Elad Haramaty Northeastern University Algebraic property testing

Algebraic Properties of ln( x ) We can derive algebraic properties of our new function f ( x ) =

Combinatorial algebraic topology of toric arrangements. Emanuele Delucchi (SNSF / Universit e

The ABCs of ADTs Algebraic Data Types Justin Lubin January 18, 2018 Asynchronous Anonymous @

Convex Algebraic Geometry Cynthia Vinzant, North Carolina State University Cynthia Vinzant

Algebraic Data Types Mark Hibberd Mar 28, 2011 Mark Hibberd Algebraic Data Types Overview

How to Find Algebraic Relations? Manuel Kauers RISC-Linz, Austria Algebraic Relations

An Implementation of Algebraic Data Types in Java using the Visitor Pattern Anton Setzer 1.

Algebraic Property Testing: A Survey Madhu Sudan MIT 1 1 April 1, 2009 April 1, 2009

Bistability in ODE and algebraic models Matthew Macauley Department of Mathematical Sciences

long range sand pile divisible Chiari L ni DELFT IM PA TU - M j Jara W Ruszel w w

Mechanisms are Performed in IPv6 Qinwen Hu qhu009@aucklanduni.ac.nz Nevil Brownlee

Content-based Similarity Queries on Complex Data: Challenges and Real Applications Agma J.

m = [ 9 ] 1 2 3 4 5 6 7 8 m = [ 1 2 3; 4 5 6; 7 8 9]; m = [ 1, 2, 3 ; 4, 5, 6; 7, 8, 9];

Critical density for Activated Random Walk Lorenzo Taggi Max Planck Institute for Mathematics in

WHY SUPERVISED LEARNING MAY WORK WHY SUPERVISED LEARNING MAY WORK Matthieu R Bloch Tuesday,

Notes on Error Propagation in Linear Systems CS3220 Summer 2008 - Jonathan Kaldor Up to this

Linear Inverse Problems A MATLAB Tutorial Presented by Johnny Samuels What do we want to do?