# Joint Source and Channel Coding: Fundamental Bounds and Connections - PowerPoint PPT Presentation

## Joint Source and Channel Coding: Fundamental Bounds and Connections to Machine Learning Deniz G und uz Imperial College London 18 April 2019 European School of Information Theory (ESIT) Overview PART I: Information theoretic limits

1. Converse Proof If P e → 0, then H ( S ) < rC , for any sequence of encoder-decoder pairs with n ≤ r · m . From Fano’s inequality: H ( S m | ˆ S m ) ≤ 1 + P m,n log |S m | = 1 + P m,n m log |S| e e Hence, H ( S ) = 1 S m ) + 1 mH ( S m | ˆ mI ( S m ; ˆ S m ) (Chain rule) ≤ 1 m log |S| ) + 1 mI ( S m ; ˆ m (1 + P m,n S m ) (Fano’s inequality) e ≤ 1 m log |S| ) + 1 m (1 + P m,n mI ( X n ; Y n ) e (Data processing inequality, S m − X n − Y n − ˆ S m ) ≤ 1 m + P m,n log |S| + rC (Capacity theorem) e Letting m, n → ∞ , if P m,n → 0, we get H ( S ) ≤ rC . e Optimality of separation continues to hold in the presence of feedback! Deniz G¨ und¨ uz Joint Source and Channel Coding

2. Converse Proof If P e → 0, then H ( S ) < rC , for any sequence of encoder-decoder pairs with n ≤ r · m . From Fano’s inequality: H ( S m | ˆ S m ) ≤ 1 + P m,n log |S m | = 1 + P m,n m log |S| e e Hence, H ( S ) = 1 S m ) + 1 mH ( S m | ˆ mI ( S m ; ˆ S m ) (Chain rule) ≤ 1 m log |S| ) + 1 mI ( S m ; ˆ m (1 + P m,n S m ) (Fano’s inequality) e ≤ 1 m log |S| ) + 1 m (1 + P m,n mI ( X n ; Y n ) e (Data processing inequality, S m − X n − Y n − ˆ S m ) ≤ 1 m + P m,n log |S| + rC (Capacity theorem) e Letting m, n → ∞ , if P m,n → 0, we get H ( S ) ≤ rC . e Optimality of separation continues to hold in the presence of feedback! Deniz G¨ und¨ uz Joint Source and Channel Coding

3. Benefits and Limitations of Separation Separation is good, because ... brings modularity, we can benefit from existing source and channel coding techniques Image Image Encoder Decoder Channel Channel Channel Encoder Decoder Audio Audio Encoder Encoder but ... infinite delay and complexity, ergodic source and channel assumption and no separation theorem for multi-user networks Deniz G¨ und¨ uz Joint Source and Channel Coding

4. Benefits and Limitations of Separation Separation is good, because ... brings modularity, we can benefit from existing source and channel coding techniques Image Image Encoder Decoder Channel Channel Channel Encoder Decoder Audio Audio Encoder Encoder but ... infinite delay and complexity, ergodic source and channel assumption and no separation theorem for multi-user networks Deniz G¨ und¨ uz Joint Source and Channel Coding

5. Receiver Side Information Channel Encoder Decoder Receiver has correlated side information: sensor network Separation optimal (Shamai, Verdu, ’95): Optimal source-channel rate r = H ( S | T ) C Lossy transmission: minimum distortion D WZ ( rC ), where D WZ is the Wyner-Ziv rate-distortion function Deniz G¨ und¨ uz Joint Source and Channel Coding

6. No Side Information (reminder) When there is no side information, no need for binning. SOURCE SPACE CHANNEL SPACE Deniz G¨ und¨ uz Joint Source and Channel Coding

7. With Side Information: Binning When there is side information at the receiver, we map multiple source codewords to the same channel codeword: SOURCE SPACE CHANNEL SPACE Deniz G¨ und¨ uz Joint Source and Channel Coding

8. With Side Information: Binning First decode channel codeword. There are multiple candidates for source codeword from the same bin: SOURCE SPACE Deniz G¨ und¨ uz Joint Source and Channel Coding

9. With Side Information: Binning Correlated side information T m : Choose source codeword in the bin jointly typical with T m : SOURCE SPACE Deniz G¨ und¨ uz Joint Source and Channel Coding

10. Random Binning (Slepian-Wolf Coding) Typical set Typical set Randomly assign source vectors to bins such that there are ∼ 2 m [ I ( S ; T ) − ǫ ] elements in each bin. Sufficiently few elements in each bin to decode S m using typicality. Even if the sender knew T m , source coding rate could not be lower than H ( S | T ). Deniz G¨ und¨ uz Joint Source and Channel Coding

11. Lossy Compression: Wyner-Ziv Coding In lossy transmission, we first quantize, then bin: Fix P W | S . Create a codebook of m -length codewords W m of size ∼ 2 m [ I ( S ; W )+ ǫ ] . Randomly assign these codewords into bins such that there are ∼ 2 n [ I ( T ; W ) − ǫ ] elements in each bin. Sufficiently few elements in each bin to decode W m using typicality. Since T − S − W , correct W m satisfies typicality (conditional typicality lemma) Once W m is decoded, use it with side information T m through a single-letter function ˆ S i = φ ( T i , W i ). Minimum source coding rate within distortion D : R W Z ( D ) = W,φ : T − S − W,E [ d ( S,φ ( T,W ))] ≤ D I ( S ; T ) − I ( W ; T ) min = W,φ : T − S − W,E [ d ( S,φ ( T,W ))] ≤ D I ( S ; W | T ) min Deniz G¨ und¨ uz Joint Source and Channel Coding

12. Generalized Coding Scheme Generate M = 2 mR bins with H ( S | T ) ≤ R ≤ H ( S ) Randomly allocate source sequences to bins. B ( i ): sequences in i th bin SOURCE SPACE CHANNEL SPACE Joint decoding: Find bin index s.t. 1 corresponding channel input x n ( i ) is typical with channel output Y n , 2 there exist exactly one codeword in the bin jointly typical with side information T m Prob of error: Prob. of having another bin satisfying above conditions: 2 mR 2 − n ( I ( X ; Y ) − 3 ǫ ) |B ( i ) ∩ A m ǫ ( S ) | 2 − m ( I ( S ; T ) − 3 ǫ ) ≤ 2 − n ( I ( X ; Y ) − 3 ǫ ) 2 − m ( H ( S | T ) − 2 ǫ ) goes to zero if m ( H ( S | T )) ≤ nI ( X ; Y ). Deniz G¨ und¨ uz Joint Source and Channel Coding

13. Generalized Coding Scheme Generate M = 2 mR bins with H ( S | T ) ≤ R ≤ H ( S ) Randomly allocate source sequences to bins. B ( i ): sequences in i th bin SOURCE SPACE CHANNEL SPACE Joint decoding: Find bin index s.t. 1 corresponding channel input x n ( i ) is typical with channel output Y n , 2 there exist exactly one codeword in the bin jointly typical with side information T m Prob of error: Prob. of having another bin satisfying above conditions: 2 mR 2 − n ( I ( X ; Y ) − 3 ǫ ) |B ( i ) ∩ A m ǫ ( S ) | 2 − m ( I ( S ; T ) − 3 ǫ ) ≤ 2 − n ( I ( X ; Y ) − 3 ǫ ) 2 − m ( H ( S | T ) − 2 ǫ ) goes to zero if m ( H ( S | T )) ≤ nI ( X ; Y ). Deniz G¨ und¨ uz Joint Source and Channel Coding

14. Generalized Coding Scheme Separate decoding: List indices i s.t. x n ( i ) and Y n are jointly typical. Source decoder finds the bin with a jointly typical sequence with T m Separate source and channel coding is a special case for R = H ( S | T ): single element in list Works without any binning at all: generate an iid channel codeword for each source outcome, i.e., R = log |S 0 | Decoder outputs only typical sequences: no point having ≥ 2 m ( H ( S )+ ǫ ) bins. R = H ( S ) equivalent to no-binning Transfer complexity of binning from encoder to decoder Deniz G¨ und¨ uz Joint Source and Channel Coding

15. Generalized Coding Scheme Separate decoding: List indices i s.t. x n ( i ) and Y n are jointly typical. Source decoder finds the bin with a jointly typical sequence with T m Separate source and channel coding is a special case for R = H ( S | T ): single element in list Works without any binning at all: generate an iid channel codeword for each source outcome, i.e., R = log |S 0 | Decoder outputs only typical sequences: no point having ≥ 2 m ( H ( S )+ ǫ ) bins. R = H ( S ) equivalent to no-binning Transfer complexity of binning from encoder to decoder Deniz G¨ und¨ uz Joint Source and Channel Coding

16. Generalized Coding Scheme Separate decoding: List indices i s.t. x n ( i ) and Y n are jointly typical. Source decoder finds the bin with a jointly typical sequence with T m Separate source and channel coding is a special case for R = H ( S | T ): single element in list Works without any binning at all: generate an iid channel codeword for each source outcome, i.e., R = log |S 0 | Decoder outputs only typical sequences: no point having ≥ 2 m ( H ( S )+ ǫ ) bins. R = H ( S ) equivalent to no-binning Transfer complexity of binning from encoder to decoder Deniz G¨ und¨ uz Joint Source and Channel Coding

17. Virtual Binning Channel is virtually binning the channel codewords; equivalently the source codewords (or, outcomes) SOURCE SPACE CHANNEL SPACE Deniz G¨ und¨ uz Joint Source and Channel Coding

18. Virtual Binning When the channel is good, there will be fewer candidates in the list SOURCE SPACE CHANNEL SPACE Deniz G¨ und¨ uz Joint Source and Channel Coding

19. Virtual Binning When the channel is weak, there will be more candidates SOURCE SPACE CHANNEL SPACE Deniz G¨ und¨ uz Joint Source and Channel Coding

20. When Does It Help? Multiple receivers with different side information. Strict separation suboptimal. Decoder 1 Encoder Channel Decoder 2 Source-channel capacity: I ( X ; Y i ) max p ( x ) min H ( S | T i ) i =1 , 2 If p ( x ) maximizes both I ( X ; Y 1 ) and I ( X ; Y 2 ), then we can use the channel at full capacity for each user. E. Tuncel, Slepian–Wolf coding over broadcast channels , IEEE Trans. Information Theory , Apr. 2006. Deniz G¨ und¨ uz Joint Source and Channel Coding

21. When Does It Help? Multiple receivers with different side information. Strict separation suboptimal. Decoder 1 Encoder Channel Decoder 2 Source-channel capacity: I ( X ; Y i ) max p ( x ) min H ( S | T i ) i =1 , 2 If p ( x ) maximizes both I ( X ; Y 1 ) and I ( X ; Y 2 ), then we can use the channel at full capacity for each user. E. Tuncel, Slepian–Wolf coding over broadcast channels , IEEE Trans. Information Theory , Apr. 2006. Deniz G¨ und¨ uz Joint Source and Channel Coding

22. Separate Source and Channel Coding with Backward Decoding Randomly partition all source outputs into - M 1 = 2 nH ( S | T 1 ) bins for Receiver 1 - M 2 = 2 nH ( S | T 2 ) bins for Receiver 2 Fix p ( x ). Generate - M 1 M 2 length- n codewords with � n i =1 p ( x i ): x n ( w 1 , w 2 ), w i ∈ [1 : M i ]. · · · 1 M 2 x n (1 , 1) x n (1 , M 2 ) 1 . . . x n ( M 1 , 1) x n ( M 1 , M 2 ) M 1 D. Gunduz, E. Erkip, A. Goldsmith and H. V. Poor, Reliable joint source-channel cooperative transmission over relay networks , IEEE Trans. Information Theory , Apr. 2013. Deniz G¨ und¨ uz Joint Source and Channel Coding

23. Backward decoding Send Bm samples over ( B + 1) n channel uses with n/m = r . w 1 ,i ∈ [1 : M 1 ]: bin index for receiver 1, i = 1 , . . . , B w 2 ,i ∈ [1 : M 2 ]: bin index for receiver 2, i = 1 , . . . , B Block 1 Block 2 · · · Block i · · · Block B + 1 x n ( w 1 , 1 , 1) x n ( w 1 , 2 , w 2 , 1 ) x n ( w 1 ,i , w 2 ,i − 1 ) x n (1 , w 2 ,B ) · · · · · · Receiver 1 decodes reliably if H ( S | T 1 ) ≤ r · I ( X ; Y 1 ) Receiver 2 decodes reliably if H ( S | T 2 ) ≤ r · I ( X ; Y 2 ) Deniz G¨ und¨ uz Joint Source and Channel Coding

24. Backward decoding Send Bm samples over ( B + 1) n channel uses with n/m = r . w 1 ,i ∈ [1 : M 1 ]: bin index for receiver 1, i = 1 , . . . , B w 2 ,i ∈ [1 : M 2 ]: bin index for receiver 2, i = 1 , . . . , B Block 1 Block 2 · · · Block i · · · Block B + 1 x n ( w 1 , 1 , 1) x n ( w 1 , 2 , w 2 , 1 ) x n ( w 1 ,i , w 2 ,i − 1 ) x n (1 , w 2 ,B ) · · · · · · Receiver 1 decodes reliably if H ( S | T 1 ) ≤ r · I ( X ; Y 1 ) Receiver 2 decodes reliably if H ( S | T 2 ) ≤ r · I ( X ; Y 2 ) Deniz G¨ und¨ uz Joint Source and Channel Coding

25. Backward decoding Send Bm samples over ( B + 1) n channel uses with n/m = r . w 1 ,i ∈ [1 : M 1 ]: bin index for receiver 1, i = 1 , . . . , B w 2 ,i ∈ [1 : M 2 ]: bin index for receiver 2, i = 1 , . . . , B Block 1 Block 2 · · · Block i · · · Block B + 1 x n ( w 1 , 1 , 1) x n ( w 1 , 2 , w 2 , 1 ) x n ( w 1 ,i , w 2 ,i − 1 ) x n (1 , w 2 ,B ) · · · · · · Receiver 1 decodes reliably if H ( S | T 1 ) ≤ r · I ( X ; Y 1 ) Receiver 2 decodes reliably if H ( S | T 2 ) ≤ r · I ( X ; Y 2 ) Deniz G¨ und¨ uz Joint Source and Channel Coding

26. Lossy Broadcasting First quantize, then broadcat quantized codeword Decoder 1 Encoder Channel Decoder 2 ( D 1 , D 2 ) is achievable at rate r if there exist W satisfying W − S − ( T 1 , T 2 ), input distribution p X ( x ) and reconstruction functions φ 1 , φ 2 such that I ( S ; W | T i ) ≤ rI ( X ; Y i ) , E[ d k ( S, φ i ( W, T i ))] ≤ D i for i = 1 , 2. J. Nayak, E. Tuncel, D. Gunduz, Wyner-Ziv coding over broadcast channels: Digital schemes , IEEE Trans. Information Theory , Apr. 2010. Deniz G¨ und¨ uz Joint Source and Channel Coding

27. Time-varying Channel and Side Information Time-varying side-information Encoder Fading Decoder Channel I. E. Aguerri and D. Gunduz, Joint source-channel coding with time-varying channel and side-information, IEEE Trans. Information Theory, vol. 62, no. 2, pp. 736 - 753, Feb. 2016. Deniz G¨ und¨ uz Joint Source and Channel Coding

28. Two-way MIMO Relay Channel Compress-and-forward at the relay Lossy broadcasting with side information Achieves optimal diversity-multiplexing trade-off D. Gunduz, A. Goldsmith, and H. V. Poor, MIMO two-way relay channel: Diversity-multiplexing trade-off analysis , Asilomar Conference , Oct. 2008. D. Gunduz, E. Tuncel, and J. Nayak, Rate regions for the separated two-way relay channel, Allerton Conf. on Comm., Control, and Computing, Sep. 2008. Deniz G¨ und¨ uz Joint Source and Channel Coding

29. Multi-user Networks: No Separation Separation does not hold for multi-user channels Encoder Decoder Channel Binary two-way multiplying channel: X i ∈ { 0 , 1 } , i = 1 , 2 Y = X 1 · X 2 Capacity still open: Shannon provided inner/ outer bounds Consider correlated signals S 1 and S 2 : 0 1 0 0.275 0 0.275 0.45 1 With separation, they need to exchange rates H ( S 1 | S 2 ) = H ( S 2 | S 1 ) = 0 . 6942 bpss C. E. Shannon, Two-way communication channels , in Proc. 4th Berkeley Symp. Math. Satist. Probability, vol. 1, 1961, pp. 611-644. Deniz G¨ und¨ uz Joint Source and Channel Coding

30. Two-way Channel with Correlated Sources 1 Shannon outer bound 0.9 (H(S 1 |S 2 ), H(S 2 |S 1 )) 0.8 0.7 0.6 Hekstra− Willems outer bound R 2 0.5 0.4 0.3 Shannon inner bound 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 R 1 Symmetric transmission rate with independent channel inputs bounded by 0 . 64628 bpcu (Hekstra and Willems) Uncoded transmission allows reliable decoding! A. P. Hekstra and F. M. W. Willems, Dependence balance bounds for single-output two-way channels , IEEE Trans. Inform. Theory , Jan. 1989. Deniz G¨ und¨ uz Joint Source and Channel Coding

31. Multiple Access Channel (MAC) with Correlated Sources Encoder 1 Decoder Encoder 2 Binary input adder channel: X i ∈ { 0 , 1 } , Y = X 1 + X 2 p ( s 1 , s 2 ): p (0 , 0) = p (1 , 0) = p (0 , 1) = 1 / 3 H ( S 1 , S 2 ) = log 3 = 1 . 58 bits/sample Max. sum rate with independent inputs: 1 . 5 bits/channel use Separation fails, while uncoded transmission is optimal T. M. Cover, A. El Gamal and M. Salehi, Multiple access channels with arbitrarily correlated sources , IEEE Trans. Information Theory , Nov. 1980. Deniz G¨ und¨ uz Joint Source and Channel Coding

32. Relay Channel Relay Destination Source Channel Introduced by van der Meulen Characterized by p ( y 1 , y 2 | x 1 , x 2 ) Capacity of relay channel not known Multi-letter capacity given by van der Meulen: C k = lim k →∞ C k C = sup k where C k � 1 k I ( X k 1 ; Y k max 2 ) 1 ) , { x 2 i ( y i − 1 p ( x k ) } k 1 i =1 Various achievable schemes: amplify-and-forward, decode-and-forward, compress-and-forward T. M. Cover and A. E. Gamal, Capacity theorems for the relay channel , IEEE Trans. Inf. Theory , Sep. 1979. Deniz G¨ und¨ uz Joint Source and Channel Coding

33. Relay Channel Relay Destination Source Channel Introduced by van der Meulen Characterized by p ( y 1 , y 2 | x 1 , x 2 ) Capacity of relay channel not known Multi-letter capacity given by van der Meulen: C k = lim k →∞ C k C = sup k where C k � 1 k I ( X k 1 ; Y k max 2 ) 1 ) , { x 2 i ( y i − 1 p ( x k ) } k 1 i =1 Various achievable schemes: amplify-and-forward, decode-and-forward, compress-and-forward T. M. Cover and A. E. Gamal, Capacity theorems for the relay channel , IEEE Trans. Inf. Theory , Sep. 1979. Deniz G¨ und¨ uz Joint Source and Channel Coding

34. Relay Channel Relay Destination Source Channel Introduced by van der Meulen Characterized by p ( y 1 , y 2 | x 1 , x 2 ) Capacity of relay channel not known Multi-letter capacity given by van der Meulen: C k = lim k →∞ C k C = sup k where C k � 1 k I ( X k 1 ; Y k max 2 ) 1 ) , { x 2 i ( y i − 1 p ( x k ) } k 1 i =1 Various achievable schemes: amplify-and-forward, decode-and-forward, compress-and-forward T. M. Cover and A. E. Gamal, Capacity theorems for the relay channel , IEEE Trans. Inf. Theory , Sep. 1979. Deniz G¨ und¨ uz Joint Source and Channel Coding

35. Relay Channel with Destination Side Information Relay Channel Destination Source Separation still optimal Proof of separation in a network whose capacity is not known! D. Gunduz, E. Erkip, A. Goldsmith and H. Poor, Reliable joint source-channel cooperative transmission over relay networks , IEEE Trans. Inform. Theory , Apr. 2013. Deniz G¨ und¨ uz Joint Source and Channel Coding

36. Relay and Destination Side Information Relay Source Channel Destination Source-channel rate r is achievable if, r · H ( S | T 1 ) ≤ I ( X 1 ; Y 1 | X 2 ) r · H ( S | T 2 ) ≤ I ( X 1 , X 2 ; Y 2 ) for some p ( x 1 , x 2 ). Decode-and-forward transmission Optimal for physically degraded relay channel ( X 1 − ( X 2 , Y 1 ) − Y 2 ) with degraded side information ( S 1 − T 1 − T 2 ) Deniz G¨ und¨ uz Joint Source and Channel Coding

37. Relay and Destination Side Information Relay Source Channel Destination Source-channel rate r is achievable if, r · H ( S | T 1 ) ≤ I ( X 1 ; Y 1 | X 2 ) r · H ( S | T 2 ) ≤ I ( X 1 , X 2 ; Y 2 ) for some p ( x 1 , x 2 ). Decode-and-forward transmission Optimal for physically degraded relay channel ( X 1 − ( X 2 , Y 1 ) − Y 2 ) with degraded side information ( S 1 − T 1 − T 2 ) Deniz G¨ und¨ uz Joint Source and Channel Coding

38. Achievability Block Markov encoding Regular encoding and joint source-channel sliding window decoding More complicated decoder Less delay Regular encoding and separate source-channel backward decoding Simpler decoder More delay Deniz G¨ und¨ uz Joint Source and Channel Coding

39. Backward decoding Randomly partition all source outputs into - M 1 = 2 nH ( S | T 1 ) bins: Relay bins - M 2 = 2 nH ( S | T 2 ) bins: Destination bins Fix p ( x 1 , x 2 ). Generate - M 1 codewords of length n with � n i =1 p ( x 2 ,i ). Enumerate as x n 2 ( w 2 ). - For each x n 2 ( w 2 ), generate M 1 codewords of length n with � n i =1 p ( x 1 ,i | x n 2 ,i ). Enumerate as x n 1 ( w 1 , w 2 ) Deniz G¨ und¨ uz Joint Source and Channel Coding

40. Backward decoding Send Bm samples over ( B + 1) n channel uses with n/m = r . w 1 ,i ∈ [1 , M 1 ]: relay bin index of source block i = 1 , . . . , B w 2 ,i ∈ [1 , M 2 ]: destination bin index of block i = 1 , . . . , B Block 1 Block 2 · · · Block i · · · Block B + 1 x n x n x n x n · · · · · · 1 ( w 1 , 1 , 1) 1 ( w 1 , 2 , w 2 , 1 ) 1 ( w 1 ,i , w 2 ,i − 1 ) 1 (1 , w 2 ,B ) 2 ( w ′ 2 ( w ′ 2 ( w ′ x n x n x n x n 2 (1) 2 , 1 ) · · · 2 ,i − 1 ) · · · 2 ,B ) Relay decodes reliably if H ( S | T 1 ) ≤ r · I ( X 1 ; Y 1 | X 2 ) Destination decodes reliably if H ( S | T 2 ) ≤ r · I ( X 1 , X 2 ; Y 1 ) Deniz G¨ und¨ uz Joint Source and Channel Coding

41. Backward decoding Send Bm samples over ( B + 1) n channel uses with n/m = r . w 1 ,i ∈ [1 , M 1 ]: relay bin index of source block i = 1 , . . . , B w 2 ,i ∈ [1 , M 2 ]: destination bin index of block i = 1 , . . . , B Block 1 Block 2 · · · Block i · · · Block B + 1 x n x n x n x n · · · · · · 1 ( w 1 , 1 , 1) 1 ( w 1 , 2 , w 2 , 1 ) 1 ( w 1 ,i , w 2 ,i − 1 ) 1 (1 , w 2 ,B ) 2 ( w ′ 2 ( w ′ 2 ( w ′ x n x n x n x n 2 (1) 2 , 1 ) · · · 2 ,i − 1 ) · · · 2 ,B ) Relay decodes reliably if H ( S | T 1 ) ≤ r · I ( X 1 ; Y 1 | X 2 ) Destination decodes reliably if H ( S | T 2 ) ≤ r · I ( X 1 , X 2 ; Y 1 ) Deniz G¨ und¨ uz Joint Source and Channel Coding

42. Backward decoding Send Bm samples over ( B + 1) n channel uses with n/m = r . w 1 ,i ∈ [1 , M 1 ]: relay bin index of source block i = 1 , . . . , B w 2 ,i ∈ [1 , M 2 ]: destination bin index of block i = 1 , . . . , B Block 1 Block 2 · · · Block i · · · Block B + 1 x n x n x n x n · · · · · · 1 ( w 1 , 1 , 1) 1 ( w 1 , 2 , w 2 , 1 ) 1 ( w 1 ,i , w 2 ,i − 1 ) 1 (1 , w 2 ,B ) 2 ( w ′ 2 ( w ′ 2 ( w ′ x n x n x n x n 2 (1) 2 , 1 ) · · · 2 ,i − 1 ) · · · 2 ,B ) Relay decodes reliably if H ( S | T 1 ) ≤ r · I ( X 1 ; Y 1 | X 2 ) Destination decodes reliably if H ( S | T 2 ) ≤ r · I ( X 1 , X 2 ; Y 1 ) Deniz G¨ und¨ uz Joint Source and Channel Coding

43. Do we need coding? Encoder Channel Decoder Let S i ∼ N (0 , 1) i.i.d. Gaussian Memoryless Gaussian channel: 1 m E [ X m ( X m ) T ] ≤ P Y i = X i + Z i , Z i N (0 , N ) , 1 2 log � 1 + P � Capacity: N Distortion-rate function: D ( R ) = 2 − 2 R � − 1 1 + P � D min = N What about uncoded/ analog transmision? √ X i = PS i MMSE at the receiver Uncoded symbol-by-symbol transmission is optimal! T. J. Goblick, Theoretical limitations on the transmission of data from analog sources , IEEE Trans. Inf. Theory , vol. 11, pp. 558- 567, Oct. 1965. Deniz G¨ und¨ uz Joint Source and Channel Coding

44. Do we need coding? Encoder Channel Decoder Let S i ∼ N (0 , 1) i.i.d. Gaussian Memoryless Gaussian channel: 1 m E [ X m ( X m ) T ] ≤ P Y i = X i + Z i , Z i N (0 , N ) , 1 2 log � 1 + P � Capacity: N Distortion-rate function: D ( R ) = 2 − 2 R � − 1 1 + P � D min = N What about uncoded/ analog transmision? √ X i = PS i MMSE at the receiver Uncoded symbol-by-symbol transmission is optimal! T. J. Goblick, Theoretical limitations on the transmission of data from analog sources , IEEE Trans. Inf. Theory , vol. 11, pp. 558- 567, Oct. 1965. Deniz G¨ und¨ uz Joint Source and Channel Coding

45. Do we need coding? Encoder Channel Decoder Let S i ∼ N (0 , 1) i.i.d. Gaussian Memoryless Gaussian channel: 1 m E [ X m ( X m ) T ] ≤ P Y i = X i + Z i , Z i N (0 , N ) , 1 2 log � 1 + P � Capacity: N Distortion-rate function: D ( R ) = 2 − 2 R � − 1 1 + P � D min = N What about uncoded/ analog transmision? √ X i = PS i MMSE at the receiver Uncoded symbol-by-symbol transmission is optimal! T. J. Goblick, Theoretical limitations on the transmission of data from analog sources , IEEE Trans. Inf. Theory , vol. 11, pp. 558- 567, Oct. 1965. Deniz G¨ und¨ uz Joint Source and Channel Coding

46. Do we need coding? Encoder Channel Decoder Let S i ∼ N (0 , 1) i.i.d. Gaussian Memoryless Gaussian channel: 1 m E [ X m ( X m ) T ] ≤ P Y i = X i + Z i , Z i N (0 , N ) , 1 2 log � 1 + P � Capacity: N Distortion-rate function: D ( R ) = 2 − 2 R � − 1 1 + P � D min = N What about uncoded/ analog transmision? √ X i = PS i MMSE at the receiver Uncoded symbol-by-symbol transmission is optimal! T. J. Goblick, Theoretical limitations on the transmission of data from analog sources , IEEE Trans. Inf. Theory , vol. 11, pp. 558- 567, Oct. 1965. Deniz G¨ und¨ uz Joint Source and Channel Coding

47. Do we need coding? Encoder Channel Decoder Let S i ∼ N (0 , 1) i.i.d. Gaussian Memoryless Gaussian channel: 1 m E [ X m ( X m ) T ] ≤ P Y i = X i + Z i , Z i N (0 , N ) , 1 2 log � 1 + P � Capacity: N Distortion-rate function: D ( R ) = 2 − 2 R � − 1 1 + P � D min = N What about uncoded/ analog transmision? √ X i = PS i MMSE at the receiver Uncoded symbol-by-symbol transmission is optimal! T. J. Goblick, Theoretical limitations on the transmission of data from analog sources , IEEE Trans. Inf. Theory , vol. 11, pp. 558- 567, Oct. 1965. Deniz G¨ und¨ uz Joint Source and Channel Coding

48. To Code or Not To Code Encoder Channel Decoder S can be communicated over channel p ( y | x ) uncoded if X ∼ p S ( x ) attains the capacity C = max p ( x ) I ( X ; Y ) s | s ) attains the rate-distortion function test channel p Y | X (ˆ S ) ≤ D ] I ( S ; ˆ R ( D ) = min p (ˆ S ) s | s ):E[ d ( S, ˆ Then, we have C = R ( D ). M. Gastpar, B. Rimoldi, and M. Vetterli, To code, or not to code: Lossy source-channel communication revisited , IEEE Trans. Inf. Theory , May 2003. Deniz G¨ und¨ uz Joint Source and Channel Coding

49. Gaussian Sources over Gaussian MAC Encoder 1 Decoder Encoder 2 � � �� � � �� S 1 0 1 ρ Correlated Gausssian sources: ∼ N , S 2 0 ρ 1 Memoryless Gaussian MAC: 1 i ) T ] ≤ P m E [ X m i ( X m Z j ∼ N (0 , 1) , Y j = X 1 ,j + X 2 ,j + Z j , Mean squared-error distortion measure: � S i,j | 2 � � m j =1 | S i,j − ˆ 1 D i = E , i = 1 , 2 . m Necessary conditions: R S 1 ,S 2 ( D 1 , D 2 ) ≤ 1 2 log(1 + 2 P (1 + ρ )) Corollary ρ Uncoded transmission is optimal in the low SNR regime, i.e., if P ≤ 1 − ρ 2 . A. Lapidoth and S. Tinguely, Sending a bivariate Gaussian over a Gaussian MAC , IEEE Transactions on Information Theory , Jun. 2010. Deniz G¨ und¨ uz Joint Source and Channel Coding

50. Gaussian Sources over Gaussian MAC Encoder 1 Decoder Encoder 2 � � �� � � �� S 1 0 1 ρ Correlated Gausssian sources: ∼ N , S 2 0 ρ 1 Memoryless Gaussian MAC: 1 i ) T ] ≤ P m E [ X m i ( X m Z j ∼ N (0 , 1) , Y j = X 1 ,j + X 2 ,j + Z j , Mean squared-error distortion measure: � S i,j | 2 � � m j =1 | S i,j − ˆ 1 D i = E , i = 1 , 2 . m Necessary conditions: R S 1 ,S 2 ( D 1 , D 2 ) ≤ 1 2 log(1 + 2 P (1 + ρ )) Corollary ρ Uncoded transmission is optimal in the low SNR regime, i.e., if P ≤ 1 − ρ 2 . A. Lapidoth and S. Tinguely, Sending a bivariate Gaussian over a Gaussian MAC , IEEE Transactions on Information Theory , Jun. 2010. Deniz G¨ und¨ uz Joint Source and Channel Coding

51. Gaussian Sources over Weak Interference Channel Correlated Gausssian sources with correlation coefficient ρ Memoryless Gaussian weak interference channel ( c ≤ 1): Y 1 ,j = X 1 ,j + cX 2 ,j + Z 1 ,j , Y 2 ,j = cX 1 ,j + X 2 ,j + Z 2 ,j , i ) T ] ≤ P m E [ X m 1 i ( X m with Corollary ρ Uncoded transmission is optimal in the low SNR regime, i.e., if cP ≤ 1 − ρ 2 . I. E. Aguerri and D. Gunduz, Correlated Gaussian sources over Gaussian weak interference channels , IEEE Inform. Theory Workshop (ITW) , Oct. 2015. Deniz G¨ und¨ uz Joint Source and Channel Coding

52. Remote Estimation Encoder 1 Decoder Encoder 2 Memoryless Gaussian MAC: 1 i ) T ] ≤ P m E [ X m i ( X m Z i ∼ N (0 , 1) , Y i = X 1 ,j + X 2 ,j + Z i , Uncoded transmission is always optimal! M. Gastpar, Uncoded transmission is exactly optimal for a simple Gaussian sensor network, IEEE Trans. Inf. Theory , Nov. 2008. Deniz G¨ und¨ uz Joint Source and Channel Coding

53. Beyond Bandwidth Match How do we map 2 Gaussian sample into 1 channel use? or, 1 sample to 2 channel uses? Optimal mappings (encoder and decoder) are either noth linear or both nonlinear. Can be optimized numerically. What about 1 sample and unlimited bandwidth? E Akyol, KB Viswanatha, K Rose, TA Ramstad, On zero-delay source-channel coding , IEEE Transactions on Information Theory, Dec. 2012. E. Koken, E. Tuncel, and D. Gunduz, Energy-distortion exponents in lossy transmission of Gaussian sources over Gaussian channels , IEEE Trans. Information Theory , Feb. 2017. Deniz G¨ und¨ uz Joint Source and Channel Coding

54. What About in Practice? SoftCast: Uncoded image/video transmission Divide DCT coefficients into blocks Find empirical variance (“energy”) of each block Compression: Remove blocks with low energy Remaining blocks transmitted uncoded Power allocation according to block energies S. Jakubczak and D. Katabi, Softcast: One-size-fits-all wireless video, in Proc. ACM SIGCOMM , New York, NY, Aug. 2010, pp. 449–450. Deniz G¨ und¨ uz Joint Source and Channel Coding

55. SoftCast: Uncoded Video Transmission S. Jakubczak and D. Katabi, Softcast: One-size-fits-all wireless video, in Proc. ACM SIGCOMM , New York, NY, Aug. 2010, pp. 449–450. Deniz G¨ und¨ uz Joint Source and Channel Coding

56. SparseCast: Hybrid Digital-Analog Image Transmission SparseCast: Hybrid digital-analog image transmission Block-based DCT transform One vector for each frequency component Thresholding for compression (remove small components) Compressive sensing for transmission Tung and Gunduz, SparseCast: Hybrid Digital-Analog Wireless Image Transmission Exploiting Frequency Domain Sparsity , IEEE Comm. Letters , 2018. Deniz G¨ und¨ uz Joint Source and Channel Coding

57. Exploit Sparsity for Bandwidth Efficiency 0 0 0 0 0 Y k = A k x k + Z k N × N grayscale image B × B block DCT transform B 2 vectors (of length N 2 /B 2 each) Thresholding for compression Compressive transmission: measurement matrix A k dimension chosen according to sparsity of x k finite set of sparsity levels variance according to power allocation Approximate message passing (AMP) receiver Tung and Gunduz, SparseCast: Hybrid Digital-Analog Wireless Image Transmission Exploiting Frequency Domain Sparsity , IEEE Comm. Letters , 2018. Deniz G¨ und¨ uz Joint Source and Channel Coding

58. SparseCast: Hybrid Digital-Analog Image Transmission 131 K channel symbols transmitted 60 64 QAM 3/4 I 50 64 QAM 2/3 I 1 f 16 QAM 3/4 40 PSNR (dB) I I l 16 QAM 1/2 i l l QPSK 3/4 I I i T QPSK 1/2 i 30 l l t I BPSK 1/2 t i l i l I I I l i I l l l 20 i i SparseCast l l I l I I SoftCast l l I i l BCS-SPL l 10 I I l i l I I I 0 0 5 10 15 20 25 CSNR (dB) Metadata size: SoftCast: 17 Kbits, SoftCast 10 − 16 Kbits (depending on block threshold) Tung and Gunduz, SparseCast: Hybrid Digital-Analog Wireless Image Transmission Exploiting Frequency Domain Sparsity , IEEE Comm. Letters , 2018. Deniz G¨ und¨ uz Joint Source and Channel Coding

59. SparseCast: USRP Implementation 75 K channel symbols transmitted Tung and Gunduz, SparseCast: Hybrid Digital-Analog Wireless Image Transmission Exploiting Frequency Domain Sparsity , IEEE Comm. Letters , 2018. Deniz G¨ und¨ uz Joint Source and Channel Coding

60. Learning to Communicate Forget about compression, channel coding, modulation, channel estimation, equalization, etc. Deep neural networks for code design Deniz G¨ und¨ uz Joint Source and Channel Coding

61. Autoencoder: Dimensionality Reduction with Neural Networks (NNs) Example of unsupervised learning Two NNs trained together: Goal is to reconstruct the original input with highest fidelity Deniz G¨ und¨ uz Joint Source and Channel Coding

62. Deep JSCC Architecture E. Bourtsoulatze, D. Burth Kurka and D. Gunduz, Deep joint source-channel coding for wireless image transmission-journal , submitted, IEEE TCCN, Sep. 2018. Deniz G¨ und¨ uz Joint Source and Channel Coding

63. Deep JSCC - PSNR vs. Channel Bandwidth AWGN channel 50 45 40 Deep JSCC (SNR=0dB) Deep JSCC (SNR=10dB) 35 PSNR (dB) Deep JSCC (SNR=20dB) JPEG (SNR=0dB) 30 JPEG (SNR=10dB) JPEG (SNR=20dB) JPEG2000 (SNR=0dB) 25 JPEG2000 (SNR=10dB) JPEG2000 (SNR=20dB) 20 15 10 0 0.1 0.2 0.3 0.4 0.5 k/n E. Bourtsoulatze, D. Burth Kurka and D. Gunduz, Deep joint source-channel coding for wireless image transmission-journal , submitted, IEEE TCCN, Sep. 2018. Deniz G¨ und¨ uz Joint Source and Channel Coding

64. Deep JSCC - PSNR vs. Test SNR AWGN channel (k/n=1/12) 32 30 28 PSNR (dB) 26 24 Deep JSCC (SNR train =1dB) 22 Deep JSCC (SNR train =4dB) Deep JSCC (SNR train =7dB) 20 Deep JSCC (SNR train =13dB) Deep JSCC (SNR train =19dB) 18 0 5 10 15 20 25 SNR test (dB) Provides graceful degradation with channel SNR! More like analog communications than digital. E. Bourtsoulatze, D. Burth Kurka and D. Gunduz, Deep joint source-channel coding for wireless image transmission-journal , submitted, IEEE TCCN, Sep. 2018. Deniz G¨ und¨ uz Joint Source and Channel Coding

65. Deep JSCC over a Rayleigh Fading Channel Slow Rayleigh fading channel 32 30 28 26 Deep JSCC (SNR=0dB) Deep JSCC (SNR=10dB) PSNR (dB) 24 Deep JSCC (SNR=20dB) JPEG (SNR=0dB) 22 JPEG (SNR=10dB) JPEG (SNR=20dB) JPEG2000 (SNR=0dB) 20 JPEG2000 (SNR=10dB) JPEG2000 (SNR=20dB) 18 16 14 12 0 0.2 0.4 0.6 0.8 1 k/n No pilot signal or explicit channel estimation is needed! E. Bourtsoulatze, D. Burth Kurka and D. Gunduz, Deep joint source-channel coding for wireless image transmission-journal , submitted, IEEE TCCN, Sep. 2018. Deniz G¨ und¨ uz Joint Source and Channel Coding

66. Larger Images AWGN channel (k/n=1/12), JPEG 35 33 Deep JSCC (SNR train =-2dB) 31 Deep JSCC (SNR train =1dB) Deep JSCC (SNR train =4dB) 29 Deep JSCC (SNR train =7dB) PSNR (dB) Deep JSCC (SNR train =13dB) 27 Deep JSCC (SNR train =19dB) 1/2 rate LDPC + 4QAM 25 2/3 rate LDPC + 4QAM 1/2 rate LDPC + 16QAM 2/3 rate LDPC + 16QAM 23 1/2 rate LDPC + 64QAM 2/3 rate LDPC + 64QAM 21 19 -2 1 4 7 10 13 16 19 22 25 SNR test (dB) Train on ImageNet, test with Kodak dataset (24 images of size 768 x 512) E. Bourtsoulatze, D. Burth Kurka and D. Gunduz, Deep joint source-channel coding for wireless image transmission-journal , submitted, IEEE TCCN, Sep. 2018. Deniz G¨ und¨ uz Joint Source and Channel Coding

67. Larger Images Original Deep JSCC JPEG JPEG2000 N/A 30.9dB 22.68dB 31.92dB 31.65dB 36.40dB 32.90dB 34.36dB 38.46dB 35.34dB 36.45dB 40.5dB Deniz G¨ und¨ uz Joint Source and Channel Coding

68. Larger Images Original Deep JSCC JPEG JPEG2000 25.07dB 20.63dB 24.11dB 26.86dB 24.78dB 27.5dB 28.45dB 27.14dB 30.15dB 31.46dB 29.81dB 33.03dB Deniz G¨ und¨ uz Joint Source and Channel Coding

69. Quality vs. Compression Rate Deniz G¨ und¨ uz Joint Source and Channel Coding

70. Deep Wireless Successive Refinement NN NN Channel Encoder 1 Decoder 1 NN NN Channel Encoder 2 Decoder 2 Deniz G¨ und¨ uz Joint Source and Channel Coding

71. Deep Wireless Successive Refinement NN NN Channel Encoder 1 Decoder 1 NN NN Channel Encoder 2 Decoder 2 Deniz G¨ und¨ uz Joint Source and Channel Coding

72. Two-layer Successive Refinement Deniz G¨ und¨ uz Joint Source and Channel Coding

73. Five-layer Successive Refinement Deniz G¨ und¨ uz Joint Source and Channel Coding

74. First Two Layer Comparison Deniz G¨ und¨ uz Joint Source and Channel Coding

75. Hypothesis Testing over a Noisy Channel Channel Observer Detector k k Null hypothesis H 0 : U k ∼ Alternate hypothesis H 1 : U k ∼ � � P U , Q U . i =1 i =1 Acceptance region for H 0 : A ( n ) ⊆ Y n Definition Type-2 error exponent κ is ( τ, ǫ ) achievable if there exist k, n , such that n ≤ τ · k , and k,n →∞ − 1 Q Y n ( A ( n ) ) � k log � lim inf ≥ κ − 1 1 − P Y n ( A ( n ) ) � k log � lim sup ≤ ǫ k,n →∞ κ ( τ, ǫ ) � sup { κ ′ : κ ′ is achievable } Deniz G¨ und¨ uz Joint Source and Channel Coding

76. Hypothesis Testing over a Noisy Channel Channel Observer Detector k k Null hypothesis H 0 : U k ∼ Alternate hypothesis H 1 : U k ∼ � � P U , Q U . i =1 i =1 E c � ( x,x ′ ) ∈X×X D ( P Y | X = x || P Y | X = x ′ ) max κ ( τ, ǫ ) = min ( D ( P U || Q U ) , τE c ) Making decisions locally at the observer, and communicating it to the detector is optimal. Deniz G¨ und¨ uz Joint Source and Channel Coding

77. Hypothesis Testing over a Noisy Channel Channel Observer Detector k k Null hypothesis H 0 : U k ∼ Alternate hypothesis H 1 : U k ∼ � � P U , Q U . i =1 i =1 E c � ( x,x ′ ) ∈X×X D ( P Y | X = x || P Y | X = x ′ ) max κ ( τ, ǫ ) = min ( D ( P U || Q U ) , τE c ) Making decisions locally at the observer, and communicating it to the detector is optimal. Deniz G¨ und¨ uz Joint Source and Channel Coding

78. Distributed Hypothesis Testing Channel Observer Detector k k � � H 0 : ( U k , E K , Z K ) ∼ H 1 : ( U k , E K , Z K ) ∼ P UEZ , Q UEZ . i =1 i =1 Problem open for general Q Let κ ( τ ) = lim ǫ → 0 κ ( τ, ǫ ) Testing Against Conditional Independence: Q UEZ = P UE P E | Z � I ( E ; W | Z ) : ∃ W s.t. I ( U ; W | Z ) ≤ τC ( P Y | X ) , � κ ( τ ) = sup , τ ≥ 0 . ( Z, E ) − U − W, |W| ≤ |U| + 1 . Optimal performance achieved by a separation-based scheme. Deniz G¨ und¨ uz Joint Source and Channel Coding

79. Machine Learning (ML) at the Edge Significant amount of data will be collected by IoT devices at network edge Standard approach: Powerful centralized ML algorithms to make sense of data Requires sending data to the cloud Costy in terms of bandwidth/ energy May conflict with privacy requirements Alternative: distributed/ federated learning Master Server Deniz G¨ und¨ uz Joint Source and Channel Coding

80. Machine Learning (ML) at the Edge Significant amount of data will be collected by IoT devices at network edge Standard approach: Powerful centralized ML algorithms to make sense of data Requires sending data to the cloud Costy in terms of bandwidth/ energy May conflict with privacy requirements Alternative: distributed/ federated learning Master Server Deniz G¨ und¨ uz Joint Source and Channel Coding

81. Distributed Machine Learning Data set: ( u 1 , y 1 ) , . . . , ( u N , y N ) N F ( θ ) = 1 � f ( θ , u n ) N n =1 Master Server N θ t +1 = θ t − η t 1 � ∇ f ( θ t , u n ) N n =1 Deniz G¨ und¨ uz Joint Source and Channel Coding

82. Wireless Edge Learning Communication is bottleneck in distributed learning ML literature focuses on reducing the number and size of gradient informaton transmitted from each worker Underlying channel ignored In edge learning, wireless channel is limited in bandwidth and may suffer from interference Worker 1 noise Worker 2 Parameter server Worker K Deniz G¨ und¨ uz Joint Source and Channel Coding

Recommend

More recommend