Chapter 7 Channel Capacity Peng-Hua Wang Graduate Inst. of Comm. - PowerPoint PPT Presentation

Chapter 7 Channel Capacity Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University

Chapter Outline Chap. 7 Channel Capacity 7.1 Examples of Channel Capacity 7.2 Symmetric Channels 7.3 Properties of Channel Capacity 7.4 Preview of the Channel Coding Theorem 7.5 Definitions 7.6 Jointly Typical Sequences 7.7 Channel Coding Theorem 7.8 Zero-Error Codes 7.9 Fano’s Inequality and the Converse to the Coding Theorem Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 2/62

7.1 Examples of Channel Capacity Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 3/62

Channel Model ■ Operational channel capacity: the bit number to represent the maximum number of distinguishable signals for n uses of a communication channel. ◆ In n transmission, we can send M signals without error, the channel capacity is log M/n bits per transmission. ■ Information channel capacity: the maximum mutual information ■ Operational channel capacity is equal to Information channel capacity. ◆ Fundamental theory and central success of information theory. Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 4/62

Channel capacity Definition 1 (Discrete Channel) A system consisting of an input alphabet X and output alphabet Y and a probability transition matrix p ( y | x ) . Definition 2 (Channel capacity) The “information” channel capacity of a discrete memoryless channel is C = max p ( x ) I ( X ; Y ) where the maximum is taken over all possible input distribution p ( x ) . ■ Operational definition of channel capacity: The highest rate in bits per channel use at which information can be sent. ■ Shannon’s second theorem: The information channel capacity is equal to the operational channel capacity. Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 5/62

Example 1 Noiseless binary channel p ( Y = 0) = p ( X = 0) = π 0 , p ( Y = 1) = p ( X = 1) = π 1 = 1 − π 0 I ( X ; Y ) = H ( Y ) − H ( Y | X ) = H ( Y ) ≤ 1 “ = ” ⇒ π 0 = π 1 = 1 / 2 Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 6/62

Example 2 Noisy channel with non-overlapping outputs p ( X = 0) = π 0 , p ( X = 1) = π 1 = 1 − π 0 p ( Y = 1) = π 0 p, p ( Y = 2) = π 0 (1 − p ) , p = 1 / 2 p ( Y = 3) = π 1 q, p ( Y = 4) = π 1 (1 − q ) , q = 1 / 3 I ( X ; Y ) = H ( Y ) − H ( Y | X ) = H ( Y ) − π 0 H ( p ) − π 1 H ( q ) Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 7/62 = H ( π 0 ) = H ( X ) ≤ 1

Noisy Typewriter Noisy typewriter Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 8/62

Noisy Typewriter I ( X ; Y ) = H ( Y ) − H ( Y | X ) � = H ( Y ) − p ( x ) H ( Y | X = x ) x � p ( x ) H ( 1 = H ( Y ) − 2 ) x = H ( Y ) − H ( 1 2 ) ≤ log 26 − 1 = log 13 C = max I ( X ; Y ) = log 13 Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 9/62

Binary Symmetric Channel (BSC) Binary Symmetric Channel (BSC) Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 10/62

Binary Symmetric Channel (BSC) I ( X ; Y ) = H ( Y ) − H ( Y | X ) � = H ( Y ) − p ( x ) H ( Y | X = x ) x � = H ( Y ) − p ( x ) H ( p ) x = H ( Y ) − H ( p ) ≤ 1 − H ( p ) C = max I ( X ; Y ) = 1 − H ( p ) Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 11/62

Binary Erasure Channel Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 12/62

Binary Erasure Channel I ( X ; Y ) = H ( Y ) − H ( Y | X ) � = H ( Y ) − p ( x ) H ( Y | X = x ) x � = H ( Y ) − p ( x ) H ( α ) x = H ( Y ) − H ( α ) H ( Y ) = (1 − α ) H ( π 0 ) + H ( α ) C = max I ( X ; Y ) = 1 − α Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 13/62

7.3 Properties of Channel Capacity Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 14/62

Properties of Channel Capacity ■ C ≥ 0 . ■ C ≤ log |X| . ■ C ≤ log |Y| . ■ I ( X ; Y ) is a continuous function of p ( x ) , ■ I ( X ; Y ) is a concave function of p ( x ) , Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 15/62

7.4 Preview of the Channel Coding Theorem Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 16/62

Preview of the Channel Coding Theorem ■ For each input n -sequence, there are approximately 2 nH ( Y | X ) , possible Y sequences. ■ The total number of possible (typical) Y sequences is 2 nH ( Y ) . ■ This set has to be divided into sets of size 2 nH ( Y | X ) corresponding to the different input X sequences. ■ The total number of disjoint sets is less than or equal to 2 nH ( Y ) / 2 nH ( Y | X ) = 2 n ( H ( Y ) − H ( Y | X )) = 2 nI ( X ; Y ) ■ We can send at most 2 nI ( X ; Y ) distinguishable sequences of length n . Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 17/62

Example ■ 6 typical sequences for X n . 4 typical sequences for Y n . ■ 12 typical sequences for ( X n , Y n ) . ■ For every X n , we have 2 nH ( X,Y ) / 2 nH ( X ) = 2 nH ( Y | X ) = 2 typical Y n . e.g., for X n = 001100 ⇒ Y n = 010100 , 101011 . Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 18/62

Example ■ Since we have 2 nH ( Y ) = 4 typical Y n in total, how many typical X n can these typical Y n be assigned? 2 nH ( Y ) / 2 nH ( Y | X ) = 2 n ( H ( Y ) − H ( Y | X )) = 2 nI ( X ; Y ) = 2 . ■ Can we assign more typical X n ? No. For some Y n received, we can’t not determine which X n is received. e.g., If we use 001100, 101101, and 101000 as codewords, we can’t determine which codeword is sent when we receive 101011. Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 19/62

7.5 Definitions Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 20/62

Communication Channel ■ Message W ∈ { 1 , 2 , ..., M } . ■ Encoder: input W , output X n ≡ X n ( W ) ∈ X n ◆ n is the length of the signal. We then transmit the signal via the channel by using the channel n times. Every time we send a symbol of the signal. ■ Channel: input X n , output Y n with distribution p ( y n | x n ) ■ Decoder: input Y n , output ˆ W = g ( Y n ) where g ( Y n ) is a deterministic decoding rule. ■ If ˆ W � = W , an error occurs. Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 21/62

Definitions Definition 3 (Discrete Channel) A discrete channel, denoted by ( X , p ( y | x ) , Y ) , consists of two finite sets X and Y and a collection of probability mass functions p ( y | x ) . ■ X : input, Y :output, for every input x ∈ X , � y p ( y | x ) = 1 . Definition 4 (Discrete Memoryless Channel, DMC) The n th extension of the discrete memoryless channel is the channel ( X n , p ( y n | x n ) , Y n ) where p ( y k | x k , y k − 1 ) = p ( y k | x k ) , k = 1 , 2 , . . . , n. ■ Without feedback: p ( x k | x k − 1 , y k − 1 ) = p ( x k | x k − 1 ) ■ n th extension of DMC without feedback: n � p ( y n | x n ) = p ( y i | x i ) . i =1 Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 22/62

Definitions Definition 5 ( M, n ) code] An ( M, n ) code for the channel ( X , p ( y | x ) , Y ) consists of the following: 1. An index set { 1 , 2 , . . . , M } . 2. An encoding function X n : { 1 , 2 , . . . , M } → X n . The codewords are x n (1) , x n (2) , . . . , x n ( M ) . The set of codewords is called the codebook . 3. A decoding function g : Y n → { 1 , 2 , . . . , M } Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 23/62

Definitions Definition 6 (Conditional probability of error) λ i = Pr( g ( Y n ) � = i | X n = x n ( i )) = � p ( y n | x n ( i )) g ( y n ) � = i � p ( y n | x n ( i )) I ( g ( y n ) � = i ) = y n ■ I ( · ) is the indicator function. Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 24/62

Definitions Definition 7 (Maximal probability of error) λ ( n ) = i ∈{ 1 , 2 ,...,M } λ i max Definition 8 (Average probability of error) M = 1 � P ( n ) λ i e M i =1 ■ The decoding error is M � Pr( g ( Y n ) � = W ) = Pr( W = i ) Pr( g ( Y n ) � = i | W = i ) i =1 If the index W is chosen uniformly from { 1 , 2 , . . . , M } , then P ( n ) = Pr( g ( Y n ) � = W ) . e Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 25/62

Definitions Definition 9 (Rate) The rate R of an ( M, n ) code is R = log M bits per transmission n Definition 10 (Achievable rate) A rate R is said to be achievable if there exists a ( ⌈ 2 nR ⌉ , n ) code such that the maximal probability of error λ ( n ) tends to 0 as n → ∞ . Definition 11 (Channel capacity) The capacity of a channel is the supremum of all achievable rates. Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 26/62

7.6 Jointly Typical Sequences Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 27/62

Definitions Definition 12 (Jointly typical sequences) The set A ( n ) of jointly ǫ typical sequences { ( x n , y n ) } with respect to the distribution p ( x, y ) is defined by � � � � − 1 ( x n , y n ) ∈ X n × Y n : A ( n ) n log p ( x n ) − H ( X ) � � = � < ǫ, ǫ � � � � � − 1 n log p ( y n ) − H ( Y ) � � � < ǫ, � � � � � � − 1 n log p ( x n , y n ) − H ( X, Y ) � � � < ǫ � � where n � p ( x n , y n ) = p ( x i , y i ) i =1 Peng-Hua Wang, April 16, 2012 Information Theory, Chap. 7 - p. 28/62

Chapter 7 Channel Capacity Peng-Hua Wang Graduate Inst. of Comm. - PowerPoint PPT Presentation

Chapter 7 Channel Capacity Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 7 Channel Capacity 7.1 Examples of Channel Capacity 7.2 Symmetric Channels 7.3 Properties of Channel Capacity 7.4

CHANNEL ALLOCATION Channel Language Translation Channel Translation Language Channel 1 German

ANNUAL ACCOUNTS PRESS CONFERENCE CHANNEL ALLOCATION. Channel Language Translation Channel

Channel Assignment and Channel Hopping in IEEE 802.11 Operating Channels for 802.11b Europe

Part V. AWGN Channel Capacity AWGN Capacity Formula; Sphere Packing; Resources in AWGN Channel

ANNUAL ACCOUNTS PRESS CONFERENCE LANGUAGE CHANNELS. Channel Language Channel (translation)

Channel design Channel coverage Intensive Selective Exclusive Channel

Formal Modeling in Cognitive Science 1 Noisy Channel Model Channel Capacity Lecture 29: Noisy

1 Simultaneous interpretation EN channel 1 FR channel 2 ES channel 3 DE channel 4 2 The Future

Channel capacity estimation using free probability theory yvind Ryan and Merouane Debbah

Further Properties of Wireless Channel Capacity Fengyou Sun and Yuming Jiang Norwegian University

Information Theory Lecture 10 Network Information Theory (CT15); a focus on channel capacity

Previously... Joint typical sequences Covering and Packing Lemmas Channel Coding Theorem

Capacity of a Broadcast Channel with Luo, Gohary, Yanikomeroglu Gaussian Jamming and a Friendly

Chapter 4 The Medium Access Control Sublayer 1 The Channel Allocation Problem Static

BK4047B 20 MHz Dual Channel Function/Arbitrary Waveform Generator BK4047B 20 MHz Dual Channel

FIRST GLOBAL TURKISH DRAMA CHANNEL TIMELESS DRAMA CHANNEL TIMELESS DRAMA CHANNEL Love, family,

Operator algebras and data hiding in topologically ordered systems Leander Fiedler Pieter

- Stuart Boersma, Central Washington Univ. - Caren Diefenderfer, Hollins University - Shannon

Comparison of Noisy Channels and Reverse Data-Processing Theorems Francesco Buscemi 1 2017 IEEE

Reconciling Fitts Law with Shannons Information Theory EMPG 2015 University of Padua,

A baseline for a monitoring program in the Natural Local Reserve of Douro Estuary using the AMBI

Assessing Single-Objective Performance Assessing Single-Objective Performance Convergence and

Statistics and learning Tests Emmanuel Rachelson and Matthieu Vignes ISAE SupAero Wednesday 16

Sorting Algorithms CptS 223 Advanced Data Structures Larry Holder School of Electrical