Lecture 6 Channel Coding over Continuous Channels I-Hsiang Wang - - PowerPoint PPT Presentation

lecture 6 channel coding over continuous channels
SMART_READER_LITE
LIVE PREVIEW

Lecture 6 Channel Coding over Continuous Channels I-Hsiang Wang - - PowerPoint PPT Presentation

Channel Coding over Continuous Memoryless Channels Lecture 6 Channel Coding over Continuous Channels I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw November 2, 2015 1 / 30 I-Hsiang Wang IT


slide-1
SLIDE 1

Channel Coding over Continuous Memoryless Channels

Lecture 6 Channel Coding over Continuous Channels

I-Hsiang Wang

Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw

November 2, 2015

1 / 30 I-Hsiang Wang IT Lecture 6

slide-2
SLIDE 2

Channel Coding over Continuous Memoryless Channels

We have investigated the measures of information for continuous r.v.’s: The amount of uncertainty (entropy) is mostly infinite. Mutual information and KL divergences are well defined. Differential entropy is a useful entity to compute and manage measures of information for continuous r.v.’s. Question: How about coding theorems? Is there a general way or framework to extend coding theorems from discrete (memoryless) sources/channels to continuous (memoryless) sources/channels?

2 / 30 I-Hsiang Wang IT Lecture 6

slide-3
SLIDE 3

Channel Coding over Continuous Memoryless Channels

C (B) = sup

X: E[b(X)]≤B

I (X ; Y ) .

Channel Encoder Channel Decoder Channel

xN yN w b w pY |X Discrete Memoryless Channel C (B) = max

X: E[b(X)]≤BI (X ; Y ) .

Channel Encoder Channel Decoder Channel

xN yN w b w Continuous Memoryless Channel fY |X

?

3 / 30 I-Hsiang Wang IT Lecture 6

slide-4
SLIDE 4

Channel Coding over Continuous Memoryless Channels

Coding Theorems: from Discrete to Continuous (1)

Two main techniques for extending the achievability part of coding theorems from the discrete world to the continuous world:

1 Discretization: Discretize the source and channel input/output to

create a discrete system, and then make the discretization finer and finer to prove the achievability.

2 New typicality: Extend weak typicality for continuous r.v. and

repeat the arguments in a similar way. In particular, replace the entropy terms in the definitions of weakly typical sequences by differential entropy terms. Using discretization to derive the achievability of Gaussian channel capacity follows Gallager[2] and El Gamal&Kim[6]. Cover&Thomas[1] and Yeung[5] use weak typicality for continuous r.v.’s. Moser[4] uses threshold decoder, similar to weak typicality.

4 / 30 I-Hsiang Wang IT Lecture 6

slide-5
SLIDE 5

Channel Coding over Continuous Memoryless Channels

Coding Theorems: from Discrete to Continuous (2)

In this lecture, we use discretization for the achievability proof. Pros: No need for new tools (eg., typicality) for continuous r.v.’s. Extends naturally to multi-terminal settings – can focus on discrete memoryless networks. Cons: Technical; not much insight on how to achieve capacity. Hence, we use a geometric argument to provide insights on how to achieve capacity. Disclaimer: We will not be 100% rigorous in deriving the results in this

  • lecture. Instead, you can find rigorous treatment in the references.

5 / 30 I-Hsiang Wang IT Lecture 6

slide-6
SLIDE 6

Channel Coding over Continuous Memoryless Channels

Outline

1 First, we formulate the channel coding problem over continuous

memoryless channels (CMC), state the coding theorem, and sketch the converse and achievability proofs.

2 Second, we introduce additive Gaussian noise (AGN) channel, derive

the Gaussian channel capacity, and provide insights based on geometric arguments.

3 We then explore extensions, including parallel Gaussian channels and

correlated Gaussian channels, and continuous-time bandlimited Gaussian channels.

6 / 30 I-Hsiang Wang IT Lecture 6

slide-7
SLIDE 7

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

1 Channel Coding over Continuous Memoryless Channels

Continuous Memoryless Channel Gaussian Channel Capacity

7 / 30 I-Hsiang Wang IT Lecture 6

slide-8
SLIDE 8

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

1 Channel Coding over Continuous Memoryless Channels

Continuous Memoryless Channel Gaussian Channel Capacity

8 / 30 I-Hsiang Wang IT Lecture 6

slide-9
SLIDE 9

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Continous Memoryless Channel

Channel Encoder Channel Decoder Channel

xN yN w b w fY |X

1 Input/output alphabet X = Y = R. 2 Continuous Memoryless Channel (CMC):

Channel Law: Governed by the conditional density (p.d.f.) fY|X. Memoryless: Yk − Xk − ( Xk−1, Yk−1) .

3 Average input cost constraint B: 1 N

∑N

k=1 b (xk) ≤ B, where

b : R → [0, ∞) is the (single-letter) cost function. The definitions of error probability, achievable rate, and capacity, are the same as those in channel coding over DMC.

9 / 30 I-Hsiang Wang IT Lecture 6

slide-10
SLIDE 10

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Channel Coding Theorem

Theorem 1 (Continuous Memoryless Channel Capacity) The capacity of the CMC ( R, fY|X, R ) with input cost constraint B is C = sup

X: E[b(X)]≤B

I (X ; Y ) . (1) Note: The input distribution of the r.v. X needs not to have a density. In other words, it could also be discrete.

How to compute h (Y |X ) when X has no density? Recall h (Y |X ) = EX [ − ∫

suppY f (y|X) log f (y|X) dy

] , where f (y|x) is the conditional density of Y given X.

Converse proof: Exactly the same as that in the DMC case.

10 / 30 I-Hsiang Wang IT Lecture 6

slide-11
SLIDE 11

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Sketch of the Achievability (1): Discretization

ENC

xN yN w b w

DEC

fY |X

The proof of achievability makes use of discretization, so that one can apply the result in DMC with input cost:

11 / 30 I-Hsiang Wang IT Lecture 6

slide-12
SLIDE 12

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Sketch of the Achievability (1): Discretization

ENC

w b w

DEC

fY |X Qin Qout

The proof of achievability makes use of discretization, so that one can apply the result in DMC with input cost: Qin: (single-letter) discretization that maps X ∈ R to Xd ∈ Xd. Qout: (single-letter) discretization that maps Y ∈ R to Yd ∈ Yd. Note that both Xd and Yd are discrete (countable) alphabets.

12 / 30 I-Hsiang Wang IT Lecture 6

slide-13
SLIDE 13

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Sketch of the Achievability (1): Discretization

ENC

w b w Qin Qout

DEC

fY |X

New ENC Equivalent DMC

The proof of achievability makes use of discretization, so that one can apply the result in DMC with input cost: Qin: (single-letter) discretization that maps X ∈ R to Xd ∈ Xd. Qout: (single-letter) discretization that maps Y ∈ R to Yd ∈ Yd. Note that both Xd and Yd are discrete (countable) alphabets. Idea: With the two discretization blocks Qin and Qout, one can build an equivalent DMC ( Xd, pYd|Xd, Yd ) as shown above.

13 / 30 I-Hsiang Wang IT Lecture 6

slide-14
SLIDE 14

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Sketch of the Achievability (2): Arguments

Equivalent DMC New ENC

w b w

DEC

pYd|Xd xN

d

yN

d

Qin Qout

1 Random codebook generation: Generate the codebook randomly

based on the original (continuous) r.v. X, satisfying E [b (X)] ≤ B.

2 Choice of discretization: Choose Qin such that the cost constraint

will not be violated after discretization. Specifically, E [b (Xd)] ≤ B.

3 Achievability in the equivalent DMC: By the achievability part of the

channel coding theorem for DMC with input constraint, any rate R < I (Xd ; Yd ) is achievable.

4 Achievability in the original CMC: Prove that when the discretization

in Qin and Qout gets finer and finer, I (Xd ; Yd ) → I (X ; Y ).

14 / 30 I-Hsiang Wang IT Lecture 6

slide-15
SLIDE 15

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

1 Channel Coding over Continuous Memoryless Channels

Continuous Memoryless Channel Gaussian Channel Capacity

15 / 30 I-Hsiang Wang IT Lecture 6

slide-16
SLIDE 16

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Additive White Gaussian Noise (AWGN) Channel

Channel Encoder Channel Decoder

xN yN w b w zN

1 Input/output alphabet X = Y = R. 2 AWGN Channel:

Conditional p.d.f. fY|X is given by Y = X + Z, Z ∼ N ( 0, σ2) ⊥ ⊥ X. {Zk} form an i.i.d. (white) Gaussian r.p. with Zk ∼ N ( 0, σ2) , ∀ k. Memoryless: Zk ⊥ ⊥ ( W, Xk−1, Zk−1) . Without feedback: ZN ⊥ ⊥ XN.

3 Average input power constraint P: 1 N

∑N

k=1|xk|2 ≤ P.

16 / 30 I-Hsiang Wang IT Lecture 6

slide-17
SLIDE 17

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Channel Coding Theorem for Gaussian Channel

Theorem 2 (Gaussian Channel Capacity) The capacity of the AWGN channel with input power constraint P and noise variance σ2 is given by C = sup

f(x): E[|X|2]≤P

I (X ; Y ) = 1

2 log

( 1 + P

σ2

) . (2) Note: For the AWGN channel, the supremum is actually attainable with Gaussian input distribution f (x) =

1 √ 2πPe− x2

2P , i.e., X ∼ N (0, P), as

shown in the next slide.

17 / 30 I-Hsiang Wang IT Lecture 6

slide-18
SLIDE 18

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Evaluation of Capacity

Let us compute the capacity of AWGN channel (2) as follows: I (X ; Y ) = h (Y ) − h (Y |X ) = h (Y ) − h (X + Z |X ) = h (Y ) − h (Z |X ) = h (Y ) − h (Z ) (since Z ⊥ ⊥ X) = h (Y ) − 1 2 log (2πe) σ2

(a)

≤ 1 2 log (2πe) ( P + σ2) − 1 2 log (2πe) σ2 = 1 2 log ( 1 + P σ2 ) Here (a) is due to the fact that h (Y ) ≤ 1

2 log (2πe) Var [Y] and

Var [Y] = Var [X] + Var [Z] ≤ P + σ2, since Var [X] ≤ E [ X2] ≤ P. Finally, note that the above inequalities hold when X ∼ N (0, P).

18 / 30 I-Hsiang Wang IT Lecture 6

slide-19
SLIDE 19

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Achievability Proof (1): Discretization

Here we use a simple quantizer as follows to construct the discretization blocks Qin and Qout: ∀ m ∈ N, let Qm := {

l √m : l = 0, ±1, . . . , ±m

} be the set of quantized points. For any r ∈ R, quantize r to the closest point [r]m ∈ Qm such that |[r]m| ≤ |r|. Discretization: For two given m, n ∈ N, define Channel input discretization: Qin (·) = [·]m. Channel input discretization: Qout (·) = [·]n In other words, Xd = Qm, Yd = Qn, Xd = [X]m, and Yd = [Xd + Z]n = [ [X]m + Z ]

n .

19 / 30 I-Hsiang Wang IT Lecture 6

slide-20
SLIDE 20

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Achievability Proof (2): Equivalent DMC

Now we have an equivalent DMC with Input Xd = [X]m Output Yd = [ Y(m)]

n, where Y(m) ≜ [X]m + Z.

Note that for any original input r.v. X with E [ |X|2] ≤ P, the discretized [X]m also satisfies the power constraint: E [ [X]m

  • 2]

≤ E [ |X|2] ≤ P. Hence, by the achievability result of DMC with input cost constraint, any R < I ( [X]m ; [ Y(m)]

n

)

(evaluated under fX (x) =

1 √ 2πPe− x2

2P )

is indeed achievable for the equivalent DMC under power constraint P. The only thing left to be shown is that, I ( [X]m ; [ Y(m)]

n

) can be made arbitrarily close to I (X ; Y ) = 1

2 log

( 1 + P

σ2

) as m, n → ∞.

20 / 30 I-Hsiang Wang IT Lecture 6

slide-21
SLIDE 21

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Achievability Proof (3): Convergence

Due to data processing inequality and [X]m − Y(m) − [ Y(m)]

n, we have

I ( [X]m ; [ Y(m)]

n

) ≤ I ( [X]m ; Y(m) ) = h ( Y(m) ) − h (Z ) . Since Var [ Y(m)] ≤ P + σ2, we have h ( Y(m) ) ≤ 1

2 log

( 2πe(P + σ2) ) , and hence the upper bound I ( [X]m ; [ Y(m)]

n

) ≤ 1

2 log

( 1 + P

σ2

) . For the lower bound, we would like to prove lim inf

m→∞

lim

n→∞ I

( [X]m ; [ Y(m)]

n

) ≥ 1

2 log

( 1 + P

σ2

) . We skip the details here; see Appendix 3A of El Gamal&Kim[6].

21 / 30 I-Hsiang Wang IT Lecture 6

slide-22
SLIDE 22

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Geometric Intuition: Sphere Packing

y = x + z RN

p N(P + σ2)

By LLN, as N → ∞, most output y (yN) will lie inside the N-dimensional sphere of radius √ N (P + σ2).

22 / 30 I-Hsiang Wang IT Lecture 6

slide-23
SLIDE 23

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Geometric Intuition: Sphere Packing

y = x + z RN

p N(P + σ2) √ Nσ2

By LLN, as N → ∞, most output y (yN) will lie inside the N-dimensional sphere of radius √ N (P + σ2). Also by LLN, as N → ∞, y will lie near the surface of the N-dimensional sphere centered at x with radius √ Nσ2.

23 / 30 I-Hsiang Wang IT Lecture 6

slide-24
SLIDE 24

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Geometric Intuition: Sphere Packing

y = x + z RN

p N(P + σ2) √ Nσ2

By LLN, as N → ∞, most output y (yN) will lie inside the N-dimensional sphere of radius √ N (P + σ2). Also by LLN, as N → ∞, y will lie near the surface of the N-dimensional sphere centered at x with radius √ Nσ2. Vanishing error probability criterion = ⇒ non-overlapping spheres. Question: How many non-overlapping spheres can be packed into the large sphere? Maximum # of non-overlapping spheres = Maximum # of codewords that can be reliably delivered.

24 / 30 I-Hsiang Wang IT Lecture 6

slide-25
SLIDE 25

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Geometric Intuition: Sphere Packing

y = x + z RN

p N(P + σ2) √ Nσ2

Back-of-envelope calculation: 2NR ≤ √

N(P+σ2)

N

√ Nσ2N

= ⇒ R ≤ 1

N log

(√

N(P+σ2)

N

√ Nσ2N

) =

1 2 log

( 1 + P

σ2

) Hence, intuitively any achievable rate R cannot exceed C = 1 2 log ( 1 + P σ2 ) .

How to achieve it?

25 / 30 I-Hsiang Wang IT Lecture 6

slide-26
SLIDE 26

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Achieving Capacity via Good Packing

√ NP

x-sphere

x1 x2

Random codebook generation: Generate 2NR N-dim. vectors (codewords) {x1, . . . , x2NR} lying in the “x-sphere” of radius √ NP.

26 / 30 I-Hsiang Wang IT Lecture 6

slide-27
SLIDE 27

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Achieving Capacity via Good Packing

√ NP

x-sphere

x1 αy x2

Random codebook generation: Generate 2NR N-dim. vectors (codewords) {x1, . . . , x2NR} lying in the “x-sphere” of radius √ NP. Decoding: α ≜

P P+σ2 (MMSE coeff.)

y → MMSE → αy → Nearest Neighbor → x

27 / 30 I-Hsiang Wang IT Lecture 6

slide-28
SLIDE 28

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Achieving Capacity via Good Packing

√ NP

x-sphere

r N Pσ2 P + σ2 x1 αy x2

Random codebook generation: Generate 2NR N-dim. vectors (codewords) {x1, . . . , x2NR} lying in the “x-sphere” of radius √ NP. Decoding: α ≜

P P+σ2 (MMSE coeff.)

y → MMSE → αy → Nearest Neighbor → x

By LLN, we have ∥αy − x1∥2 = ∥αz + (α − 1)x1∥2 ≈ α2Nσ2 + (α − 1)2NP = N Pσ2

P+σ2

28 / 30 I-Hsiang Wang IT Lecture 6

slide-29
SLIDE 29

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Achieving Capacity via Good Packing

√ NP

x-sphere

r N Pσ2 P + σ2 x1 αy x2

Performance analysis: When does an error occur? When another codeword, say, x2, falls inside the uncertainty sphere centered at αy. What is that probability? It is the ratio of the volumes of the two spheres! P {x1 → x2} = √

NPσ2/(P+σ2)

N

√ NP

N

= (

σ2 P+σ2

)N/2

29 / 30 I-Hsiang Wang IT Lecture 6

slide-30
SLIDE 30

Channel Coding over Continuous Memoryless Channels Continuous Memoryless Channel Gaussian Channel Capacity

Achieving Capacity via Good Packing

√ NP

x-sphere

r N Pσ2 P + σ2 x1 αy x2

By the Union of Events Bound, the total probability of error P {E} ≤ 2NR (

σ2 P+σ2

)N/2 = 2

N ( R+ 1

2 log

(

1 1+ P σ2

))

, which vanishes as N → ∞ if R < 1

2 log

( 1 + P

σ2

) . Hence, any R < 1

2 log

( 1 + P

σ2

) is achievable.

30 / 30 I-Hsiang Wang IT Lecture 6