Lecture 1 Capacity of the Gaussian Channel Basic concepts in - - PDF document

lecture 1
SMART_READER_LITE
LIVE PREVIEW

Lecture 1 Capacity of the Gaussian Channel Basic concepts in - - PDF document

Lecture 1 Capacity of the Gaussian Channel Basic concepts in information theory: Appendix B Capacity of the Gaussian channel: Appendix B, Ch. 5.13 Mikael Skoglund, Theoretical Foundations of Wireless 1/23 Entropy and Mutual


slide-1
SLIDE 1

Lecture 1

Capacity of the Gaussian Channel

  • Basic concepts in information theory: Appendix B
  • Capacity of the Gaussian channel: Appendix B, Ch. 5.1–3

Mikael Skoglund, Theoretical Foundations of Wireless 1/23

Entropy and Mutual Information I

  • Entropy for a discrete random variable X with alphabet X and pmf

p(x) Pr(X = x), ∀x ∈ X H(X) −

  • x∈X

p(x) log p(x)

  • H(X) = average amount of uncertainty removed when observing

X = the information obtained X

  • It holds that 0 ≤ H(X) ≤ log |X|
  • Entropy for an n-tuple Xn

1 (X1, . . . , Xn)

H(Xn

1 ) = H(X1, . . . , Xn) = −

  • xn

1

p(xn

1) log p(xn 1)

Mikael Skoglund, Theoretical Foundations of Wireless 2/23

slide-2
SLIDE 2

Entropy and Mutual Information II

  • Conditional entropy of Y given X = x

H(Y |X = x) −

  • y∈Y

p(y|x) log p(y|x)

  • H(Y |X = x) = the average information obtained when observing Y

when it is already known that X = x

  • Conditional entropy of Y given X (on the average)

H(Y |X)

  • x∈X

p(x)H(Y |X = x)

  • Define g(x) = H(Y |X = x). Then H(Y |X) = Eg(X).
  • Chain rule

H(X, Y ) = H(Y |X) + H(X) (c.f., p(x, y) = p(y|x)p(x))

Mikael Skoglund, Theoretical Foundations of Wireless 3/23

Entropy and Mutual Information III

  • Mutual information

I(X; Y )

  • x
  • y

p(x, y) log p(x, y) p(x)p(y)

  • I(X; Y ) = the average information about X obtained when
  • bserving Y (and vice versa)

Mikael Skoglund, Theoretical Foundations of Wireless 4/23

slide-3
SLIDE 3

Entropy and Mutual Information IV

H(X) H(Y ) H(X, Y ) H(X|Y ) H(Y |X) I(X; Y ) I(X; Y ) = I(Y ; X) I(X; Y ) = H(Y ) − H(Y |X) = H(X) − H(X|Y ) I(X; Y ) = H(X) + H(Y ) − H(X, Y ) I(X; X) = H(X) H(X, Y ) = H(X) + H(Y |X) = H(Y ) + H(X|Y )

Mikael Skoglund, Theoretical Foundations of Wireless 5/23

Entropy and Mutual Information V

  • A continuous random variable X with pdf f(x), differential entropy

h(X) = − Z f(x) log f(x)dx

  • For E[X2] = σ2,

h(X) ≤ 1 2 log(2πeσ2) [bits] with = only for X Gaussian

  • Mutual information,

I(X; Y ) = ZZ f(x, y) log f(x, y) f(x)f(y)dxdy

Mikael Skoglund, Theoretical Foundations of Wireless 6/23

slide-4
SLIDE 4

Jensen’s Inequality

  • For f : Rn → R convex and a random X ∈ Rn,

f(E[X]) ≤ E[f(X)]

  • Reverse inequality for f concave
  • For f strictly convex (or strictly concave),

f(E[X]) = E[f(X)] = ⇒ Pr(X = E[X]) = 1

Mikael Skoglund, Theoretical Foundations of Wireless 7/23

Fano’s Inequality

  • Consider the following estimation problem (discrete RV’s):

X random variable of interest Y observed random variable ˆ X = f(Y ) estimate of X based on Y

  • Define the probability of error as

Pe = Pr( ˆ X = X)

  • Fano’s inequality lower bounds Pe

h(Pe) + Pe log(|X| − 1) ≥ H(X|Y )

[h(x) = −x log x − (1 − x) log(1 − x)]

Mikael Skoglund, Theoretical Foundations of Wireless 8/23

slide-5
SLIDE 5

The Gaussian Channel I

encoder decoder

α β xm ym wm ω ˆ ω

  • At time m: transmitted symbol xm ∈ X = R, received symbol

ym ∈ Y = R, noise wm ∈ R

  • The noise {wm} is i.i.d. Gaussian N(0, σ2)
  • A memoryless Gaussian transition density (noise variance σ2),

f(y|x) = 1 √ 2πσ2 exp

1 2σ2 (y − x)2

Mikael Skoglund, Theoretical Foundations of Wireless 9/23

The Gaussian Channel II

  • Coding for the Gaussian channel, subject to an average power

constraint

  • Equally likely information symbols ω ∈ IM = {1, . . . , M}
  • An (M, n) code with power constraint P

1 Power-limited codebook

C = n xn

1 (1), . . . , xn 1 (M)

  • ,

with n−1

n

X

m=1

x2

m(i) ≤ P, i ∈ IM

2 Encoding: ω = i ⇒ xn

1 = α(i) = xn 1 (i) transmitted

3 Decoding: yn

1 received ⇒ ˆ

ω = β(yn

1 )

  • One symbol → one codeword → n channel uses

Mikael Skoglund, Theoretical Foundations of Wireless 10/23

slide-6
SLIDE 6

Capacity

  • A rate

R log M n is achievable (subject to the power constraint P) if there exists a sequence of (⌈2nR⌉, n) codes with codewords satisfying the power constraint, and such that the average probability of error P (n)

e

= Pr(ˆ ω = ω) tends to 0 as n → ∞.

  • The capacity C is the supremum of all achievable rates.

Mikael Skoglund, Theoretical Foundations of Wireless 11/23

A Lower Bound for C I

  • Gaussian random code design: Fix

f(x) = 1

  • 2π(P − ε)

exp

x2 2(P − ε)

  • for a small ε > 0, and draw a codebook Cn =
  • xn

1(1), . . . , xn 1(M)

  • i.i.d according to f(xn

1) = m f(xm).

  • Mutual information: Let

Iε =

  • f(y|x)f(x) log

f(y|x)

  • f(y|x)f(x)dxdxdy

= 1 2 log

  • 1 + P − ε

σ2

  • the mutual info between input and output when the channel is

“driven by” f(x) = N(0, P − ε)

Mikael Skoglund, Theoretical Foundations of Wireless 12/23

slide-7
SLIDE 7

A Lower Bound for C II

  • Encoding: A message ω ∈ IM is encoded as xn

1(ω)

  • Transmission: Received sequence

yn

1 = xn 1(ω) + wn 1

where wm are i.i.d zero-mean Gaussian, E[w2

m] = σ2

  • Decoding: For any sequences xn

1 and yn 1 , let

fn = fn(xn

1, yn 1 ) = 1

n log f(yn

1 |xn 1)

f(yn

1 )

= 1 n

n

  • m=1

log f(ym|xm) f(ym) and let T (n)

ε

be the set of (xn

1, yn 1 ) such that fn > Iε − ε. Declare

ˆ ω = i if xn

1(i) is the only codeword such that (xn 1(i), yn 1 ) ∈ T (n) ε

, and in addition n−1

n

  • m=1

x2

m(i) ≤ P,

  • therwise set ˆ

ω = 0.

Mikael Skoglund, Theoretical Foundations of Wireless 13/23

A Lower Bound for C III

  • Average probability of error:

πn = Pr(ˆ ω = ω) =

  • symmetry
  • = Pr(ˆ

ω = 1|ω = 1) with “Pr” over the random codebook and the noise

  • Let

E0 = {n−1

m

x2

m(1) > P}

and Ei =

  • xn

1(i), xn 1(1) + wn 1

  • ∈ T (n)

ε

  • then

πn = P(E0 ∪ Ec

1 ∪ E2 ∪ · · · ∪ EM) ≤ P(E0) + P(Ec 1) + M

  • i=2

P(Ei)

Mikael Skoglund, Theoretical Foundations of Wireless 14/23

slide-8
SLIDE 8

A Lower Bound for C IV

  • For n sufficiently large, we have
  • P(E0) < ε
  • P(Ec

1) < ε

  • P(Ei) ≤ 2−n(Iε−ε), i = 2, . . . , M

that is, πn ≤ 2ε + 2−n(Iε−R−ε) ⇒ For the average code, R < Iε − ε ⇒ πn → 0 as n → ∞ ⇒ Exists at least one code with P n

e → 0 for R < Iε − ε

⇒ C ≥ 1 2 log

  • 1 + P

σ2

  • Mikael Skoglund,

Theoretical Foundations of Wireless 15/23

An Upper Bound for C I

  • Consider any sequence of codes that can achieve the rate R
  • Fano =

⇒ R ≤ 1 n

n

  • m=1

I(xm(ω); ym) + αn where αn = n−1 + RP (n)

e

→ 0 as n → ∞, and where I(xm(ω); ym) = h(ym) − h(wm) = h(ym) − 1 2 log 2πeσ2

  • Since E[y2

m] = Pm + σ2 where Pm = M −1 M i=1 x2 m(i) we get

h(ym) ≤ 1 2 log 2πe(σ2 + Pm) and hence I(xm(ω); ym) ≤ 2−1 log(1 + Pm/σ2).

Mikael Skoglund, Theoretical Foundations of Wireless 16/23

slide-9
SLIDE 9

An Upper Bound for C II

Thus R ≤ 1 n

n

  • m=1

1 2 log

  • 1 + Pm

σ2

  • + αn ≤ 1

2 log

  • 1 + n−1

m Pm

σ2

  • + αn

≤ 1 2 log

  • 1 + P

σ2

  • + αn → 1

2 log

  • 1 + P

σ2

  • as n → ∞

for all achievable R, due to Jensen and the power constraint = ⇒ C ≤ 1 2 log

  • 1 + P

σ2

  • Mikael Skoglund,

Theoretical Foundations of Wireless 17/23

Coding Theorem for the Gaussian Channel

Theorem

A memoryless Gaussian channel with noise variance σ2 and power constraint P has capacity C = 1 2 log

  • 1 + P

σ2

  • That is, all rates R < C and no rates R > C are achievable.

Mikael Skoglund, Theoretical Foundations of Wireless 18/23

slide-10
SLIDE 10

The Gaussian Waveform Channel I

x(t) (−T/2, T/2) (−T/2, T/2) y(t) N(f) H(f)

  • Linear-filter waveform channel with Gaussian noise,
  • independent Gaussian noise with spectral density N(f)
  • linear filter H(f)
  • input confined to (−T/2, T/2)
  • output measured over (−T/2, T/2)
  • codebook

C = {x1(t), . . . , xM(t)}

  • power constraint

1 T Z T/2

−T/2

x2

i (t)dt ≤ P

  • rate

R = log M T

Mikael Skoglund, Theoretical Foundations of Wireless 19/23

The Gaussian Waveform Channel II

  • Capacity (in bits per second),

C = 1 2

  • F(β)

log |H(f)|2 · β N(f) d f P =

  • F(β)
  • β −

N(f) |H(f)|2

  • d

f where F(β) =

  • f : N(f) · |H(f)|−2 ≤ β
  • for different β ∈ (0, ∞).
  • That is, there exist codes such that arbitrarily low error probability is

possible as long as R = log M T < C and as T → ∞. For R > C the error probability is > 0.

Mikael Skoglund, Theoretical Foundations of Wireless 20/23

slide-11
SLIDE 11

The Gaussian Waveform Channel III

  • In the achievability proof: Random Gaussian codewords,

with spectral density S(f) =

  • β −

N(f) |H(f)|2 + where [x]+ =

  • x,

x ≥ 0 0, x < 0

  • The famous special case of a bandlimited AWGN channel:
  • Perfect lowpass filter of bandwidth W

H(f) = ( 1 |f| ≤ W |f| > W

  • White Gaussian noise, with N(f) = N0/2

Mikael Skoglund, Theoretical Foundations of Wireless 21/23

The Gaussian Waveform Channel IV

  • The capacity of this channel is (Shannon ’48):

C = W log

  • 1 +

P WN0

  • [bits per second]
  • Fundamental resources: power P and bandwidth W

Mikael Skoglund, Theoretical Foundations of Wireless 22/23

slide-12
SLIDE 12

Waterfilling

  • A frequency-selective Gaussian waveform channel (H(f) arbitrary),

white Gaussian noise (N(f) = N0/2) ⇒ C = 1 2

  • F(β)

log

  • β|H(f)|2

d f P = N0 2

  • F(β)
  • β −

1 |H(f)|2

  • d

f where F(β) =

  • f : β ≥ |H(f)|−2
  • Optimal signal spectrum

S(f) = N0 2

  • β −

1 |H(f)|2 +

  • “sample in frequency” ⇒ OFDM. . .

Mikael Skoglund, Theoretical Foundations of Wireless 23/23