Chapter 10 Rate Distortion Theory Peng-Hua Wang Graduate Inst. of - - PowerPoint PPT Presentation

chapter 10 rate distortion theory
SMART_READER_LITE
LIVE PREVIEW

Chapter 10 Rate Distortion Theory Peng-Hua Wang Graduate Inst. of - - PowerPoint PPT Presentation

Chapter 10 Rate Distortion Theory Peng-Hua Wang Graduate Inst. of Comm. Engineering National Taipei University Chapter Outline Chap. 10 Rate Distortion Theory 10.1 Quantization 10.2 Definitions 10.3 Calculation of the Rate Distortion Function


slide-1
SLIDE 1

Chapter 10 Rate Distortion Theory

Peng-Hua Wang

Graduate Inst. of Comm. Engineering National Taipei University

slide-2
SLIDE 2

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 2/22

Chapter Outline

  • Chap. 10 Rate Distortion Theory

10.1 Quantization 10.2 Definitions 10.3 Calculation of the Rate Distortion Function 10.4 Converse to the Rate Distortion Theorem 10.5 Achievability of the Rate Distortion Function 10.6 Strongly Typical Sequences and Rate Distortion 10.7 Characterization of the Rate Distortion Function 10.8 Computation of Channel Capacity and the Rate Distortion Function

slide-3
SLIDE 3

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 3/22

10.1 Quantization

slide-4
SLIDE 4

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 4/22

Introduction

■ Finite representation of a continuous r.v. ◆ can’t be perfect ◆ How well can we do ? ⇒ need to define “goodness” or distortion

measurement means the distance between a r.v. and its representation

◆ This is in fact the lossy compression

slide-5
SLIDE 5

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 5/22

Quantization

■ Let X be a r.v., ˆ

X = ˆ X(X) be its representation.

■ If we quantize X into R bits, means we use 2R distinct values to

represent X.

■ Problem. Find optimal set ˆ

X, called the representation points or code

points, and the region associated with each value in ˆ

X such that

certain error measurement is minimized.

  • Example. X ∼ N(0, σ2), R = 1, min E[(X − ˆ

X)2]. ˆ X =   

  • 2

πσ,

X ≥ 0 −

  • 2

πσ,

X < 0

slide-6
SLIDE 6

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 6/22

Quantization

slide-7
SLIDE 7

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 7/22

10.2 Definitions

slide-8
SLIDE 8

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 8/22

Definitions

Definition 1 (Distortion Function)

d : X × ˆ X → R+

which means a function d with d(x, ˆ

x) ≥ 0 for x ∈ X, ˆ x ∈ ˆ X.

■ A distortion function d(x, ˆ

x) is bounded if max d(x, ˆ x) < ∞

■ The distortion between sequences x and ˆ

x is defined by d(xn, ˆ xn) = 1 n

n

  • i=1

d(xi, ˆ xi)

■ Example. ◆ Hamming distance. d(x, ˆ

x) = 1 if x = ˆ x and d(x, ˆ x) = 0 if x = ˆ x.

◆ squared-error distortion. d(x, ˆ

x) = (x − ˆ x)2.

slide-9
SLIDE 9

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 9/22

Definitions

Definition 2 (Rate Distortion Codes) A 2nR, n-rate distortion code consists of an encoding function

fn : X n → {1, 2, . . . , 2nR}

and a decoding function

gn : {1, 2, . . . , 2nR} → X n.

The distortion D associated with the code is the average distortion over all codewords

D = E[d(Xn, gn(fn(Xn)))] =

  • xn

p(xn)d(xn, gn(fn(xn)))

We may call ˆ

Xn the vector quantization, reproduction, reconstruction,

source code, or estimation of X.

slide-10
SLIDE 10

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 10/22

Definitions

Definition 3 (Achievable) A rate distortion pair (R, D) is said to be achievable if there exists a sequence of (2nR, n)-rate distortion code

(fn, gn) with lim

n→∞ E[d(Xn, gn(fn(Xn)))] ≤ D. ■ A rate distortion region for a source is the closure of the set of

achievable rate distortion pairs (R, D).

■ The rate distortion function R(D) for a source is the infimum of

rates R such that (R, D) is in the rate distortion region of the source for a given distortion D.

■ The rate distortion function R(D) for a source is the infimum of all

distortion D such that (R, D) is in the rate distortion region of the source for a given rate R.

slide-11
SLIDE 11

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 11/22

Definitions

Definition 4 (Information Rate Distortion Function) The information rate distortion function R(I)(D) for a source X with distortion measure

d(x, ˆ x) is defined as R(I)(D) = min

p(ˆ x|x):

x,ˆ x p(x)p(ˆ

x|x)d(x,ˆ x)≤D I(X; ˆ

X)

Theorem 1 (Rate Distortion Function) The rate distortion function for an i.i.d. source X with distribution p(x) a nd bounded distortion function d(x, ˆ

x) is equal to the associated information rate distortion

  • function. Thus, R(D) = R(I)(D) is the minimum achievable rate at

distortion D.

slide-12
SLIDE 12

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 12/22

10.3 Calculation of the Rate Distortion Function

slide-13
SLIDE 13

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 13/22

Binary Source

Theorem 2 The rate distortion function for a Bernoulli(p) source with Hamming distortion is given by

R(D) =

  • H(p) − H(D),

0 ≤ D ≤ min{p, 1 − p} 0, D > min{p, 1 − p}

■ If D ≥ p, we can achieve R(D) = 0 (one code to represent two

values) by letting ˆ

X = 0 since the distortion is p(x = 1, ˆ x = 0) × d(x = 0, ˆ x = 1) =p(x = 1) p(ˆ x = 0|x = 1)

  • =1

×1 = p

slide-14
SLIDE 14

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 14/22

Binary Source

  • Proof. Let Z = 0 if X = ˆ

X and Z = 1 if X = ˆ

  • X. Denote

p(Z = 1) = t. The distortion E[d(X, ˆ X] = p(X = 0, ˆ X = 1) × 1 + p(X = 1, ˆ X = 0) × 1 = p(Z = 1) = t ≤ D I(X; ˆ X) = H(X) − H(X| ˆ X) = H(p) − H(Z| ˆ X) ≥ H(p) − H(Z) = H(p) − H(t) ≥ H(p) − H(D)

Equality holds when H(X| ˆ

X) = H(D).

slide-15
SLIDE 15

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 15/22

Binary Source

slide-16
SLIDE 16

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 16/22

Gaussian Source

Theorem 3 The rate distortion function for a N(0, σ2) source with squared error distortion is

R(D) =

  • 1

2 log σ2 D ,

0 ≤ D ≤ σ2 0, D > σ2

■ If D ≥ σ2, we can achieve R(D) = 0 (one code to represent ALL

values) by letting ˆ

X = 0 since the distortion is

  • (x − ˆ

x)2φ(x)dx =

  • x2φ(x)dx = σ2

where φ(x) is the pdf of N(0, σ2).

slide-17
SLIDE 17

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 17/22

Gaussian Source

Proof.

I(X; ˆ X) = h(X) − h(X| ˆ X) = h(X) − h(X − ˆ X| ˆ X) ≥ h(X) − h(X − ˆ X) ≥ h(X) − h(N(0, E[(X − ˆ X)2])) = 1 2 log(2πeσ2) − 1 2 log(2πeE[(X − ˆ X)2]) ≥ 1 2 log(2πeσ2) − 1 2 log(2πeD) = 1 2 log σ2 D

Equality holds when Z = X − ˆ

X has a normal distribution of zero

mean and variance D.

slide-18
SLIDE 18

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 18/22

Gaussian Source

slide-19
SLIDE 19

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 19/22

Sphere Packing for Channel Coding

slide-20
SLIDE 20

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 20/22

Sphere Packing for Channel Coding

For each sent codeword, the received codeword is contained in a sphere of radius

  • nN. The received vectors have energy no grater

than n(P + N), so they lie in a sphere of radius

  • n(P + N). How

many codeword can we use without intersection in the decoding sphere?

M = An

  • n(P + N)

n An( √ nN)n =

  • 1 + P

N n/2

where A the constant for calculating the volume of n-dimensional sphere. For example,

A2 = π, A3 = 4

3π. Therefore, the capacity is

1 n log M = 1 2 log

  • 1 + P

N

  • .
slide-21
SLIDE 21

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 21/22

Sphere Packing for Rate Distortion

slide-22
SLIDE 22

Peng-Hua Wang, May 21, 2012 Information Theory, Chap. 10 - p. 22/22

Sphere Packing for Rate Distortion

Consider a Gaussian source of variance σ2. A (2nR, n) rate distortion code for this source with distortion D is a set of M = 2nR sequences in Rn. All these sequences lie within a sphere of radius

√ nσ2. The

source sequences are within a distance

√ D of some codewords. How

many codeword can we use without intersection in the decoding sphere?

M = An √ nσ2 n An( √ nD)n = σ2 D n/2

Therefore, the rate is

1 n log M = 1 2 log σ2 D .