High-Rate Sparse Superposition Codes with Iteratively Optimal - - PowerPoint PPT Presentation

high rate sparse superposition codes with iteratively
SMART_READER_LITE
LIVE PREVIEW

High-Rate Sparse Superposition Codes with Iteratively Optimal - - PowerPoint PPT Presentation

High-Rate Sparse Superposition Codes with Iteratively Optimal Estimates Andrew Barron, Sanghee Cho Department of Statistics Yale University 2012 IEEE International Symposium on Information Theory July 2, 2012 MIT Sparse Superposition Code


slide-1
SLIDE 1

High-Rate Sparse Superposition Codes with Iteratively Optimal Estimates

Andrew Barron, Sanghee Cho

Department of Statistics Yale University 2012 IEEE International Symposium on Information Theory July 2, 2012 MIT

slide-2
SLIDE 2

Sparse Superposition Code for the Gaussian Channel

u

Input bits (length K)

β

Sparse

  • coeff. vector

(length N) L non-zero β2 = P

X

Dictionary n by N indep N(0,1)

Codeword (length n)

Channel

ǫ

Noise ∼ N(0, σ2I)

Y

received (length n)

Decoder

ˆ u

slide-3
SLIDE 3

Sparse Superposition Code for the Gaussian Channel

u

Input bits (length K)

β

Sparse

  • coeff. vector

(length N) L non-zero β2 = P

X

Dictionary n by N indep N(0,1)

Codeword (length n)

Channel

ǫ

Noise ∼ N(0, σ2I)

Y

received (length n)

Decoder

ˆ u Linear Model Y = Xβ + ǫ

slide-4
SLIDE 4

Sparse Superposition Code for the Gaussian Channel

u

Input bits (length K)

β

Sparse

  • coeff. vector

(length N) L non-zero β2 = P

X

Dictionary n by N indep N(0,1)

Codeword (length n)

Channel

ǫ

Noise ∼ N(0, σ2I)

Y

received (length n)

Decoder

ˆ u

  • Partitioned Coef.: β = (00∗0000, 000∗000, . . . , 0∗00000)
  • L sections of size M = N/L, one non-zero in each
slide-5
SLIDE 5

Sparse Superposition Code for the Gaussian Channel

u

Input bits (length K)

β

Sparse

  • coeff. vector

(length N) L non-zero β2 = P

X

Dictionary n by N indep N(0,1)

Codeword (length n)

Channel

ǫ

Noise ∼ N(0, σ2I)

Y

received (length n)

Decoder

ˆ u snr = P

σ2

  • Partitioned Coef.: β = (00∗0000, 000∗000, . . . , 0∗00000)
  • L sections of size M = N/L, one non-zero in each
  • Rate R = K

n = L log M n

, Capacity C = 1

2 log(1 + snr)

slide-6
SLIDE 6

Sparse Superposition Code for the Gaussian Channel

u

Input bits (length K)

β

Sparse

  • coeff. vector

(length N) L non-zero β2 = P

X

Dictionary n by N indep N(0,1)

Codeword (length n)

Channel

ǫ

Noise ∼ N(0, σ2I)

Y

received (length n)

Decoder

ˆ u snr = P

σ2

  • Partitioned Coef.: β = (00∗0000, 000∗000, . . . , 0∗00000)
  • L sections of size M = N/L, one non-zero in each
  • Rate R = K

n = L log M n

, Capacity C = 1

2 log(1 + snr)

  • Ultra-sparse case: Impractical M = 2nR/L with L constant

(successive decoder reliable for R < C: Cover 1972 IT)

  • Moderately-sparse: M = La with n = (L log M)/R
slide-7
SLIDE 7

Sparse Superposition Code for the Gaussian Channel

u

Input bits (length K)

β

Sparse

  • coeff. vector

(length N) L non-zero β2 = P

X

Dictionary n by N indep N(0,1)

Codeword (length n)

Channel

ǫ

Noise ∼ N(0, σ2I)

Y

received (length n)

Decoder

ˆ u snr = P

σ2

  • Partitioned Coef.: β = (00∗0000, 000∗000, . . . , 0∗00000)
  • L sections of size M = N/L, one non-zero in each
  • Rate R = K

n = L log M n

, Capacity C = 1

2 log(1 + snr)

  • Ultra-sparse case: Impractical M = 2nR/L with L constant

(successive decoder reliable for R < C: Cover 1972 IT)

  • Moderately-sparse: M = La with n = (L log M)/R

(reliable for R < C)

Maximum likelihood decoder (Joseph & Barron 2010a ISIT, 2012a IT) Adaptive successive decoder with threshold (J&B 2010b ISIT, 2012b) Adaptive successive decoder with soft decision (B&C, this talk)

slide-8
SLIDE 8

Progression of success rate

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x M = 29 , L = M snr=7 C=1.5 bits R=1.05 bits(0.7C)

soft decision Thresholding with a=0.5

slide-9
SLIDE 9

Progression of success rate

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x M = 29 , L = M snr=7 C=1.5 bits R=1.05 bits(0.7C)

soft decision Thresholding with a=0.5

slide-10
SLIDE 10

Progression of success rate

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x M = 29 , L = M snr=7 C=1.5 bits R=1.05 bits(0.7C)

soft decision Thresholding with a=0.5

slide-11
SLIDE 11

Power Allocation

  • Power control: L

ℓ=1 Pℓ = P

β2 = P

  • Special choice: Pℓ proportional to e−2Cℓ/L for ℓ = 1, . . . , L
slide-12
SLIDE 12

Power Allocation

  • Power control: L

ℓ=1 Pℓ = P

β2 = P

  • Special choice: Pℓ proportional to e−2Cℓ/L for ℓ = 1, . . . , L

20 40 60 80 100 0.000 0.005 0.010 0.015 0.020 section index power allocation

slide-13
SLIDE 13

Coefficient vectors β

  • Power control: L

ℓ=1 Pℓ = P

β2 = P

  • Special choice: Pℓ proportional to e−2Cℓ/L for ℓ = 1, . . . , L
  • Coeff. sent: β = (00

√ P10000, 000 √ P2000, . . . , 0 √ PL00000)

  • Terms sent: (j1, j2, . . . , jL)
  • βj = √Pℓ 1{j=jℓ}

for j in section ℓ , for ℓ = 1, . . . , L

  • B = set of such allowed vectors β for codewords Xβ
slide-14
SLIDE 14

Coefficient Estimates ˆ β

  • Power control: L

ℓ=1 Pℓ = P

β2 = P

  • Special choice: Pℓ proportional to e−2Cℓ/L for ℓ = 1, . . . , L
  • Coeff. sent: β = (00

√ P10000, 000 √ P2000, . . . , 0 √ PL00000)

  • Terms sent: (j1, j2, . . . , jL)
  • βj = √Pℓ 1{j=jℓ}

for j in section ℓ , for ℓ = 1, . . . , L

  • B = set of such allowed vectors β for codewords Xβ
  • ˆ

βj restricted to B or the convex hull of B

  • ˆ

βj = √Pℓ ˆ wj for j in secℓ , with ˆ wj ≥ 0,

  • j∈secℓ ˆ

wj = 1

slide-15
SLIDE 15

Iterative Estimation

For k ≥ 1

  • Coefficient fits: ˆ

βk,j (initially 0)

  • Codeword fits: Fk = X ˆ

βk

  • Vector of statistics: statk = function of (X, Y, F1, . . . , Fk)
  • e.g. statk,j proportional to X T

j (Y − Fk)

  • Update ˆ

βk+1 as a function of statk

slide-16
SLIDE 16

Iterative Estimation

For k ≥ 1

  • Coefficient fits: ˆ

βk,j (initially 0)

  • Codeword fits: Fk = X ˆ

βk

  • Vector of statistics: statk = function of (X, Y, F1, . . . , Fk)
  • e.g. statk,j proportional to X T

j (Y − Fk)

  • Update ˆ

βk+1 as a function of statk

  • Thresholding: Adaptive Successive Decoder

ˆ βk+1,j = √Pℓ if statk,j is above threshold in sections ℓ not previously decoded

slide-17
SLIDE 17

Iterative Estimation

For k ≥ 1

  • Coefficient fits: ˆ

βk,j (initially 0)

  • Codeword fits: Fk = X ˆ

βk also Fk,−j = X ˆ βk,−j

  • Vector of statistics: statk = function of (X, Y, F1, . . . , Fk)
  • e.g. statk,j proportional to X T

j (Y − Fk,−j)

  • Update ˆ

βk+1 as a function of statk

  • Thresholding: Adaptive Successive Decoder

ˆ βk+1,j = √Pℓ if statk,j is above threshold in sections ℓ not previously decoded

  • Soft decision:

ˆ βk+1,j = E[βj|statk] with thresholding on the last step

slide-18
SLIDE 18

Statistics

  • statk = function of (X, Y, F1, . . . , Fk)

Fk = X ˆ βk

  • Orthogonalization : Let G0 = Y and for k ≥ 1

Gk = part of Fk orthogonal to G0, G1, . . . , Gk−1

  • Components of statistics

Zk,j = X T

j Gk

Gk

  • Class of statistics statk formed by combining Z0, . . . , Zk
slide-19
SLIDE 19

Statistics

  • statk = function of (X, Y, F1, . . . , Fk)

Fk = X ˆ βk

  • Orthogonalization : Let G0 = Y and for k ≥ 1

Gk = part of Fk orthogonal to G0, G1, . . . , Gk−1

  • Components of statistics

Zk,j = X T

j Gk

Gk

  • Class of statistics statk formed by combining Z0, . . . , Zk

statk,j = Zcomb

k,j

+ √n √ck ˆ βk,j where Zcomb

k

= λk,0 Z0 − λk,1 Z1 − . . . − λk,k Zk with λk,0 + λk,1 + . . . + λk,k = 1

slide-20
SLIDE 20

Statistics based on residuals

Let statk,j be proportional to X T

j (Y − X ˆ

βk,−j) statk,j = X T

j (Y − X ˆ

βk) √nck + Xj2 √nck ˆ βk,j Arises with λk proportional to

  • (Y − ZT

0 ˆ

βk)2, (ZT

1 ˆ

βk)2, . . . , (ZT

k ˆ

βk)2 and nck = Y − X ˆ βk2. Here, ck is typically between σ2 and σ2 + P

slide-21
SLIDE 21

Idealized Statistics

λk exists yielding statideal

k

with distributional representation √n

  • σ2 + β − ˆ

βk2 β + Z comb

k

with Z comb

k

∼ N(0, I). This is a normal shift that improves with decreasing β − ˆ βk2.

slide-22
SLIDE 22

Idealized Statistics

λk exists yielding statideal

k

with distributional representation √n

  • σ2 + β − ˆ

βk2 β + Z comb

k

with Z comb

k

∼ N(0, I). This is a normal shift that improves with decreasing β − ˆ βk2. For terms sent the shift αℓ,k has an effective snr interpretation αℓ,k =

  • n

Pℓ σ2 + Premaining,k where Premaining,k = β − ˆ βk2.

slide-23
SLIDE 23

Distributional Analysis

Lemma 1: shifted normal conditional distribution Given Fk−1 = (G0, . . . , Gk−1, Z0, Z1, . . . , Zk−1), the Zk has the distributional representation Zk = Gk σk bk + Zk

  • Gk2/σ2

k ∼ Chi-square(n − k)

  • Zk ∼ N(0, Σk) indep of Gk
slide-24
SLIDE 24

Distributional Analysis

Lemma 1: shifted normal conditional distribution Given Fk−1 = (G0, . . . , Gk−1, Z0, Z1, . . . , Zk−1), the Zk has the distributional representation Zk = Gk σk bk + Zk

  • Gk2/σ2

k ∼ Chi-square(n − k)

  • Zk ∼ N(0, Σk) indep of Gk
  • b0, b1, . . . , bk the successive orthonormal components of

β σ

  • ,

ˆ β1

  • , . . . ,

ˆ βk

  • (∗)
  • Σk = I − b0bT

0 − b1bT 1 − . . . − bkbT k

= projection onto space orthogonal to (∗)

  • σ2

k = ˆ

βT

k Σk−1 ˆ

βk

slide-25
SLIDE 25

Distribution of ZT

k =

  • X T

1 Gk

Gk , . . . , X T

N Gk

Gk , ǫTGk σGk

  • Lemma 1: shifted normal conditional distribution

Given Fk−1 = (G0, . . . , Gk−1, Z0, Z1, . . . , Zk−1), the Zk has the distributional representation Zk = Gk σk bk + Zk

  • Gk2/σ2

k ∼ Chi-square(n − k)

  • Zk ∼ N(0, Σk) indep of Gk
  • b0, b1, . . . , bk the successive orthonormal components of

β σ

  • ,

ˆ β1

  • , . . . ,

ˆ βk

  • (∗)
  • Σk = I − b0bT

0 − b1bT 1 − . . . − bkbT k

= projection onto space orthogonal to (∗)

  • σ2

k = ˆ

βT

k Σk−1 ˆ

βk

slide-26
SLIDE 26

Idealized Statistics

Weights of combination based on λk proportional to

  • (σY − bT

0 ˆ

βk)2, (bT

1 ˆ

βk)2, . . . , (bT

k ˆ

βk)2 produces the desired distributional representation statideal

k

= √n

  • σ2 + β − ˆ

βk2 β + Z comb

k

with Z comb

k

∼ N(0, I) and σ2

Y = σ2 + P.

slide-27
SLIDE 27

Idealized Statistics

Weights of combination based on λk proportional to

  • (σY − bT

0 ˆ

βk)2, (bT

1 ˆ

βk)2, . . . , (bT

k ˆ

βk)2 produces the desired distributional representation statideal

k

= √n

  • σ2 + β − ˆ

βk2 β + Z comb

k

with Z comb

k

∼ N(0, I) and σ2

Y = σ2 + P.

  • β − ˆ

βk2 is close to its known expectation

  • This provides approximation of the distribution of the statk,j

as independent shifted normals.

slide-28
SLIDE 28

Relationship between statistics

The stats based on residuals estimate the idealized statistics. Why? For statideal

k

the λk are proportional to n

  • (σY − bT

0 ˆ

βk)2, (bT

1 ˆ

βk)2, . . . , (bT

k ˆ

βk)2 whereas, for the residual-based statk they are proportional to

  • (Y − ZT

0 ˆ

βk)2, (ZT

1 ˆ

βk)2, . . . , (ZT

k ˆ

βk)2 Here ZT

k′ ˆ

βk/√n is approximately bT

k′ ˆ

βk for k′ ≤ k. Indeed, with the chi-square factor replaced by its expectation, ZT

k′ ˆ

βk/ √ n = bT

k′ ˆ

βk + Z T

k′ ˆ

βk/ √ n. The Z T

k′ ˆ

βk has mean 0 and is stochastically dominated by Z T

k′β.

slide-29
SLIDE 29

Iteratively Bayes optimal coefficient estimates

With prior jℓ ∼Unif on secℓ, the Bayes estimate based on statk ˆ βk+1 = E[β|statk] has representation ˆ βk+1,j = √Pℓ ˆ wk,j with ˆ wk,j = Prob{jℓ = j|statk}.

slide-30
SLIDE 30

Iteratively Bayes optimal coefficient estimates

With prior jℓ ∼Unif on secℓ, the Bayes estimate based on statk ˆ βk+1 = E[β|statk] has representation ˆ βk+1,j = √Pℓ ˆ wk,j with ˆ wk,j = Prob{jℓ = j|statk}. Here, when the statk,j are independent N(αℓ,k1{j=jℓ}, 1), we have the logit representation ˆ wk,j = eαℓ,k statk,j

  • j∈secℓ eαℓ,k statk,j .

In our setting, αℓ,k is the shift given by αℓ,k =

  • n Pℓ

σ2 + Eβ − ˆ βk2

slide-31
SLIDE 31

Relating error rate and squared distance

  • Error of posterior weight is (1 − ˆ

wk,jℓ) if jℓ is sent.

  • The power-weighted error

L

  • ℓ=1

Pℓ (1 − ˆ wk,jℓ)

  • Squared distance from ˆ

βk+1,j = √Pℓ ˆ wk,j to βj = √Pℓ1{j=jℓ} ˆ βk+1 − β2

slide-32
SLIDE 32

Relating error rate and squared distance

  • Error of posterior weight is (1 − ˆ

wk,jℓ) if jℓ is sent.

  • The power-weighted error

L

  • ℓ=1

Pℓ (1 − ˆ wk,jℓ)

  • Squared distance from ˆ

βk+1,j = √Pℓ ˆ wk,j to βj = √Pℓ1{j=jℓ} ˆ βk+1 − β2 Lemma 2

  • The power-weighted error and the squared distance have

the same expectation.

  • Equivalently, the success rate L

ℓ=1(Pℓ/P) ˆ

wk,jℓ which is βT ˆ βk+1/P and ˆ βk+12/P have the same expectation.

slide-33
SLIDE 33

Relating error rate and squared distance

  • Error of posterior weight is (1 − ˆ

wk,jℓ) if jℓ is sent.

  • The power-weighted error

L

  • ℓ=1

Pℓ (1 − ˆ wk,jℓ)

  • Squared distance from ˆ

βk+1,j = √Pℓ ˆ wk,j to βj = √Pℓ1{j=jℓ} ˆ βk+1 − β2 Lemma 2

  • The power-weighted error and the squared distance have

the same expectation.

  • Equivalently, the success rate L

ℓ=1(Pℓ/P) ˆ

wk,jℓ which is βT ˆ βk+1/P and ˆ βk+12/P have the same expectation. Proof: Use ˆ βk+1 = E [β|statk].

slide-34
SLIDE 34

Relating error rate and squared distance

  • Error of posterior weight is (1 − ˆ

wk,jℓ) if jℓ is sent.

  • The power-weighted error

L

  • ℓ=1

Pℓ (1 − ˆ wk,jℓ)

  • Squared distance from ˆ

βk+1,j = √Pℓ ˆ wk,j to βj = √Pℓ1{j=jℓ} ˆ βk+1 − β2 Lemma 2

  • The power-weighted error and the squared distance have

the same expectation.

  • Equivalently, the success rate L

ℓ=1(Pℓ/P) ˆ

wk,jℓ which is βT ˆ βk+1/P and ˆ βk+12/P have the same expectation. Proof: Use ˆ βk+1 = E [β|statk]. Expected success rate: xk+1 = L

ℓ=1(Pℓ/P) E

ˆ wk,jℓ

slide-35
SLIDE 35

Consequence for expected success rate

If the expected success rate was xk, then using the statk,j representation αℓ,k 1{j=jℓ} + Zk,j with αℓ,k =

  • nPℓ/(σ2 + P(1 − xk)),

then at the next step we have xk+1 = g(xk) where g(x) is the success update function g(x) =

L

  • ℓ=1

Pℓ P success(αℓ(x)) where success(α) = E

  • e α2+αZ1

e α2+αZ1 + M

j=2 e αZj

  • evaluated at αℓ(x) =
  • nPℓ/(σ2 + P(1 − x))

assuming w.l.o.g. that first term is sent in each section.

slide-36
SLIDE 36

Decoding progression

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x M = 29 , L = M snr=7 C=1.5 bits R=1.2 bits(0.8C) g(x) x

Figure: Plot of g(x) and the sequence xk.

slide-37
SLIDE 37

Integral representation of g(x)

Change of variables from t = ℓ/L to u = 1 − e−2Ct 1 − e−2C ∼ Uniform on [0, 1], αℓ(x) becomes α(u, x) = τ

  • C

R 1 + snr(1 − u) 1 + snr(1 − x) which can be compared to τ =

  • 2 log M.

We have the integral representation of g(x) g(x) = EU [g(U, x)] = 1 g(u, x)du where g(u, x) = success(α(u, x))

slide-38
SLIDE 38

Transition plots

0.0 0.4 0.8

Expected weight of the terms sent

x=0 soft decision hard decision with a=1/2 x=0.2 x=0.4 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.4 0.8

u(l)

x=0.6 0.0 0.2 0.4 0.6 0.8 1.0

u(l)

x=0.8 0.0 0.2 0.4 0.6 0.8 1.0

u(l)

x=1

Setting: M = 29, L = M, C = 1.5 bits and R = 0.8C. Plot of g(u, x) for x = 0, 0.2, 0.4, 0.6, 0.8, 1. Horizontal axis: depicts u(ℓ) = (1 − e−2Cℓ/L)/(1 − e−2C). Black curves: our soft decision decoder Red curves: thresholding decoder with threshold

  • 2 log M + a

The area under the curve is g(x).

slide-39
SLIDE 39

Lowerbound for update function

Using Jensen’s inequality, we have success(α) = E

  • e α2+αZ1

e α2+αZ1 + M

j=2 e αZj

E

  • eα2+αZ1

eα2+αZ1 + (M − 1)eα2/2

  • so that

g(x) ≥ P{ξ ≤ α2

U/2 − τ 2/2 + αUZ}

where ξ ∼ logistic(0, 1) and αu = α(u, x)

slide-40
SLIDE 40

The Logit representation

  • By McFadden(1974),

Let s1, . . . , sm be a fixed sequence and ǫj be independent Gumbel distributed random variable. Then, P{s1 + ǫ1 ≥ max

2≤j≤m(sj + ǫj)} =

es1 m

j=1 esj

Thus, we can write g(x) as g(x) = P

  • α2

U + αUZ1 + ǫ1 ≥ max 2≤j≤m(αUZj + ǫj)

  • ,
slide-41
SLIDE 41

Extreme value representation of the update function

  • Using the logit representation: Approximation of the update

function g(x) = P{V1 ≤ αU}, where V1 = max

2≤j≤m

  • −Z1 − Zj

2 +

  • ǫj − ǫ1 + (Z1 − Zj)2

4

  • +
  • .
  • For the lowerbound

g(x) ≥ P{V2 ≤ αU} where V2 = −Z1 +

  • (τ 2 + 2ξ + Z 2

1 )+.

slide-42
SLIDE 42

Analysis of Update function

  • x∗ solves g(x) = x,

yields mistake rate 1 − x∗

  • Communication rate R = C/(1 + r/τ 2)
  • with r = E[(V 2

+ − τ 2)1B], mistake rate

1 − x∗ = 1 snr r τ 2

  • Here r grows no faster than order of τ
  • B = {α(1, x∗) ≤ V ≤ α(0, x∗)}
slide-43
SLIDE 43

Summary

u

Input bits (length K)

Sparse Superposition Encoder

Gaussian Channel

ǫ

Noise

Y

received (length n)

Adaptive Successive Decoder

ˆ u Reliable for rates R < C For the adaptive success decoder

  • with thresholding (J&B 2010b ISIT, 2012b)
  • with iteratively optimal soft decision (shown here)
slide-44
SLIDE 44

Update fuctions

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x M = 29 , L = M snr=7 C=1.5 bits R=1.2 bits(0.8C) g(x) Lower bound a=0 a=0.5

Figure: Comparison of update functions. Blue and light blue lines indicates {0, 1} decision using the threshold

  • 2 log M + a with

respect to the value a as indicated.

slide-45
SLIDE 45

Update fuctions

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x M = 29 , L = M snr=7 C=1.5 bits R=1.2 bits(0.8C) g(x) Lower bound a=0 a=0.5

Figure: Comparison of update functions. Blue and light blue lines indicates {0, 1} decision using the threshold

  • 2 log M + a with

respect to the value a as indicated.

slide-46
SLIDE 46

Update fuctions

0.80 0.85 0.90 0.95 1.00 0.80 0.85 0.90 0.95 1.00 x M = 29 , L = M snr=7 C=1.5 bits R=1.2 bits(0.8C) g(x) Lower bound a=0 a=0.5

Figure: Comparison of update functions. Blue and light blue lines indicates {0, 1} decision using the threshold

  • 2 log M + a with

respect to the value a as indicated.

slide-47
SLIDE 47

Iteratively Bayes optimal coefficient estimates

With prior jℓ ∼Unif on secℓ, the Bayes estimate based on statk ˆ βk+1 = E[β|statk] ∼ = E[β|statk, statk−1, . . . , stat1] ∼ = E[β|Fk] where Fk = { Standardized inner products of X columns with Y and with components of the fits F1, . . . , Fk} Here, Fk = X ˆ βk

slide-48
SLIDE 48

Fraction of Mistakes

Translating power-weighted value of (1 − ˆ wk,jℓ) into fraction of occurrences of {1 − ˆ wk,jℓ ≥ 1/2} δmis ≤ snr C (1 − x∗) = r Cτ 2 at rate R =

C 1+r/τ 2