Revisiting Zero-Rate Bounds on the Reliability Function of Discrete - - PowerPoint PPT Presentation

revisiting zero rate bounds on the reliability function
SMART_READER_LITE
LIVE PREVIEW

Revisiting Zero-Rate Bounds on the Reliability Function of Discrete - - PowerPoint PPT Presentation

Revisiting Zero-Rate Bounds on the Reliability Function of Discrete Memoryless Channels Marco Bondaschi & Marco Dalai Department of Information Engineering University of Brescia, Italy ISIT 2020 Setting 2 / 21 Setting 2 / 21 Setting 2


slide-1
SLIDE 1

Revisiting Zero-Rate Bounds on the Reliability Function of Discrete Memoryless Channels

Marco Bondaschi & Marco Dalai

Department of Information Engineering University of Brescia, Italy

ISIT 2020

slide-2
SLIDE 2

Setting

2 / 21

slide-3
SLIDE 3

Setting

2 / 21

slide-4
SLIDE 4

Setting

2 / 21

slide-5
SLIDE 5

Setting

2 / 21

slide-6
SLIDE 6

Setting

2 / 21

slide-7
SLIDE 7

Setting

2 / 21

slide-8
SLIDE 8

Setting

2 / 21

slide-9
SLIDE 9

Setting

2 / 21

slide-10
SLIDE 10

Setting

Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n

3 / 21

slide-11
SLIDE 11

Setting

Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n Rate: R = log M n

3 / 21

slide-12
SLIDE 12

Setting

Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n Rate: R = log M n Decoding regions: Ym =

  • y ∈ Yn : m ∈ L(y)
  • 3 / 21
slide-13
SLIDE 13

Setting

Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n Rate: R = log M n Decoding regions: Ym =

  • y ∈ Yn : m ∈ L(y)
  • Probability of error:

Pe,m =

  • y/

∈Ym

Pm(y) Pe,max = max

m∈M Pe,m

3 / 21

slide-14
SLIDE 14

Setting

Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n Rate: R = log M n Decoding regions: Ym =

  • y ∈ Yn : m ∈ L(y)
  • Probability of error:

Pe,m =

  • y/

∈Ym

Pm(y) Pe,max = max

m∈M Pe,m

L-list reliability function: EL(R) = lim

n→∞ −log Pe,max

n Pe,max ≈ e−n EL(R)

3 / 21

slide-15
SLIDE 15

Reliability Function

4 / 21

slide-16
SLIDE 16

Reliability Function

4 / 21

slide-17
SLIDE 17

Reliability Function

4 / 21

slide-18
SLIDE 18

Reliability Function

4 / 21

slide-19
SLIDE 19

Reliability Function

4 / 21

slide-20
SLIDE 20

Outline of the Proof

1 Lower-bound Pe,max for codes with L + 1 codewords.

5 / 21

slide-21
SLIDE 21

Outline of the Proof

1 Lower-bound Pe,max for codes with L + 1 codewords.

Berlekamp and Blinovsky’s approach: study of a gradient on the boundary of the probability simplex − → cumbersome for L > 1.

5 / 21

slide-22
SLIDE 22

Outline of the Proof

1 Lower-bound Pe,max for codes with L + 1 codewords.

Berlekamp and Blinovsky’s approach: study of a gradient on the boundary of the probability simplex − → cumbersome for L > 1. Our approach: method of types + trick by Shayevitz − → straightforward for L > 1.

5 / 21

slide-23
SLIDE 23

Outline of the Proof

1 Lower-bound Pe,max for codes with L + 1 codewords.

Berlekamp and Blinovsky’s approach: study of a gradient on the boundary of the probability simplex − → cumbersome for L > 1. Our approach: method of types + trick by Shayevitz − → straightforward for L > 1.

2 For M ≥ L + 1 codewords, Pe,max is lower-bounded by the largest bound over all

subsets of L + 1 codewords.

5 / 21

slide-24
SLIDE 24

Outline of the Proof

3 Upper bound the smallest error exponent with the average over all

(L + 1)-subcodes.

6 / 21

slide-25
SLIDE 25

Outline of the Proof

3 Upper bound the smallest error exponent with the average over all

(L + 1)-subcodes.

4 Bound the average over a carefully selected subcode.

6 / 21

slide-26
SLIDE 26

Outline of the Proof

3 Upper bound the smallest error exponent with the average over all

(L + 1)-subcodes.

4 Bound the average over a carefully selected subcode.

Berlekamp and Blinovsky’s approach: Selection of ordered subcode + complex iterative concatenation of codewords − → cumbersome for L > 1.

6 / 21

slide-27
SLIDE 27

Outline of the Proof

3 Upper bound the smallest error exponent with the average over all

(L + 1)-subcodes.

4 Bound the average over a carefully selected subcode.

Berlekamp and Blinovsky’s approach: Selection of ordered subcode + complex iterative concatenation of codewords − → cumbersome for L > 1. Our approach: Selection of subcode using Ramsey theory + theorem by Komlós (Blinovsky’s idea for L = 1) − → straightforward for L > 1.

6 / 21

slide-28
SLIDE 28

Outline of the Proof

3 Upper bound the smallest error exponent with the average over all

(L + 1)-subcodes.

4 Bound the average over a carefully selected subcode.

Berlekamp and Blinovsky’s approach: Selection of ordered subcode + complex iterative concatenation of codewords − → cumbersome for L > 1. Our approach: Selection of subcode using Ramsey theory + theorem by Komlós (Blinovsky’s idea for L = 1) − → straightforward for L > 1.

5 Show that for the subcode EL(0) = EL,ex(0).

6 / 21

slide-29
SLIDE 29
  • 1. Probability of Error for L + 1 Codewords

7 / 21

slide-30
SLIDE 30
  • 1. Probability of Error for L + 1 Codewords

For any vector x = (x1, . . . , xL+1) ∈ X L+1, q(x) = fraction of times the code has x as a column.

7 / 21

slide-31
SLIDE 31
  • 1. Probability of Error for L + 1 Codewords

For any vector x = (x1, . . . , xL+1) ∈ X L+1, q(x) = fraction of times the code has x as a column. Fundamental concave function: for any probability vector α, µ(α) =

  • x∈X L+1

q(x) µx(α), µx(α) = − log

  • y∈Y

P(y|x1)α1 · · · P(y|xL+1)αL+1

7 / 21

slide-32
SLIDE 32
  • 1. Probability of Error for L + 1 Codewords

For any vector x = (x1, . . . , xL+1) ∈ X L+1, q(x) = fraction of times the code has x as a column. Fundamental concave function: for any probability vector α, µ(α) =

  • x∈X L+1

q(x) µx(α), µx(α) = − log

  • y∈Y

P(y|x1)α1 · · · P(y|xL+1)αL+1 Objective: prove that Pe,max ≥ e−n DM, DM = max

α

µ(α)

7 / 21

slide-33
SLIDE 33
  • 1. Probability of Error for L + 1 Codewords

Berlekamp & Blinovsky’s approach: Auxiliary distribution on Yn that depends on ∇µ(α).

8 / 21

slide-34
SLIDE 34
  • 1. Probability of Error for L + 1 Codewords

Berlekamp & Blinovsky’s approach: Auxiliary distribution on Yn that depends on ∇µ(α). Requires careful analysis of the behavior of ∇µ(α) when µ(α) is maximized on the border of the probability simplex {α} → Easy for L = 1, complicated for L > 1.

8 / 21

slide-35
SLIDE 35
  • 1. Probability of Error for L + 1 Codewords

Berlekamp & Blinovsky’s approach: Auxiliary distribution on Yn that depends on ∇µ(α). Requires careful analysis of the behavior of ∇µ(α) when µ(α) is maximized on the border of the probability simplex {α} → Easy for L = 1, complicated for L > 1. Our approach: method of types + a result by Shayevitz → Much more straightforward to generalize to L > 1.

8 / 21

slide-36
SLIDE 36
  • 1. Probability of Error for L + 1 Codewords

Case L = 1: two messages M = {1, 2}.

9 / 21

slide-37
SLIDE 37
  • 1. Probability of Error for L + 1 Codewords

Case L = 1: two messages M = {1, 2}. Method of types: for output sequences y Same conditional type V given (x1, x2) = ⇒ same conditional probabilities P1(y) and P2(y)

9 / 21

slide-38
SLIDE 38
  • 1. Probability of Error for L + 1 Codewords

Case L = 1: two messages M = {1, 2}. Method of types: for output sequences y Same conditional type V given (x1, x2) = ⇒ same conditional probabilities P1(y) and P2(y) Decoding regions on conditional types: Y1 =

  • y : P1(y) > P2(y)

⇒ T1 =

  • V : D(V ||P1) < D(V ||P2)
  • D(V ||P1) =
  • a∈X
  • b∈X

q(a, b) D

  • V (·|a, b) || P(·|a)
  • 9 / 21
slide-39
SLIDE 39
  • 1. Probability of Error for L + 1 Codewords

Binary hypothesis testing (Cover & Thomas):

10 / 21

slide-40
SLIDE 40
  • 1. Probability of Error for L + 1 Codewords

Binary hypothesis testing (Cover & Thomas):

10 / 21

slide-41
SLIDE 41
  • 1. Probability of Error for L + 1 Codewords

Case L > 1: same approach with L + 1 messages M = {1, 2, . . . , L + 1}.

11 / 21

slide-42
SLIDE 42
  • 1. Probability of Error for L + 1 Codewords

Case L > 1: same approach with L + 1 messages M = {1, 2, . . . , L + 1}. One message left out from each list.

11 / 21

slide-43
SLIDE 43
  • 1. Probability of Error for L + 1 Codewords

Case L > 1: same approach with L + 1 messages M = {1, 2, . . . , L + 1}. One message left out from each list. Decoding regions: Ym =

  • y : Pm(y) > Pi(y) for some i
  • Tm =
  • V : D(V ||Pm) < D(V ||Pi) for some i
  • 11 / 21
slide-44
SLIDE 44
  • 1. Probability of Error for L + 1 Codewords

Case L > 1: same approach with L + 1 messages M = {1, 2, . . . , L + 1}. One message left out from each list. Decoding regions: Ym =

  • y : Pm(y) > Pi(y) for some i
  • Tm =
  • V : D(V ||Pm) < D(V ||Pi) for some i
  • D(V ||Pm) =
  • x∈X L+1

q(x) D

  • V (·|x) || P(·|xm)
  • 11 / 21
slide-45
SLIDE 45
  • 1. Probability of Error for L + 1 Codewords

When n → ∞, same dominant exponent in all regions: DM = min

Q/ ∈T1

D(Q||P1) = · · · = min

Q/ ∈TL+1

D(Q||PL+1) Pe,max ≥ e−n

  • DM+o(1)
  • 12 / 21
slide-46
SLIDE 46
  • 1. Probability of Error for L + 1 Codewords

When n → ∞, same dominant exponent in all regions: DM = min

Q/ ∈T1

D(Q||P1) = · · · = min

Q/ ∈TL+1

D(Q||PL+1) Pe,max ≥ e−n

  • DM+o(1)
  • Alternative expression for µ(α) by Shayevitz (2010) + minimax theorem:

DM = max

α

µ(α)

12 / 21

slide-47
SLIDE 47
  • 2. Probability of Error for M ≥ L + 1 Codewords

For a code C with M ≥ L + 1 messages M = {1, . . . , M}:

13 / 21

slide-48
SLIDE 48
  • 2. Probability of Error for M ≥ L + 1 Codewords

For a code C with M ≥ L + 1 messages M = {1, . . . , M}: Pick the (L + 1)-subcode ˆ C ⊂ C with smallest error exponent: Dmin(C) = min

C⊂C max α

µC(α)

13 / 21

slide-49
SLIDE 49
  • 2. Probability of Error for M ≥ L + 1 Codewords

For a code C with M ≥ L + 1 messages M = {1, . . . , M}: Pick the (L + 1)-subcode ˆ C ⊂ C with smallest error exponent: Dmin(C) = min

C⊂C max α

µC(α) Pick the message ˆ m with the maximal probability of error for the code ˆ C alone: Pe, ˆ

m(ˆ

C) = Pe,max(ˆ C) ≥ e−n

  • Dmin(C)+o(1)
  • 13 / 21
slide-50
SLIDE 50
  • 2. Probability of Error for M ≥ L + 1 Codewords

For a code C with M ≥ L + 1 messages M = {1, . . . , M}: Pick the (L + 1)-subcode ˆ C ⊂ C with smallest error exponent: Dmin(C) = min

C⊂C max α

µC(α) Pick the message ˆ m with the maximal probability of error for the code ˆ C alone: Pe, ˆ

m(ˆ

C) = Pe,max(ˆ C) ≥ e−n

  • Dmin(C)+o(1)
  • When the whole code C is considered, Pe, ˆ

m can only increase:

Pe,max(C) ≥ Pe, ˆ

m(C) ≥ e−n

  • Dmin(C)+o(1)
  • 13 / 21
slide-51
SLIDE 51
  • 3. Averaging the Error Exponents

Dmin(C) = min

C⊂C max α

µC(α) = min

C⊂C max α

  • x∈X L+1

qC(x) µx(α)

14 / 21

slide-52
SLIDE 52
  • 3. Averaging the Error Exponents

Dmin(C) = min

C⊂C max α

µC(α) = min

C⊂C max α

  • x∈X L+1

qC(x) µx(α) We need an upper bound for Dmin(C) valid for all codes C.

14 / 21

slide-53
SLIDE 53
  • 3. Averaging the Error Exponents

Dmin(C) = min

C⊂C max α

µC(α) = min

C⊂C max α

  • x∈X L+1

qC(x) µx(α) We need an upper bound for Dmin(C) valid for all codes C. Standard idea: bound the minimum error exponent with the average over all (L + 1)-subcodes C Dmin(C) = min

C⊂C max α

µC(α) ≤ E

  • max

α

µC(α)

  • 14 / 21
slide-54
SLIDE 54
  • 3. Averaging the Error Exponents

Dmin(C) ≤

  • 1

M − L L+1

C⊂C

max

α

  • x∈X L+1

qC(x) µx(α)

15 / 21

slide-55
SLIDE 55
  • 3. Averaging the Error Exponents

Dmin(C) ≤

  • 1

M − L L+1

C⊂C

max

α

  • x∈X L+1

qC(x) µx(α) Problem: Maximum inside the sum! Subcodes C can have very different maximizing α.

15 / 21

slide-56
SLIDE 56
  • 3. Averaging the Error Exponents

Dmin(C) ≤

  • 1

M − L L+1

C⊂C

max

α

  • x∈X L+1

qC(x) µx(α) Problem: Maximum inside the sum! Subcodes C can have very different maximizing α. Idea: average only on subcode C′ ⊂ C for which all C have same maximizing α. Dmin(C) = min

C⊂C max α

µC(α) ≤ min

C⊂C′ max α

µC(α) = Dmin(C′)

15 / 21

slide-57
SLIDE 57
  • 3. Averaging the Error Exponents

16 / 21

slide-58
SLIDE 58
  • 3. Averaging the Error Exponents

Berlekamp & Blinovsky Ordered subcode + Iterative concatenation of codewords

16 / 21

slide-59
SLIDE 59
  • 3. Averaging the Error Exponents

Berlekamp & Blinovsky Ordered subcode + Iterative concatenation of codewords Ramsey Theory (Blinovsky for L = 1) Symmetric subcode (Ramsey + Kómlos)

16 / 21

slide-60
SLIDE 60
  • 4. Extraction of Symmetric Subcode

What kind of subcode do we need? Subcode C′ such that for all (L + 1)-subcodes C ⊂ C′ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) where S(x) = set of permutations of x.

17 / 21

slide-61
SLIDE 61
  • 4. Extraction of Symmetric Subcode

What kind of subcode do we need? Subcode C′ such that for all (L + 1)-subcodes C ⊂ C′ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) where S(x) = set of permutations of x. If the property above is satisfied:

17 / 21

slide-62
SLIDE 62
  • 4. Extraction of Symmetric Subcode

What kind of subcode do we need? Subcode C′ such that for all (L + 1)-subcodes C ⊂ C′ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) where S(x) = set of permutations of x. If the property above is satisfied: µC(α) =

x qC(x)µx(α)

invariant to permutations of α

17 / 21

slide-63
SLIDE 63
  • 4. Extraction of Symmetric Subcode

What kind of subcode do we need? Subcode C′ such that for all (L + 1)-subcodes C ⊂ C′ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) where S(x) = set of permutations of x. If the property above is satisfied: µC(α) =

x qC(x)µx(α)

invariant to permutations of α = ⇒ µC(α) maximized at ˜ α =

  • 1

L+1, . . . , 1 L+1

  • 17 / 21
slide-64
SLIDE 64
  • 4. Extraction of Symmetric Subcode

Generalization of Komlós (1990): subcode C′ of size M′

18 / 21

slide-65
SLIDE 65
  • 4. Extraction of Symmetric Subcode

Generalization of Komlós (1990): subcode C′ of size M′

18 / 21

slide-66
SLIDE 66
  • 4. Extraction of Symmetric Subcode

Generalization of Komlós (1990): subcode C′ of size M′

18 / 21

slide-67
SLIDE 67
  • 4. Extraction of Symmetric Subcode

Generalization of Komlós (1990): subcode C′ of size M′

18 / 21

slide-68
SLIDE 68
  • 4. Extraction of Symmetric Subcode

Generalization of Komlós (1990): subcode C′ of size M′

18 / 21

slide-69
SLIDE 69
  • 4. Extraction of Symmetric Subcode

Generalization of Komlós (1990): subcode C′ of size M′

18 / 21

slide-70
SLIDE 70
  • 4. Extraction of Symmetric Subcode

Generalization of Komlós (1990): subcode C′ of size M′

18 / 21

slide-71
SLIDE 71
  • 4. Extraction of Symmetric Subcode

Ramsey’s theorem for graphs:

19 / 21

slide-72
SLIDE 72
  • 4. Extraction of Symmetric Subcode

Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs).

19 / 21

slide-73
SLIDE 73
  • 4. Extraction of Symmetric Subcode

Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with

  • qC(x)
  • (quantized).

19 / 21

slide-74
SLIDE 74
  • 4. Extraction of Symmetric Subcode

Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with

  • qC(x)
  • (quantized).

Monochromatic subgraph C′

19 / 21

slide-75
SLIDE 75
  • 4. Extraction of Symmetric Subcode

Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with

  • qC(x)
  • (quantized).

Monochromatic subgraph C′ = ⇒ same qC(x) for all C ⊂ C′

19 / 21

slide-76
SLIDE 76
  • 4. Extraction of Symmetric Subcode

Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with

  • qC(x)
  • (quantized).

Monochromatic subgraph C′ = ⇒ same qC(x) for all C ⊂ C′ = ⇒ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x)

19 / 21

slide-77
SLIDE 77
  • 4. Extraction of Symmetric Subcode

Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with

  • qC(x)
  • (quantized).

Monochromatic subgraph C′ = ⇒ same qC(x) for all C ⊂ C′ = ⇒ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) = ⇒ max

α

µC(α) = µC( ˜ α)

19 / 21

slide-78
SLIDE 78
  • 5. Upper Bound on EL(0)

Dmin(C) ≤

  • 1

M′ − L L+1

C⊂C′

  • x

qC(x) µx( ˜ α) + o(1)

20 / 21

slide-79
SLIDE 79
  • 5. Upper Bound on EL(0)

Dmin(C) ≤

  • 1

M′ − L L+1

C⊂C′

  • x

qC(x) µx( ˜ α) + o(1) ↓ Plotkin double-counting trick

20 / 21

slide-80
SLIDE 80
  • 5. Upper Bound on EL(0)

Dmin(C) ≤

  • 1

M′ − L L+1

C⊂C′

  • x

qC(x) µx( ˜ α) + o(1) ↓ Plotkin double-counting trick ↓ Dmin(C) ≤

  • M′

M′ − L L+1 1 n

n

  • c=1
  • x

M′

c(x1)

M′ · · · M′

c(xL+1)

M′ µx( ˜ α) + o(1) where: M′

c(x) = number of times x occurs in column c

20 / 21

slide-81
SLIDE 81
  • 5. Upper Bound on EL(0)

Dmin(C) ≤

  • M′

M′ − L L+1 1 n

n

  • c=1
  • x

M′

c(x1)

M′ · · · M′

c(xL+1)

M′ µx( ˜ α) + o(1)

21 / 21

slide-82
SLIDE 82
  • 5. Upper Bound on EL(0)

Dmin(C) ≤

  • M′

M′ − L L+1 1 n

n

  • c=1
  • x

M′

c(x1)

M′ · · · M′

c(xL+1)

M′ µx( ˜ α) + o(1) ↓ M′

c(x)

M′

  • probability distribution on X

21 / 21

slide-83
SLIDE 83
  • 5. Upper Bound on EL(0)

Dmin(C) ≤

  • M′

M′ − L L+1 1 n

n

  • c=1
  • x

M′

c(x1)

M′ · · · M′

c(xL+1)

M′ µx( ˜ α) + o(1) ↓ M′

c(x)

M′

  • probability distribution on X

↓ Dmin(C) ≤

  • M′

M′ − L L+1 max

Q∈P(X)

  • x

Q(x1) · · · Q(xL+1) µx( ˜ α) + o(1)

21 / 21

slide-84
SLIDE 84
  • 5. Upper Bound on EL(0)

Dmin(C) ≤

  • M′

M′ − L L+1 1 n

n

  • c=1
  • x

M′

c(x1)

M′ · · · M′

c(xL+1)

M′ µx( ˜ α) + o(1) ↓ M′

c(x)

M′

  • probability distribution on X

↓ Dmin(C) ≤

  • M′

M′ − L L+1 max

Q∈P(X)

  • x

Q(x1) · · · Q(xL+1) µx( ˜ α) + o(1)   M′ → ∞ Eex(0)

21 / 21