Revisiting Zero-Rate Bounds on the Reliability Function of Discrete - - PowerPoint PPT Presentation
Revisiting Zero-Rate Bounds on the Reliability Function of Discrete - - PowerPoint PPT Presentation
Revisiting Zero-Rate Bounds on the Reliability Function of Discrete Memoryless Channels Marco Bondaschi & Marco Dalai Department of Information Engineering University of Brescia, Italy ISIT 2020 Setting 2 / 21 Setting 2 / 21 Setting 2
Setting
2 / 21
Setting
2 / 21
Setting
2 / 21
Setting
2 / 21
Setting
2 / 21
Setting
2 / 21
Setting
2 / 21
Setting
2 / 21
Setting
Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n
3 / 21
Setting
Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n Rate: R = log M n
3 / 21
Setting
Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n Rate: R = log M n Decoding regions: Ym =
- y ∈ Yn : m ∈ L(y)
- 3 / 21
Setting
Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n Rate: R = log M n Decoding regions: Ym =
- y ∈ Yn : m ∈ L(y)
- Probability of error:
Pe,m =
- y/
∈Ym
Pm(y) Pe,max = max
m∈M Pe,m
3 / 21
Setting
Code: set of M codewords C = {x1, x2, . . . , xM} ⊂ X n Rate: R = log M n Decoding regions: Ym =
- y ∈ Yn : m ∈ L(y)
- Probability of error:
Pe,m =
- y/
∈Ym
Pm(y) Pe,max = max
m∈M Pe,m
L-list reliability function: EL(R) = lim
n→∞ −log Pe,max
n Pe,max ≈ e−n EL(R)
3 / 21
Reliability Function
4 / 21
Reliability Function
4 / 21
Reliability Function
4 / 21
Reliability Function
4 / 21
Reliability Function
4 / 21
Outline of the Proof
1 Lower-bound Pe,max for codes with L + 1 codewords.
5 / 21
Outline of the Proof
1 Lower-bound Pe,max for codes with L + 1 codewords.
Berlekamp and Blinovsky’s approach: study of a gradient on the boundary of the probability simplex − → cumbersome for L > 1.
5 / 21
Outline of the Proof
1 Lower-bound Pe,max for codes with L + 1 codewords.
Berlekamp and Blinovsky’s approach: study of a gradient on the boundary of the probability simplex − → cumbersome for L > 1. Our approach: method of types + trick by Shayevitz − → straightforward for L > 1.
5 / 21
Outline of the Proof
1 Lower-bound Pe,max for codes with L + 1 codewords.
Berlekamp and Blinovsky’s approach: study of a gradient on the boundary of the probability simplex − → cumbersome for L > 1. Our approach: method of types + trick by Shayevitz − → straightforward for L > 1.
2 For M ≥ L + 1 codewords, Pe,max is lower-bounded by the largest bound over all
subsets of L + 1 codewords.
5 / 21
Outline of the Proof
3 Upper bound the smallest error exponent with the average over all
(L + 1)-subcodes.
6 / 21
Outline of the Proof
3 Upper bound the smallest error exponent with the average over all
(L + 1)-subcodes.
4 Bound the average over a carefully selected subcode.
6 / 21
Outline of the Proof
3 Upper bound the smallest error exponent with the average over all
(L + 1)-subcodes.
4 Bound the average over a carefully selected subcode.
Berlekamp and Blinovsky’s approach: Selection of ordered subcode + complex iterative concatenation of codewords − → cumbersome for L > 1.
6 / 21
Outline of the Proof
3 Upper bound the smallest error exponent with the average over all
(L + 1)-subcodes.
4 Bound the average over a carefully selected subcode.
Berlekamp and Blinovsky’s approach: Selection of ordered subcode + complex iterative concatenation of codewords − → cumbersome for L > 1. Our approach: Selection of subcode using Ramsey theory + theorem by Komlós (Blinovsky’s idea for L = 1) − → straightforward for L > 1.
6 / 21
Outline of the Proof
3 Upper bound the smallest error exponent with the average over all
(L + 1)-subcodes.
4 Bound the average over a carefully selected subcode.
Berlekamp and Blinovsky’s approach: Selection of ordered subcode + complex iterative concatenation of codewords − → cumbersome for L > 1. Our approach: Selection of subcode using Ramsey theory + theorem by Komlós (Blinovsky’s idea for L = 1) − → straightforward for L > 1.
5 Show that for the subcode EL(0) = EL,ex(0).
6 / 21
- 1. Probability of Error for L + 1 Codewords
7 / 21
- 1. Probability of Error for L + 1 Codewords
For any vector x = (x1, . . . , xL+1) ∈ X L+1, q(x) = fraction of times the code has x as a column.
7 / 21
- 1. Probability of Error for L + 1 Codewords
For any vector x = (x1, . . . , xL+1) ∈ X L+1, q(x) = fraction of times the code has x as a column. Fundamental concave function: for any probability vector α, µ(α) =
- x∈X L+1
q(x) µx(α), µx(α) = − log
- y∈Y
P(y|x1)α1 · · · P(y|xL+1)αL+1
7 / 21
- 1. Probability of Error for L + 1 Codewords
For any vector x = (x1, . . . , xL+1) ∈ X L+1, q(x) = fraction of times the code has x as a column. Fundamental concave function: for any probability vector α, µ(α) =
- x∈X L+1
q(x) µx(α), µx(α) = − log
- y∈Y
P(y|x1)α1 · · · P(y|xL+1)αL+1 Objective: prove that Pe,max ≥ e−n DM, DM = max
α
µ(α)
7 / 21
- 1. Probability of Error for L + 1 Codewords
Berlekamp & Blinovsky’s approach: Auxiliary distribution on Yn that depends on ∇µ(α).
8 / 21
- 1. Probability of Error for L + 1 Codewords
Berlekamp & Blinovsky’s approach: Auxiliary distribution on Yn that depends on ∇µ(α). Requires careful analysis of the behavior of ∇µ(α) when µ(α) is maximized on the border of the probability simplex {α} → Easy for L = 1, complicated for L > 1.
8 / 21
- 1. Probability of Error for L + 1 Codewords
Berlekamp & Blinovsky’s approach: Auxiliary distribution on Yn that depends on ∇µ(α). Requires careful analysis of the behavior of ∇µ(α) when µ(α) is maximized on the border of the probability simplex {α} → Easy for L = 1, complicated for L > 1. Our approach: method of types + a result by Shayevitz → Much more straightforward to generalize to L > 1.
8 / 21
- 1. Probability of Error for L + 1 Codewords
Case L = 1: two messages M = {1, 2}.
9 / 21
- 1. Probability of Error for L + 1 Codewords
Case L = 1: two messages M = {1, 2}. Method of types: for output sequences y Same conditional type V given (x1, x2) = ⇒ same conditional probabilities P1(y) and P2(y)
9 / 21
- 1. Probability of Error for L + 1 Codewords
Case L = 1: two messages M = {1, 2}. Method of types: for output sequences y Same conditional type V given (x1, x2) = ⇒ same conditional probabilities P1(y) and P2(y) Decoding regions on conditional types: Y1 =
- y : P1(y) > P2(y)
- ⇐
⇒ T1 =
- V : D(V ||P1) < D(V ||P2)
- D(V ||P1) =
- a∈X
- b∈X
q(a, b) D
- V (·|a, b) || P(·|a)
- 9 / 21
- 1. Probability of Error for L + 1 Codewords
Binary hypothesis testing (Cover & Thomas):
10 / 21
- 1. Probability of Error for L + 1 Codewords
Binary hypothesis testing (Cover & Thomas):
10 / 21
- 1. Probability of Error for L + 1 Codewords
Case L > 1: same approach with L + 1 messages M = {1, 2, . . . , L + 1}.
11 / 21
- 1. Probability of Error for L + 1 Codewords
Case L > 1: same approach with L + 1 messages M = {1, 2, . . . , L + 1}. One message left out from each list.
11 / 21
- 1. Probability of Error for L + 1 Codewords
Case L > 1: same approach with L + 1 messages M = {1, 2, . . . , L + 1}. One message left out from each list. Decoding regions: Ym =
- y : Pm(y) > Pi(y) for some i
- Tm =
- V : D(V ||Pm) < D(V ||Pi) for some i
- 11 / 21
- 1. Probability of Error for L + 1 Codewords
Case L > 1: same approach with L + 1 messages M = {1, 2, . . . , L + 1}. One message left out from each list. Decoding regions: Ym =
- y : Pm(y) > Pi(y) for some i
- Tm =
- V : D(V ||Pm) < D(V ||Pi) for some i
- D(V ||Pm) =
- x∈X L+1
q(x) D
- V (·|x) || P(·|xm)
- 11 / 21
- 1. Probability of Error for L + 1 Codewords
When n → ∞, same dominant exponent in all regions: DM = min
Q/ ∈T1
D(Q||P1) = · · · = min
Q/ ∈TL+1
D(Q||PL+1) Pe,max ≥ e−n
- DM+o(1)
- 12 / 21
- 1. Probability of Error for L + 1 Codewords
When n → ∞, same dominant exponent in all regions: DM = min
Q/ ∈T1
D(Q||P1) = · · · = min
Q/ ∈TL+1
D(Q||PL+1) Pe,max ≥ e−n
- DM+o(1)
- Alternative expression for µ(α) by Shayevitz (2010) + minimax theorem:
DM = max
α
µ(α)
12 / 21
- 2. Probability of Error for M ≥ L + 1 Codewords
For a code C with M ≥ L + 1 messages M = {1, . . . , M}:
13 / 21
- 2. Probability of Error for M ≥ L + 1 Codewords
For a code C with M ≥ L + 1 messages M = {1, . . . , M}: Pick the (L + 1)-subcode ˆ C ⊂ C with smallest error exponent: Dmin(C) = min
C⊂C max α
µC(α)
13 / 21
- 2. Probability of Error for M ≥ L + 1 Codewords
For a code C with M ≥ L + 1 messages M = {1, . . . , M}: Pick the (L + 1)-subcode ˆ C ⊂ C with smallest error exponent: Dmin(C) = min
C⊂C max α
µC(α) Pick the message ˆ m with the maximal probability of error for the code ˆ C alone: Pe, ˆ
m(ˆ
C) = Pe,max(ˆ C) ≥ e−n
- Dmin(C)+o(1)
- 13 / 21
- 2. Probability of Error for M ≥ L + 1 Codewords
For a code C with M ≥ L + 1 messages M = {1, . . . , M}: Pick the (L + 1)-subcode ˆ C ⊂ C with smallest error exponent: Dmin(C) = min
C⊂C max α
µC(α) Pick the message ˆ m with the maximal probability of error for the code ˆ C alone: Pe, ˆ
m(ˆ
C) = Pe,max(ˆ C) ≥ e−n
- Dmin(C)+o(1)
- When the whole code C is considered, Pe, ˆ
m can only increase:
Pe,max(C) ≥ Pe, ˆ
m(C) ≥ e−n
- Dmin(C)+o(1)
- 13 / 21
- 3. Averaging the Error Exponents
Dmin(C) = min
C⊂C max α
µC(α) = min
C⊂C max α
- x∈X L+1
qC(x) µx(α)
14 / 21
- 3. Averaging the Error Exponents
Dmin(C) = min
C⊂C max α
µC(α) = min
C⊂C max α
- x∈X L+1
qC(x) µx(α) We need an upper bound for Dmin(C) valid for all codes C.
14 / 21
- 3. Averaging the Error Exponents
Dmin(C) = min
C⊂C max α
µC(α) = min
C⊂C max α
- x∈X L+1
qC(x) µx(α) We need an upper bound for Dmin(C) valid for all codes C. Standard idea: bound the minimum error exponent with the average over all (L + 1)-subcodes C Dmin(C) = min
C⊂C max α
µC(α) ≤ E
- max
α
µC(α)
- 14 / 21
- 3. Averaging the Error Exponents
Dmin(C) ≤
- 1
M − L L+1
C⊂C
max
α
- x∈X L+1
qC(x) µx(α)
15 / 21
- 3. Averaging the Error Exponents
Dmin(C) ≤
- 1
M − L L+1
C⊂C
max
α
- x∈X L+1
qC(x) µx(α) Problem: Maximum inside the sum! Subcodes C can have very different maximizing α.
15 / 21
- 3. Averaging the Error Exponents
Dmin(C) ≤
- 1
M − L L+1
C⊂C
max
α
- x∈X L+1
qC(x) µx(α) Problem: Maximum inside the sum! Subcodes C can have very different maximizing α. Idea: average only on subcode C′ ⊂ C for which all C have same maximizing α. Dmin(C) = min
C⊂C max α
µC(α) ≤ min
C⊂C′ max α
µC(α) = Dmin(C′)
15 / 21
- 3. Averaging the Error Exponents
16 / 21
- 3. Averaging the Error Exponents
Berlekamp & Blinovsky Ordered subcode + Iterative concatenation of codewords
16 / 21
- 3. Averaging the Error Exponents
Berlekamp & Blinovsky Ordered subcode + Iterative concatenation of codewords Ramsey Theory (Blinovsky for L = 1) Symmetric subcode (Ramsey + Kómlos)
16 / 21
- 4. Extraction of Symmetric Subcode
What kind of subcode do we need? Subcode C′ such that for all (L + 1)-subcodes C ⊂ C′ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) where S(x) = set of permutations of x.
17 / 21
- 4. Extraction of Symmetric Subcode
What kind of subcode do we need? Subcode C′ such that for all (L + 1)-subcodes C ⊂ C′ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) where S(x) = set of permutations of x. If the property above is satisfied:
17 / 21
- 4. Extraction of Symmetric Subcode
What kind of subcode do we need? Subcode C′ such that for all (L + 1)-subcodes C ⊂ C′ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) where S(x) = set of permutations of x. If the property above is satisfied: µC(α) =
x qC(x)µx(α)
invariant to permutations of α
17 / 21
- 4. Extraction of Symmetric Subcode
What kind of subcode do we need? Subcode C′ such that for all (L + 1)-subcodes C ⊂ C′ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) where S(x) = set of permutations of x. If the property above is satisfied: µC(α) =
x qC(x)µx(α)
invariant to permutations of α = ⇒ µC(α) maximized at ˜ α =
- 1
L+1, . . . , 1 L+1
- 17 / 21
- 4. Extraction of Symmetric Subcode
Generalization of Komlós (1990): subcode C′ of size M′
18 / 21
- 4. Extraction of Symmetric Subcode
Generalization of Komlós (1990): subcode C′ of size M′
18 / 21
- 4. Extraction of Symmetric Subcode
Generalization of Komlós (1990): subcode C′ of size M′
18 / 21
- 4. Extraction of Symmetric Subcode
Generalization of Komlós (1990): subcode C′ of size M′
18 / 21
- 4. Extraction of Symmetric Subcode
Generalization of Komlós (1990): subcode C′ of size M′
18 / 21
- 4. Extraction of Symmetric Subcode
Generalization of Komlós (1990): subcode C′ of size M′
18 / 21
- 4. Extraction of Symmetric Subcode
Generalization of Komlós (1990): subcode C′ of size M′
18 / 21
- 4. Extraction of Symmetric Subcode
Ramsey’s theorem for graphs:
19 / 21
- 4. Extraction of Symmetric Subcode
Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs).
19 / 21
- 4. Extraction of Symmetric Subcode
Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with
- qC(x)
- (quantized).
19 / 21
- 4. Extraction of Symmetric Subcode
Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with
- qC(x)
- (quantized).
Monochromatic subgraph C′
19 / 21
- 4. Extraction of Symmetric Subcode
Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with
- qC(x)
- (quantized).
Monochromatic subgraph C′ = ⇒ same qC(x) for all C ⊂ C′
19 / 21
- 4. Extraction of Symmetric Subcode
Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with
- qC(x)
- (quantized).
Monochromatic subgraph C′ = ⇒ same qC(x) for all C ⊂ C′ = ⇒ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x)
19 / 21
- 4. Extraction of Symmetric Subcode
Ramsey’s theorem for graphs: Same holds when edges are of L + 1 vertices (hypergraphs). Color edge C with
- qC(x)
- (quantized).
Monochromatic subgraph C′ = ⇒ same qC(x) for all C ⊂ C′ = ⇒ qC(x) ≃ qC(x′) ∀ x′ ∈ S(x) = ⇒ max
α
µC(α) = µC( ˜ α)
19 / 21
- 5. Upper Bound on EL(0)
Dmin(C) ≤
- 1
M′ − L L+1
C⊂C′
- x
qC(x) µx( ˜ α) + o(1)
20 / 21
- 5. Upper Bound on EL(0)
Dmin(C) ≤
- 1
M′ − L L+1
C⊂C′
- x
qC(x) µx( ˜ α) + o(1) ↓ Plotkin double-counting trick
20 / 21
- 5. Upper Bound on EL(0)
Dmin(C) ≤
- 1
M′ − L L+1
C⊂C′
- x
qC(x) µx( ˜ α) + o(1) ↓ Plotkin double-counting trick ↓ Dmin(C) ≤
- M′
M′ − L L+1 1 n
n
- c=1
- x
M′
c(x1)
M′ · · · M′
c(xL+1)
M′ µx( ˜ α) + o(1) where: M′
c(x) = number of times x occurs in column c
20 / 21
- 5. Upper Bound on EL(0)
Dmin(C) ≤
- M′
M′ − L L+1 1 n
n
- c=1
- x
M′
c(x1)
M′ · · · M′
c(xL+1)
M′ µx( ˜ α) + o(1)
21 / 21
- 5. Upper Bound on EL(0)
Dmin(C) ≤
- M′
M′ − L L+1 1 n
n
- c=1
- x
M′
c(x1)
M′ · · · M′
c(xL+1)
M′ µx( ˜ α) + o(1) ↓ M′
c(x)
M′
- probability distribution on X
21 / 21
- 5. Upper Bound on EL(0)
Dmin(C) ≤
- M′
M′ − L L+1 1 n
n
- c=1
- x
M′
c(x1)
M′ · · · M′
c(xL+1)
M′ µx( ˜ α) + o(1) ↓ M′
c(x)
M′
- probability distribution on X
↓ Dmin(C) ≤
- M′
M′ − L L+1 max
Q∈P(X)
- x
Q(x1) · · · Q(xL+1) µx( ˜ α) + o(1)
21 / 21
- 5. Upper Bound on EL(0)
Dmin(C) ≤
- M′
M′ − L L+1 1 n
n
- c=1
- x
M′
c(x1)
M′ · · · M′
c(xL+1)
M′ µx( ˜ α) + o(1) ↓ M′
c(x)
M′
- probability distribution on X
↓ Dmin(C) ≤
- M′
M′ − L L+1 max
Q∈P(X)
- x
Q(x1) · · · Q(xL+1) µx( ˜ α) + o(1) M′ → ∞ Eex(0)
21 / 21