Error-correcting codes and Cryptography Henk van Tilborg - - PowerPoint PPT Presentation

error correcting codes and cryptography
SMART_READER_LITE
LIVE PREVIEW

Error-correcting codes and Cryptography Henk van Tilborg - - PowerPoint PPT Presentation

Error-correcting codes and Cryptography Henk van Tilborg Code-based Cryptography Workshop Eindhoven, May 11-12, 2011 1/45 CONTENTS I Error-correcting codes; the basics II Quasi-cyclic codes; codes generated by circulants III Cyclic codes


slide-1
SLIDE 1

1/45

Error-correcting codes and Cryptography

Henk van Tilborg

Code-based Cryptography Workshop Eindhoven, May 11-12, 2011

slide-2
SLIDE 2

2/45

CONTENTS

I Error-correcting codes; the basics II Quasi-cyclic codes; codes generated by circulants III Cyclic codes IV The McEliece cryptosystem V Burst-correcting array codes

slide-3
SLIDE 3

3/45

I Error-correcting codes; the basics

Sender

Encode Decode

Receiver

Noise r

Channel

m c m

  • Error-correcting codes are (mostly) used to correct independent, random

errors that occur during transmission of data or during storage of data.

(0, . . . . . . , 0,

i

1, 0, . . . . . . . . . , 0,

j

1, 0, . . . , 0)

We shall also briefly discuss codes that correct bursts (clusters) of errors, i.e. error patterns of the form:

(0, . . . . . . . . . , 0,

i

1, ∗, . . ., ∗,

i+b−1

1 , 0, . . . . . . . . . , 0)

slide-4
SLIDE 4

4/45

m0 0 0 0 0 0 0 0 c0 m1 0 0 0 1 1 1 1 c1 m2 0 0 1 0 0 1 1 c2 m3 0 0 1 1 1 0 0 c3 m4 0 1 0 0 1 0 1 c4 m5 0 1 0 1 0 1 0 c5 m6 0 1 1 0 1 1 0 c6 m7 0 1 1 1 0 0 1 c7 m8 1 0 0 0 1 1 0 c8 m9 1 0 0 1 0 0 1 c9 m10 1 0 1 0 1 0 1 c10 m11 1 0 1 1 0 1 0 c11 m12 1 1 0 0 0 1 1 c12 m13 1 1 0 1 1 0 0 c13 m14 1 1 1 0 0 0 0 c14 m15 1 1 1 1 1 1 1 c15

16 codewords of length 7

slide-5
SLIDE 5

5/45

A code C is such a (well-chosen) subset of {0, 1}n. So codes here will be binary codes. The generalization to other field sizes is easy. The weight of a word is the number of non-zero coordinates. Example: a code C of length 5 with the following four codewords:

c0 = 0 0 0 0 0 c1 = 0 0 1 1 1 c2 = 1 1 0 0 1 c3 = 1 1 1 1 0

slide-6
SLIDE 6

6/45

Suppose that each two codewords differ in at least d coordinates (have dis- tance at least d) and put t = ⌊d−1

2 ⌋.

d = 3, t = 1 c0 = 0 0 0 0 0 c1 = 0 0 1 1 1 c2 = 1 1 0 0 1 c3 = 1 1 1 1 0

Then the code C is said to be t-error-correcting, because if you transmit (or store) a codeword and not more than t errors have occurred upon reception (or read out) due of noise or damage, then the received word will still be closer to the original codeword than to any other. For instance, if you receive

r = 0 1 0 0 1

you know that c2 is the most likely transmitted codeword.

slide-7
SLIDE 7

7/45

From now on codes will be linear, meaning that C is a linear subspace of

{0, 1}n. We use the notation [n, k, d] codes, where k denotes the dimension

  • f the code C and d the so-called minimum distance of C : the minimum
  • f all distances between codewords.

The quantity r = n − k is called the redundancy of the code. This is the number of additional coordinates (apart from the actual information being transmitted) that make error-correction possible. It follows from the linear structure of C that an appropriate choice of k codewords forms a basis for the code. A basis of the code C = {00000, 00111, 11001, 11110} is given by the rows

  • f

0 0 1 1 1 1 1 0 0 1

  • .
slide-8
SLIDE 8

8/45

A basis of the linear (!)

[7, 4, 3] code introduced before

is given by c1, c2, c4, c8

0 0 0 0 0 0 0 c0 0 0 0 1 1 1 1 c1 0 0 1 0 0 1 1 c2 0 0 1 1 1 0 0 c3 0 1 0 0 1 0 1 c4 0 1 0 1 0 1 0 c5 0 1 1 0 1 1 0 c6 0 1 1 1 0 0 1 c7 1 0 0 0 1 1 0 c8 1 0 0 1 0 0 1 c9 1 0 1 0 1 0 1 c10 1 0 1 1 0 1 0 c11 1 1 0 0 0 1 1 c12 1 1 0 1 1 0 0 c13 1 1 1 0 0 0 0 c14 1 1 1 1 1 1 1 c15

slide-9
SLIDE 9

9/45

A matrix G whose rows form a basis of an [n, k, d] code C, is called a gene- rator matrix G of C. Its size is k × n. The basis c1, c2, c4, c8 of the code on the previous page results in the gene- rator matrix:

G =    0 0 0 1 1 1 1 0 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 0 0 1 1 0    .

So, in general, a linear code C with k × n generator matrix G consists of all linear combinations of the rows of G.

C = {mG | m ∈ {0, 1}k}

slide-10
SLIDE 10

10/45

If k is large compared to n, it is often advantageous to describe C as the null-space of a (n − k) × n matrix H called a parity check matrix:

C = {x ∈ {0, 1}n | HxT = 0T}.

Typically, you transmit a codeword c and you receive r which can be written as r = c ⊕ e, where e is called the error vector and is caused by the noise. The decoder can not do better than look for the closest codeword to r, i.e. look for e of lowest weight such that r − e ∈ C. Note that sT := HrT = HcT ⊕ HeT = HeT. This value is called the syndrome of the received word. It only depends on the error-vector.

slide-11
SLIDE 11

11/45

Example: The matrix

H =   0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1  

is the parity check matrix of a linear code C = {x ∈ {0, 1}n | HxT = 0T}

  • f length 7 and dimension 4.

Moreover, this code can correct a single error (d = 3, t = 1). We give a decoding algorithm. Let r be a received word. Compute its syndrome s, i.e. compute sT = HrT. If sT =

    then r ∈ C, so (most likely) no error occurred.

slide-12
SLIDE 12

12/45

Example continued: Suppose you receive

r = 1 0 0 0 1 1 1

Its syndrome with

H =   0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1  

is

  1 1  , which is the 5-th column. Note that e = 0 0 0 0 1 0 0

gives the same syndrome, so H(rT − eT) = 0T. So, the most likely transmitted codeword is r − e, i.e.

c = 1 0 0 0 0 1 1

slide-13
SLIDE 13

13/45

II Quasi-cyclic codes; Codes generated by circulants

Consider the

15 × 15 circulant U =                         1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1                        

slide-14
SLIDE 14

14/45

Note that in

U =              1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1

. . . . . .

             u0 u4 u6 u7

rows u0, u4, and u6 add up (modulo 2) to row u7. So, row u7 is a linear combination of the preceding rows. But then, because of the cyclic structure, also row u8 is a linear combination

  • f the top 7 rows, etc..

We conclude that the rows of U generate a [15, 7] code.

slide-15
SLIDE 15

15/45

U =         1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0

. . . . . .

0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1

. . . . . .

        u(x) xu(x)

. . .

x6u(x) x7u(x)

. . . Each row in U is a cyclic shift of the previous row. Define u(x) by the top row u0: u(x) = 14

i=0 U0,ixi = 1 + x4 + x6 + x7 + x8.

Then xu(x) corresponds to row u1, x2u(x) corresponds to row u2, etc., whe- re these polynomials have to be taken modulo x15 − 1. For example,

u6 corresponds to x6u(x) = x6 + x10 + x12 + x13 + x14 u7 corresponds to x7u(x) = x7 + x11 + x13 + x14 + 1

slide-16
SLIDE 16

16/45

The reason that U generates a [15, 7] code (2nd proof) is that:

  • 1. u(x) has degree 8, so the first 7 rows of U are clearly linearly indepen-

dent.

  • 2. u(x) divides x15 − 1.

Indeed x15 − 1 = u(x)(1 + x4 + x6 + x7), as one can easily check. So,

x7u(x) ≡ x7u(x) + (x15 − 1) ≡ (x7 + (1 + x4 + x6 + x7)) u(x) ≡ ≡ (1 + x4 + x6)u(x) ≡ u(x) + x4u(x) + x6u(x) (mod x15 − 1).

This shows why rows u0, u4, u6 add up (modulo 2) to row u7. This argument holds in general when u(x) divides xn − 1.

slide-17
SLIDE 17

17/45

How about the rank of a code generated by a circulant U with top row u0, corresponding to a polynomial u(x) that does not divide xn − 1?

U =          1 0 0 1 1 1 0 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 1          u0 u(x) = 1 + x3 + x4 + x5 does not divide x7 − 1.

slide-18
SLIDE 18

18/45

U = u(x)

  • with u(x) does not divide of xn − 1.

Define g(x) = gcd(u(x), xn − 1) and use the extended version of Euclid’s Algorithm to write:

g(x) = a(x)u(x) + b(x)(xn − 1).

Then

g(x) ≡ n−1

  • i=0

aixi

  • u(x) ≡

n−1

  • i=0

ai

  • xiu(x)
  • (mod xn − 1).

So,

g =

n−1

  • i=0

aiui.

So, g is a linear combination of the rows of U.

slide-19
SLIDE 19

19/45

The vector g (and each of its cyclic shifts) is a linear combination of the rows

  • f U.

Since g(x) = gcd(u(x), xn − 1) divides u(x), we also know that u0 (and each of its shifts) is a linear combination of cyclic shifts of g. We conclude that G, the circulant with g as top row, generates the same code as U does:

U = u(x)

  • and

G = g(x)

  • generate the same code.

But now g(x) divides xn − 1, so the code generated by U has dimension

n − degree(g(x)).

slide-20
SLIDE 20

20/45

How about a code that is the linear span of two (or more) circulants under- neath each other?

  • U

V

  • =

   u(x)

  • v(x)

 

Codewords are linear combinations of rows of U and V. Things are easy here:

  • 1. Compute g(x) = gcd(u(x), v(x), xn − 1).
  • 2. The circulant with g as top row generates the same code.
  • 3. This code has dimension n − degree(g(x)).

Indeed, g(x) = a(x)u(x) + b(x)v(x) gives g (and all its cyclic shifts) as linear combination of the rows of U and V .

slide-21
SLIDE 21

21/45

How about a code that is the linear span of two (or more) circulants next to each other?

u1(x) u2(x) · · · · · · um(x) · · · · · ·

  • Some things are still easy here:
  • 1. Compute g(x) = gcd(u1(x), u2(x), · · · , um(x), xn − 1).
  • 2. The code has dimension n − degree(g(x)).
slide-22
SLIDE 22

22/45

How about a code that is the linear span of two (or more) rows of circulants next to each other, so-called quasi-cyclic codes?

            u1,1(x) u1,2(x) · · · · · · u1,m(x)

  • · · · · · ·
  • u2,1(x) u2,2(x) · · · · · · u2,m(x)
  • · · · · · ·
  • .

. . . . . . . . . . . . . . . . .

ul,1(x) ul,2(x) · · · · · · ul,m(x)

  • · · · · · ·

          

Things are difficult here. Little to nothing can be said about rank, minimum distance, let alone decoding. See Ph.D. thesis: Kristine Lally, Application of the theory of Gröbner bases to the study of quasi-cyclic codes, National University of Ireland, Cork, 6-15-2000, especially for the case of a single row of circulants.

slide-23
SLIDE 23

23/45

III Cyclic codes

The codes generated by a column of circulants are commonly called cyclic codes.

  U V

. . .

  g(x) := gcd(u(x), v(x), · · · , xn − 1) ⇄ G =

  • g(x)
  • Only the the top n − degree(g(x)) rows of G are needed. The remaining

rows are commonly left out. The real question is how to select a divisor g(x) of xn − 1 such that the code generated by it has good properties:

  • 1. large minimum distance
  • 2. easy error-correction.
slide-24
SLIDE 24

24/45

Consider the irreducible polynomial f(x) = 1 + x + x4 and let α be a zero

  • f f(x) in some extension field of GF(2) = {0, 1}. So 1 + α + α4 = 0.

Then α can be assumed to be in GF(24) with as elements all binary polyno- mials in α of degree less than 4.

GF(24) =

  • 3
  • i=0

aiαi

  • ai ∈ {0, 1}, 0 ≤ i ≤ 3
  • .

Arithmetic is modulo 2 and modulo 1 + α + α4. For instance:

(1 + α2) + (1 + α3) = α2 + α3 (1 + α2) (1 + α3) = 1 + α2 + α3 + α5 = = 1 + α2 + α3 + α5 + α (1 + α + α4) = = 1 + α + α3

slide-25
SLIDE 25

25/45

But α has the additional property of being primitive:

α generates GF(24) \ {0} (remember that α4 = 1 + α) 1 α α2 α3 1 1 0 α 0 1 α2 0 0 1 α3 0 0 1 α4 1 1 α5 0 1 1 α6 0 0 1 1 α7 1 1 1 1 α α2 α3 α8 1 0 1 α9 0 1 1 α10 1 1 1 α11 0 1 1 1 α12 1 1 1 1 α13 1 0 1 1 α14 1 0 1 α15 1 0

Note that indeed α15 = 1. Thus α and each of its powers is a zero of x15−1. Hence

x15 − 1 =

14

  • i=0
  • x − αi
slide-26
SLIDE 26

26/45

In general: when gcd(2, n) = 1 there exists an α in some extension field

  • f GF(2) = {0, 1} such that xn − 1 can be written as

xn − 1 =

n−1

  • i=0
  • x − αi

.

It follows that g(x) =

i∈I (x − αi) for some I ⊂ {0, 1, . . . , n − 1}.

The challenge is to choose a suitable I ⊂ {0, 1, . . . , n − 1} to give the code generated by g(x) good properties.

slide-27
SLIDE 27

27/45

Now consider the parity check matrix

H =

  • 1 α

α2 α3 α4 α5 · · · · · · α14 1 α3 α3×2 α3×3 α3×4 α3×5 · · · · · · α3×14

  • which really stands for

H =            1 0 0 0 1 0 0 1 1 0 1 0 1 1 1 0 1 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 0 1 1 0 1 0 1 1 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1           

So, we consider the binary [15, 7, ?] code C defined by

  • c ∈ {0, 1}15 | HcT = 0T
slide-28
SLIDE 28

28/45

H =

  • 1 α

α2 α3 α4 α5 · · · · · · α14 1 α3 α3×2 α3×3 α3×4 α3×5 · · · · · · α3×14

  • Let c(x) correspond to c = (c0, c1, . . . , c14). So, c(x) = 14

i=0 cixi.

Then

c ∈ C ⇔ HcT = 0T ⇔ c(α) = c(α3) = 0.

We shall now show that the minimum distance of this code is 5 and that there exists an easy decoding algorithm to correct up to 2 errors. Suppose that r(x) (corresponding to vector r) is received, while codeword

c(x) was transmitted. Write r(x) = c(x) + e(x), where e(x) stands for the

error vector e = (e0, e1, . . . , e14). As always for decoding, we compute the syndrome

s1 = r(α) = c(α) + e(α) = e(α) s3 = r(α3) = c(α3) + e(α3) = e(α3)

slide-29
SLIDE 29

29/45

s1 = r(α) = c(α) + e(α) = e(α) s3 = r(α3) = c(α3) + e(α3) = e(α3)

We can distinguish three possibilities: No error: e(x) = 0, s1 = s3 = 0. A single error at coordinate i:

e(x) = xi, s1 = αi, s3 = α3i.

Two errors, one on coordinate i and the other on coordinate j:

e(x) = xi + xj, s1 = αi + αj, s3 = α3i + α3j.

These cases are easy to distinguish: no error

s1 = 0 & s3 = 0

  • ne error

s1 = 0 & s3 = (s1)3

two errors s1 = 0 & s3 = (s1)3 Finding e(x) in these three cases is also elementary.

slide-30
SLIDE 30

30/45

The technique at the previous sheets can be easily generalized to construct codes that correct more errors and allow efficient decoding methods. So,

H =   1 α α2 α3 α4 α5 · · · · · · αn−1 1 α3 α3×2 α3×3 α3×4 α3×5 · · · · · · α3×(n−1) 1 α5 α5×2 α5×3 α5×4 α5×5 · · · · · · α5×(n−1)  

generates a 3-error-correcting code, etc. The family of BCH codes does this. Also the Reed-Solomon codes that are used on CD’s and DVD’s are related to this construction. Paterson’s decoding algorithm does the decoding in t×n operations, where

n is the length of the code and t the number of errors that can be corrected.

slide-31
SLIDE 31

31/45

IV The McEliece cryptosystem

History: Berlekamp, McEliece, and vT proved in 1978 that the general deco- ding problem is NP-complete. Coset weights problem: Input: a matrix H, a vector s, and an integer w. Property: there exists a vector e of weight ≤ w such that HeT = sT. Take w = 0, 1, 2, . . . until you find a YES. You do not find e (the/a most likely error pattern with syndrome s) but at least you know its existence and weight.

slide-32
SLIDE 32

32/45

NP: a decision problem that can be verified in polynomial time (but no known algorithm answers it in polynomial time). Complete: any other NP problem can be converted to this one (in polynomi- al time). Famous other NP-complete problems are: the Boolean satisfiability problem and the traveling salesman problem. The relevance of being NP-complete to cryptography is limited, as the story

  • f the knapsack based cryptosystems teaches us.

Elwyn Berlekamp, Bob McEliece and Henk van Tilborg, On the inherent in- tractability of certain coding problems, IEEE Trans. Inf. Theory IT-24, 1978, p. 384-386. Michael R. Garey and David S. Johnson, Computers and Intractability: A Gui- de to the Theory of NP-Completeness, Freeman, San Francisco, 1978.

slide-33
SLIDE 33

33/45

The Coset Weights Problem is about arbitrary (parity check) matrices, not the well structured parity check matrices that allow easy decoding, like

H =   0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1  

and

H = 1 α α2 · · · · · · αn−1 1 α3 α3×2 · · · · · · α3×(n−1)

slide-34
SLIDE 34

34/45

Instead think of

               0 1 1 1 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 0 1 0 0 1 1 1 0 1 0 1 1 1 0 1 1 1 0 0 0 1 0 0 1 1 0 1 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 0 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 1 1 1 0 1 0 0 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 0 0 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1 0 0 1 1 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 1 1 1 0                s =                1 1 1 1 1               

Write s as linear combination/sum of as few columns of H as possible.

slide-35
SLIDE 35

35/45

McEliece based his cryptosystem on this:

  • Decoding linear codes is, in general, very hard.
  • But linear codes with a nice structure are easy to decode.

He needed a trapdoor to hide the nice structure. Robert McEliece, A public–key cryptosystem based on algebraic coding theory, JPL DSN Progress Report 42–44, pp. 114–116, Jan–Febr. 1978.

slide-36
SLIDE 36

36/45

Set up

  • Select a generator matrix G of an [n, k, 2t + 1] linear code C with an

efficient decoding algorithm DecG.

  • Select a random k ×k invertible matrix S and a random n×n permuta-

tion matrix P. Compute ˆ

G = SGP.

  • Make ˆ

G and t public, but keep S, P, and G secret.

Encryption

  • Message m ∈ {0, 1}k will be encrypted into r = m ˆ

G + e, where e is a

random vector of weight t. Decryption

  • Compute rP −1 = (m ˆ

G + e)P −1 = mSGPP −1 + eP −1 = (mS)G + e′.

  • Apply DecG to this vector to find mS (note that e′ also has weight t).
  • Retrieve m from (mS)S−1.
slide-37
SLIDE 37

37/45

Of course, the adversary should not be able to “guess” the code C that was used (or the S or P). There are too few BCH codes and Reed-Somolon codes for given parame- ters. That is why McEliece did choose the large class of Goppa codes. Their num- ber grows exponentially in the length of the code. In his original proposal (1978): n = 1024, t = 50, and k ≈ 524. Since 2008 these parameters are no longer safe. Dan Bernstein, Tanja Lange, and Christiane Peters, Attacking and Defending the McEliece Cryptosystem, Johannes Buchmann and Jintai Ding, PQCrypto 2008, Springer-Verlag, Berlin Heidelberg, LNCS-5299, pp. 31–46, 2008.

slide-38
SLIDE 38

38/45

V Burst-correcting array codes

Definition: An (n1, n2)-array code C consists of all n1 × n2 {0, 1}-arrays C whose row and column sums are all congruent to zero modulo 2.

1 2 3 · · · · · · n2 1 ← even parity 2 ← even parity

. . . . . .

n1 ← even parity ↑ ↑ ↑ ↑ ↑

even parity even parity It follows directly from this definition that an (n1, n2) array code C is a linear code with length n1 × n2, dimension (n1 − 1)(n2 − 1).

slide-39
SLIDE 39

39/45

Example: n1 = 5, n2 = 8.

0 1 0 1 1 1 0 0 1 1 1 1 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 1 0 1 1 1 0 0 0 1 1 1 1 0

is a “codeword”. This code has length 5 × 8 = 40 and dimension 4 × 7 = 28. Any fixed read-out of these 40 coordinates is fine.

slide-40
SLIDE 40

40/45

Let R be a received word.

h1 h2

. . . . . .

hn1 v1 v2 vn2

The horizontal and vertical syndrome of R are defined by the row sums and column sums. Decoding a single error in this code is extremely simple.

slide-41
SLIDE 41

41/45

Example continued: Look at the received word:

1 1 0 0 0 1 0 1 0 0 1 0 0 1 0 1 0 1 1 0 1 0 1 1 0 0 0 0 0 1 1 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 0

It is clear where the error occurred. So, decoding a single error is easy (but not very impressive). The actual minimum distance of this code is 4. How about decoding bursts?

slide-42
SLIDE 42

42/45

For burst-correction the particular read-out of the array is important. We follow diagonals, one after another. Example: n1 = 5, n2 = 6, so n = 30.

5 10 15 20 25 26 1 6 11 16 21 22 27 2 7 12 17 18 23 28 3 8 13 14 19 24 29 4 9

Without loss of generality we shall assume that n2 ≥ n1.

slide-43
SLIDE 43

43/45

It is not so difficult to see that C cannot correct all bursts of length up to n1. Indeed, in our example, the two bursts of length 5 indicated below (and many more) have the same syndrome.

5 10 15 20 25 26 1 6 11 16 21 22 27 2 7 12 17 18 23 28 3 8 13 14 19 24 29 4 9

and

5 10 15 20 25 26 1 6 11 16 21 22 27 2 7 12 17 18 23 28 3 8 13 14 19 24 29 4 9

Both have burst-pattern (1, 0, 0, 0, 1) and the positions of the ones have been indicated in color.

slide-44
SLIDE 44

44/45

Let us now see when C can correct all bursts of length ≤ n1 − 1. With a little bit of work one can check that for n2 < 2n1 −3 there are always two different weight-two bursts of length ≤ n1−1 with the same syndrome. For instance the two bursts depicted below in red resp. blue have the same syndrome.

5 10 15 20 25 1 26 1 6 11 16 21 0 22 27 2 7 12 17 0 18 23 28 3 8 13 1 14 19 24 29 4 9 1 1

slide-45
SLIDE 45

45/45

Theorem: Let C be the n1 × n2 array code, n2 ≥ n1, with +1-diagonal read-

  • ut as defined above. Then C can correct all bursts of length ≤ n1 −1 if and
  • nly if n2 ≥ 2n1 − 3.

Proof by example: n1 = 11, n2 = 19. Mario Blaum, Paddy Farrell, and Henk van Tilborg, A class of burst error– correcting array codes, IEEE Trans. Information Theory IT-32, 1986, pp. 836- 839.