1/45
Error-correcting codes and Cryptography
Henk van Tilborg
Code-based Cryptography Workshop Eindhoven, May 11-12, 2011
Error-correcting codes and Cryptography Henk van Tilborg - - PowerPoint PPT Presentation
Error-correcting codes and Cryptography Henk van Tilborg Code-based Cryptography Workshop Eindhoven, May 11-12, 2011 1/45 CONTENTS I Error-correcting codes; the basics II Quasi-cyclic codes; codes generated by circulants III Cyclic codes
1/45
Code-based Cryptography Workshop Eindhoven, May 11-12, 2011
2/45
CONTENTS
I Error-correcting codes; the basics II Quasi-cyclic codes; codes generated by circulants III Cyclic codes IV The McEliece cryptosystem V Burst-correcting array codes
3/45
I Error-correcting codes; the basics
Sender
Encode Decode
Receiver
Noise r
Channel
m c m
errors that occur during transmission of data or during storage of data.
(0, . . . . . . , 0,
i
1, 0, . . . . . . . . . , 0,
j
1, 0, . . . , 0)
We shall also briefly discuss codes that correct bursts (clusters) of errors, i.e. error patterns of the form:
(0, . . . . . . . . . , 0,
i
1, ∗, . . ., ∗,
i+b−1
1 , 0, . . . . . . . . . , 0)
4/45
m0 0 0 0 0 0 0 0 c0 m1 0 0 0 1 1 1 1 c1 m2 0 0 1 0 0 1 1 c2 m3 0 0 1 1 1 0 0 c3 m4 0 1 0 0 1 0 1 c4 m5 0 1 0 1 0 1 0 c5 m6 0 1 1 0 1 1 0 c6 m7 0 1 1 1 0 0 1 c7 m8 1 0 0 0 1 1 0 c8 m9 1 0 0 1 0 0 1 c9 m10 1 0 1 0 1 0 1 c10 m11 1 0 1 1 0 1 0 c11 m12 1 1 0 0 0 1 1 c12 m13 1 1 0 1 1 0 0 c13 m14 1 1 1 0 0 0 0 c14 m15 1 1 1 1 1 1 1 c15
16 codewords of length 7
5/45
A code C is such a (well-chosen) subset of {0, 1}n. So codes here will be binary codes. The generalization to other field sizes is easy. The weight of a word is the number of non-zero coordinates. Example: a code C of length 5 with the following four codewords:
c0 = 0 0 0 0 0 c1 = 0 0 1 1 1 c2 = 1 1 0 0 1 c3 = 1 1 1 1 0
6/45
Suppose that each two codewords differ in at least d coordinates (have dis- tance at least d) and put t = ⌊d−1
2 ⌋.
d = 3, t = 1 c0 = 0 0 0 0 0 c1 = 0 0 1 1 1 c2 = 1 1 0 0 1 c3 = 1 1 1 1 0
Then the code C is said to be t-error-correcting, because if you transmit (or store) a codeword and not more than t errors have occurred upon reception (or read out) due of noise or damage, then the received word will still be closer to the original codeword than to any other. For instance, if you receive
r = 0 1 0 0 1
you know that c2 is the most likely transmitted codeword.
7/45
From now on codes will be linear, meaning that C is a linear subspace of
{0, 1}n. We use the notation [n, k, d] codes, where k denotes the dimension
The quantity r = n − k is called the redundancy of the code. This is the number of additional coordinates (apart from the actual information being transmitted) that make error-correction possible. It follows from the linear structure of C that an appropriate choice of k codewords forms a basis for the code. A basis of the code C = {00000, 00111, 11001, 11110} is given by the rows
0 0 1 1 1 1 1 0 0 1
8/45
A basis of the linear (!)
[7, 4, 3] code introduced before
is given by c1, c2, c4, c8
0 0 0 0 0 0 0 c0 0 0 0 1 1 1 1 c1 0 0 1 0 0 1 1 c2 0 0 1 1 1 0 0 c3 0 1 0 0 1 0 1 c4 0 1 0 1 0 1 0 c5 0 1 1 0 1 1 0 c6 0 1 1 1 0 0 1 c7 1 0 0 0 1 1 0 c8 1 0 0 1 0 0 1 c9 1 0 1 0 1 0 1 c10 1 0 1 1 0 1 0 c11 1 1 0 0 0 1 1 c12 1 1 0 1 1 0 0 c13 1 1 1 0 0 0 0 c14 1 1 1 1 1 1 1 c15
9/45
A matrix G whose rows form a basis of an [n, k, d] code C, is called a gene- rator matrix G of C. Its size is k × n. The basis c1, c2, c4, c8 of the code on the previous page results in the gene- rator matrix:
G = 0 0 0 1 1 1 1 0 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 0 0 1 1 0 .
So, in general, a linear code C with k × n generator matrix G consists of all linear combinations of the rows of G.
C = {mG | m ∈ {0, 1}k}
10/45
If k is large compared to n, it is often advantageous to describe C as the null-space of a (n − k) × n matrix H called a parity check matrix:
C = {x ∈ {0, 1}n | HxT = 0T}.
Typically, you transmit a codeword c and you receive r which can be written as r = c ⊕ e, where e is called the error vector and is caused by the noise. The decoder can not do better than look for the closest codeword to r, i.e. look for e of lowest weight such that r − e ∈ C. Note that sT := HrT = HcT ⊕ HeT = HeT. This value is called the syndrome of the received word. It only depends on the error-vector.
11/45
Example: The matrix
H = 0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1
is the parity check matrix of a linear code C = {x ∈ {0, 1}n | HxT = 0T}
Moreover, this code can correct a single error (d = 3, t = 1). We give a decoding algorithm. Let r be a received word. Compute its syndrome s, i.e. compute sT = HrT. If sT =
then r ∈ C, so (most likely) no error occurred.
12/45
Example continued: Suppose you receive
r = 1 0 0 0 1 1 1
Its syndrome with
H = 0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1
is
1 1 , which is the 5-th column. Note that e = 0 0 0 0 1 0 0
gives the same syndrome, so H(rT − eT) = 0T. So, the most likely transmitted codeword is r − e, i.e.
c = 1 0 0 0 0 1 1
13/45
II Quasi-cyclic codes; Codes generated by circulants
Consider the
15 × 15 circulant U = 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 1 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1
14/45
Note that in
U = 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1
. . . . . .
u0 u4 u6 u7
rows u0, u4, and u6 add up (modulo 2) to row u7. So, row u7 is a linear combination of the preceding rows. But then, because of the cyclic structure, also row u8 is a linear combination
We conclude that the rows of U generate a [15, 7] code.
15/45
U = 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0
. . . . . .
0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1
. . . . . .
u(x) xu(x)
. . .
x6u(x) x7u(x)
. . . Each row in U is a cyclic shift of the previous row. Define u(x) by the top row u0: u(x) = 14
i=0 U0,ixi = 1 + x4 + x6 + x7 + x8.
Then xu(x) corresponds to row u1, x2u(x) corresponds to row u2, etc., whe- re these polynomials have to be taken modulo x15 − 1. For example,
u6 corresponds to x6u(x) = x6 + x10 + x12 + x13 + x14 u7 corresponds to x7u(x) = x7 + x11 + x13 + x14 + 1
16/45
The reason that U generates a [15, 7] code (2nd proof) is that:
dent.
Indeed x15 − 1 = u(x)(1 + x4 + x6 + x7), as one can easily check. So,
x7u(x) ≡ x7u(x) + (x15 − 1) ≡ (x7 + (1 + x4 + x6 + x7)) u(x) ≡ ≡ (1 + x4 + x6)u(x) ≡ u(x) + x4u(x) + x6u(x) (mod x15 − 1).
This shows why rows u0, u4, u6 add up (modulo 2) to row u7. This argument holds in general when u(x) divides xn − 1.
17/45
How about the rank of a code generated by a circulant U with top row u0, corresponding to a polynomial u(x) that does not divide xn − 1?
U = 1 0 0 1 1 1 0 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 1 u0 u(x) = 1 + x3 + x4 + x5 does not divide x7 − 1.
18/45
U = u(x)
Define g(x) = gcd(u(x), xn − 1) and use the extended version of Euclid’s Algorithm to write:
g(x) = a(x)u(x) + b(x)(xn − 1).
Then
g(x) ≡ n−1
aixi
n−1
ai
So,
g =
n−1
aiui.
So, g is a linear combination of the rows of U.
19/45
The vector g (and each of its cyclic shifts) is a linear combination of the rows
Since g(x) = gcd(u(x), xn − 1) divides u(x), we also know that u0 (and each of its shifts) is a linear combination of cyclic shifts of g. We conclude that G, the circulant with g as top row, generates the same code as U does:
U = u(x)
G = g(x)
But now g(x) divides xn − 1, so the code generated by U has dimension
n − degree(g(x)).
20/45
How about a code that is the linear span of two (or more) circulants under- neath each other?
V
u(x)
Codewords are linear combinations of rows of U and V. Things are easy here:
Indeed, g(x) = a(x)u(x) + b(x)v(x) gives g (and all its cyclic shifts) as linear combination of the rows of U and V .
21/45
How about a code that is the linear span of two (or more) circulants next to each other?
u1(x) u2(x) · · · · · · um(x) · · · · · ·
22/45
How about a code that is the linear span of two (or more) rows of circulants next to each other, so-called quasi-cyclic codes?
u1,1(x) u1,2(x) · · · · · · u1,m(x)
. . . . . . . . . . . . . . . . .
ul,1(x) ul,2(x) · · · · · · ul,m(x)
Things are difficult here. Little to nothing can be said about rank, minimum distance, let alone decoding. See Ph.D. thesis: Kristine Lally, Application of the theory of Gröbner bases to the study of quasi-cyclic codes, National University of Ireland, Cork, 6-15-2000, especially for the case of a single row of circulants.
23/45
III Cyclic codes
The codes generated by a column of circulants are commonly called cyclic codes.
U V
. . .
g(x) := gcd(u(x), v(x), · · · , xn − 1) ⇄ G =
rows are commonly left out. The real question is how to select a divisor g(x) of xn − 1 such that the code generated by it has good properties:
24/45
Consider the irreducible polynomial f(x) = 1 + x + x4 and let α be a zero
Then α can be assumed to be in GF(24) with as elements all binary polyno- mials in α of degree less than 4.
GF(24) =
aiαi
Arithmetic is modulo 2 and modulo 1 + α + α4. For instance:
(1 + α2) + (1 + α3) = α2 + α3 (1 + α2) (1 + α3) = 1 + α2 + α3 + α5 = = 1 + α2 + α3 + α5 + α (1 + α + α4) = = 1 + α + α3
25/45
But α has the additional property of being primitive:
α generates GF(24) \ {0} (remember that α4 = 1 + α) 1 α α2 α3 1 1 0 α 0 1 α2 0 0 1 α3 0 0 1 α4 1 1 α5 0 1 1 α6 0 0 1 1 α7 1 1 1 1 α α2 α3 α8 1 0 1 α9 0 1 1 α10 1 1 1 α11 0 1 1 1 α12 1 1 1 1 α13 1 0 1 1 α14 1 0 1 α15 1 0
Note that indeed α15 = 1. Thus α and each of its powers is a zero of x15−1. Hence
x15 − 1 =
14
26/45
In general: when gcd(2, n) = 1 there exists an α in some extension field
xn − 1 =
n−1
.
It follows that g(x) =
i∈I (x − αi) for some I ⊂ {0, 1, . . . , n − 1}.
The challenge is to choose a suitable I ⊂ {0, 1, . . . , n − 1} to give the code generated by g(x) good properties.
27/45
Now consider the parity check matrix
H =
α2 α3 α4 α5 · · · · · · α14 1 α3 α3×2 α3×3 α3×4 α3×5 · · · · · · α3×14
H = 1 0 0 0 1 0 0 1 1 0 1 0 1 1 1 0 1 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 1 0 0 1 1 0 1 0 1 1 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1
So, we consider the binary [15, 7, ?] code C defined by
28/45
H =
α2 α3 α4 α5 · · · · · · α14 1 α3 α3×2 α3×3 α3×4 α3×5 · · · · · · α3×14
i=0 cixi.
Then
c ∈ C ⇔ HcT = 0T ⇔ c(α) = c(α3) = 0.
We shall now show that the minimum distance of this code is 5 and that there exists an easy decoding algorithm to correct up to 2 errors. Suppose that r(x) (corresponding to vector r) is received, while codeword
c(x) was transmitted. Write r(x) = c(x) + e(x), where e(x) stands for the
error vector e = (e0, e1, . . . , e14). As always for decoding, we compute the syndrome
s1 = r(α) = c(α) + e(α) = e(α) s3 = r(α3) = c(α3) + e(α3) = e(α3)
29/45
s1 = r(α) = c(α) + e(α) = e(α) s3 = r(α3) = c(α3) + e(α3) = e(α3)
We can distinguish three possibilities: No error: e(x) = 0, s1 = s3 = 0. A single error at coordinate i:
e(x) = xi, s1 = αi, s3 = α3i.
Two errors, one on coordinate i and the other on coordinate j:
e(x) = xi + xj, s1 = αi + αj, s3 = α3i + α3j.
These cases are easy to distinguish: no error
s1 = 0 & s3 = 0
s1 = 0 & s3 = (s1)3
two errors s1 = 0 & s3 = (s1)3 Finding e(x) in these three cases is also elementary.
30/45
The technique at the previous sheets can be easily generalized to construct codes that correct more errors and allow efficient decoding methods. So,
H = 1 α α2 α3 α4 α5 · · · · · · αn−1 1 α3 α3×2 α3×3 α3×4 α3×5 · · · · · · α3×(n−1) 1 α5 α5×2 α5×3 α5×4 α5×5 · · · · · · α5×(n−1)
generates a 3-error-correcting code, etc. The family of BCH codes does this. Also the Reed-Solomon codes that are used on CD’s and DVD’s are related to this construction. Paterson’s decoding algorithm does the decoding in t×n operations, where
n is the length of the code and t the number of errors that can be corrected.
31/45
IV The McEliece cryptosystem
History: Berlekamp, McEliece, and vT proved in 1978 that the general deco- ding problem is NP-complete. Coset weights problem: Input: a matrix H, a vector s, and an integer w. Property: there exists a vector e of weight ≤ w such that HeT = sT. Take w = 0, 1, 2, . . . until you find a YES. You do not find e (the/a most likely error pattern with syndrome s) but at least you know its existence and weight.
32/45
NP: a decision problem that can be verified in polynomial time (but no known algorithm answers it in polynomial time). Complete: any other NP problem can be converted to this one (in polynomi- al time). Famous other NP-complete problems are: the Boolean satisfiability problem and the traveling salesman problem. The relevance of being NP-complete to cryptography is limited, as the story
Elwyn Berlekamp, Bob McEliece and Henk van Tilborg, On the inherent in- tractability of certain coding problems, IEEE Trans. Inf. Theory IT-24, 1978, p. 384-386. Michael R. Garey and David S. Johnson, Computers and Intractability: A Gui- de to the Theory of NP-Completeness, Freeman, San Francisco, 1978.
33/45
The Coset Weights Problem is about arbitrary (parity check) matrices, not the well structured parity check matrices that allow easy decoding, like
H = 0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1
and
H = 1 α α2 · · · · · · αn−1 1 α3 α3×2 · · · · · · α3×(n−1)
34/45
Instead think of
0 1 1 1 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 0 1 0 0 1 1 1 0 1 0 1 1 1 0 1 1 1 0 0 0 1 0 0 1 1 0 1 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 0 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 1 1 1 0 1 0 0 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 0 0 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1 0 0 1 1 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 1 1 1 0 s = 1 1 1 1 1
Write s as linear combination/sum of as few columns of H as possible.
35/45
McEliece based his cryptosystem on this:
He needed a trapdoor to hide the nice structure. Robert McEliece, A public–key cryptosystem based on algebraic coding theory, JPL DSN Progress Report 42–44, pp. 114–116, Jan–Febr. 1978.
36/45
Set up
efficient decoding algorithm DecG.
tion matrix P. Compute ˆ
G = SGP.
G and t public, but keep S, P, and G secret.
Encryption
G + e, where e is a
random vector of weight t. Decryption
G + e)P −1 = mSGPP −1 + eP −1 = (mS)G + e′.
37/45
Of course, the adversary should not be able to “guess” the code C that was used (or the S or P). There are too few BCH codes and Reed-Somolon codes for given parame- ters. That is why McEliece did choose the large class of Goppa codes. Their num- ber grows exponentially in the length of the code. In his original proposal (1978): n = 1024, t = 50, and k ≈ 524. Since 2008 these parameters are no longer safe. Dan Bernstein, Tanja Lange, and Christiane Peters, Attacking and Defending the McEliece Cryptosystem, Johannes Buchmann and Jintai Ding, PQCrypto 2008, Springer-Verlag, Berlin Heidelberg, LNCS-5299, pp. 31–46, 2008.
38/45
Definition: An (n1, n2)-array code C consists of all n1 × n2 {0, 1}-arrays C whose row and column sums are all congruent to zero modulo 2.
1 2 3 · · · · · · n2 1 ← even parity 2 ← even parity
. . . . . .
n1 ← even parity ↑ ↑ ↑ ↑ ↑
even parity even parity It follows directly from this definition that an (n1, n2) array code C is a linear code with length n1 × n2, dimension (n1 − 1)(n2 − 1).
39/45
Example: n1 = 5, n2 = 8.
0 1 0 1 1 1 0 0 1 1 1 1 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 1 0 1 1 1 0 0 0 1 1 1 1 0
is a “codeword”. This code has length 5 × 8 = 40 and dimension 4 × 7 = 28. Any fixed read-out of these 40 coordinates is fine.
40/45
Let R be a received word.
h1 h2
. . . . . .
hn1 v1 v2 vn2
The horizontal and vertical syndrome of R are defined by the row sums and column sums. Decoding a single error in this code is extremely simple.
41/45
Example continued: Look at the received word:
1 1 0 0 0 1 0 1 0 0 1 0 0 1 0 1 0 1 1 0 1 0 1 1 0 0 0 0 0 1 1 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 0 0 0
It is clear where the error occurred. So, decoding a single error is easy (but not very impressive). The actual minimum distance of this code is 4. How about decoding bursts?
42/45
For burst-correction the particular read-out of the array is important. We follow diagonals, one after another. Example: n1 = 5, n2 = 6, so n = 30.
5 10 15 20 25 26 1 6 11 16 21 22 27 2 7 12 17 18 23 28 3 8 13 14 19 24 29 4 9
Without loss of generality we shall assume that n2 ≥ n1.
43/45
It is not so difficult to see that C cannot correct all bursts of length up to n1. Indeed, in our example, the two bursts of length 5 indicated below (and many more) have the same syndrome.
5 10 15 20 25 26 1 6 11 16 21 22 27 2 7 12 17 18 23 28 3 8 13 14 19 24 29 4 9
and
5 10 15 20 25 26 1 6 11 16 21 22 27 2 7 12 17 18 23 28 3 8 13 14 19 24 29 4 9
Both have burst-pattern (1, 0, 0, 0, 1) and the positions of the ones have been indicated in color.
44/45
Let us now see when C can correct all bursts of length ≤ n1 − 1. With a little bit of work one can check that for n2 < 2n1 −3 there are always two different weight-two bursts of length ≤ n1−1 with the same syndrome. For instance the two bursts depicted below in red resp. blue have the same syndrome.
5 10 15 20 25 1 26 1 6 11 16 21 0 22 27 2 7 12 17 0 18 23 28 3 8 13 1 14 19 24 29 4 9 1 1
45/45
Theorem: Let C be the n1 × n2 array code, n2 ≥ n1, with +1-diagonal read-
Proof by example: n1 = 11, n2 = 19. Mario Blaum, Paddy Farrell, and Henk van Tilborg, A class of burst error– correcting array codes, IEEE Trans. Information Theory IT-32, 1986, pp. 836- 839.