Handout 9 Summary of this handout: RSA Generating Prime Numbers - - PDF document

handout 9
SMART_READER_LITE
LIVE PREVIEW

Handout 9 Summary of this handout: RSA Generating Prime Numbers - - PDF document

06-20008 Cryptography The University of Birmingham Autumn Semester 2012 School of Computer Science Eike Ritter 22 November, 2012 Handout 9 Summary of this handout: RSA Generating Prime Numbers Arithmetic Modulo a Composite IV.4 RSA


slide-1
SLIDE 1

06-20008 Cryptography The University of Birmingham Autumn Semester 2012 School of Computer Science Eike Ritter 22 November, 2012

Handout 9

Summary of this handout: RSA — Generating Prime Numbers — Arithmetic Modulo a Composite

IV.4 RSA

RSA was the first public key cipher, invented by Ronald Rivest, Adi Shamir and Leonard Adleman in 1978 and it is probably the most widely used public key cipher today. Its security is based on the difficulty

  • f factoring large integers, a problem that has been studied for more than 2000 years.

RSA has properties similar to the Diffie-Hellman key exchange since it is also based on discrete expo-

  • nentiation. However, instead of using a one-way function as Diffie-Hellman (gx (mod p) can be easily

computed but not inverted), it uses a trap-door one way function: Given the public information e and n, it is easy to compute Me (mod n) for a message M. This is still infeasible to invert, but given as additional information the factorisation of n, it is easy to invert the function. Thus the factorisation of n is the trapdoor, often called RSA trapdoor function.

  • 65. Key Generation

As usual the RSA cipher consists of three algorithms (G, E, D) for key generation, encryption and

  • decryption. The key generator G works as follows:
  • Choose two large random prime numbers p and q.
  • Compute n = p · q.
  • Compute ϕ(n) = (p − 1) · (q − 1).
  • Choose an integer e > 1 such that gcd(e, ϕ(n)) = 1.
  • Compute a d such that d · e ≡ 1(mod ϕ(n)).
  • Publish the public key

K = (e, n).

  • Retain the private key K = d.
  • 66. Encryption

The encryption algorithm E takes the public key K = (e, n) to encrypt the message M. This time we assume that M is a number with M < n. In case the actual message is larger than n, then M is split into blocks of the right size and each block is encrypted separately. In addition we can use some padding

  • scheme. The algorithm E is simple exponentiation of the message M to get the ciphertext C

Me ≡ C(mod n).

  • 67. Decryption

Given the ciphertext C, Alice can now use her secret K = d to recover the original message M by computing Cd ≡ M(mod n).

  • 68. Correctness

The obvious next question is of course, why does that work? Let’s have a look: First we note that Cd ≡ (Me)d ≡ Med (mod n). We now want to show that we have Med ≡ M(mod n) 22

slide-2
SLIDE 2

We know that ed ≡ 1(mod (p − 1)(q − 1)). We therefore have ed − k(p − 1)(q − 1) = 1 for some k ∈ Z and hence get: Med = M1+k(p−1)(q−1) = M · Mk(p−1)(q−1) = M · (M(p−1)(q−1))k ≡ M(mod n), where the last step follows

  • 1. either from Euler’s theorem, i.e., Mϕ(n) = M(p−1)(q−1) ≡ 1(modn) for M, n ∈ Z when

gcd(M, n) = 1,

  • 2. or using the Chinese Remainder Theorem: Given that 0 ≤ M < n = pq and gcd(M, n) = 1,

we know that M can only be a multiple of p or q. Suppose it is a multiple of q then we have gcd(M, p) = 1 and with Euler’s theorem we know that Mp−1 ≡ 1(mod p) as ϕ(p) = p − 1, thus M · Mk(p−1)(q−1) ≡ M(mod p). Since M is a multiple of q we have M ≡ 0(mod q) and also M · Mk(p−1)(q−1) ≡ 0 = M(mod q). Therefore, by the Chinese Remainder Theorem, we get M · Mk(p−1)(q−1) ≡ M(mod n). The similar reasoning holds if the M is a multiple of p. And what does all this have to do with factorisation? Only if we know ϕ(n) we can then efficiently decrypt the message. However, Eve could only learn ϕ(n) if she was able to factorise n into p and q. For large enough primes this is, however, infeasible. Nevertheless there need to be some precautions for the choice of parameters of RSA as we will see later. Example: Here is a very simple example of RSA encryption. The key generation G consists of the steps:

  • We choose p = 7 and q = 11.
  • We compute n = p · q = 77.
  • And also ϕ(n) = (p − 1)(q − 1) = 6 · 10 = 60.
  • We now need an exponent e such that gcd(e, 60) = 1. We pick e = 43, for which it is easy to see

that gcd(43, 60) = 1 as 43 is a prime number.

  • To obtain the secret key we have to compute d using the extended Euclidean algorithm. This

results in d = 7 since 43 · 7 = 301 ≡ 1(mod 60).

  • We now publish the public key

K = (43, 77).

  • We retain the secret key K = 7.

Suppose we now want to transmit the message M = 14. We then encrypt it by computing Me ≡ C(mod n) as 1443 ≡ 49(mod 77). To decrypt the ciphertext C = 49 we compute Cd ≡ M(mod n) as 497 ≡ 14(mod 77). Note that not all elements are invertible elements in Z77 and we can therefore not always generate sub-

  • groups. However, we can still using counting for computing the discrete exponentiations.

For example 14 = {141 = 14, 142 = 42, 143 = 49, 144 = 70, 145 = 56, 146 = 14, . . .} = {14, 42, 49, 70, 56} 23

slide-3
SLIDE 3

IV.4.1 Attacks on RSA There are many attacks on RSA, most of them challenging its structure with sophisticated mathematical

  • tools. Discussing these attacks would go beyond the scope of this lecture. However, we can easily

demonstrate that RSA is already vulnerable to a chosen-ciphertext attack, i.e., an attack in which Eve can obtain decrypted plaintexts for ciphertexts of her choice. Suppose RSA is used as defined above, that is, messages are encrypted with E

K(M) = Me (mod n)

and decrypted with DK(C) = Cd (mod n). Suppose also Eve has intercepted a ciphertext C that she wants to decrypt. She then starts a chosen ciphertext attack by:

  • Choose C1 ∈ Zn arbitrary and ask for its decryption, yielding a message M1.
  • Let C2 := C · C1−1 and ask for its decryption, yielding a message M2.
  • Compute M1 · M2 = Cd

1 · Cd 2 = Cd 1 · Cd · C−d 1

= 1 · Cd = Med = M. Eve can do even better and break the message with only one chosen ciphertext. Again we assume she has intercepted the ciphertext C.

  • Choose M1 ∈ Zn arbitrary and let C1 := Me

1 (mod n).

  • Let C2 := C · C−1

1

and ask for its decryption, yielding a message M2.

  • Compute M1·M2 = M1·Cd

2 = M1·Cd·C−d 1

= M1·Cd·M−1

1

= M1·(Me)d·M−1

1

= Med = M. IV.4.2 Using RSA in Practice While brute force attacks on RSA are still computationally infeasible, many attacks become possible if

  • nly a tiny bit of information is leaked or if the RSA parameters have not been chosen carefully. Here

are a number of criteria that should be met in order to operate RSA securely:

  • Choose e such that both e and d are large numbers.
  • Never leak even a small number of bits of p,q, and d.
  • Never encrypt small messages (corresponding to small numbers in Zn.) as they are easy to attack.

But despite all these precautions the main problem with RSA is that it has too much mathematical struc- ture and is therefore vulnerable. Recall that for symmetric ciphers we always required a cipher to act as randomly as possible, i.e., there should be no obvious connection between the plaintext and the cipher- text and small changes in the plaintext should lead to big changes in the ciphertext. Ideally, the same plaintext should also be enciphered differently at different times (for block ciphers we have achieved this for example with different modes of operation). For the RSA and most public ciphers this is however not the case.

  • 69. Encryption Schemes

One way to overcome the mathematical regularity of many public key ciphers is by using so called Encryption Schemes. They aim at

  • 1. adding randomness to the encryption, and
  • 2. preventing attacks of the form presented above.

There are several general encryption schemes for public key ciphers as well as specialist schemes for RSA known as PKCS standards. They pre-process the messages before applying the RSA function in

  • rder to achieve the above goals. This can be done, for example, by adding random bit strings to a

message or applying hash functions, in order to vary the resulting ciphertext. 24

slide-4
SLIDE 4
  • 70. Combining Asymmetric and Symmetric Ciphers

Another problem with RSA and other public key ciphers is that they are computationally very expensive. For efficiency and also security reasons they are therefore often used in combination with symmetric ciphers, e.g., block or stream ciphers, which can be much more efficiently computed. For example, Alice can use RSA to communicate with Bob using a symmetric cipher without having to exchange keys first. Here is a simple scenario: Let K be a randomly generated key for a symmetric cipher (E, D). Alice can then encrypt the message M with key K in the chosen symmetric cipher and then use RSA to encrypt the message K and send it as the prefix of the ciphertext to Bob: ERSA(K)EK(M)

IV.5 Computing Large Prime Numbers

One necessary prerequisite for all the public key ciphers we have seen so far is the ability to generate large prime numbers. Unfortunately, this is not a straight forward task. Due to the nature of prime numbers, it is not possible to use a composite approach, i.e., starting with small primes and deriving larger prime numbers from them. The opposite approach is to pick a number and then test whether or not it is prime. Obviously this problem is as hard as factoring numbers in general, since we have to find — or rather reconfirm — the single prime factor of the picked number. The problem is therefore computationally infeasible. Moreover, there are not many number theoretic guidelines on which numbers should be picked, apart from that the number should be odd. This leaves probabilistic test algorithms as the only computationally feasible approach. A probabilistic test algorithm picks a number and tests whether or not it is prime to a certain extend only, until it is very likely that the picked number is actually a prime number. However, there always remains a certain small probability that the generated number is not a prime number. We will have a look at two such algorithms. IV.5.1 Fermat’s Test The algorithm is based on Fermat’s little theorem. Recall that this states for a given prime p and 1 ≤ a < p we have ap−1 ≡ 1(mod p). If we want to test whether or not a given n is prime, we pick random values for a in 1 to n − 1 and check the congruence an−1 ≡ 1(mod n). If it does not hold, then n is

  • composite. However, if it holds, n is a candidate for a prime number. And for the more values of a

that it holds, the higher the probability gets that n is indeed a prime. Here is the algorithm to test n for primality: for i := 0 to k − 1 do Pick a ∈ {2, . . . , n − 1} if an−1 ≡ 1 (mod n) then return (“n is a composite”) end return(”n is probably prime”) Obviously the larger we pick k the likelier it is that n is a prime. In fact, probability that n is a composite after k tests is (1

2)k.

While Fermat’s primality test is not perfect, it is nevertheless used in practice, for instance in the Pretty Good Privacy (PGP) implementations. IV.5.2 Miller-Rabin Test The main problem with Fermat’s primality test is that there are numbers n, so-called Carmichael num- bers, for which we have an−1 ≡ 1(mod n) for all 1 ≤ a < n but n is nevertheless composite. This problem is addressed by the Miller-Rabin test. Apart from some more intricate number theory (which we will not discuss in detail here) it relies on the fact that we can easily rewrite n−1 as 2r·s, where s is an

  • dd number. (For example, consider n = 17 then n−1 = 16 = 24·1 or n = 19 then n−1 = 18 = 22·3.)

Thus given an odd integer n and n − 1 = 2r · s the Miller-Rabin algorithm tests for the primality of n as follows: 25

slide-5
SLIDE 5

for i := 0 to k − 1 do Pick a ∈ {1, . . . , n − 1} if as ≡ 1 (mod n) then for j := 0 to r − 1 do if a(2j·s) ≡ −1 (mod n) then return (“n is a composite”) end end end return(”n is probably prime”) The Miller-Rabin algorithm gives us some higher probability of not mistaking a Carmichael number for a prime. In addition it also computes a “highly probable prime” (i.e., a number for which we can be fairly sure that it is indeed a prime) more quickly, since the probability that n is a composite after k tests is (1

4)k. In practice it is sufficient to test with a value k = 20. Observe that the bound for k is independent

  • f the actual size of n.

26

slide-6
SLIDE 6

06-20008 Cryptography The University of Birmingham Autumn Semester 2012 School of Computer Science Eike Ritter 22 November, 2012

Mathematics 6 – Arithmetic Modulo a Composite

So far our algorithms have worked in Zp with p is prime, for which we knew that it was a field and therefore, in particular, that (Z∗

p, ·) is a group. In the following we are moving into Zn where n is a

composite, i.e., not a prime number. We will be especially interested in the case where n = p · q, with p, q prime numbers. We recall that for n we have Zn = {0, 1, 2, . . . , n − 1} and that the arithmetic operations (i.e. addition, subtraction, multiplication, and exponentiation) are defined modulo n as usual. But we also recall that while (Zn, +, ·) is a ring, it is in general not a field and therefore the multiplicative part (Z∗

n, ·) is not a

  • group. Nevertheless, Zn has some very interesting properties that are exploited for the RSA public key

cipher, which we shall explore in this handout.

Multiplicative Inverses in Zn

We already know that (Z∗

n, ·) is not a group and, therefore, that not all elements of Z∗ n have inverses

with respect to ·. Nevertheless we can find some elements in Z∗

n that have multiplicative inverses, that is,

elements x ∈ Zn for which an element y ∈ Zn exists, such that x · y ≡ 1(mod n). We then write x−1 instead of y provided that the inverse exists. Example: Let n = 2 · 3 = 6 we can easily check that 1 ∈ Z6 and 5 ∈ Z6 have inverses, since 1 ∗ 1 ≡ 1(mod 6) and 5 ∗ 5 ≡ 1(mod 6), and that no other element in Z6 has an inverse. In the example it is trivial to see that 1 has in inverse in every Zn. However, our next question is, can we somehow characterise the other elements that have inverses in Zn. In order to do that we need to define the concept of greatest common divisor. Definition 14 (Greatest Common Divisor) Let a, b ∈ Z with a = 0 and b = 0. The greatest common divisor for a and b, written gcd(a, b), is the largest positive integer that divides both numbers without remainder. Example: Let a = 8 and b = 12, then we get gcd(a, b) = 4. Having the gcd available, we can now characterise the elements in Zn that are invertible with respect to multiplication: Theorem 15 (Inverse) Let x ∈ Zn. x has an inverse in Zn if and only if gcd(x, n) = 1. In order to be able to decide whether or not an element in Zn has an inverse, we need a way to compute the gcd. This is done with Euclidean Algorithm, which has been around for more than two millennia.

Euclidean Algorithm

Given two integers a, b we can compute gcd(a, b) by means of one of the two algorithms below. The left one is the classical Euclidean algorithm, whereas the right one is a more efficient variant, that uses modular arithmetic: while a = b do if a > b then a := a − b else b := b − a end end return a while b = 0 do c := b b := a mod b a := c end return a xii

slide-7
SLIDE 7

Example: Let a = 90 and b = 126, we compute gcd(a, b) as follows: a b 90 126 90 126 − 90 = 36 90 − 36 = 54 36 54 − 36 = 18 36 18 36 − 18 = 18 gcd(90, 126) = 18 a b 90 126 126 90 mod 126 = 90 90 126 mod 90 = 36 36 90 mod 36 = 18 18 36 mod 18 = 0 gcd(90, 126) = 18

Extended Euclidean Algorithm

Using the Euclidean algorithm we can determine if a has an inverse modulo n by testing whether gcd(a, n) = 1. But we still do not know how to determine the inverse when it exists. To do this we use a variant of Euclid’s gcd algorithm, called the extended Euclidean algorithm. It relies on the fact that for every two integers a and b with gcd(a, b) = r there exist x, y ∈ Z such that x · a + y · b = r. Example: For a = 8 and b = 12 we had gcd(a, b) = 4. With x = −1 and y = 1 we get −1·8+1·12 = 4. Here is schematic overview of the algorithm:

  • 1. We start with integers a and b, together with two corresponding (x, y) pairs (xa, ya) and (xb, yb),

that are initialised as (xa, ya) = (1, 0) and and (xb, yb) = (0, 1).

  • 2. Divide the larger of the two numbers a and b by the smaller using integer division. Call this

quotient q.

  • 3. Subtract q times the smaller from the larger number.
  • 4. Subtract q times the vector corresponding to the smaller number from the vector corresponding to

the larger number.

  • 5. Repeat steps 2 through 4 until one of the numbers equals zero. The vector that corresponds to the

number that is not zero, contains the two desired numbers x, y. Here is the corresponding algorithm in pseudo-code for given a, b ∈ Z. Observe that “div” is integer division and that addition and operation on (xa, ya) and (xb, yb) are to be considered component-wise. (xa, ya) = (1, 0) (xb, yb) = (0, 1) while a = 0 and b = 0 do if a > b then q := a div b a := a − q · b (xa, ya) := (xa, ya) − q · (xb, yb) else q := b div a b := b − q · a (xb, yb) := (xb, yb) − q · (xa, ya) end end if a = 0 then return b, xb, yb else return a, xa, ya end xiii

slide-8
SLIDE 8

Example: And finally, here is an example of the algorithm applied to a = 53 and b = 30:

a b (xa, ya) (xb, yb) 53 30 (1, 0) (0, 1) 53 − 1 · 30 = 23 30 (1, 0) − 1 · (0, 1) = (1, −1) (0, 1) 23 30 − 1 · 23 = 7 (1, −1) (0, 1) − 1 · (1, −1) = (−1, 2) 23 − 3 · 7 = 2 7 (1, −1) − 3 · (−1, 2) = (4, −7) (−1, 2) 2 7 − 3 · 2 = 1 (4, −7) (−1, 2) − 3 · (4, −7) = (−13, 23) 2 − 2 · 1 = 0 1 (4, −7) − 2 · (−13, 23) = (30, −53) (−13, 23)

The final result is therefore xba + ybb = r, which is −13 · 53 + 23 · 30 = −689 + 690 = 1. We can now solve our original problem of determining the inverse of a modulo n, when such an inverse

  • exists. We first apply the extended Euclidean algorithm to a and n so as to compute r, x, y such that

r = gcd(a, n) = xa + yn. We can solve the equation ax ≡ 1( mod n), since we have r = xa+yn ≡ xa( mod n). Hence, we have a solution x = a−1 , precisely when r = 1. Observe that in many cases the inverse will be actually a negative number, that we might need to convert into the corresponding positive number modulo n. Example: From the previous example of the extended Euclidean algorithm we can see that 23 is the inverse of 30 modulo 53, i.e., we have 30 · 23 ≡ 1(mod 53)

Euler’s Totient Function

We now know how to check whether or not an element of Zn has a multiplicative inverse and also how to compute that inverse. We now investigate how many elements in Zn are invertible. The number theoretic tool to count the invertible elements in Z∗

n is called Euler’s Totient function and is traditionally denoted

by ϕ(n). (ϕ is the small Greek letter “phi”.) The function ϕ(n) turns out to be easily computable, provided that the factorisation of n is known. Given all the prime factors of n and their multiplicity there exists a single formula to compute the value of ϕ(n). But since the general formula is rather complicated we restrict ourselves to the two special cases that are

  • f interest to us:
  • 1. For a prime p we already know that ϕ(p) = p − 1 or, in other words, that Z∗

p has p − 1 elements.

  • 2. For positive integers p, q with gcd(p, q) = 1 we have ϕ(pq) = (p − 1)(q − 1) or, in other words,

that Z∗

pq has (p − 1)(q − 1) invertible elements.

We finish this section with the following theorem, important for the correctness of the RSA cipher: Theorem 16 (Euler) Let n ∈ N and a ∈ Z, with gcd(a, n) = 1, then we have aϕ(n) ≡ 1(mod n).

Counting Exponents

Recall from Mathematics 5 how we can use generated subgroups to compute discrete exponentiation. But since Z∗

pq is not a multiplicative group, its elements also do not necessarily generate subgroups. In

particular all non-invertible elements, i.e. multiples of p and q do not generate subgroups! Nevertheless we can compute discrete exponentiations using a slightly more general counting technique: Let n = pq, where p and q are prime numbers, then each element g ∈ Z∗

n generates a multiplicative

subset of Z∗

n (which is generally not a group!) by taking S = {g1, g2, . . .}. Since Z∗ n is finite this set has

to be finite and thus the sequence of exponentiations will eventually cycle, i.e., gn ∈ {g1, g2, . . . , gn−1}. For example take 14 ∈ Z∗

77

14 = {141 = 14, 142 = 42, 143 = 49, 144 = 70, 145 = 56, 146 = 14, . . .} = {14, 42, 49, 70, 56} Thus we can compute 1443 ≡ 49(mod 77). [Observe that in general we have no guarantee that the cycle will “restart” at the first element g1. How- ever, if n is of the form pq this will always be the case. For more general n the “restart” has to be taken into account when counting exponents. For instance, for 2 ∈ Z∗

60 we have 2 = {2, 4, 8, 16, 32, 4, . . .}

and thus 28 ≡ 16(mod 60).] xiv

slide-9
SLIDE 9

Chinese Remainder Theorem The Chinese Remainder Theorem was first published by the Chinese mathematician Sun Tzu around the third centry BC, but is probably a lot older. It is a statement about simultaneously solving congruence equations with different modulo factors. Here is the statement: Let m, n ∈ Z with gcd(m, n) = 1. Then for any given a, b ∈ Z there exists and x ∈ Z such that x ≡ a(mod m) and x ≡ b(mod n) Moreover, every solution x is congruent modulo m·n. Or in other words the solution x ∈ Zmn is unique. To elaborate the last point: If we have two solutions x1, x2 such that x1 ≡ a(mod m) and x1 ≡ b(mod n) as well as x2 ≡ a( mod m) and x2 ≡ b( mod n), then we know that x1 = x2 +knm for some k ∈ Z. This is actually quite easy to see: Solutions for the first congruence have to be multiples of m apart, i.e. with x a solution to x ≡ a(mod m), so is x+lm for every l ∈ Z. And, to satisfy the second congruence they have to be multiples of n apart. Since m, n have no common factor (recall gcd(n, m) = 1) they can

  • nly coincide when they are multiples of mn apart.

The Chinese Remainder Theorem can easily be generalised to systems of simultaneous congruences. However, for our purposes it is enough to consider pairs. Particularly useful is the following alternative formulation: Let m, n ∈ Z with gcd(m, n) = 1. Then for every pair a, b ∈ Z such that a ≡ b(mod m) and a ≡ b(mod n) we have a ≡ b(mod mn). Finally, we can also give the following efficient algorithm to compute the solution guaranteed by the Chinese Remainder Theorem, using the extended Euclidean algorithm: Let n = m with gcd(n, m) = 1 and let a, b ∈ Z. Then compute u, v ∈ Z with um + vn = 1 using the extended Euclidean algorithm and compute x as umb + vna ≡ x(mod nm). Example: Let m = 4, n = 5, a = 3, b = 4 then we can compute u = 4 and v = −3 with the extended Euclidean algorithm. This yields 4 · 4 · 4 + (−3) · 5 · 3 = 64 − 45 ≡ 19(mod 20). This means that 19 is a solution the single congruence relations, i.e., 19 ≡ 3(mod 4) and 19 ≡ 4(mod 5), and so is every integer of the form 19 + 20k, k ∈ Z and only those integers. xv

slide-10
SLIDE 10

Cryptography Glossary 9

Carmichael Number A number n for which an−1 ≡ 1(mod n) for all 1 ≤ a < n but that is nevertheless composite. 25 Encryption Schemes Techniques to introduce a degree of randomness into public key ciphers

  • ften by preprocessing the messages.

24 Fermat’s Test A probabilistic test algorithm to establish the likelihood that a given number is a prime, based on Fermat’s little theorem. 25 Miller-Rabin Test A probabilistic test algorithm to establish the likelihood that a given number is a prime based on number theory. It is an improvement over Fermat’s test, since it can successfully detect Carmichael numbers as composite. 25 RSA The first, and probably most widely used public key cipher based on discrete exponentiation modulo a composite number. 22 RSA Trapdoor Function RSA’s function (M e(modn) that can only be easily inverted if the fac- torisation of n is known. 22