Square Always Exponentiation Christophe Clavier 1 Benoit Feix 1 , 2 - - PowerPoint PPT Presentation

square always exponentiation
SMART_READER_LITE
LIVE PREVIEW

Square Always Exponentiation Christophe Clavier 1 Benoit Feix 1 , 2 - - PowerPoint PPT Presentation

Introduction Square Always Parallelization Conclusion Square Always Exponentiation Christophe Clavier 1 Benoit Feix 1 , 2 Georges Gagnerot 1 , 2 ene Roussellet 2 Vincent Verneuil 2 , 3 Myl` 1 XLIM-Universit e de Limoges, France 2 INSIDE


slide-1
SLIDE 1

Introduction Square Always Parallelization Conclusion

Square Always Exponentiation

Christophe Clavier1 Benoit Feix1,2 Georges Gagnerot1,2 Myl` ene Roussellet2 Vincent Verneuil2,3

1XLIM-Universit´

e de Limoges, France

2INSIDE Secure, Aix-en-Provence, France

  • 3Univ. Bordeaux, IMB, France

Indocrypt 2011 - December 12, 2011

Vincent Verneuil - Square Always Exponentiation 1 / 38

slide-2
SLIDE 2

Introduction Square Always Parallelization Conclusion

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 2 / 38

slide-3
SLIDE 3

Introduction Square Always Parallelization Conclusion

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 2 / 38

slide-4
SLIDE 4

Introduction Square Always Parallelization Conclusion Motivation

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 2 / 38

slide-5
SLIDE 5

Introduction Square Always Parallelization Conclusion Motivation

Motivation

  • Exponentiation is the core operation of RSA, DSA,

Diffie-Hellman protocols.

  • Embedded in constrained devices (smart cards, etc.) with low

resources.

  • Targeted by side-channel attacks in this sensitive context.

Vincent Verneuil - Square Always Exponentiation 3 / 38

slide-6
SLIDE 6

Introduction Square Always Parallelization Conclusion Motivation

Context

Let consider the computation of md mod n with d = (dk−1dk−2 ...d0)2. M the cost of a modular multiplication. S the cost of a modular squaring. Two cases : fast squaring (S/M = .8) or not (S/M = 1).

Vincent Verneuil - Square Always Exponentiation 4 / 38

slide-7
SLIDE 7

Introduction Square Always Parallelization Conclusion Recalls

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 4 / 38

slide-8
SLIDE 8

Introduction Square Always Parallelization Conclusion Recalls

Basic Exponentiation

Square-and-Multiply Algorithms

Left-to-right Right-to-left

Vincent Verneuil - Square Always Exponentiation 5 / 38

slide-9
SLIDE 9

Introduction Square Always Parallelization Conclusion Recalls

Basic Exponentiation

Square-and-Multiply Algorithms

Left-to-right Right-to-left

md = md0 ×

  • md1 ×
  • ...
  • mdk−12 ...

22 md = mdk−12k−1 ×mdk−22k−2 ×...×md0

Vincent Verneuil - Square Always Exponentiation 5 / 38

slide-10
SLIDE 10

Introduction Square Always Parallelization Conclusion Recalls

Basic Exponentiation

Square-and-Multiply Algorithms

Left-to-right Right-to-left

md = md0 ×

  • md1 ×
  • ...
  • mdk−12 ...

22 md = mdk−12k−1 ×mdk−22k−2 ×...×md0

Input: m,n,d ∈ N Output: md mod n a ← 1 for i = k −1 to 0 do a ← a2 mod n if di = 1 then a ← a×m mod n return a Input: m,n,d ∈ N Output: md mod n a ← 1 ; b ← m for i = 0 to k −1 do if di = 1 then a ← a×b mod n b ← b2 mod n return a

Vincent Verneuil - Square Always Exponentiation 5 / 38

slide-11
SLIDE 11

Introduction Square Always Parallelization Conclusion Recalls

Side-Channel Threats

When a computation involving a secret occurs on an embedded devices, side-channels (power, EM) may be spotted to search for leakages. Kocher introduced in 1999 the simple and differential side-channel analysis.

Vincent Verneuil - Square Always Exponentiation 6 / 38

slide-12
SLIDE 12

Introduction Square Always Parallelization Conclusion Recalls

Simple Side-Channel Analysis on Exponentiation (SPA)

Side-channel leakage: power, EM, etc. The whole exponent may be recovered using a single curve.

Vincent Verneuil - Square Always Exponentiation 7 / 38

slide-13
SLIDE 13

Introduction Square Always Parallelization Conclusion Recalls

Regular Exponentiation

Montgomery ladder

Square & multiply: S, M, S, S, M, S, M, S, S, M. . .

Vincent Verneuil - Square Always Exponentiation 8 / 38

slide-14
SLIDE 14

Introduction Square Always Parallelization Conclusion Recalls

Regular Exponentiation

Montgomery ladder

Square & multiply: S, M, S, S, M, S, M, S, S, M. . . Square & multiply always: S, M, S, M,S, M,S, M,S, M, S, M. . .

Vincent Verneuil - Square Always Exponentiation 8 / 38

slide-15
SLIDE 15

Introduction Square Always Parallelization Conclusion Recalls

Regular Exponentiation

Montgomery ladder

Square & multiply: S, M, S, S, M, S, M, S, S, M. . . Square & multiply always: S, M, S, M,S, M,S, M,S, M, S, M. . .

Vincent Verneuil - Square Always Exponentiation 8 / 38

slide-16
SLIDE 16

Introduction Square Always Parallelization Conclusion Recalls

Regular Exponentiation

Montgomery ladder

Square & multiply: S, M, S, S, M, S, M, S, S, M. . . Square & multiply always: S, M, S, M,S, M,S, M,S, M, S, M. . . Montgomery ladder: S, M, S, M,S, M,S, M,S, M, S, M. . .

Vincent Verneuil - Square Always Exponentiation 8 / 38

slide-17
SLIDE 17

Introduction Square Always Parallelization Conclusion Recalls

Regular Exponentiation

Montgomery ladder

Square & multiply: S, M, S, S, M, S, M, S, S, M. . . Square & multiply always: S, M, S, M,S, M,S, M,S, M, S, M. . . Montgomery ladder: S, M, S, M,S, M,S, M,S, M, S, M. . . Input: m,n,d ∈ N Output: md mod n

1: R0 ← 1 2: R1 ← m 3: for i = k −1 to 0 do 4:

R1−di ← R0 ×R1 mod n

5:

Rdi ← Rdi

2 mod n

6: return R0

Vincent Verneuil - Square Always Exponentiation 8 / 38

slide-18
SLIDE 18

Introduction Square Always Parallelization Conclusion Recalls

Regular Exponentiation

Atomic Exponentiation “Multiply Always”

Square & multiply: S, M, S, S, M, S, M, S, S, M. . .

Vincent Verneuil - Square Always Exponentiation 9 / 38

slide-19
SLIDE 19

Introduction Square Always Parallelization Conclusion Recalls

Regular Exponentiation

Atomic Exponentiation “Multiply Always”

Square & multiply: S, M, S, S, M, S, M, S, S, M. . . Multiply always: M,M, M,M,M, M,M, M,M,M. . .

Vincent Verneuil - Square Always Exponentiation 9 / 38

slide-20
SLIDE 20

Introduction Square Always Parallelization Conclusion Recalls

Regular Exponentiation

Atomic Exponentiation “Multiply Always”

Square & multiply: S, M, S, S, M, S, M, S, S, M. . . Multiply always: M,M, M,M,M, M,M, M,M,M. . . Input: m,n,d ∈ N Output: md mod n

1: R0 ← 1 2: R1 ← m 3: i ← k −1 4: t ← 0 5: while i ≥ 0 do 6:

R0 ← R0 ×Rt mod n

7:

t ← t ⊕di [⊕ is bitwise XOR]

8:

i ← i −1+t

9: return R0

Vincent Verneuil - Square Always Exponentiation 9 / 38

slide-21
SLIDE 21

Introduction Square Always Parallelization Conclusion Recalls

Squaring-Multiplication Discrimination Attack

In [Distinguishing Multiplications from Squaring Operations, SAC 2008], Amiel et al. observed that Ex,y(HW(x ×y)) has a different value whether:

  • x = y uniformly distributed in [0,2k −1],
  • x and y independent and uniformly distributed in [0,2k −1].

Vincent Verneuil - Square Always Exponentiation 10 / 38

slide-22
SLIDE 22

Introduction Square Always Parallelization Conclusion Recalls

Squaring-Multiplication Discrimination Attack

Attack: subtract two (averaged) power traces of consecutive atomic multiplications. Countermeasure: exponent blinding d∗ ← d +rψ(n).

Vincent Verneuil - Square Always Exponentiation 11 / 38

slide-23
SLIDE 23

Introduction Square Always Parallelization Conclusion Recalls

Cost Summary

Algorithm Cost / bit S/M = 1 S/M = .8 # reg Square & multiply 1,2,3 .5M +1S 1.5M 1.3M 2 Multiply always 2,3 1.5M 1.5M 1.5M 2 Montgomery ladder 1M +1S 2M 1.8M 2

1 algorithm unprotected towards the SPA 2 algorithm sensitive to S – M discrimination 3 possible sliding window optimization

Vincent Verneuil - Square Always Exponentiation 12 / 38

slide-24
SLIDE 24

Introduction Square Always Parallelization Conclusion Contribution

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 12 / 38

slide-25
SLIDE 25

Introduction Square Always Parallelization Conclusion Contribution

Our Contribution

  • Atomic exponentiation algorithms immune to the S – M

discrimination

  • Better efficiency than ladder algorithms
  • Study of algorithms for parallelized (co)processors and

space/time trade-offs

Vincent Verneuil - Square Always Exponentiation 13 / 38

slide-26
SLIDE 26

Introduction Square Always Parallelization Conclusion

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 13 / 38

slide-27
SLIDE 27

Introduction Square Always Parallelization Conclusion Principle

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 13 / 38

slide-28
SLIDE 28

Introduction Square Always Parallelization Conclusion Principle

Replacing Multiplications by Squarings

x ×y = (x +y)2 −x2 −y2 2 (1) x ×y = x +y 2 2 − x −y 2 2 (2)

Vincent Verneuil - Square Always Exponentiation 14 / 38

slide-29
SLIDE 29

Introduction Square Always Parallelization Conclusion Principle

Replacing Multiplications by Squarings

x ×y = (x +y)2 −x2 −y2 2 (1) x ×y = x +y 2 2 − x −y 2 2 (2)

Vincent Verneuil - Square Always Exponentiation 14 / 38

slide-30
SLIDE 30

Introduction Square Always Parallelization Conclusion Algorithms

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 14 / 38

slide-31
SLIDE 31

Introduction Square Always Parallelization Conclusion Algorithms

Left-to-Right Algorithm Using (1)

Input: m,n,d ∈ N Output: md mod n a ← 1 for i = k −1 to 0 do a ← a2 mod n if di = 1 then a ← (a+m)2−a2

2

− m2

2 mod n

return a

Vincent Verneuil - Square Always Exponentiation 15 / 38

slide-32
SLIDE 32

Introduction Square Always Parallelization Conclusion Algorithms

Atomic Left-to-Right Algorithm

Input: m,n,d ∈ N Output: md mod n

1: R0 ← 1 ; R1 ← m ; R2 ← 1 2: R3 ← m2/2 mod n 3: j ← 0 ; i ← k −1 4: while i ≥ 0 do 5:

RMj,0 ← RMj,1 +RMj,2 mod n

6:

RMj,3 ← RMj,3

2 mod n

7:

RMj,4 ← RMj,5/2 mod n

8:

RMj,6 ← RMj,7 −RMj,8 mod n

9:

j ← di(1+(j mod 3))

10:

i ← i −Mj,9

11: return R0 j = 0 j = 1 j = 2 j = 3 1 bit 0 bit 1 bit 0 bit M =     1 1 1 2 1 1 1 2 1 2 1 2 2 2 2 2 3 1 1 3 2 3 3 3 3 3 1 1 3 1    

Vincent Verneuil - Square Always Exponentiation 16 / 38

slide-33
SLIDE 33

Introduction Square Always Parallelization Conclusion Algorithms

Atomic Left-to-Right Algorithm

Atomic Patterns

j = 0

(di = 0 or 1) R1 ← R1 +R1 mod n ⋆ R0 ← R0

2 mod n

R2 ← R1/2 mod n ⋆ R1 ← R1 −R2 mod n ⋆ j ← di [⋆ if di = 0] i ← i −(1−di) [⋆ if di = 1]

j = 2

(di = 1) R1 ← R1 +R3 mod n ⋆ R0 ← R0

2 mod n

R0 ← R0/2 mod n R0 ← R2 −R0 mod n j ← 3 i ← i −1

j = 1

(di = 1) R2 ← R0 +R1 mod n R2 ← R2

2 mod n

R2 ← R2/2 mod n R2 ← R2 −R3 mod n j ← 2 i ← i ⋆

j = 3

(di = 0 or 1) R3 ← R3 +R3 mod n ⋆ R0 ← R0

2 mod n

R3 ← R3/2 mod n ⋆ R1 ← R1 −R3 mod n ⋆ j ← di i ← i −(1−di) [⋆ if di = 1]

Vincent Verneuil - Square Always Exponentiation 17 / 38

slide-34
SLIDE 34

Introduction Square Always Parallelization Conclusion Algorithms

Right-to-Left Algorithm Using (2)

Input: m,n,d ∈ N Output: md mod n a ← 1 ; b ← m for i = 0 to k −1 do if di = 1 then a ←

  • a+b

2

2 −

  • a−b

2

2 mod n b ← b2 mod n return a

Vincent Verneuil - Square Always Exponentiation 18 / 38

slide-35
SLIDE 35

Introduction Square Always Parallelization Conclusion Algorithms

Atomic Right-to-Left Algorithm

Input: m,n,d ∈ N Output: md mod n

1: R0 ← m ; R1 ← 1 ; R2 ← 1 2: i ← 0 ; j ← 0 3: while i ≤ k −1 do 4:

j ← di(1+(j mod 3))

5:

RMj,0 ← RMj,1 +R0 mod n

6:

RMj,2 ← RMj,3/2 mod n

7:

RMj,4 ← RMj,5 −RMj,6 mod n

8:

RMj,3 ← RMj,3

2 mod n

9:

i ← i +Mj,7

10: return R1 j = 0 j = 1 j = 2 j = 3 1 bit 0 bit 1 bit 0 bit M =     2 2 1 2 1 2 2 1 1 2 1 1 2 1 2 1 1    

Vincent Verneuil - Square Always Exponentiation 19 / 38

slide-36
SLIDE 36

Introduction Square Always Parallelization Conclusion Algorithms

Atomic Right-to-Left Algorithm

Atomic Patterns

j = 0

(di = 0) j ← 0 [⋆ if j was 0] R0 ← R0 +R0 mod n ⋆ R2 ← R0/2 mod n ⋆ R0 ← R0 −R2 mod n ⋆ R0 ← R0

2 mod n

i ← i +1

j = 2

(di = 1) j ← 2 R0 ← R2 +R0 mod n ⋆ R1 ← R1/2 mod n R0 ← R0 −R2 mod n ⋆ R1 ← R1

2 mod n

i ← i ⋆

j = 1

(di = 1) j ← 1 R2 ← R1 +R0 mod n R2 ← R2/2 mod n R1 ← R0 −R1 mod n R2 ← R2

2 mod n

i ← i ⋆

j = 3

(di = 1) j ← 3 R0 ← R0 +R0 mod n ⋆ R0 ← R0/2 mod n ⋆ R1 ← R2 −R1 mod n R0 ← R0

2 mod n

i ← i +1

Vincent Verneuil - Square Always Exponentiation 20 / 38

slide-37
SLIDE 37

Introduction Square Always Parallelization Conclusion Algorithms

Cost Comparison

Algorithm Cost / bit S/M = 1 S/M = .8 # reg Square & multiply 1,2,3 .5M +1S 1.5M 1.3M 2 Multiply always 2,3 1.5M 1.5M 1.5M 2 Montgomery ladder 1M +1S 2M 1.8M 2 L.-to-r. square always3 2S 2M 1.6M 4 R.-to-l. square always3 2S 2M 1.6M 3 → 11 % speed-up over Montgomery ladder

1 algorithm unprotected towards the SPA 2 algorithm sensitive to S – M discrimination 3 possible sliding window optimization

Vincent Verneuil - Square Always Exponentiation 21 / 38

slide-38
SLIDE 38

Introduction Square Always Parallelization Conclusion Algorithms

Implementation

AT90SC chip @ 30MHz with AdvX arithmetic coprocessor:

Algorithm Key len. (b) Code (B) RAM (B) Timing (ms)

  • Mont. ladder

512 360 128 30 1024 360 256 200 2048 360 512 1840 Square Always 512 510 192 28 1024 510 384 190 2048 510 768 1740

→ 5 % practical speed-up obtained in practice

Vincent Verneuil - Square Always Exponentiation 22 / 38

slide-39
SLIDE 39

Introduction Square Always Parallelization Conclusion Algorithms

Security Considerations

  • Immune to any S – M discrimination
  • Compatible with classical DPA/FA countermeasures
  • Right-to-left is more robust than left-to-right
  • Despite DPA countermeasures prevent safe-errors, we provide

algorithms immune to C safe-errors

Vincent Verneuil - Square Always Exponentiation 23 / 38

slide-40
SLIDE 40

Introduction Square Always Parallelization Conclusion

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 23 / 38

slide-41
SLIDE 41

Introduction Square Always Parallelization Conclusion Generalities

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 23 / 38

slide-42
SLIDE 42

Introduction Square Always Parallelization Conclusion Generalities

Motivation

  • Trendy topic
  • Parallelized Montgomery ladder : 1M / bit
  • Squarings are independent in eq. (1) and (2)

What can we do if two parallel squarings are available ?

Vincent Verneuil - Square Always Exponentiation 24 / 38

slide-43
SLIDE 43

Introduction Square Always Parallelization Conclusion Generalities

Basic Parallelization

Core 1 Core 2 0 bit S S 1 bit S S S S Many wasted squaring slots :-(

Vincent Verneuil - Square Always Exponentiation 25 / 38

slide-44
SLIDE 44

Introduction Square Always Parallelization Conclusion Generalities

Scanning Direction

Left-to-right: md = md0 ×

  • md1 ×
  • ...
  • mdk−1

2 ... 22 Right-to-left: md = mdk−12k−1 ×mdk−22k−2 ×...×md0 → Right-to-left is more flexible

Vincent Verneuil - Square Always Exponentiation 26 / 38

slide-45
SLIDE 45

Introduction Square Always Parallelization Conclusion Generalities

Better Parallelization

Strategy: use the wasted 1 bit squaring slot to compute 1 squaring in advance. if(di = 1) a ← ((a+b)/2)2 −((a−b)/2)2 mod n b ← b2 mod n Core 1 Core 2 S S S S Remark: requires 1 additional memory register to store the precomputed squaring.

Vincent Verneuil - Square Always Exponentiation 27 / 38

slide-46
SLIDE 46

Introduction Square Always Parallelization Conclusion Generalities

Better Parallelization

Strategy: use the wasted 1 bit squaring slot to compute 1 squaring in advance. if(di = 1) a ← ((a+b)/2)2 −((a−b)/2)2 mod n b ← b2 mod n if(di+1 = 1) a ← ((a+b)/2)2 −((a−b)/2)2 mod n b ← b2 mod n Core 1 Core 2 S S S S Remark: requires 1 additional memory register to store the precomputed squaring.

Vincent Verneuil - Square Always Exponentiation 27 / 38

slide-47
SLIDE 47

Introduction Square Always Parallelization Conclusion Generalities

Better Parallelization

Strategy: use the wasted 1 bit squaring slot to compute 1 squaring in advance. if(di = 1) a ← ((a+b)/2)2 −((a−b)/2)2 mod n b ← b2 mod n if(di+1 = 1) a ← ((a+b)/2)2 −((a−b)/2)2 mod n b ← b2 mod n Core 1 Core 2 S S S S Remark: requires 1 additional memory register to store the precomputed squaring.

Vincent Verneuil - Square Always Exponentiation 27 / 38

slide-48
SLIDE 48

Introduction Square Always Parallelization Conclusion Algorithms

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 27 / 38

slide-49
SLIDE 49

Introduction Square Always Parallelization Conclusion Algorithms

Right-to-Left Parallelized Algorithm

Input: m,n ∈ N, m < n, d = (dk−1 ...d0)2, require 5 k-bit registers a, b, R0, R1, R2 Output: md mod n 1: a ← 1 ; b ← m ; extra ← 0 2: for i = 0 to k −1 do 3: if di = 1 then 4: if extra = 0 then 5: R0 ← (a−b)2 mod n || R1 ← b2 mod n 6: a ← (a+b)2 mod n || R2 ← R1

2 mod n

7: a ← (a−R0)/4 mod n ; b ← R1 ; R1 ← R2 8: extra ← 1 9: else 10: R0 ← (a−b)2 mod n || a ← (a+b)2 mod n 11: a ← (a−R0)/4 mod n ; b ← R1 12: extra ← 0 13: else 14: if extra = 0 then 15: b ← b2 mod n || <nothing> 16: else 17: b ← R1 18: extra ← 0 19: return a

Vincent Verneuil - Square Always Exponentiation 28 / 38

slide-50
SLIDE 50

Introduction Square Always Parallelization Conclusion Algorithms

Atomic Parallelized Algorithm

Input: m,n ∈ N, m < n, d = (dk−1dk−2 ...d0)2, require 7 k-bit registers R0 to R6 Output: md mod n 1: R0 ← 1 ; R1 ← m ; v ← (0,0,0) ; u ← 1 2: while v0 ≤ k −1 do 3: j ← dv0(v1 +u +1) 4: R5 ← (R0 −R1)/2 mod n 5: R6 ← (R0 +R1)/2 mod n 6: RMj,0 ← RMj,1

2 mod n

|| RMj,2 ← RMj,3

2 mod n

7: RMj,4 ← R0 −R2 mod n 8: RMj,5 ← R3 9: RMj,6 ← R4 10: v1 ← Mj,7 11: u ← Mj,8 12: t ← 1−v1(1−dv0+1) 13: RNt,0 ← R3 14: vNt,1 ← 0 15: vNt,2 ← vNt,2 +1 16: v0 ← v0 +u 17: return R0

Vincent Verneuil - Square Always Exponentiation 29 / 38

slide-51
SLIDE 51

Introduction Square Always Parallelization Conclusion Algorithms

Atomic Parallelized Algorithm

Matrices

M =     1 1 5 6 5 5 5 1 6 4 3 1 3 1 1 2 5 3 1 5 5 5 2 5 6 1 5 1     N = 1 1 5 2 2

  • Vincent Verneuil - Square Always Exponentiation

30 / 38

slide-52
SLIDE 52

Introduction Square Always Parallelization Conclusion Algorithms

Even Better Parallelization

Strategy: use the wasted 1 bit squaring slots to compute several squarings in advance. The variable extra, 0 ≤ extra ≤ extramax, stores the number of precomputed squarings. Remark: requires extramax additional memory registers to store the precomputed squarings.

Vincent Verneuil - Square Always Exponentiation 31 / 38

slide-53
SLIDE 53

Introduction Square Always Parallelization Conclusion Algorithms

Even Better Parallelization

Example : d = (...00110)2, extramax = 1

extramax = 1 cost: 1S 2S 1S 1S 1S

1 ❅

❅ ❘

1

✲ ✲ Cost: 6S

Vincent Verneuil - Square Always Exponentiation 32 / 38

slide-54
SLIDE 54

Introduction Square Always Parallelization Conclusion Algorithms

Even Better Parallelization

Example : d = (...01100)2, extramax ≥ 2

1 2 . . . extramax cost: 1S 2S 2S 0S 0S

1

1 ❅

❅ ❘ ❅ ❅ ❘ Cost: 5S

Vincent Verneuil - Square Always Exponentiation 33 / 38

slide-55
SLIDE 55

Introduction Square Always Parallelization Conclusion Algorithms

Generalized Parallel Square Always

Input: m,n ∈ N, m < n, d = (dk−1dk−2 ...d0)2, extramax ∈ N∗, require extramax+4 k-bit registers a, R0, R1, . . . Rextramax+2 Output: md mod n

1: a ← 1 ; R1 ← m ; extra ← 0 2: for i = 0 to k −1 do 3:

if di = 1 then

4:

if extra < extramax then

5:

R0 ← (a−R1)2 mod n || Rextra+2 ← Rextra+1

2 mod n

6:

a ← (a+R1)2 mod n || Rextra+3 ← Rextra+2

2 mod n

7:

a ← (a−R0)/4 mod n

8:

(R1,R2,...Rextramax+1) ← (R2,R3,...Rextramax+2)

9:

extra ← extra+1

10:

else

11:

R0 ← (a−R1)2 mod n || a ← (a+R1)2 mod n

12:

a ← (a−R0)/4 mod n

13:

(R1,R2,...Rextramax+1) ← (R2,R3,...Rextramax+2)

14:

extra ← extra−1

15:

else

16:

if extra = 0 then

17:

R1 ← R1

2 mod n

18:

else

19:

(R1,R2,...Rextramax+1) ← (R2,R3,...Rextramax+2)

20:

extra ← extra−1

21: return a

Vincent Verneuil - Square Always Exponentiation 34 / 38

slide-56
SLIDE 56

Introduction Square Always Parallelization Conclusion Algorithms

Cost Comparison

We demonstrate that the cost of the parallelized algorithms tends to:

  • 1+

1 4extramax+2

  • S

Algorithm General cost S/M = 1 S/M = 0.8 Parallel Montgomery ladder 1M 1M 1M Parallel Sq. Al. extramax = 1 7S/6 1.17M 0.93M Parallel Sq. Al. extramax = 2 11S/10 1.10M 0.88M Parallel Sq. Al. extramax = 3 15S/14 1.07M 0.86M . . . . . . . . . . . . Parallel Sq. Al. extramax → ∞ 1S 1M 0.8M

Vincent Verneuil - Square Always Exponentiation 35 / 38

slide-57
SLIDE 57

Introduction Square Always Parallelization Conclusion

Outline

1

Introduction Motivation Recalls Contribution

2

Square Always Principle Algorithms

3

Parallelization Generalities Algorithms

4

Conclusion

Vincent Verneuil - Square Always Exponentiation 35 / 38

slide-58
SLIDE 58

Introduction Square Always Parallelization Conclusion

Conclusion

  • Square Always is an alternative countermeasure to S – M

discrimination

  • It provides better efficiency than the Montgomery ladder
  • Parallelization brings best exponentiation performances to our

knowledge

  • Two parallel squaring blocks yields faster exponentiation than

two parallel multiplication blocks !

Vincent Verneuil - Square Always Exponentiation 36 / 38

slide-59
SLIDE 59

Introduction Square Always Parallelization Conclusion

Thank you for your attention !

Vincent Verneuil - Square Always Exponentiation 37 / 38

slide-60
SLIDE 60

Turning Squarings into Multiplications

r, r1, r2 being random integers < n x ×x = (x +r)×x −x ×r = (x +r)×(x −r)+r 2 = (x +r1)×(x +r2)−x ×(r1 +r2)−r1 ×r2

Vincent Verneuil - Square Always Exponentiation 38 / 38