Efficient Ring-LWE Encryption on 8-bit AVR Processors . Zhe Liu 1 - - PowerPoint PPT Presentation

efficient ring lwe encryption on 8 bit avr processors
SMART_READER_LITE
LIVE PREVIEW

Efficient Ring-LWE Encryption on 8-bit AVR Processors . Zhe Liu 1 - - PowerPoint PPT Presentation

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion . Efficient Ring-LWE Encryption on 8-bit AVR Processors . Zhe Liu 1 Hwajeong Seo 2 Sujoy Sinha Roy 3 adl 1 Howon Kim 2 Ingrid Verbauwhede 3


slide-1
SLIDE 1

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. .

Efficient Ring-LWE Encryption

  • n 8-bit AVR Processors

Zhe Liu1 Hwajeong Seo2 Sujoy Sinha Roy3 Johann Großsch¨ adl1 Howon Kim2 Ingrid Verbauwhede3

1University of Luxembourg 2Pusan National University 3Katholieke Universiteit Leuven

2015/09/16

1 / 60

slide-2
SLIDE 2

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Outline

.

1 Short Overview

. .

2 Ring-LWE Encryption Scheme

. .

3 Our Implementation

Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler . .

4 Implementation Results

Comparison with Related Work . .

5 Conclusion

2 / 60

slide-3
SLIDE 3

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Outline

.

1 Short Overview

. .

2 Ring-LWE Encryption Scheme

. .

3 Our Implementation

Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler . .

4 Implementation Results

Comparison with Related Work . .

5 Conclusion

3 / 60

slide-4
SLIDE 4

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Lattice-based Cryptography

RSA and ECC: Integer Factorization and ECDLP

Hard problems can be solved by Shor’s algorithm

4 / 60

slide-5
SLIDE 5

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Lattice-based Cryptography

RSA and ECC: Integer Factorization and ECDLP

Hard problems can be solved by Shor’s algorithm

Lattice-based Cryptography: Hard for quantum computers

Ring-LWE Encryption schemes: proposed [EUROCRYPT’10] → optimized [CHES’14] (reducing the polynomial arithmetic)

4 / 60

slide-6
SLIDE 6

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Implementation Platform

8-bit XMEGA128 Microcontroller

Wireless Sensor Networks; Internet of Things Operating Frequency: 32 MHz 128KB Flash, 8KB RAM, 32 registers Core instruction: 8-bit mul/add (2/1 cycles) AES/DES Crypto Engine (for PRNG)

5 / 60

slide-7
SLIDE 7

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Previous Works

Hardware Implementations G¨

  • ttert et al. [CHES’12]: First hardware of Ring-LWE

  • ppelmann et al. [Latincrypt’12] → Roy et al. [CHES’14]

6 / 60

slide-8
SLIDE 8

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Previous Works

Hardware Implementations G¨

  • ttert et al. [CHES’12]: First hardware of Ring-LWE

  • ppelmann et al. [Latincrypt’12] → Roy et al. [CHES’14]

Software Implementations on Embedded Processors 32-bit ARM processor: Oder et al. [DATE’14]: BLISS → Boorghany et al. [ACM TEC’15]: NTRU, Ring-LWE → De Clercq et al. [DATE’15]: Ring-LWE 8-bit AVR processor: Boorghany et al. [ACM TEC’15]: NTRU, Ring-LWE → P¨

  • ppelmann et al. [Latincrypt’15]: BLISS

6 / 60

slide-9
SLIDE 9

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Previous Works

Hardware Implementations G¨

  • ttert et al. [CHES’12]: First hardware of Ring-LWE

  • ppelmann et al. [Latincrypt’12] → Roy et al. [CHES’14]

Software Implementations on Embedded Processors 32-bit ARM processor: Oder et al. [DATE’14]: BLISS → Boorghany et al. [ACM TEC’15]: NTRU, Ring-LWE → De Clercq et al. [DATE’15]: Ring-LWE 8-bit AVR processor: Boorghany et al. [ACM TEC’15]: NTRU, Ring-LWE → P¨

  • ppelmann et al. [Latincrypt’15]: BLISS

→ This work [CHES’15]: Ring-LWE

6 / 60

slide-10
SLIDE 10

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Motivation & Contributions

Motivation

Few 8-bit AVR implementation “Cryptosystem of the Future” for “Internet of the Future”

7 / 60

slide-11
SLIDE 11

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Motivation & Contributions

Motivation

Few 8-bit AVR implementation “Cryptosystem of the Future” for “Internet of the Future”

Contributions: Efficient implementation of Ring-LWE

Fast NTT computation: “MOV-and-ADD” + “SAMS2” Reducing the RAM consumption for coefficient Efficient techniques for Knuth-Yao sampler: “Byte-Scanning”

7 / 60

slide-12
SLIDE 12

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Outline

.

1 Short Overview

. .

2 Ring-LWE Encryption Scheme

. .

3 Our Implementation

Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler . .

4 Implementation Results

Comparison with Related Work . .

5 Conclusion

8 / 60

slide-13
SLIDE 13

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Ring-LWE Encryption Scheme

Key generation stage: Gen(˜ a)

Two error polynomials r1, r2 ∈ Rq from the discrete Gaussian distribution Xσ by the Knuth-Yao sampler twice: ˜ r1 = NTT(r1), ˜ r2 = NTT(r2), ˜ p = ˜ r1 − ˜ a · ˜ r2 ∈ Rq Public key (˜ a, ˜ p), Private key (˜ r2) are obtained

9 / 60

slide-14
SLIDE 14

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Ring-LWE Encryption Scheme

Encryption stage: Enc(˜ a, ˜ p, M)

Message M ∈ {0, 1}n is encoded into a polynomial in the ring; Three error polynomials e1, e2, e3 ∈ Rq are sampled ( ˜ C1, ˜ C2) = (˜ a · ˜ e1 + ˜ e2, ˜ p · ˜ e1 + NTT(e3 + M′))

10 / 60

slide-15
SLIDE 15

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Ring-LWE Encryption Scheme

Encryption stage: Enc(˜ a, ˜ p, M)

Message M ∈ {0, 1}n is encoded into a polynomial in the ring; Three error polynomials e1, e2, e3 ∈ Rq are sampled ( ˜ C1, ˜ C2) = (˜ a · ˜ e1 + ˜ e2, ˜ p · ˜ e1 + NTT(e3 + M′))

Decryption stage: Dec( ˜ C1, ˜ C2, ˜ r2)

Inverse NTT has to be performed to recover M′: M′ = INTT(˜ r2 · ˜ C1 + ˜ C2) and a decoder is to recover the original message M from M′

10 / 60

slide-16
SLIDE 16

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Ring-LWE Encryption Scheme

Number Theoretic Transform

Polynomial multiplication a(x) = ∑n−1

i=0 aixi ∈ Zq in the n-th

roots of unity ωi

n

Algorithm 1: Iterative Number Theoretic Transform Require: Polynomial a(x), n-th root of unity ω Ensure: Polynomial a(x) = NTT(a)

1:

a ← BitReverse(a)

2:

for i from 2 by 2i to n do

3:

ωi ← ωn/i

n

, ω ← 1

4:

for j from 0 by 1 to i/2 − 1 do

5:

for k from 0 by i to n − 1 do

6:

1 ⃝ U ← a[k + j], 2 ⃝ V ← ω · a[k + j + i/2]

7:

3 ⃝ a[k + j] ← U + V , 4 ⃝ a[k + j + i/2] ← U − V

8:

end for

9:

ω ← ω · ωi

10:

end for

11:

end for

12:

return a

11 / 60

slide-17
SLIDE 17

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Ring-LWE Encryption Scheme

Gaussian Sampler

Random walk by Discrete Distribution Generating tree

Algorithm 2: Low-level implementation of Knuth-Yao sampling Require: Probability matrix Pmat, random number r, modulus q Ensure: Sample value s

1:

d ← 0

2:

for col from 0 by 1 to MAXCOL do

3:

d ← 2d + (r&1); r ← r ≫ 1

4:

for row from MAXROW by −1 to 0 do

5:

d ← d − Pmat[row][col]

6:

if d = −1 then

7:

if (r&1) = 1 then

8:

return q − row

9:

else

10:

return row

11:

end if

12:

end if

13:

end for

14:

end for

15:

return 0

12 / 60

slide-18
SLIDE 18

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Outline

.

1 Short Overview

. .

2 Ring-LWE Encryption Scheme

. .

3 Our Implementation

Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler . .

4 Implementation Results

Comparison with Related Work . .

5 Conclusion

13 / 60

slide-19
SLIDE 19

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Outline

.

1 Short Overview

. .

2 Ring-LWE Encryption Scheme

. .

3 Our Implementation

Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler . .

4 Implementation Results

Comparison with Related Work . .

5 Conclusion

14 / 60

slide-20
SLIDE 20

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Parameter Selection (n, q, σ) for Ring-LWE

128-bit security level: (256, 7681, 11.31/ √ 2π) 256-bit security level: (512, 12289, 12.18/ √ 2π) Discrete Gaussian sampler: 12σ

15 / 60

slide-21
SLIDE 21

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Parameter Selection (n, q, σ) for Ring-LWE

128-bit security level: (256, 7681, 11.31/ √ 2π) 256-bit security level: (512, 12289, 12.18/ √ 2π) Discrete Gaussian sampler: 12σ

LUT based Twiddle Factor: ωn and ω · ωi [LATINCRYPT’12]

15 / 60

slide-22
SLIDE 22

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Parameter Selection (n, q, σ) for Ring-LWE

128-bit security level: (256, 7681, 11.31/ √ 2π) 256-bit security level: (512, 12289, 12.18/ √ 2π) Discrete Gaussian sampler: 12σ

LUT based Twiddle Factor: ωn and ω · ωi [LATINCRYPT’12] Negative wrapped convolution: Reduce coefficient [CHES’14]

15 / 60

slide-23
SLIDE 23

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Parameter Selection (n, q, σ) for Ring-LWE

128-bit security level: (256, 7681, 11.31/ √ 2π) 256-bit security level: (512, 12289, 12.18/ √ 2π) Discrete Gaussian sampler: 12σ

LUT based Twiddle Factor: ωn and ω · ωi [LATINCRYPT’12] Negative wrapped convolution: Reduce coefficient [CHES’14] Changing of the j and k-loops in the NTT [HOST’13]

15 / 60

slide-24
SLIDE 24

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Parameter Selection (n, q, σ) for Ring-LWE

128-bit security level: (256, 7681, 11.31/ √ 2π) 256-bit security level: (512, 12289, 12.18/ √ 2π) Discrete Gaussian sampler: 12σ

LUT based Twiddle Factor: ωn and ω · ωi [LATINCRYPT’12] Negative wrapped convolution: Reduce coefficient [CHES’14] Changing of the j and k-loops in the NTT [HOST’13] Merging of the scaling operation by n−1 in INTT [CHES’14]

15 / 60

slide-25
SLIDE 25

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

MOV-and-ADD Coefficient Multiplication (Step1): 1 mul, 1 movw

aL aH bL bH aL × bL r0 r1

16 / 60

slide-26
SLIDE 26

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

MOV-and-ADD Coefficient Multiplication (Step2): 1 mul, 1 movw

aL aH bL bH aL × bL aH × bH r0 r1 r2 r3

17 / 60

slide-27
SLIDE 27

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

MOV-and-ADD Coefficient Multiplication (Step3): 1 mul, 3 add

aL aH bL bH aL × bL aH × bH aH × bL r0 r1 r2 r3

18 / 60

slide-28
SLIDE 28

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

MOV-and-ADD Coefficient Multiplication (Step4): 1 mul, 3 add

aL aH bL bH aL × bL aH × bH aH × bL aL × bH r0 r1 r2 r3

19 / 60

slide-29
SLIDE 29

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

MOV-and-ADD Coefficient Multiplication (Total): 4 mul, 2 movw, 6 add instructions (16 cycles)

aL aH bL bH aL × bL aH × bH aH × bL aL × bH r0 r1 r2 r3

20 / 60

slide-30
SLIDE 30

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Approximation based reduction [ACM TEC’15]

21 / 60

slide-31
SLIDE 31

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Approximation based reduction [ACM TEC’15] Position of 1’s in (2w × 1/q) → p1, ..., pl

21 / 60

slide-32
SLIDE 32

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Approximation based reduction [ACM TEC’15] Position of 1’s in (2w × 1/q) → p1, ..., pl ⌊z/q⌋ ∼ = ∑l

i=1(z ≫ (w − pi))

21 / 60

slide-33
SLIDE 33

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Approximation based reduction [ACM TEC’15] Position of 1’s in (2w × 1/q) → p1, ..., pl ⌊z/q⌋ ∼ = ∑l

i=1(z ≫ (w − pi))

z mod q ∼ = z − q × ⌊z/q⌋

21 / 60

slide-34
SLIDE 34

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Approximation based reduction [ACM TEC’15] Position of 1’s in (2w × 1/q) → p1, ..., pl ⌊z/q⌋ ∼ = ∑l

i=1(z ≫ (w − pi))

z mod q ∼ = z − q × ⌊z/q⌋ ⌊z/7681⌋ ∼ = (z ≫ 13) + (z ≫ 17) + (z ≫ 21)

21 / 60

slide-35
SLIDE 35

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

SAMS2 (Step1-1): shifting (z ≫ 17)

(s1, s0) r0 r1 r2 r3 (r3,r2,r1) » 1

22 / 60

slide-36
SLIDE 36

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

SAMS2 (Step1-2): shifting (z ≫ 13)

(s1, s0) (t1, t0) r0 r1 r2 r3 (r3,r2,r1) » 1 (s1,s0,sx) » 4

23 / 60

slide-37
SLIDE 37

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

SAMS2 (Step1-3): shifting (z ≫ 21)

(s1, s0) (t1, t0) u0 r0 r1 r2 r3 (r3,r2,r1) » 1 (s1,s0,sx) » 4

24 / 60

slide-38
SLIDE 38

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

SAMS2 (Step2): addition (z ≫ 13) + (z ≫ 17) + (z ≫ 21)

(s1, s0) (t1, t0) u0 (s1, s0) + (t1, t0) + u0 r0 r1 r2 r3 (r3,r2,r1) » 1 (s1,s0,sx) » 4

25 / 60

slide-39
SLIDE 39

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

SAMS2 (Step3): multiplication

(s1, s0) (t1, t0) u0 (s1, s0) + (t1, t0) + u0 0x1e 0x1e × [(s1, s0) + (t1, t0) + u0] r0 r1 r2 r3 (r3,r2,r1) » 1 (s1,s0,sx) » 4

26 / 60

slide-40
SLIDE 40

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

SAMS2 method, 1 ⃝: shifting; 2 ⃝: addition; 3 ⃝: multiplication

1 2 3 (s1, s0) (t1, t0) u0 (s1, s0) + (t1, t0) + u0 0x1e 0x1e × [(s1, s0) + (t1, t0) + u0] r0 r1 r2 r3 (r3,r2,r1) » 1 (s1,s0,sx) » 4

27 / 60

slide-41
SLIDE 41

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Incomplete modular arithmetic

28 / 60

slide-42
SLIDE 42

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Incomplete modular arithmetic

Complete: s = a + b mod q

28 / 60

slide-43
SLIDE 43

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Incomplete modular arithmetic

Complete: s = a + b mod q Incomplete: s = a + b mod 2m where m = ⌈log2 q⌉

28 / 60

slide-44
SLIDE 44

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Incomplete modular arithmetic

Complete: s = a + b mod q Incomplete: s = a + b mod 2m where m = ⌈log2 q⌉

Taking q = 7681 as an example

28 / 60

slide-45
SLIDE 45

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Incomplete modular arithmetic

Complete: s = a + b mod q Incomplete: s = a + b mod 2m where m = ⌈log2 q⌉

Taking q = 7681 as an example

Perform a normal coefficient addition

28 / 60

slide-46
SLIDE 46

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Incomplete modular arithmetic

Complete: s = a + b mod q Incomplete: s = a + b mod 2m where m = ⌈log2 q⌉

Taking q = 7681 as an example

Perform a normal coefficient addition Compare the results with 213

28 / 60

slide-47
SLIDE 47

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Incomplete modular arithmetic

Complete: s = a + b mod q Incomplete: s = a + b mod 2m where m = ⌈log2 q⌉

Taking q = 7681 as an example

Perform a normal coefficient addition Compare the results with 213 Conduct a subtraction of q where r > 213

28 / 60

slide-48
SLIDE 48

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Incomplete modular arithmetic

Complete: s = a + b mod q Incomplete: s = a + b mod 2m where m = ⌈log2 q⌉

Taking q = 7681 as an example

Perform a normal coefficient addition Compare the results with 213 Conduct a subtraction of q where r > 213 The operands are kept in [0, 213 − 1]

28 / 60

slide-49
SLIDE 49

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Incomplete modular arithmetic

Complete: s = a + b mod q Incomplete: s = a + b mod 2m where m = ⌈log2 q⌉

Taking q = 7681 as an example

Perform a normal coefficient addition Compare the results with 213 Conduct a subtraction of q where r > 213 The operands are kept in [0, 213 − 1] In the last iteration, the result back into the range [0, q − 1]

28 / 60

slide-50
SLIDE 50

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Reducing the RAM for coefficients (Step 1): Initialized registers

29 / 60

slide-51
SLIDE 51

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 2): 13-bit coefficient (a0) is stored

a0

30 / 60

slide-52
SLIDE 52

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 3): Other coefficients (a1∼12) are stored

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0

31 / 60

slide-53
SLIDE 53

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 4): Coefficient (a13) is stored

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13

32 / 60

slide-54
SLIDE 54

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 5): Remaining coefficients (a14∼15) are stored

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

33 / 60

slide-55
SLIDE 55

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Updating coefficient (a12)

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

34 / 60

slide-56
SLIDE 56

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 1): Clear the lower 13-bit

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

35 / 60

slide-57
SLIDE 57

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 2): Add with target register

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

36 / 60

slide-58
SLIDE 58

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Updating coefficient (a13)

37 / 60

slide-59
SLIDE 59

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 1): Divide the coefficient into 5 limbs

38 / 60

slide-60
SLIDE 60

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 2): Shift the coefficient

» 13 » 10 » 7 » 4 » 1

39 / 60

slide-61
SLIDE 61

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 3): Select the memory

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

40 / 60

slide-62
SLIDE 62

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 4): Clear the 14th bit

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

41 / 60

slide-63
SLIDE 63

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 5): Add with target register

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

42 / 60

slide-64
SLIDE 64

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 6): Select the memory

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

43 / 60

slide-65
SLIDE 65

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 7): Clear the higher 3-bit

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

44 / 60

slide-66
SLIDE 66

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

(Step 8): Add with target register

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

45 / 60

slide-67
SLIDE 67

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization Techniques for NTT Computation

Optimized Storages: 16 13-bit elements in 26 bytes

a12 a11 a10 a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 a13 a14 a15

46 / 60

slide-68
SLIDE 68

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Outline

.

1 Short Overview

. .

2 Ring-LWE Encryption Scheme

. .

3 Our Implementation

Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler . .

4 Implementation Results

Comparison with Related Work . .

5 Conclusion

47 / 60

slide-69
SLIDE 69

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

Algorithm 3: Bit Scanning

1:

for row from MAXROW by −1 to 0 do

2:

d ← d − Pmat[row][col] {Bit wise computations}

3:

. . . omit . . .

4:

end for

Algorithm 4: Byte Scanning

1:

for row from MAXROW by −8 to 0 do

2:

if (Pmat[row][col] ∥ . . . ∥ Pmat[row − 7][col]) > 0 then

3:

sum = ∑row−7

i=row (Pmat[i][col])

4:

d ← d − sum {Byte wise computations}

5:

. . . omit . . .

6:

end if

7:

end for

Byte scanning saves 7 branch operations at the expense of 1 sub

48 / 60

slide-70
SLIDE 70

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

Comparison between BitScanning and ByteScanning

Bit Scanning Byte Scanning

49 / 60

slide-71
SLIDE 71

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

(Step 1):BitScanning (1-bit), ByteScanning (1-byte)

Bit Scanning Byte Scanning

50 / 60

slide-72
SLIDE 72

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

(Step 2):BitScanning (2-bit), ByteScanning (2-byte)

Bit Scanning Byte Scanning

51 / 60

slide-73
SLIDE 73

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

Pseudo Random Number Generation

52 / 60

slide-74
SLIDE 74

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

Pseudo Random Number Generation

AES block cipher with counter mode

52 / 60

slide-75
SLIDE 75

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

Pseudo Random Number Generation

AES block cipher with counter mode ATxmega128A1 supports AES engine (375 cycles)

52 / 60

slide-76
SLIDE 76

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

Pseudo Random Number Generation

AES block cipher with counter mode ATxmega128A1 supports AES engine (375 cycles) SW requires 1.9K cycles and 2KB ROM

52 / 60

slide-77
SLIDE 77

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

Pseudo Random Number Generation

AES block cipher with counter mode ATxmega128A1 supports AES engine (375 cycles) SW requires 1.9K cycles and 2KB ROM

Parallel Computations

52 / 60

slide-78
SLIDE 78

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

Pseudo Random Number Generation

AES block cipher with counter mode ATxmega128A1 supports AES engine (375 cycles) SW requires 1.9K cycles and 2KB ROM

Parallel Computations

AES engine and processor are executed simultaneously

52 / 60

slide-79
SLIDE 79

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler

. . Optimization of the Knuth-Yao Sampler

Pseudo Random Number Generation

AES block cipher with counter mode ATxmega128A1 supports AES engine (375 cycles) SW requires 1.9K cycles and 2KB ROM

Parallel Computations

AES engine and processor are executed simultaneously PRNG and KY sampling in same time

52 / 60

slide-80
SLIDE 80

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Outline

.

1 Short Overview

. .

2 Ring-LWE Encryption Scheme

. .

3 Our Implementation

Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler . .

4 Implementation Results

Comparison with Related Work . .

5 Conclusion

53 / 60

slide-81
SLIDE 81

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Performance Evaluation

500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 4500000 Key_Gen Enc Dec

Ex Execution tim time (in (in clo lock cycles)

HS-256 ME-256 HS-512 ME-512

54 / 60

slide-82
SLIDE 82

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Performance Evaluation

500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 4500000 Key_Gen Enc Dec

Ex Execution tim time (in (in clo lock cycles)

HS-256 ME-256 HS-512 ME-512

High speed (HS) is 2.3x faster than memory efficient (ME)

54 / 60

slide-83
SLIDE 83

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Performance Evaluation

500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 4500000 Key_Gen Enc Dec

Ex Execution tim time (in (in clo lock cycles)

HS-256 ME-256 HS-512 ME-512

High speed (HS) is 2.3x faster than memory efficient (ME) ME version requires sophisticated memory alignments

54 / 60

slide-84
SLIDE 84

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Memory Evaluation

1000 2000 3000 4000 5000 6000 7000 Key_Gen Enc Dec Total

RAM AM requirements (in (in bytes)

HS-256 ME-256 HS-512 ME-512

55 / 60

slide-85
SLIDE 85

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Memory Evaluation

1000 2000 3000 4000 5000 6000 7000 Key_Gen Enc Dec Total

RAM AM requirements (in (in bytes)

HS-256 ME-256 HS-512 ME-512

Compared to HS, ME version reduces the RAM by 21 %

55 / 60

slide-86
SLIDE 86

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Memory Evaluation

1000 2000 3000 4000 5000 6000 7000 Key_Gen Enc Dec Total

RAM AM requirements (in (in bytes)

HS-256 ME-256 HS-512 ME-512

Compared to HS, ME version reduces the RAM by 21 % HS and ME consume the 8K RAM by 77 % and 56 %

55 / 60

slide-87
SLIDE 87

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Outline

.

1 Short Overview

. .

2 Ring-LWE Encryption Scheme

. .

3 Our Implementation

Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler . .

4 Implementation Results

Comparison with Related Work . .

5 Conclusion

56 / 60

slide-88
SLIDE 88

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Comparison with Related Work

Implementation Key-Gen Enc Dec Boorghany (256) 2,770K 3,042K 1,368K P¨

  • ppelmann (256)

n/a 1,314K 381K This work (HS-256) 589K 671K 275K P¨

  • ppelmann (512)

n/a 3,279K 1,019K This work (HS-512) 2,165K 2,617K 686K

57 / 60

slide-89
SLIDE 89

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Comparison with Related Work

Implementation Key-Gen Enc Dec Boorghany (256) 2,770K 3,042K 1,368K P¨

  • ppelmann (256)

n/a 1,314K 381K This work (HS-256) 589K 671K 275K P¨

  • ppelmann (512)

n/a 3,279K 1,019K This work (HS-512) 2,165K 2,617K 686K Boorghany et al.: 4.5x faster (ENC, 256)

57 / 60

slide-90
SLIDE 90

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Comparison with Related Work

Implementation Key-Gen Enc Dec Boorghany (256) 2,770K 3,042K 1,368K P¨

  • ppelmann (256)

n/a 1,314K 381K This work (HS-256) 589K 671K 275K P¨

  • ppelmann (512)

n/a 3,279K 1,019K This work (HS-512) 2,165K 2,617K 686K Boorghany et al.: 4.5x faster (ENC, 256) P¨

  • ppelmann et al.: 2x and 1.25x faster (ENC, 256/512)

57 / 60

slide-91
SLIDE 91

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Comparison with Related Work

Implementation Scheme Enc Dec Liu et al. RSA-1024 n/a 75,680K D¨ ull et al. (HS) ECC-255 27,800K 13,900K D¨ ull et al. (ME) ECC-255 28,293K 14,146K Aranha et al. ECC-233 11,796K 5,898K This work (HS) LWE-256 671K 275K This work (ME) LWE-256 1,532K 673K

58 / 60

slide-92
SLIDE 92

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Comparison with Related Work

Implementation Scheme Enc Dec Liu et al. RSA-1024 n/a 75,680K D¨ ull et al. (HS) ECC-255 27,800K 13,900K D¨ ull et al. (ME) ECC-255 28,293K 14,146K Aranha et al. ECC-233 11,796K 5,898K This work (HS) LWE-256 671K 275K This work (ME) LWE-256 1,532K 673K RSA: 278x faster (DEC, 1024)

58 / 60

slide-93
SLIDE 93

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion Comparison with Related Work

. . Comparison with Related Work

Implementation Scheme Enc Dec Liu et al. RSA-1024 n/a 75,680K D¨ ull et al. (HS) ECC-255 27,800K 13,900K D¨ ull et al. (ME) ECC-255 28,293K 14,146K Aranha et al. ECC-233 11,796K 5,898K This work (HS) LWE-256 671K 275K This work (ME) LWE-256 1,532K 673K RSA: 278x faster (DEC, 1024) ECC: 41x faster (ENC, 255)

58 / 60

slide-94
SLIDE 94

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Outline

.

1 Short Overview

. .

2 Ring-LWE Encryption Scheme

. .

3 Our Implementation

Optimization Techniques for NTT Computation Optimization of the Knuth-Yao Sampler . .

4 Implementation Results

Comparison with Related Work . .

5 Conclusion

59 / 60

slide-95
SLIDE 95

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Conclusion

Compact Ring-LWE encryption for 8-bit platform

60 / 60

slide-96
SLIDE 96

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Conclusion

Compact Ring-LWE encryption for 8-bit platform

Fast NTT computation: “MOV-and-ADD” + “SAMS2”

60 / 60

slide-97
SLIDE 97

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Conclusion

Compact Ring-LWE encryption for 8-bit platform

Fast NTT computation: “MOV-and-ADD” + “SAMS2” Reducing the RAM consumption for coefficient

60 / 60

slide-98
SLIDE 98

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Conclusion

Compact Ring-LWE encryption for 8-bit platform

Fast NTT computation: “MOV-and-ADD” + “SAMS2” Reducing the RAM consumption for coefficient Efficient techniques for Knuth-Yao sampler: “Byte-Scanning”

60 / 60

slide-99
SLIDE 99

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Conclusion

Compact Ring-LWE encryption for 8-bit platform

Fast NTT computation: “MOV-and-ADD” + “SAMS2” Reducing the RAM consumption for coefficient Efficient techniques for Knuth-Yao sampler: “Byte-Scanning”

Faster than RSA (278x) and ECC (41x)

60 / 60

slide-100
SLIDE 100

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Conclusion

Compact Ring-LWE encryption for 8-bit platform

Fast NTT computation: “MOV-and-ADD” + “SAMS2” Reducing the RAM consumption for coefficient Efficient techniques for Knuth-Yao sampler: “Byte-Scanning”

Faster than RSA (278x) and ECC (41x) More information:

60 / 60

slide-101
SLIDE 101

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Conclusion

Compact Ring-LWE encryption for 8-bit platform

Fast NTT computation: “MOV-and-ADD” + “SAMS2” Reducing the RAM consumption for coefficient Efficient techniques for Knuth-Yao sampler: “Byte-Scanning”

Faster than RSA (278x) and ECC (41x) More information: Software is available (contacting the authors)

60 / 60

slide-102
SLIDE 102

. .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. . . .. .

Short Overview Ring-LWE Encryption Scheme Our Implementation Implementation Results Conclusion

. . Conclusion

Compact Ring-LWE encryption for 8-bit platform

Fast NTT computation: “MOV-and-ADD” + “SAMS2” Reducing the RAM consumption for coefficient Efficient techniques for Knuth-Yao sampler: “Byte-Scanning”

Faster than RSA (278x) and ECC (41x) More information: Software is available (contacting the authors)

Thank you for your attention 60 / 60