public key cryptography on iot devices
play

Public key cryptography on IoT devices Sujoy Sinha Roy COSIC, KU - PowerPoint PPT Presentation

Public key cryptography on IoT devices Sujoy Sinha Roy COSIC, KU Leuven 1 Small area for HW implementations Small code size for SW implementation Low power or energy or both Reasonably fast computation time 2 This talk


  1. Public key cryptography on IoT devices Sujoy Sinha Roy COSIC, KU Leuven 1

  2. • Small area for HW implementations • Small code size for SW implementation • Low power or energy or both • Reasonably fast computation time 2

  3. This talk Lightweight hardware implementation of Elliptic Curve Cryptography (ECC) ➢ Over binary field ➢ Over prime field 3

  4. m Elliptic curves over binary field F 2 Generic elliptic curves y 2 + xy = x 3 + ax 2 + b where a and b are from Point addition: P 3 (x 3 , y 3 ) = Finite field operations P 1 (x 1 ,y 1 )+P 2 (x 2 ,y 2 ) x 3 = λ 2 + λ + x 1 + x 2 + a y 3 = λ (x 1 + x 3 ) + x 3 + y 1 Scalar multiplication: λ = (y 1 + y 2 )/(x 1 +x 2 ) Base point P(x,y) on curve and scalar n Point doubling: P 3 (x 3 , y 3 ) = 2P 1 (x 1 ,y 1 ) x 3 = λ 2 + λ + a nP = P + P + P + … + P 2 + λ x 3 + x 3 y 3 = x 1 λ = x 1 + y 1 /x 1 Scalar multiplication using double and add algorithm … 0 1 0 1 1 … PD PD PD PD PD … … PA PA PA 4

  5. Lightweight ECC: common tricks • Choice of elliptic curve, finite field etc. ➢ special arithmetic such as endomorphism ➢ sparse irreducible polynomial • Efficient point multiplication algorithm ➢ Reduces number of field operations ➢ Also number of registers e.g. Montgomery ladder, special encoding of scalar etc. • Projective coordinate system ➢ Inversion free • Affordable light-weight countermeasures ➢ Constant time arithmetic. E.g., Montgomery ladder ➢ Random projective coordinate ➢ Scalar randomization (may be?) 5

  6. 163 Uses NIST 163-bit ECC over F 2 ~80 bit security “A 5.1 μ J per point- multiplication elliptic curve cryptographic processor” by V. Rozic, O. 6 Reparaz, and I. Verbauwhede, published in IJCTA 2016.

  7. • Co-processor architecture • Components within ‘ - - ’ rectangle are implemented on the chip “A 5.1 μ J per point- multiplication elliptic curve cryptographic processor” by V. Rozic, O. 7 Reparaz, and I. Verbauwhede, published in IJCTA 2016.

  8. • Algorithm: Montgomery ladder with projective coordinate • Circular register file • Digit serial arithmetic unit (MALU) • Full custom balanced layout for Register File and MALU “A 5.1 μ J per point- multiplication elliptic curve cryptographic processor” by V. Rozic, O. 8 Reparaz, and I. Verbauwhede, published in IJCTA 2016.

  9. Measurement results • UMC 130 nm • Core area 0.54 mm 2 • Scalar multiplication 86K cycles (102 ms at 847.5 KHz) • Power 50.4µW at 847.5KHz • Energy per scalar multiplication 5.1µJ “A 5.1 μ J per point- multiplication elliptic curve cryptographic processor” by V. Rozic, O. 9 Reparaz, and I. Verbauwhede, published in IJCTA 2016.

  10. 283 Uses NIST 283-bit Koblitz curve over F 2 ~140 bit security 10

  11. m Elliptic curves over F 2 Generic elliptic curves Koblitz curves y 2 + xy = x 3 + a x 2 + b y 2 + xy = x 3 + ax 2 + 1, a=0 or 1 Point addition: P 3 (x 3 , y 3 ) = P 1 (x 1 ,y 1 )+P 2 (x 2 ,y 2 ) Point addition: P 3 (x 3 , y 3 ) = P 1 (x 1 ,y 1 )+P 2 (x 2 ,y 2 ) x 3 = λ 2 + λ + x 1 + x 2 + a x 3 = λ 2 + λ + x 1 + x 2 + a y 3 = λ (x 1 + x 3 ) + x 3 + y 1 y 3 = λ (x 1 + x 3 ) + x 3 + y 1 Point doubling: P 3 (x 3 , y 3 ) = 2P 1 (x 1 ,y 1 ) Point doubling: P 3 (x 3 , y 3 ) = 2P 1 (x 1 ,y 1 ) x 3 = λ 2 + λ + a Frobenius endomorphism x 3 = x 1 2 2 + λ x 3 + x 3 y 3 = x 1 2 y 3 = y 1 Cheap! Scalar multiplication Scalar multiplication … 0 1 0 1 1 … … 0 1 0 1 1 … PD PD PD PD PD FE FE FE FE FE … … … … PA PA PA PA PA PA 11

  12. But there is a catch … Generic elliptic curve Scalar 1 1 … … 0 1 0 PD PD PD PD PD … … PA PA PA Koblitz curve Scalar 1 … 0 1 0 1 … Scalar conversion FE FE FE FE FE … … PA PA PA 12

  13. But there is a catch … Generic elliptic curve Scalar 1 1 … … 0 1 0 PD PD PD PD PD … … PA PA PA Koblitz curve Scalar … 0 1 0 1 … 1 Scalar conversion FE FE FE FE FE … … PA PA PA Several implementations of 13 m lightweight ECC over 𝔾 2

  14. Scalar conversion • Step 1: Scalar is reduced using the lazy reduction by Brumley and Järvinen • Step 2: zero-free expansion by Okeya, Takagi, and Vuillaume ⇒ For Koblitz curve K283, integer add/sub of size 283-bit 14

  15. Optimization • We avoid negations ➢ We compute ( d 0 , d 1 )  ( d 0 /2 – d 1, d 0 /2) ➢ We compute ( a 0 , a 1 )  (2 a 1 , a 1 - a 0 ) Saves 1/3 of ➢ We compute ( b 0 , b 1 )  ( b 0 /2 – b 1 , b 0 /2) cycles! ➢ Sign is corrected in the end of loop 15

  16. SPA resistance O or 1 Conditional multi-precision addition reveals info of the secret scalar 16

  17. SPA resistance O or 1 Conditional multi-precision addition reveals info of the secret scalar We generate u ∈ {-1,1} using zero-free function Ψ ( ) ➢ u = -1 then b 0 - a 0 Similar operations ⇒ Increased SPA resistance! ➢ u = +1 then b 0 + a 0 17

  18. Scalar multiplication Scalar conversion produces zero-free representation • Zero-free representation is generated in (almost) constant time • Conversion is one time for a scalar ⇒ attacker has one trace • The accumulator point is randomized as shown by Coron: (X; Y;Z) = (xr; yr 2 ; r), where r is random 18

  19. Lightweight 283-bit Koblitz curve processor Area 4.3 KGE (without RAM) ~10 KGE (with RAM) RAM size 4032 bits Time 1,566,000 cycles 98 ms (16MHz) Energy 9.6 µJ Power 98 µW (1MHz) “Lightweight coprocessor for Koblitz curves: 283-bit ECC including scalar conversion 19 with only 4300 gates” by SS Roy, K Järvinen, I Verbauwhede in CHES2015

  20. An implementation over prime field 20

  21. Curve25519 E : y 2 = x 3 + 486662x 2 + x 128-bit security • Montgomery curve Efficient prime p = 2 255 − 19 • • Known for fast arithmetic 21

  22. Curve25519 E : y 2 = x 3 + 486662x 2 + x 128-bit security • Montgomery curve Efficient prime p = 2 255 − 19 • • Known for fast arithmetic Montgomery ladder Combined PA-PD No need to store y-coordinate! 4S + 5M +M A + 8A 22

  23. Curve25519 Efficient prime p = 2 255 − 19 • Modular reduction is easier C = AB = C 1 ∙2 255 + C 0 C mod p = (C 1 ∙19 + C 0 ) mod p • 15 × 17 = 255 • Special acceleration on HW by processing words of 17-bit • E.g. Xilinx FPGAs have 25×18 DSP multipliers 23

  24. Throughput: 25,000 point multiplications per sec Area of point multiplier: 2,783 LUTs 3,592 FF 20 DSP MULTs Parallel processing for high throughput Modular multiplier for Curve25519 “ Efficient Elliptic-Curve Cryptography using Curve25519 on Reconfigurable 24 Devices” by Sasdrich and Güneysu in ARC 2014

  25. lightweight architecture for Curve25519 • 32 bit word-serial architecture • Single port memory • 32-bit multiplier parameterized for digit width w = 2,4,8,12 and 16 ➢ Speed vs area • ASIP ⇒ programmable “NaCl’s crypto_box in hardware” by M. H utter, J. Schilling, P. Schwabe, and W. 25 Wieser in CHES 2015. Architecture diagram taken from CHES2015 presentation.

  26. lightweight architecture for Curve25519 Results Note: Unified implementation of Curve25519, Salsa20 and Poly 1305 Smallest configuration: Area 14,648 GE, power 40µW (including optimized RAM) Key exchange takes 3,455,394 cycles Fastest configuration: Area 17,966 GE, power 70µW (including optimized RAM) Key exchange takes 811,170 cycles “NaCl’s crypto_box in hardware” by M. Hutter, J. Schilling, P. Schwabe, and W. 26 Wieser in CHES 2015. Architecture diagram taken from CHES2015 presentation.

  27. Of general interest … [product scanning] • Classical product scanning example a 3 a 2 a 1 a 0 × b 3 b 2 b 1 b 0 a 0 b 0 … c 0 27

  28. Of general interest … [product scanning] • Classical product scanning example a 3 a 2 a 1 a 0 × b 3 b 2 b 1 b 0 a 1 b 0 a 0 b 0 a 0 b 1 c 1 c 0 28

  29. Of general interest … [product scanning] • Classical product scanning example a 3 a 2 a 1 a 0 × b 3 b 2 b 1 b 0 a 2 b 0 a 1 b 0 a 0 b 0 a 1 b 1 a 0 b 1 a 0 b 2 c 2 c 1 c 0 29

  30. Of general interest … [product scanning] • Classical product scanning example a 3 a 2 a 1 a 0 × b 3 b 2 b 1 b 0 a 3 b 0 a 2 b 0 a 1 b 0 a 0 b 0 a 2 b 1 a 1 b 1 a 0 b 1 a 1 b 2 a 0 b 2 a 0 b 3 c 3 c 2 c 1 c 0 30

  31. Of general interest … [product scanning] • Classical product scanning example a 3 a 2 a 1 a 0 × b 3 b 2 b 1 b 0 a 3 b 0 a 2 b 0 a 1 b 0 a 0 b 0 a 3 b 1 a 2 b 1 a 1 b 1 a 0 b 1 a 2 b 2 a 1 b 2 a 0 b 2 … a 1 b 1 a 0 b 3 … c 4 c 3 c 2 c 1 c 0 31

  32. Of general interest … [product scanning] • Classical product scanning example a 3 a 2 a 1 a 0 a 0 b 0 × b 3 b 2 b 1 b 0 a 3 b 0 a 2 b 0 a 1 b 0 a 0 b 0 × a 3 b 1 a 2 b 1 a 1 b 1 a 0 b 1 a 2 b 2 a 1 b 2 a 0 b 2 + … a 1 b 1 a 0 b 3 … c 4 c 3 c 2 c 1 c 0 c 0 32

  33. Of general interest … [product scanning] • Classical product scanning example a 3 a 2 a 1 a 0 a 1 b 0 × b 3 b 2 b 1 b 0 a 3 b 0 a 2 b 0 a 1 b 0 a 0 b 0 × a 3 b 1 a 2 b 1 a 1 b 1 a 0 b 1 a 2 b 2 a 1 b 2 a 0 b 2 + … a 1 b 1 a 0 b 3 … c 4 c 3 c 2 c 1 c 0 a 1 b 0 33

  34. Of general interest … [product scanning] • Classical product scanning example a 3 a 2 a 1 a 0 a 0 b 1 × b 3 b 2 b 1 b 0 a 3 b 0 a 2 b 0 a 1 b 0 a 0 b 0 × a 3 b 1 a 2 b 1 a 1 b 1 a 0 b 1 a 2 b 2 a 1 b 2 a 0 b 2 + … a 1 b 1 a 0 b 3 … c 4 c 3 c 2 c 1 c 0 c 0 34

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend