the complete cost of cofactor h 1
play

The complete cost of cofactor h = 1 Implementing Weierstrass curves - PowerPoint PPT Presentation

The complete cost of cofactor h = 1 Implementing Weierstrass curves with complete formulas Peter Schwabe Daan Sprenkels 18 December 2019 Radboud University, peter@cryptojedi.org, daan@dsprenkels.com 1 Introduction Some history


  1. Benchmarks Figure: cycle counts in kcc Implementation SB H M4 159 a 156 b Chou16 [Cho16] – 156 a Faz-Hern´ andez-L´ opez15 [FL15] – – OLHF18 [OLH + 18] 139 a – – 907 a Fujii-Aranha19 [FA19] – – 625 a Haase-Labrique19 [HL19] – – 390 b 205 b 1 797 b Curve13318 ( this work ) slowdown 2.45 × 1.47 × 2.87 × a As reported in the respective publication. b From own measurements. 18

  2. Future work ◮ Use formulas from [SM17] ◮ Benchmark with ristretto255 19

  3. Thank you! The code is at https://github . com/dsprenkels/curve13318-all (public domain) Extra reading: ◮ Paper: https://dsprenkels . com/files/curve13318 . pdf ◮ Monero vulnerability (1): https://nickler . ninja/blog/2017/05/23/exploiting-low-order- generators-in-one-time-ring-signatures/ ◮ Monero vulnerability (2): https://moderncrypto . org/mail-archive/curves/2017/000898 . html 20

  4. References i Paulo S. L. M. Barreto. Tweet, 2017. https: //twitter . com/pbarreto/status/869103226276134912 . Daniel J. Bernstein. Floating-point arithmetic and message authentication, 2004. http://cr . yp . to/papers . html#hash127 . 21

  5. References ii Daniel J. Bernstein. Curve25519: new Diffie-Hellman speed records. In Moti Yung, Yevgeniy Dodis, Aggelos Kiayias, and Tal Malkin, editors, Public Key Cryptography – PKC 2006 , volume 3958 of LNCS , pages 207–228. Springer, 2006. http://cr . yp . to/papers . html#curve25519 . Daniel J. Bernstein and Tanja Lange. eBACS: ECRYPT Benchmarking of Cryptographic Systems. https://bench . cr . yp . to/results-sign . html (accessed 2019-10-03). 22

  6. References iii Daniel J. Bernstein and Peter Schwabe. NEON crypto. In Emmanuel Prouff and Patrick Schaumont, editors, Cryptographic Hardware and Embedded Systems – CHES 2012 , volume 7428 of LNCS , pages 320–339. Springer, 2012. http://cryptojedi . org/papers/#neoncrypto . Tung Chou. Sandy2x: New Curve25519 speed records. In Orr Dunkelman and Liam Keliher, editors, Selected Areas in Cryptography – SAC 2015 , volume 9566 of LNCS , pages 145–160. Springer, 2016. 23

  7. References iv https://www . win . tue . nl/~tchou/papers/sandy2x . pdf . Cas Cremers and Dennis Jackson. Prime, order please! revisiting small subgroup and invalid curve attacks on protocols using Diffie-Hellman. In 2019 IEEE 32nd Computer Security Foundations Symposium (CSF) , pages 78–93, 2019. https://eprint . iacr . org/2019/526 . 24

  8. References v Henry de Valence, Jack Grigg, George Tankersley, Filippo Valsorda, and Isis Lovecruft. The ristretto255 group. IETF CFRG Internet Draft, 2019. https://tools . ietf . org/html/draft-hdevalence-cfrg- ristretto-01 (accessed 2019-07-31). Hayato Fujii and Diego F. Aranha. Curve25519 for the Cortex-M4 and Beyond. In Tanja Lange and Orr Dunkelman, editors, Progress in Cryptology – LATINCRYPT 2017 , volume 11368 of LNCS , pages 109–127. Springer, 2019. 25

  9. References vi http://www . cs . haifa . ac . il/~orrd/LC17/paper39 . pdf . Armando Faz-Hern´ andez and Julio L´ opez. Fast implementation of Curve25519 using AVX2. In Kristin Lauter and Francisco Rodr´ ıguez-Henr´ ıquez, editors, Progress in Cryptology – LATINCRYPT 2015 , volume 9230 of LNCS , pages 329–345. Springer, 2015. Mike Hamburg. Decaf: Eliminating cofactors through point compression. 26

  10. References vii In Rosario Gennaro and Matthew Robshaw, editors, Advances in Cryptology – CRYPTO 2015 , volume 9215 of LNCS , pages 705–723. Springer, 2015. https://www . shiftleft . org/papers/decaf/ . Bj¨ orn Haase and Benoˆ ıt Labrique. AuCPace: Efficient verifier-based PAKE protocol tailored for the IIoT. IACR Transactions on Cryptographic Hardware and Embedded Systems , pages 1–48, 2019. https: //tches . iacr . org/index . php/TCHES/article/view/7384 . 27

  11. References viii luigi1111 and Riccardo “fluffypony” Spagni. Disclosure of a major bug in CryptoNote based currencies. Post on the Monero website, 2017. https://www . getmonero . org/2017/05/17/disclosure-of-a- major-bug-in-cryptonote-based-currencies . html (accessed 2019-07-31). 28

  12. References ix Thomaz Oliveira, Julio L´ opez, H¨ useyin Hı¸ sıl, Armando Faz-Hern´ andez, and Francisco Rodr´ ıguez-Henr´ ıquez. How to (Pre-)Compute a Ladder. In Carlisle Adams and Jan Camenisch, editors, Selected Areas in Cryptography – SAC 2017 , volume 10719 of LNCS , pages 172–191. Springer, 2018. https://eprint . iacr . org/2017/264 . pdf . 29

  13. References x Joost Renes, Craig Costello, and Lejla Batina. Complete addition formulas for prime order elliptic curves. In Marc Fischlin and Jean-S´ ebastien Coron, editors, Advances in Cryptology – Eurocrypt 2016 , volume 9230 of LNCS , pages 403–428. Springer, 2016. http://eprint . iacr . org/2015/1060 . 30

  14. References xi Ruggero Susella and Sofia Montrasio. A compact and exception-free ladder for all short Weierstrass elliptic curves. In Kerstin Lemke-Rust and Michael Tunstall, editors, Smart Card Research and Advanced Applications , volume 10146 of LNCS , pages 156–173. Springer, 2017. 31

  15. Preliminaries

  16. Elliptic curves E : y 2 = x 3 + ax + b

  17. Elliptic curves E : y 2 = x 3 + ax + b 4 2 0 y − 2 − 4 − 4 − 2 0 2 4 x

  18. Elliptic curves: addition E : y 2 = x 3 + ax + b 4 − R Q 2 P 0 y − 2 R − 4 − 4 − 2 0 2 4 x

  19. Elliptic curves: doubling E : y 2 = x 3 + ax + b 4 − R P 2 0 y − 2 R − 4 − 4 − 2 0 2 4 x

  20. Elliptic curves ◮ Coordinates include the point at infinity O • Define P + O = P

  21. Elliptic curves ◮ Coordinates include the point at infinity O • Define P + O = P ◮ Curve equation: E : y 2 = x 3 + ax + b

  22. Elliptic curves ◮ Coordinates include the point at infinity O • Define P + O = P ◮ Curve equation: E : y 2 = x 3 + ax + b ◮ Coordinates are defined over a field F q • I.e. integers modulo q

  23. Elliptic curves: actually E : y 2 = x 3 − 3 x + 1 defined over F 11 5 4 3 2 1 0 y − 1 − 2 − 3 − 4 − 5 0 1 2 3 4 5 6 7 8 9 10 11 x

  24. Elliptic curves: actual addition E : y 2 = x 3 − 3 x + 1 defined over F 11 5 R 4 3 Q 2 1 0 y − 1 P − 2 − 3 − 4 − R − 5 0 1 2 3 4 5 6 7 8 9 10 11 x

  25. Group arithmetic ◮ We can do arithmetic with these rules! :) ◮ Addition: P + Q ◮ Subtraction: P − Q ◮ Neutral element: O , i.e. “zero”

  26. Group arithmetic ◮ We can do arithmetic with these rules! :) ◮ Addition: P + Q ◮ Subtraction: P − Q ◮ Neutral element: O , i.e. “zero” ◮ Scalar multiplication: [ k ] P = P + P + ... + P � �� � k times

  27. Group arithmetic ◮ We can do arithmetic with these rules! :) ◮ Addition: P + Q ◮ Subtraction: P − Q ◮ Neutral element: O , i.e. “zero” ◮ Scalar multiplication: [ k ] P = P + P + ... + P � �� � k times ◮ Discrete log problem: given P , Q where [ k ] P = Q , hard to find k

  28. Elliptic curves are cyclic ◮ Points form a cycle: O + P + P + P + P → ... + P + P − − → P − − → [2] P − − → [3] P − − − − → [ n − 1] P − − → O

  29. Elliptic curves are cyclic ◮ Points form a cycle: O + P + P + P + P → ... + P + P − − → P − − → [2] P − − → [3] P − − − − → [ n − 1] P − − → O � �� � n steps ◮ The order n should contain a large prime factor ◮ Only one cycle if n is prime

  30. Cofactors ◮ If n is not a prime Then n = h · ℓ ◮ I.e. small loops are possible: E.g. if 4 | n , then there is a point T 4 : + T 4 + T 4 + T 4 + T 4 O − − → T 4 − − → [2] T 4 − − → [3] T 4 − − → O � �� � only 4 steps!

  31. Cofactors ◮ If n is not a prime Then n = h · ℓ ◮ I.e. small loops are possible: E.g. if 4 | n , then there is a point T 4 : + T 4 + T 4 + T 4 + T 4 O − − → T 4 − − → [2] T 4 − − → [3] T 4 − − → O � �� � only 4 steps! ◮ h is called the cofactor

  32. Cofactors ◮ If n is not a prime Then n = h · ℓ ◮ I.e. small loops are possible: E.g. if 4 | n , then there is a point T 4 : + T 4 + T 4 + T 4 + T 4 O − − → T 4 − − → [2] T 4 − − → [3] T 4 − − → O � �� � only 4 steps! ◮ h is called the cofactor ◮ This property is often harmless

  33. Cofactors ◮ If n is not a prime Then n = h · ℓ ◮ I.e. small loops are possible: E.g. if 4 | n , then there is a point T 4 : + T 4 + T 4 + T 4 + T 4 O − − → T 4 − − → [2] T 4 − − → [3] T 4 − − → O � �� � only 4 steps! ◮ h is called the cofactor ◮ This property is often harmless • I.e. sometimes it’s the opposite of harmless

  34. Double-and-add

  35. Double-and-add algorithm function DoubleAndAdd ( k , P ) ⊲ Compute [ k ] P R ← O for i from n − 1 down to 0 do R ← [2] R ⊲ Doubling if k i = 1 then R ← R + P ⊲ Addition else R ← R + O ⊲ Addition end if end for return R end function

  36. Fixed-window double-and-add function FixedWindow ( k , P ) ⊲ Compute [ k ] P k ′ ← Windows w ( k ) Precompute ([2] P , ... , [2 w − 1] P ) R ← O for i from n w − 1 down to 0 do for j from 0 to w − 1 do R ← [2] R ⊲ w doublings end for if k ′ i � = 0 then R ← R + [ k ′ i ] P ⊲ Addition else R ← R + O ⊲ Addition end if end for return R end function

  37. Signed double-and-add function SignedFixedWindow ( k , P ) ⊲ Compute [ k ] P k ′ ← RecodeSigned ( Windows w ( k )) Precompute ([2] P , ... , [2 w − 1 ] P ) R ← O for i from n w − 1 down to 0 do for j from 0 to w − 1 do R ← [2] R ⊲ w doublings end for if k ′ i > 0 then R ← R + [ k ′ i ] P ⊲ Addition else if k ′ i < 0 then R ← R − [ − k ′ i ] P ⊲ Addition else R ← R + O ⊲ Addition end if end for return R end function

  38. Implemented signed double-and-add function ScalarMultiplication ( k , P ) ⊲ Compute [ k ] P T ← ( O , P , ... , [16] P ) ⊲ Precompute ([2] P , ... , [16] P ) k ′ ← RecodeSigned ( Windows 5 ( k )) R ← O for i from 50 down to 0 do for j from 0 to 4 do R ← [2] R ⊲ 5 doublings end for if k ′ i < 0 then R ← R − T − k ′ ⊲ Addition i else R ← R + T k ′ ⊲ Addition i end if end for return R ⊲ R = ( X R : Y R : Z R ) end function

  39. Signed windows k = 1011 0010 0110 1110 k ′ k ′ k ′ k ′ 3 2 1 0

  40. Signed window recoding k = 1011 0010 0110 1110 1 − 101 010 111 − 010 k ′′ k ′′ k ′′ k ′′ k ′′ 4 3 2 1 0

  41. Sandy Bridge details

  42. sign exponent mantissa 63 52 0

  43. Depiction of top ( f ) f i : ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? c i : + 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + z ′ : + 1 ? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 c i : + 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 − result: ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 53 b i +1 2 53 b i b i +1 b i

  44. Sandy Bridge: field element representation ◮ Use double-precision floating points

  45. Sandy Bridge: field element representation ◮ Use double-precision floating points ◮ Allows 4 × vectorized operations using SIMD instructions

  46. Sandy Bridge: field element representation ◮ Use double-precision floating points ◮ Allows 4 × vectorized operations using SIMD instructions ◮ Radix-2 21.25 redundant representation

  47. Sandy Bridge: field element representation ◮ Use double-precision floating points ◮ Allows 4 × vectorized operations using SIMD instructions ◮ Radix-2 21.25 redundant representation ◮ Use 12 limbs to represent 255-bit numbers

  48. Sandy Bridge: field element representation ◮ Use double-precision floating points ◮ Allows 4 × vectorized operations using SIMD instructions ◮ Radix-2 21.25 redundant representation ◮ Use 12 limbs to represent 255-bit numbers • I.e. f = f 0 + f 1 + ... + f 11

  49. Sandy Bridge: field element representation ◮ Carry • top ( f i ): force loss of precision • Then, move “high” bits to next limb

  50. Sandy Bridge: field element representation ◮ Carry • top ( f i ): force loss of precision • Then, move “high” bits to next limb ◮ Addition • ( f + g ) i = f i + g i • ( f − g ) i = f i − g i

  51. Sandy Bridge: field element representation ◮ Carry • top ( f i ): force loss of precision • Then, move “high” bits to next limb ◮ Addition • ( f + g ) i = f i + g i • ( f − g ) i = f i − g i ◮ Multiplication • ( f · g ) k = � i + j = k f i g i + � � 2 − 255 · 19 � f i g i i + j = k +12 • Optimized using Karatsuba’s multiplication

  52. Addition formulas ◮ Use Renes-Costello-Batina formulas ◮ Rewrite using graphs into vectorized operations ◮ Implement using field arithmetic functions

  53. ⟦ ⟧ ⟦ ⟧ ⟦ ⟧ ₂₀ ₉ ⟦ ⟧ ⟦ ⟧ ⟦ ⟧ ⟦ ⟧ Point doubling dbl_generic y3 x3 z3 Legend 27 31 34 add subtract 26 30 33 triple 25 22 14 15 multiply by small constant 24 21 13 12 32 multiply square 23 20 11 2 29 1 19 10 17 18 9 16 8 7 5 3 6 4 28 x z y

  54. Point doubling dbl_4x (3M + 4c) x3 y3 z3 Legend 31 27 add 15 14 32 subtract 5 12 13 triple 4 30 26 2 extra carry operation multiply by small constant 29a 22 25 11 34 multiply 34 11 22 square 33 10 21 29b 7 ⟦ -3 ⟧ ⟦ 8 ⟧ ⟦ -6 ⟧ 20 = -a ₂₀ 9 19 25 = -a ₉ /2 24 17 18 8 23 16 ⟦ 2b ⟧ ⟦ -b/2 ⟧ ⟦ 3 ⟧ ⟦ -3 ⟧ 28 1 6 3 y x z

  55. ⟦ ⟧ ⟦ ⟧ ⟦ ⟧ ⟦ ⟧ Point addition add_generic z3 y3 x3 38 40 Legend add 36 43 35 subtract 31 41 37 39 triple 30 23 24 multiply by small constant 29 22 multiply 28 42 21 25 34 20 33 18 27 19 8 13 32 17 16 26 7 6 12 11 1 14 3 4 9 15 5 2 10 x1 z1 y1 x2 z2 y2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend