new speed records 640838 pentium m cycles for point
play

New speed records 640838 Pentium M cycles for point multiplication - PowerPoint PPT Presentation

New speed records 640838 Pentium M cycles for point multiplication to compute a 32-byte secret shared by Dan and Tanja, D. J. Bernstein given Dans 32-byte secret key and Tanjas 32-byte public key . 2 128 cycles. All known


  1. � New speed records 640838 Pentium M cycles for point multiplication to compute a 32-byte secret shared by Dan and Tanja, D. J. Bernstein given Dan’s 32-byte secret key and Tanja’s 32-byte public key . 2 128 cycles. All known attacks: This is the new speed record for high-security Diffie-Hellman. Thanks to: Encrypt and authenticate messages University of Illinois at Chicago using hash of shared secret as key. NSF CCR–9983950 Diffie-Hellman is the bottleneck Alfred P. Sloan Foundation if total message length is short.

  2. ✂ ✂ ✂ ✂ ✂ ✁ � ✂ ✁ ✄ ✁ ✁ � ✂ ✂ ✁ ✄ � ✄ � � rds 640838 Pentium M cycles 640838 Pentium M multiplication to compute a 32-byte secret to compute � -coordinate shared by Dan and Tanja, multiple of ( ✂ ) given Dan’s 32-byte secret key given 0 ✁ 1 2 254 + 8 0 and Tanja’s 32-byte public key . ✁ 1 2 128 cycles. All known attacks: Curve25519 is the 2 = � 3 + 486662 This is the new speed record mod the prime 2 255 for high-security Diffie-Hellman. 624786 Athlon (622) Encrypt and authenticate messages 832457 Pentium II Illinois at Chicago using hash of shared secret as key. 957904 Pentium 4 CCR–9983950 Diffie-Hellman is the bottleneck I anticipate similar Foundation if total message length is short. for UltraSPARC, P

  3. ✂ ✁ ✂ ✁ ✂ ✂ ✄ � ✂ ✁ ✂ ✂ ✄ � � ✄ ✂ 640838 Pentium M cycles 640838 Pentium M (695) cycles � th to compute a 32-byte secret to compute � -coordinate of shared by Dan and Tanja, multiple of ( ✂ ) on Curve25519, ✁ 2 256 given Dan’s 32-byte secret key given 0 ✁ 1 1 and 2 254 + 8 0 ✁ 2 251 and Tanja’s 32-byte public key . ✁ 1 1 . 2 128 cycles. All known attacks: Curve25519 is the elliptic curve 2 = � 3 + 486662 � 2 + This is the new speed record mod the prime 2 255 19. for high-security Diffie-Hellman. 624786 Athlon (622) cycles; Encrypt and authenticate messages 832457 Pentium III (686) cycles; using hash of shared secret as key. 957904 Pentium 4 (f12) cycles. Diffie-Hellman is the bottleneck I anticipate similar cycle counts if total message length is short. for UltraSPARC, PowerPC, etc.

  4. ✁ ✂ ✂ ✂ ✁ ✂ ✄ � ✄ ✂ ✂ ✂ ✄ ✄ ✄ � � ✂ ✁ M cycles 640838 Pentium M (695) cycles Immune to timing � th 32-byte secret to compute � -coordinate of including cache-timing and Tanja, multiple of ( ✂ ) on Curve25519, including hyperthreading ✁ 2 256 yte secret key given 0 ✁ 1 1 and No data-dependent 2 254 + 8 0 ✁ 2 251 yte public key . ✁ 1 1 . no data-dependent 2 128 cycles. attacks: Curve25519 is the elliptic curve Software is in public 2 = � 3 + 486662 � 2 + 16 kilobytes when speed record mod the prime 2 255 19. cr.yp.to/ecdh.html Diffie-Hellman. 624786 Athlon (622) cycles; No known patent p authenticate messages 832457 Pentium III (686) cycles; shared secret as key. For comparison, Bro 957904 Pentium 4 (f12) cycles. the bottleneck much smaller prime, I anticipate similar cycle counts length is short. 780000 PII cycles; for UltraSPARC, PowerPC, etc. no timing-attack p

  5. ✂ ✄ ✄ ✂ ✂ ✂ ✁ ✁ ✄ ✂ � ✂ ✂ ✁ ✄ � ✂ ✄ 640838 Pentium M (695) cycles Immune to timing attacks, � th to compute � -coordinate of including cache-timing attacks, multiple of ( ✂ ) on Curve25519, including hyperthreading attacks. ✁ 2 256 given 0 ✁ 1 1 and No data-dependent branches; 2 254 + 8 0 ✁ 2 251 ✁ 1 1 . no data-dependent indexing. Curve25519 is the elliptic curve Software is in public domain. 2 = � 3 + 486662 � 2 + 16 kilobytes when compiled. mod the prime 2 255 19. cr.yp.to/ecdh.html 624786 Athlon (622) cycles; No known patent problems. 832457 Pentium III (686) cycles; For comparison, Brown et al.: 957904 Pentium 4 (f12) cycles. much smaller prime, 2 192 2 64 1; I anticipate similar cycle counts 780000 PII cycles; given; for UltraSPARC, PowerPC, etc. no timing-attack protection.

  6. � ✄ � ✄ ✄ � ✂ ✂ ✂ ✁ ✂ � ✄ ✂ ✂ ✂ ✁ ✁ ✄ ✂ ✂ ✁ M (695) cycles Immune to timing attacks, Where are the cycles � th ordinate of including cache-timing attacks, Focus today on Pentium ✂ ) on Curve25519, including hyperthreading attacks. Fastest arithmetic ✁ 2 256 1 and No data-dependent branches; uses floating-point ✁ 2 251 ✁ 1 1 . no data-dependent indexing. fp adds, fp subs, fp the elliptic curve Software is in public domain. � 2 + Each Pentium M cycle 486662 16 kilobytes when compiled. 1 fp op. 255 19. cr.yp.to/ecdh.html Point multiplication: (622) cycles; No known patent problems. 589825 fp ops; 0 III (686) cycles; For comparison, Brown et al.: 4 (f12) cycles. Understand cycle counts much smaller prime, 2 192 2 64 1; similar cycle counts by simply counting 780000 PII cycles; given; ARC, PowerPC, etc. no timing-attack protection.

  7. ✄ ✄ Immune to timing attacks, Where are the cycles going? including cache-timing attacks, Focus today on Pentium M. including hyperthreading attacks. Fastest arithmetic on Pentium M No data-dependent branches; uses floating-point operations: no data-dependent indexing. fp adds, fp subs, fp mults. Software is in public domain. Each Pentium M cycle does 16 kilobytes when compiled. 1 fp op. cr.yp.to/ecdh.html Point multiplication: 640838 cycles. No known patent problems. 589825 fp ops; 0 ✂ 92 per cycle. For comparison, Brown et al.: Understand cycle counts fairly well much smaller prime, 2 192 2 64 1; by simply counting fp ops. 780000 PII cycles; given; no timing-attack protection.

  8. ✄ ✄ � � ✄ timing attacks, Where are the cycles going? Avoiding all time va cache-timing attacks, to stop timing attacks: Focus today on Pentium M. erthreading attacks. 1. For 0 ✁ 1 , compute Fastest arithmetic on Pentium M endent branches; as � [1] + (1 ) uses floating-point operations: endent indexing. Avoids data-dependent fp adds, fp subs, fp mults. public domain. Costs 36210 fp ops Each Pentium M cycle does when compiled. 2. Compute final recip 1 fp op. cr.yp.to/ecdh.html by Fermat, not extended Point multiplication: 640838 cycles. patent problems. Avoids data-dependent 589825 fp ops; 0 ✂ 92 per cycle. Brown et al.: 3. Don’t branch fo Understand cycle counts fairly well rime, 2 192 2 64 1; Allow non-least remainders. by simply counting fp ops. cycles; given; No cost—this saves protection.

  9. ✄ Where are the cycles going? Avoiding all time variability to stop timing attacks: Focus today on Pentium M. 1. For 0 ✁ 1 , compute � [ ] Fastest arithmetic on Pentium M as � [1] + (1 ) � [0] or similar. uses floating-point operations: Avoids data-dependent indexing. fp adds, fp subs, fp mults. Costs 36210 fp ops (6%). Each Pentium M cycle does 2. Compute final reciprocal 1 fp op. by Fermat, not extended Euclid. Point multiplication: 640838 cycles. Avoids data-dependent branching. 589825 fp ops; 0 ✂ 92 per cycle. 3. Don’t branch for remainders. Understand cycle counts fairly well Allow non-least remainders. by simply counting fp ops. No cost—this saves time!

  10. � ✄ ✄ cycles going? Avoiding all time variability Main loop: 545700 to stop timing attacks: 2140 times 255 iterations. Pentium M. 1. For 0 ✁ 1 , compute � [ ] Reciprocal: 43821 rithmetic on Pentium M as � [1] + (1 ) � [0] or similar. 41148 = 254 � 162 oint operations: Avoids data-dependent indexing. 2673 = 11 � 243 for subs, fp mults. Costs 36210 fp ops (6%). Additional work: 304 cycle does 2. Compute final reciprocal Inside one main-loop by Fermat, not extended Euclid. 80 = 8 � 10 for 8 adds/subs; multiplication: 640838 cycles. Avoids data-dependent branching. 55 for mult by 121665; 0 ✂ 92 per cycle. 3. Don’t branch for remainders. 648 = 4 � 162 for 4 cycle counts fairly well Allow non-least remainders. 1215 = 5 � 243 for counting fp ops. No cost—this saves time! 142 for � [1] + (1

  11. ✄ ✄ Avoiding all time variability Main loop: 545700 fp ops (92.5%). to stop timing attacks: 2140 times 255 iterations. 1. For 0 ✁ 1 , compute � [ ] Reciprocal: 43821 fp ops (7.4%). as � [1] + (1 ) � [0] or similar. 41148 = 254 � 162 for 254 squarings; Avoids data-dependent indexing. 2673 = 11 � 243 for 11 more mults. Costs 36210 fp ops (6%). Additional work: 304 fp ops. 2. Compute final reciprocal Inside one main-loop iteration: by Fermat, not extended Euclid. 80 = 8 � 10 for 8 adds/subs; Avoids data-dependent branching. 55 for mult by 121665; 3. Don’t branch for remainders. 648 = 4 � 162 for 4 squarings; Allow non-least remainders. 1215 = 5 � 243 for 5 more mults; No cost—this saves time! 142 for � [1] + (1 ) � [0] etc.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend