a novel modular adder for one thousand bits and more
play

A Novel Modular Adder for One Thousand Bits and More Using Fast - PowerPoint PPT Presentation

A Novel Modular Adder for One Thousand Bits and More Using Fast Carry Chains of Modern FPGAs Marcin Rogawski, Ekawat Homsirikamol & Kris Gaj George Mason University USA 1 Co-Authors Ekawat Homsirikamol Marcin Rogawski a.k.a


  1. ¡ A Novel Modular Adder for One Thousand Bits and More Using Fast Carry Chains of Modern FPGAs Marcin Rogawski, Ekawat Homsirikamol & Kris Gaj George Mason University USA 1 ¡

  2. Co-Authors Ekawat Homsirikamol Marcin Rogawski a.k.a “Ice” PhD @ GMU, Summer 2013 PhD Student Currently @ Cadence Design Systems San Jose, CA

  3. Motivation • Adders used in multiple branches of science & engineering • Basic building block of more complex arithmetic computations (multiplication, modular reduction, etc.) • Need for long-operand adders ( ≥ 1024 bits) in cryptography (RSA, Diffie-Hellman, Elliptic Curve Cryptography, Pairing-Based Cryptography, post-quantum cryptography) • FPGAs contain special dedicated resources (fast carry chains) supporting fast addition, but only for operands in the range of 32-64 bits. 3

  4. Fast Carry Chains of Modern FPGAs Xilinx FPGAs Altera FPGAs cin a) b) cin b 0 a 0 s 0 s LUT LUT FA 0 0 1 a 0 b 0 b 1 a 1 s 1 s LUT LUT FA 1 0 1 a 1 b 1 cout cout • Minimize delays • Save reconfigurable resources 4

  5. Parallel Prefix Network (PPN) Adder 5

  6. Parallel Prefix Network – Major Concept (1) Given: Generate-propagate signals for each bit position (g n-1 , p n-1 ) …. (g 2 , p 2 ) (g 1 , p 1 ) (g 0 , p 0 ) Calculate (in parallel): Generate-propagate signals for each block of bits starting at position 0 (g [0,n-1] , p [0,n-1] ) …. (g [0,2] , p [0,2] ) (g [0,1] , p [0,1] ) (g [0,0] , p [0,0] ) 6

  7. Parallel Prefix Network – Major Concept (2) Calculate: Projected carry at position i pc i = g [0,i-1] + c 0 p [0,i-1] Assuming c 0 = 0 (no need to cascade adders that are already very long): pc i = g [0,i-1] where i=1..n 7

  8. Kogge-Stone PPN • Minimum Latency (log 2 N) • Large Area 8

  9. Brent-Kung PPN • Good trade-off between Latency (2 log 2 N – 2) and Area 9

  10. Parallel Prefix Network (PPN) Adder in FPGA • All logic must be implemented using LUTs! • Large PPN required (e.g., n=1024) 10

  11. Our High-Radix Parallel Prefix Network Adder 11

  12. GPS: Generate-Propagate-Sum in Xilinx FPGAs 12

  13. S: Sum unit in Xilinx FPGAs 13

  14. Our High-Radix Parallel Prefix Network Adder • GPS and S units implemented using Fast Carry Chains • The size of PPN reduced from n=1024 to N=1024/w 14

  15. General Construction for the Modular Adder A B R = A + B mod P n n cout#1 R = A + B – P n when 2 − P n n n A + B ≥ 2 n > P (cout#1) cout#2 or A + B – P ≥ 0 (cout#2) n n R = A + B 0 1 n otherwise R 15

  16. Our Construction for the High-Radix PPN Modular Adder A B A B A B N − 1 N − 1 1 1 0 0 w w w w fg fg w w N − 1 0 fp N − 1 fp fg fg fg 0 N − 1 1 0 GPS GPS GPS fp N − 1 fp fp 0 1 w w w fpc N fg PPN N − 2 IP IP IP w w w N − 1 1 0 sg sg sg N − 1 1 0 GPSc GPSc fpc N − 1 fpc 1 GPS sp N − 1 sp 1 sp 0 w w w w w w sg sg sel N − 1 0 0 1 0 1 0 1 sel sel sp N − 1 sp 0 fpc fpc 1 0 0 N − 1 S S 1 spc N − 1 1 spc spc PPN 1 N w w w spc spc R R R N − 1 1 0 N − 1 1 16

  17. GPSc: Generate-Propagate-Sum with carry 17

  18. Our Construction for the High-Radix PPN Modular Adder A B A B A B N − 1 N − 1 1 1 0 0 w w w w fg fg w w N − 1 0 fp N − 1 fp fg fg fg 0 N − 1 1 0 GPS GPS GPS fp N − 1 fp fp 0 1 w w w fpc N fg PPN N − 2 IP IP IP w w w N − 1 1 0 sg sg sg N − 1 1 0 GPSc GPSc fpc N − 1 fpc 1 GPS sp N − 1 sp 1 sp 0 w w w w w w sg sg sel N − 1 0 0 1 0 1 0 1 sel sel sp N − 1 sp 0 fpc fpc 1 0 0 N − 1 S S 1 spc N − 1 1 spc spc PPN 1 N w w w spc spc R R R N − 1 1 0 N − 1 1 Two additions: Ÿ Ÿ overlapped in time Ÿ Ÿ sharing resources (S units) 18

  19. Target FPGA Families Xilinx FPGAs Technology ¡ Low-­‑cost ¡ High-­‑ performance ¡ 65 ¡nm ¡ Virtex-­‑5 ¡ 45 ¡nm ¡ Spartan-­‑6 ¡ Altera FPGAs Technology ¡ Low-­‑cost ¡ High-­‑ performance ¡ 65 ¡nm ¡ Stra2x ¡III ¡ 40 ¡nm ¡ Cyclone ¡IV ¡ 19

  20. Design Flow SpecificaEon ¡ Test ¡Vectors ¡ RTL Design Functional VHDL ¡Code ¡ Verification Option Optimization Post ¡ GMU ATHENa (FPL 2010) & Parameter Exploration Place ¡& ¡Route ¡ Results ¡ FPGA ¡Tools ¡ Timing Netlist ¡ Verification ATHENa used to simplify parameter exploration (multiple values of generics) and option optimization for both Xilinx and Altera FPGAs 20

  21. Choosing the Best PPN & Word Size Adders – Altera Cyclone IV 21

  22. Choosing the Best PPN & Word Size Adders – Xilinx Spartan 6 22

  23. Choosing the Best PPN & Word Size Modular Adders – Altera Cyclone IV 23

  24. Choosing the Best PPN & Word Size Modular Adders – Xilinx Spartan 6 24

  25. The Best Choices of PPN Type & Word Size Adders Modular Adders Family PPN (w, N) PPN Word Size Cyclone IV KS (16, 64) KS (16, 64) Stratix III BK (16, 64) BK (16, 64) Spartan 6 BK (32, 32) KS (128, 8) Virtex 5 KS (16, 64) KS (64, 16) KS: Kogge-Stone Parallel Prefix Network BK: Brent-Kung Parallel Prefix Network w – word size N – size of PPN 25

  26. The Best Long-Operand Adders Proposed to Date H.D. Nguyen, B. Pasca, T.B. Preu β er, FPGA-Specific Arithmetic Optimizations of Short-Latency Adders FPL 2011, Chania, Greece • Adders based on Carry-Select Architecture (rather than PPN Architecture) • Three specific architectures proposed • AAM: Add-Add-Multiplex • CAI: Compare-Add-Increment • CCA: Compare-Compare-Add • Limited results ( only Virtex 5 ) included in the original paper • AAM architecture re-implemented and results collected for different FPGAs 26

  27. Comparison with Other Adders – Virtex 5 27

  28. Comparison with Other Adders – Virtex 5 28

  29. Comparison with Other Adders – Spartan 6 29

  30. Comparison with Other Adders – Spartan 6 30

  31. Comparison with Other Adders – Cyclone IV 31

  32. Comparison with Other Adders – Stratix III 32

  33. Comparison Between Modular Adders Xilinx Virtex 5 33

  34. Comparison Between Modular Adders Xilinx Virtex 5 34

  35. Comparison Between Modular Adders Xilinx Spartan 6 35

  36. Comparison Between Modular Adders Xilinx Spartan 6 36

  37. Comparison Between Modular Adders Altera Stratix III 37

  38. Comparison Between Modular Adders Altera Stratix III 38

  39. Comparison Between Modular Adders Altera Cyclone IV 39

  40. Comparison Between Modular Adders Cyclone IV 40

  41. Modular Adder/Subtractor B A B n n A B n SUB n n n n 0 1 1 n n cout#1 A 2 − P P n n n n n cout#1 2 − P n 0 1 n n P n n n n n cout#2 n n n n n cout#2 n SUB 0 1 n n n 0 1 0 n 0 1 R 1 n R R 41

  42. Overhead of Modular Adder/Subtractor Altera Cyclone IV 42

  43. Overhead of Modular Adder/Subtractor Altera Stratix III 43

  44. Overhead of Modular Adder/Subtractor Xilinx Spartan 6 44

  45. Overhead of Modular Adder/Subtractor Xilinx Virtex 5 45

  46. Proposed New Dedicated Resources of Modern FPGAs • Dedicated (hardwired) PPNs (Kogge-Stone and/or Brent-Kung) • Standard sizes (e.g., 32 and/or 64) • Support fast addition and modular addition for large operand sizes (as described in this paper) • Support for fast addition and modular addition of medium operand sizes (up to 64), using classical PPN adders • Pipelined registers that can be activated or bypassed 46

  47. Conclusions • A new family of High-Radix Parallel Prefix Network Adders using fast carry chains of modern FPGAs • New family outperforming the best previously known FPGA-specific adders and modular adders for Xilinx FPGAs • Very small performance penalty for an extension to adders/subtractors • A proposal for embedding medium-size hardwired PPN structures in the new generations of FPGAs 47

  48. Future Work • Possible optimizations for Altera FPGAs • Better (preferably analytical) method of choosing an optimum word size for • Other FPGA families • Other operand sizes • Optimal method of pipelining for adders and modular adders • Extended and more detailed proposal of new FPGA resources supporting fast addition 48

  49. Thank you! Suggestions? Questions? ATHENa: http:/cryptography.gmu.edu/athena CERG: http://cryptography.gmu.edu 49

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend