A Novel Modular Adder for One Thousand Bits and More Using Fast - PowerPoint PPT Presentation

¡ A Novel Modular Adder for One Thousand Bits and More Using Fast Carry Chains of Modern FPGAs Marcin Rogawski, Ekawat Homsirikamol & Kris Gaj George Mason University USA 1 ¡

Co-Authors Ekawat Homsirikamol Marcin Rogawski a.k.a “Ice” PhD @ GMU, Summer 2013 PhD Student Currently @ Cadence Design Systems San Jose, CA

Motivation • Adders used in multiple branches of science & engineering • Basic building block of more complex arithmetic computations (multiplication, modular reduction, etc.) • Need for long-operand adders ( ≥ 1024 bits) in cryptography (RSA, Diffie-Hellman, Elliptic Curve Cryptography, Pairing-Based Cryptography, post-quantum cryptography) • FPGAs contain special dedicated resources (fast carry chains) supporting fast addition, but only for operands in the range of 32-64 bits. 3

Fast Carry Chains of Modern FPGAs Xilinx FPGAs Altera FPGAs cin a) b) cin b 0 a 0 s 0 s LUT LUT FA 0 0 1 a 0 b 0 b 1 a 1 s 1 s LUT LUT FA 1 0 1 a 1 b 1 cout cout • Minimize delays • Save reconfigurable resources 4

Parallel Prefix Network (PPN) Adder 5

Parallel Prefix Network – Major Concept (1) Given: Generate-propagate signals for each bit position (g n-1 , p n-1 ) …. (g 2 , p 2 ) (g 1 , p 1 ) (g 0 , p 0 ) Calculate (in parallel): Generate-propagate signals for each block of bits starting at position 0 (g [0,n-1] , p [0,n-1] ) …. (g [0,2] , p [0,2] ) (g [0,1] , p [0,1] ) (g [0,0] , p [0,0] ) 6

Parallel Prefix Network – Major Concept (2) Calculate: Projected carry at position i pc i = g [0,i-1] + c 0 p [0,i-1] Assuming c 0 = 0 (no need to cascade adders that are already very long): pc i = g [0,i-1] where i=1..n 7

Kogge-Stone PPN • Minimum Latency (log 2 N) • Large Area 8

Brent-Kung PPN • Good trade-off between Latency (2 log 2 N – 2) and Area 9

Parallel Prefix Network (PPN) Adder in FPGA • All logic must be implemented using LUTs! • Large PPN required (e.g., n=1024) 10

Our High-Radix Parallel Prefix Network Adder 11

GPS: Generate-Propagate-Sum in Xilinx FPGAs 12

S: Sum unit in Xilinx FPGAs 13

Our High-Radix Parallel Prefix Network Adder • GPS and S units implemented using Fast Carry Chains • The size of PPN reduced from n=1024 to N=1024/w 14

General Construction for the Modular Adder A B R = A + B mod P n n cout#1 R = A + B – P n when 2 − P n n n A + B ≥ 2 n > P (cout#1) cout#2 or A + B – P ≥ 0 (cout#2) n n R = A + B 0 1 n otherwise R 15

Our Construction for the High-Radix PPN Modular Adder A B A B A B N − 1 N − 1 1 1 0 0 w w w w fg fg w w N − 1 0 fp N − 1 fp fg fg fg 0 N − 1 1 0 GPS GPS GPS fp N − 1 fp fp 0 1 w w w fpc N fg PPN N − 2 IP IP IP w w w N − 1 1 0 sg sg sg N − 1 1 0 GPSc GPSc fpc N − 1 fpc 1 GPS sp N − 1 sp 1 sp 0 w w w w w w sg sg sel N − 1 0 0 1 0 1 0 1 sel sel sp N − 1 sp 0 fpc fpc 1 0 0 N − 1 S S 1 spc N − 1 1 spc spc PPN 1 N w w w spc spc R R R N − 1 1 0 N − 1 1 16

GPSc: Generate-Propagate-Sum with carry 17

Our Construction for the High-Radix PPN Modular Adder A B A B A B N − 1 N − 1 1 1 0 0 w w w w fg fg w w N − 1 0 fp N − 1 fp fg fg fg 0 N − 1 1 0 GPS GPS GPS fp N − 1 fp fp 0 1 w w w fpc N fg PPN N − 2 IP IP IP w w w N − 1 1 0 sg sg sg N − 1 1 0 GPSc GPSc fpc N − 1 fpc 1 GPS sp N − 1 sp 1 sp 0 w w w w w w sg sg sel N − 1 0 0 1 0 1 0 1 sel sel sp N − 1 sp 0 fpc fpc 1 0 0 N − 1 S S 1 spc N − 1 1 spc spc PPN 1 N w w w spc spc R R R N − 1 1 0 N − 1 1 Two additions:   overlapped in time   sharing resources (S units) 18

Target FPGA Families Xilinx FPGAs Technology ¡ Low-‑cost ¡ High-‑ performance ¡ 65 ¡nm ¡ Virtex-‑5 ¡ 45 ¡nm ¡ Spartan-‑6 ¡ Altera FPGAs Technology ¡ Low-‑cost ¡ High-‑ performance ¡ 65 ¡nm ¡ Stra2x ¡III ¡ 40 ¡nm ¡ Cyclone ¡IV ¡ 19

Design Flow SpecificaEon ¡ Test ¡Vectors ¡ RTL Design Functional VHDL ¡Code ¡ Verification Option Optimization Post ¡ GMU ATHENa (FPL 2010) & Parameter Exploration Place ¡& ¡Route ¡ Results ¡ FPGA ¡Tools ¡ Timing Netlist ¡ Verification ATHENa used to simplify parameter exploration (multiple values of generics) and option optimization for both Xilinx and Altera FPGAs 20

Choosing the Best PPN & Word Size Adders – Altera Cyclone IV 21

Choosing the Best PPN & Word Size Adders – Xilinx Spartan 6 22

Choosing the Best PPN & Word Size Modular Adders – Altera Cyclone IV 23

Choosing the Best PPN & Word Size Modular Adders – Xilinx Spartan 6 24

The Best Choices of PPN Type & Word Size Adders Modular Adders Family PPN (w, N) PPN Word Size Cyclone IV KS (16, 64) KS (16, 64) Stratix III BK (16, 64) BK (16, 64) Spartan 6 BK (32, 32) KS (128, 8) Virtex 5 KS (16, 64) KS (64, 16) KS: Kogge-Stone Parallel Prefix Network BK: Brent-Kung Parallel Prefix Network w – word size N – size of PPN 25

The Best Long-Operand Adders Proposed to Date H.D. Nguyen, B. Pasca, T.B. Preu β er, FPGA-Specific Arithmetic Optimizations of Short-Latency Adders FPL 2011, Chania, Greece • Adders based on Carry-Select Architecture (rather than PPN Architecture) • Three specific architectures proposed • AAM: Add-Add-Multiplex • CAI: Compare-Add-Increment • CCA: Compare-Compare-Add • Limited results ( only Virtex 5 ) included in the original paper • AAM architecture re-implemented and results collected for different FPGAs 26

Comparison with Other Adders – Virtex 5 27

Comparison with Other Adders – Virtex 5 28

Comparison with Other Adders – Spartan 6 29

Comparison with Other Adders – Spartan 6 30

Comparison with Other Adders – Cyclone IV 31

Comparison with Other Adders – Stratix III 32

Comparison Between Modular Adders Xilinx Virtex 5 33

Comparison Between Modular Adders Xilinx Virtex 5 34

Comparison Between Modular Adders Xilinx Spartan 6 35

Comparison Between Modular Adders Xilinx Spartan 6 36

Comparison Between Modular Adders Altera Stratix III 37

Comparison Between Modular Adders Altera Stratix III 38

Comparison Between Modular Adders Altera Cyclone IV 39

Comparison Between Modular Adders Cyclone IV 40

Modular Adder/Subtractor B A B n n A B n SUB n n n n 0 1 1 n n cout#1 A 2 − P P n n n n n cout#1 2 − P n 0 1 n n P n n n n n cout#2 n n n n n cout#2 n SUB 0 1 n n n 0 1 0 n 0 1 R 1 n R R 41

Overhead of Modular Adder/Subtractor Altera Cyclone IV 42

Overhead of Modular Adder/Subtractor Altera Stratix III 43

Overhead of Modular Adder/Subtractor Xilinx Spartan 6 44

Overhead of Modular Adder/Subtractor Xilinx Virtex 5 45

Proposed New Dedicated Resources of Modern FPGAs • Dedicated (hardwired) PPNs (Kogge-Stone and/or Brent-Kung) • Standard sizes (e.g., 32 and/or 64) • Support fast addition and modular addition for large operand sizes (as described in this paper) • Support for fast addition and modular addition of medium operand sizes (up to 64), using classical PPN adders • Pipelined registers that can be activated or bypassed 46

Conclusions • A new family of High-Radix Parallel Prefix Network Adders using fast carry chains of modern FPGAs • New family outperforming the best previously known FPGA-specific adders and modular adders for Xilinx FPGAs • Very small performance penalty for an extension to adders/subtractors • A proposal for embedding medium-size hardwired PPN structures in the new generations of FPGAs 47

Future Work • Possible optimizations for Altera FPGAs • Better (preferably analytical) method of choosing an optimum word size for • Other FPGA families • Other operand sizes • Optimal method of pipelining for adders and modular adders • Extended and more detailed proposal of new FPGA resources supporting fast addition 48

Thank you! Suggestions? Questions? ATHENa: http:/cryptography.gmu.edu/athena CERG: http://cryptography.gmu.edu 49

A Novel Modular Adder for One Thousand Bits and More Using Fast - PowerPoint PPT Presentation

A Novel Modular Adder for One Thousand Bits and More Using Fast Carry Chains of Modern FPGAs Marcin Rogawski, Ekawat Homsirikamol & Kris Gaj George Mason University USA 1 Co-Authors Ekawat Homsirikamol Marcin Rogawski a.k.a

MIPS Instruction Formats 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits for instance,

Outline Introduction to CMOS VLSI Single-bit Addition Design Carry-Ripple Adder

Modular Budgets Modular Budgets Modular Budgets Modular Budgets OSPA NANO Session 10/25/06

FAST TWO-OPERAND ADDITION A. Conventional number system. Carry-propagate adders (CPA)

Problem 1 Design a Verilog 16-bit adder module module adder (A, B, sum); input [15:0] A, B;

T T Traveling Zero to One Traveling Zero to One li li Z Z t t O O Thousand Within a

Continuations Control flow of Web applications Michel Schinz 20070504 The adder

ex start small with a 1-bit (half) adder A B Carry out Sum A 0 0 Sum 0 1 B 1 0 1 1

Combinatorial networks- II Digital Systems M 1 Adder Lets see the truth table of a

Novel Gaits for a Novel Novel Gaits for a Novel Crawling/Grasping Mechanism Crawling/Grasping

Bits and Bytes Topics Topics Why bits? Representing information as bits

Bits and Bytes Aug. 29, 2002 Topics Topics n Why bits? n Representing information as bits l

10 Thousand Channels to 10 Million Viewers: 10 Thousand Channels to 10 Million Viewers: Scaling

Modular Applications, Loose Coupling, and the NetBeans Lookup API The Need for Modular

Managing Modular Software for your NuGet, C++ and Java Development Agenda Modular software

1 TEMPORARY MODULAR HOUSING Meeting Purpose Learn how Temporary Modular Housing will allow

Variable Operations and Maintenance Cost Review Working Group 1 Gas Resources Presenters:

Variable Operations and Maintenance Cost Review Straw Proposal Kevin Head Market Analysis &

Computing Second Order Derivatives with ADiMat Facilitating Optimal Experimental Design by

T.S.S. - Technical Support & Services GSE z/OS SYSTEMS WORKING GROUP IBM Maintenance on

ARE YOU SMART ENOUGH FOR SMART? Presentation by SMART Presentation 1 February 27, 2018 Chad

TV VA Brightness Adder 09/04/19 EPA Luminance Requirements For products with a luminance in

Federal Lands and Fossil Fuels Jayni Hein Institute for Policy Integrity NYU School of Law

Non-Energy Impacts Approaches and Values: an Examination of the Northeast, Mid- Atlantic, and

A Novel Modular Adder for One Thousand Bits and More Using Fast - PowerPoint PPT Presentation

A Novel Modular Adder for One Thousand Bits and More Using Fast Carry Chains of Modern FPGAs Marcin Rogawski, Ekawat Homsirikamol & Kris Gaj George Mason University USA 1 Co-Authors Ekawat Homsirikamol Marcin Rogawski a.k.a

MIPS Instruction Formats 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits for instance,

Outline Introduction to CMOS VLSI Single-bit Addition Design Carry-Ripple Adder

Modular Budgets Modular Budgets Modular Budgets Modular Budgets OSPA NANO Session 10/25/06

FAST TWO-OPERAND ADDITION A. Conventional number system. Carry-propagate adders (CPA)

Problem 1 Design a Verilog 16-bit adder module module adder (A, B, sum); input [15:0] A, B;

T T Traveling Zero to One Traveling Zero to One li li Z Z t t O O Thousand Within a

Continuations Control flow of Web applications Michel Schinz 20070504 The adder

ex start small with a 1-bit (half) adder A B Carry out Sum A 0 0 Sum 0 1 B 1 0 1 1

Combinatorial networks- II Digital Systems M 1 Adder Lets see the truth table of a

Novel Gaits for a Novel Novel Gaits for a Novel Crawling/Grasping Mechanism Crawling/Grasping

Bits and Bytes Topics Topics Why bits? Representing information as bits

Bits and Bytes Aug. 29, 2002 Topics Topics n Why bits? n Representing information as bits l

10 Thousand Channels to 10 Million Viewers: 10 Thousand Channels to 10 Million Viewers: Scaling

Modular Applications, Loose Coupling, and the NetBeans Lookup API The Need for Modular

Managing Modular Software for your NuGet, C++ and Java Development Agenda Modular software

1 TEMPORARY MODULAR HOUSING Meeting Purpose Learn how Temporary Modular Housing will allow

Variable Operations and Maintenance Cost Review Working Group 1 Gas Resources Presenters:

Variable Operations and Maintenance Cost Review Straw Proposal Kevin Head Market Analysis &amp;

Computing Second Order Derivatives with ADiMat Facilitating Optimal Experimental Design by

T.S.S. - Technical Support &amp; Services GSE z/OS SYSTEMS WORKING GROUP IBM Maintenance on

ARE YOU SMART ENOUGH FOR SMART? Presentation by SMART Presentation 1 February 27, 2018 Chad

TV VA Brightness Adder 09/04/19 EPA Luminance Requirements For products with a luminance in

Federal Lands and Fossil Fuels Jayni Hein Institute for Policy Integrity NYU School of Law

Non-Energy Impacts Approaches and Values: an Examination of the Northeast, Mid- Atlantic, and

Variable Operations and Maintenance Cost Review Straw Proposal Kevin Head Market Analysis &

T.S.S. - Technical Support & Services GSE z/OS SYSTEMS WORKING GROUP IBM Maintenance on