optimal sparseness in binary adders
play

Optimal Sparseness in Binary Adders ARITH 22 Lyon, France 2015 - PowerPoint PPT Presentation

Optimal Sparseness in Binary Adders ARITH 22 Lyon, France 2015 Outline Parallel Adders Structural features Recurrence algorithms Weinberger Ling Minimum depth structures Kogge-Stone Ladner-Fischer


  1. Optimal Sparseness in Binary Adders ARITH 22 Lyon, France 2015

  2. Outline • Parallel Adders – Structural features – Recurrence algorithms • Weinberger • Ling – Minimum depth structures • Kogge-Stone • Ladner-Fischer • Sparse Adders – Sparse adders in literature • Energy Optimal Sparseness – Limits on sparseness – Effect of increased sparseness on adder energy • Implementation results • Conclusion

  3. Parallel Adder Structure

  4. Structural Features of Parallel Adders • Logic Depth (LD): maximum number of stages from input to output • Prefix (P): number of signals (or maximum fan-in) processed at each stage. – Prefix 2 means two signals are processed in a node. – Logical depth changes depending on the prefix. • minimum possible number of stages = log R N ( N -bit adder, prefix R ). – For N =64 : LD min = 6 for prefix 2, LD min = 3 for prefix 4. • Fan-out (F): The maximum number of logical branching in the prefix tree. • Wiring Complexity (WC): The maximum number of wire tracks passing along a bit-pitch of the technology in any stage of the prefix tree.

  5. Recurrence Algorithms Weinberger Ling

  6. Minimum Depth Adders Kogge-Stone Ladner-Fisher - Minimum depth (log 2 N) - Minimum depth (log 2 N) - Minimum fanout (2) - Maximum fanout (N/2) - Maximum wiring (N/2) - Minimum wiring (1) P.M. Kogge and H.S. Stone, “A parallel algorithm for the efficient solution R.E. Ladner, M.J. Fischer; ‘Parallel Prefix Computation’ JACM, 27(4):831- of a general class of recurrence equations”, IEEE Trans. Computers Vol. C- 838, Oct. 1980. 22, No. 8, Aug. 1973, pp.786-793.

  7. SPARSE ADDER

  8. Sparse Adder Structure • Critical path in prefix adder – Sum block: 1 gate – Carry block: 1+log 2 N gates • Cannot reduce critical path length beyond log 2 N, however can move complexity to less critical sum block. • Solution: Sparse adder – Generate every M th carry signal – Pre-compute sum signals for missing carry signals – Select true sum signal based on computed carry signals • Dilutes carry block, complicates sum block • Saves area, power without changing critical path length

  9. Prefix Graphs for Sparse Adders

  10. SPARSE ADDERS IN LITERATURE

  11. Conditional Sum (COS) Adder 32-bit prefix 2 COS adder prefix scheme. Sklansky, J.; , "Conditional-Sum Addition Logic," Electronic Computers, IRE Transactions on , vol.EC-9, no.2, pp.226-231, June 1960.

  12. Carry Select (CSL) Adder 64-bit prefix 4 sparse 4 CSL adder prefix scheme. Bedrij, O. J.; , "Carry-Select Adder," Electronic Computers, IRE Transactions on , vol.EC-11, no.0, pp.340-346, June 1962.

  13. Sparse Adder [Mathew, 2003] 32-bit prefix 2 sparse 4 LF prefix scheme Weinberger adder Mathew, S.; Anders, M.; Krishnamurthy, R.K.; Borkar, S.; , "A 4-GHz 130-nm address generation unit with 32-bit sparse-tree adder core," Solid-State Circuits, IEEE Journal of , vol.38, no.5, pp. 689- 695, May 2003.

  14. ENERGY OPTIMAL SPARSENESS

  15. Carry Tree Sparseness • Sparse carry trees reduce energy in parallel adders • Energy improvement is due to the complexity reduction of the carry path by reduced wiring and number of gates. • A certain amount of complexity is moved to the sum path implying a limit on the sparseness of the carry tree.

  16. Carry Tree Sparseness cont. • Making the carry tree sparse does not change the critical path length of the carry block. • However, increases the critical path length for the sum block. • Critical path length of carry block for an N -bit Ling adder using prefix 2 computations is log 2 N • A sparse M adder uses M -bit parallel adders in the sum block to compute conditional sum signals • Hence, critical path length for sum block is 2+log 2 M

  17. Limit on Sparseness • Weinberger recurrence – Carry critical path: 1+log 2 N – Sum critical path: 2+log 2 M 2+log 2 M ≤ 1 + log 2 N ⇒ M ≤ N /2 • Ling recurrence – Carry critical path: log 2 N – Sum critical path: 2+log 2 M 2+log 2 M ≤ log 2 N ⇒ M ≤ N /4

  18. SUM PATH DESIGN IN A SPARSE ADDER

  19. Sum Path Weinberger Ling c i = t i −1 h i −1

  20. RCA vs PPA in Partial Sum Computation RCA (Ripple Carry Adder) PPA (Parallel Prefix Adder) Depth = 5 Depth = 4

  21. RCA vs PPA: Critical path length Degree of Ripple carry Parallel prefix Sparseness ( M ) (1+ M ) (2+log 2 M ) 2 3 3 4 5 4 8 9 5 16 17 6

  22. 8-bit Partial Sum Computation using PPA Structure

  23. Theoretical results EFFECT OF INCREASED SPARSENESS

  24. Total gate count -Gate counts are equal for KS and LF adders.

  25. Total Gate Complexity - Complexity for a gate is defined as the number of inputs (for inverter 1, two-input nand 2, etc.) - For KS sparse 4 gives least complexity for 32 to 256 bit adders. - For LF sparse 2 gives least complexity for 32 and 64, and sparse 4 for 128 and 256 bit adders.

  26. Normalized Gate Complexity - Complexities are normalized to their full carry tree (sparseness 1) complexities. - For KS sparseness achieves 30% reduction in complexity. - For LF sparseness achieves 20% reduction in complexity.

  27. Total Wire Complexity - Wire complexity is defined as the total wire length (e.g. a wire from bit 32 to 64 will have a length of 32 units). - For KS complexity reduces as sparseness increases. - For LF wire cmplx. optimum sparseness is 2 for 32 and 64 bit, and 4 for 128 and 256 bit adders.

  28. Normalized Wire Complexity - Complexities are normalized to their full carry tree (sparseness 1) complexities. - For KS sparseness achieves 80% reduction in complexity. - For LF sparseness achieves 20% reduction in complexity.

  29. Theoretical Results • For 64-bit LF adders, sparse 2 yields both minimum gate complexity and total wire length – It must be noted that the reduction in gate complexity in LF adder is due to removal of buffers as opposed to the more complex AND-OR gates in KS adder. – Hence, the improvement in gate complexity for LF adder is smaller compared to the improvement in KS adder. – The increase in gate complexity beyond sparse 8 in KS adder will circumvent energy savings achieved through reduced wiring complexity. • Energy optimum sparseness degree will be determined by the gate capacitance to the wire capacitance ratio. – For low performance design region, gate sizes are small hence wire capacitances will dominate and KS sparse 8 is expected to outperform KS sparse 4 in terms of energy at same performance. – For LF adder on the other hand, it is not worth going beyond sparse 4 due to increased complexity in both measures. • For 128- and 256-bit adders sparse 4 yields the most savings for both KS and LF structures.

  30. RESULTS

  31. Technology Technology STDCELL Library • 45nm TSMC Gate Available Strength • VDD= 1.1V • Temp = 25`C AOI21 1x,2x,4x,6x,8x AOI22 1x,2x,4x,6x,8x • Typical process corner INV 1x,2x,4x,6x,8x,12x,16x,32x • Multi-Vth standard cell NAND2 1x,2x,4x,6x,8x library (low, standard, high) NOR2 1x,2x,4x,6x,8x OAI21 1x,2x,4x,6x,8x OAI22 1x,2x,4x,6x,8x

  32. Design Environment • Designed adders • Input driver: 16x inverter – KS adder w/ full, sparse 2, • Output load: 16x inverter sparse 4, and sparse 8 • 25% activity at inputs carry trees • Adders designed for – LF adder w/ full, sparse 2, minimum energy using sparse 4, and sparse 8 carry trees delay targets between • Circuit sizing using Design 300ps to 400ps. Compiler • Placement and routing using Encounter • Post layout simulations using Primetime

  33. Energy-Delay

  34. Leakage Power

  35. Wire Energy

  36. Conclusion • Energy savings of 50% and 22%, and leakage power savings of 70% and 40% are achieved with increased sparseness degree of carry trees for KS and LF adders, respectively. • For 64-bit KS Ling adder, energy optimal sparseness is 4. For 64-bit LF Ling adder, energy optimal sparseness is 2. • Both optimal KS and LF adders reach the same minimum delay target of 300ps. • Experimental results suggest that LF S2 is 7% more energy efficient than KS S4 at minimum delay point. • Theoretical results suggest that sparse 4 carry tree should be used for both KS and LF adders of sizes 128-bit and above.

  37. Questions? THANK YOU …

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend