seongmoo heo and krste asanovi mit laboratory for

Seongmoo Heo and Krste Asanovi MIT Laboratory for Computer Science - PowerPoint PPT Presentation

Seongmoo Heo and Krste Asanovi MIT Laboratory for Computer Science http://www.cag.lcs.mit.edu/scale WVLSI 2001 April 19, 2001 Motivation Flip-flops are one of the most important components in synchronous VLSI designs. o Critical


  1. Seongmoo Heo and Krste Asanovi � MIT Laboratory for Computer Science http://www.cag.lcs.mit.edu/scale WVLSI 2001 April 19, 2001

  2. Motivation • Flip-flops are one of the most important components in synchronous VLSI designs. o Critical effect on cycle time o Large fraction of total system power • Previously published work has failed to consider the effect of circuit loading on the relative ranking of flip-flop structures. [Kawaguchi et al. ’98] [Ko and Balsara ’00] [Kong et al ’00] [Lang et al ’97] [Nikolic et al ’00] [Nogawa and Ohtomo ’98] [Stojanovic and Oklobdzija ’99] [Stollo et al ’00] [Yuan and Svensson ’91] [Zyuban and Kogge ’99] [H.P. et al ’96] [J.M. et al ’96] o Fixed and usually overly large output load o Large or non-specified input drive o No output buffering

  3. Observation 1. Different flip-flop designs have different inherent parasitics and output drive strength. Different number and complexity of logic gates o Different kinds of feedback o

  4. Observation 1. Different flip-flop designs have different inherent parasitics and output drive strength. Different number and complexity of logic gates o Different kinds of feedback o D Q D Q D Q

  5. Observation 2. Output loads in a circuit vary significantly. Flip-flop output load instances 120 in a microprocessor datapath (A custom-designed 32-bit MIPS CPU in 0.25 µ m process) 100 80 # of instances 7.2fF (4 min inv gate cap) 60 115.2fF (64 min inv gate cap) 28.8fF (16 min inv gate cap) 40 20 0 1.8fF (min inv gate cap)

  6. Our Proposal Load effects must be considered in flip-flop characterization to avoid sub-optimal selection. • We will present energy and delay measurements for various flip- flops across a range of output loading conditions(EE and absolute load size) and show that the relative rankings of structures vary. • We will show that output buffering at high load can lead to the better performance and energy consumption for some structures.

  7. Related Work • Traditional Buffer Sizing • Logical Effort [Sutherland and Sproull] o Logical Effort: drive strength of a circuit structure o Electrical Effort: the ratio of output load to input load o Delay = intrinsic parasitic delay + LE x EE

  8. Overview • Flip-Flop Designs • Test Bench & Simulation Setup • Delay and Energy Characterization • Delay Analysis • Energy-versus-Delay Analysis • Summary

  9. Flip-Flop Designs Fully static and single-ended [Nikolic et al ’00

  10. Test Bench 4 min inv cap • Sized clock buffer 16 min inv cap to give equal rise/fall time FF 64 min inv cap • Used a fixed, realistic input driver • Varied output load from 4 min inv cap(7.2fF) to 64 min inv cap(115.2fF). • 4 Load and Drive Configurations o EE4-min: min input drive, 4 min inv load (7.2fF) o EE16-min: min input drive, 16 min inv load (28.8fF) o EE64-min: min input drive, 64 min inv load (115.2fF) o EE4-big: 16x min input drive, 64 min inv load (115.2fF)

  11. Simulation Setup • 0.25 m TSMC CMOS process, Vdd=2.5V, T=25°C • Hspice Levenberg-Marquardt method was used for transistor size optimization. o Transistor widths optimized for each load and drive conf. to give min delay or min energy for a given delay (transistor lengths were fixed at minimum.) o Parasitic capacitances included in the circuit netlists.

  12. Delay and Energy Characterization Minimum D-Q delay [Stojanovic et al. ’99] (.Measure command) • • Total energy = input energy + internal energy + clock energy – output energy • A single test waveform with ungated clock and data toggling every cycle For a full characterization of energy dissipation, more realistic o activity patterns should be considered [Heo, Krashinsky, Asanovic ARVLSI’01]. FF 4 min inv load 16 min inv load 64 min inv load

  13. Speed Ranking Without Buffering 3.5 (Transistors sized at each load point, but only for min delay) 3.0 • Delay = const. intrinsic parasitic delay + output drive delay (= load size × driving capability) • Driving Capability = f(# of stages, complexity) PPCFF SAFF 1.5 MSAFF HLFF 1.0 SSAPL 0.5

  14. Influence of Buffering on Performance (Assuming no penalty for inverting output) SAFF PPCFF MSAFF 1.5 Delay (ns) 1 0.5 0 0 20 40 80 Load (min inv cap) : unbuffered HLFF SSAPL 1.5 : one inverter : two inverters 1 (Min. input drive was used.) 0.5 0 0 20 40 80

  15. Speed Ranking With Buffering Allowed 3.5 • Less speed variation compared to original flip-flops 3.0 PPCFF SAFF MSAFF HLFF 1.5 SSAPL 1.0 0.5

  16. Energy-Delay Curve : EE4-min Each point sized for min energy for a given delay ) J f ( y g r e n E Delay(ns) EE4-min: min. drive + 4 min inv load (7.2fF)

  17. Energy-Delay Curve : EE4-min ) J f ( y g r e n E PPCFF-unbuf Delay(ns) EE4-min: min. drive + 4 min inv load (7.2fF)

  18. Energy-Delay Curve : EE16-min SSAPL-unbuf ) J f ( y g r e n E Delay(ns) EE16-min: min. drive + 16 min inv load (28.8fF)

  19. Energy-Delay Curve : EE16-min SSAPL-unbuf ) J f ( y g r e n E SSAPL-buf Delay(ns) EE16-min: min. drive + 16 min inv load (28.8fF)

  20. Energy-Delay Curve : EE16-min HLFF-unbuf ) J f ( y g r e n E Delay(ns) EE16-min: min. drive + 16 min inv load (28.8fF)

  21. Energy-Delay Curve : EE16-min HLFF-unbuf ) J f ( y g r e n E HLFF-buf Delay(ns) EE16-min: min. drive + 16 min inv load (28.8fF)

  22. Energy-Delay Curve : EE16-min ) J f ( y g r e n E PPCFF-unbuf Delay(ns) EE16-min: min. drive + 16 min inv load (28.8fF)

  23. Energy-Delay Curve : EE64-min ) J f ( y g r MSAFF-unbuf e n E Delay(ns) EE64-min: min. drive + 64 min inv load (115.2fF)

  24. Energy-Delay Curve : EE64-min HLFF-buf HLFF-unbuf ) J f ( y g r e n E Delay(ns) EE64-min: min. drive + 64 min inv load (115.2fF)

  25. Energy-Delay Curve : EE4-big ) J f ( y g r e n E Delay(ns) EE4-big: 16x min. drive + 64 min inv load (115.2fF)

  26. Energy-Delay Curve : EE4-min vs EE4-big EE4-min EE4-big PPCFF-unbuf MSAFF-unbuf SAFF-unbuf

  27. Energy-Delay Curve : EE4-min vs EE4-big EE4-min EE4-big MSAFF-unbuf PPCFF-unbuf MSAFF-unbuf SAFF-unbuf PPCFF-unbuf SAFF-unbuf

  28. Summary • Different flip-flops have different gains and parasitics. • Real VLSI designs exhibit a variety of flip-flop output loads. • The output load size affects the relative performance and energy consumption of different flip-flop designs. Therefore, output load effects should be accounted for when • comparing flip-flops. 1. Electrical effort 2. Absolute output load size 3. Output buffering

Recommend


More recommend