SLIDE 1
Seongmoo Heo and Krste Asanovi MIT Laboratory for Computer Science http://www.cag.lcs.mit.edu/scale
WVLSI 2001 April 19, 2001
SLIDE 2 Motivation
- Flip-flops are one of the most important components in
synchronous VLSI designs.
- Critical effect on cycle time
- Large fraction of total system power
- Previously published work has failed to consider the effect of
circuit loading on the relative ranking of flip-flop structures.
[Kawaguchi et al. ’98] [Ko and Balsara ’00] [Kong et al ’00] [Lang et al ’97] [Nikolic et al ’00] [Nogawa and Ohtomo ’98] [Stojanovic and Oklobdzija ’99] [Stollo et al ’00] [Yuan and Svensson ’91] [Zyuban and Kogge ’99] [H.P. et al ’96] [J.M. et al ’96]
- Fixed and usually overly large output load
- Large or non-specified input drive
- No output buffering
SLIDE 3 Observation
- 1. Different flip-flop designs have different inherent parasitics and
- utput drive strength.
- Different number and complexity of logic gates
- Different kinds of feedback
SLIDE 4 Observation
- 1. Different flip-flop designs have different inherent parasitics and
- utput drive strength.
- Different number and complexity of logic gates
- Different kinds of feedback
D Q D Q D Q
SLIDE 5 Observation
- 2. Output loads in a circuit vary significantly.
Flip-flop output load instances in a microprocessor datapath
(A custom-designed 32-bit MIPS CPU in 0.25µm process) 20 120 60 40
# of instances
80 100 7.2fF (4 min inv gate cap) 28.8fF (16 min inv gate cap) 115.2fF (64 min inv gate cap) 1.8fF (min inv gate cap)
SLIDE 6 Our Proposal
Load effects must be considered in flip-flop characterization to avoid sub-optimal selection.
- We will present energy and delay measurements for various flip-
flops across a range of output loading conditions(EE and absolute load size) and show that the relative rankings of structures vary.
- We will show that output buffering at high load can lead to the
better performance and energy consumption for some structures.
SLIDE 7 Related Work
- Traditional Buffer Sizing
- Logical Effort [Sutherland and Sproull]
- Logical Effort: drive strength of a circuit structure
- Electrical Effort: the ratio of output load to input load
- Delay = intrinsic parasitic delay + LE x EE
SLIDE 8 Overview
- Flip-Flop Designs
- Test Bench & Simulation Setup
- Delay and Energy Characterization
- Delay Analysis
- Energy-versus-Delay Analysis
- Summary
SLIDE 9
Flip-Flop Designs
Fully static and single-ended
[Nikolic et al ’00
SLIDE 10 Test Bench
to give equal rise/fall time
- Used a fixed, realistic input driver
- Varied output load from
4 min inv cap(7.2fF) to 64 min inv cap(115.2fF).
- 4 Load and Drive Configurations
- EE4-min: min input drive, 4 min inv load (7.2fF)
- EE16-min: min input drive, 16 min inv load (28.8fF)
- EE64-min: min input drive, 64 min inv load (115.2fF)
- EE4-big: 16x min input drive, 64 min inv load (115.2fF)
FF
4 min inv cap 16 min inv cap 64 min inv cap
SLIDE 11 Simulation Setup
- 0.25 m TSMC CMOS process, Vdd=2.5V, T=25°C
- Hspice Levenberg-Marquardt method was used for transistor size
- ptimization.
- Transistor widths optimized for each load and drive conf.
to give min delay or min energy for a given delay (transistor lengths were fixed at minimum.)
- Parasitic capacitances included in the circuit netlists.
SLIDE 12 Delay and Energy Characterization
- Minimum D-Q delay [Stojanovic et al. ’99] (.Measure command)
- Total energy = input energy + internal energy + clock energy
– output energy
- A single test waveform with ungated clock and data toggling every cycle
- For a full characterization of energy dissipation, more realistic
activity patterns should be considered [Heo, Krashinsky, Asanovic ARVLSI’01].
FF 4 min inv load 16 min inv load 64 min inv load
SLIDE 13 Speed Ranking Without Buffering
- Delay = const. intrinsic parasitic delay
+ output drive delay (= load size × driving capability)
- Driving Capability = f(# of stages, complexity)
0.5 1.0 1.5 3.0 3.5 PPCFF SAFF MSAFF HLFF SSAPL
(Transistors sized at each load point, but only for min delay)
SLIDE 14 Influence of Buffering on Performance
: unbuffered : one inverter : two inverters (Min. input drive was used.)
(Assuming no penalty for inverting output)
SSAPL HLFF MSAFF SAFF PPCFF
1.5 1 0.5
Delay (ns)
0 20 40 80
Load (min inv cap)
1.5 1 0.5 0 20 40 80
SLIDE 15 Speed Ranking With Buffering Allowed
- Less speed variation compared to original flip-flops
PPCFF SAFF MSAFF HLFF SSAPL 0.5 1.0 1.5 3.0 3.5
SLIDE 16
Energy-Delay Curve : EE4-min EE4-min: min. drive + 4 min inv load(7.2fF)
Delay(ns) E n e r g y ( f J )
Each point sized for min energy for a given delay
SLIDE 17
Energy-Delay Curve : EE4-min
E n e r g y ( f J ) Delay(ns)
PPCFF-unbuf EE4-min: min. drive + 4 min inv load(7.2fF)
SLIDE 18
Energy-Delay Curve : EE16-min EE16-min: min. drive + 16 min inv load(28.8fF)
Delay(ns) E n e r g y ( f J )
SSAPL-unbuf
SLIDE 19
E n e r g y ( f J ) Delay(ns)
SSAPL-buf EE16-min: min. drive + 16 min inv load(28.8fF) Energy-Delay Curve : EE16-min SSAPL-unbuf
SLIDE 20
E n e r g y ( f J ) Delay(ns)
HLFF-unbuf EE16-min: min. drive + 16 min inv load(28.8fF) Energy-Delay Curve : EE16-min
SLIDE 21
E n e r g y ( f J ) Delay(ns)
HLFF-buf EE16-min: min. drive + 16 min inv load(28.8fF) Energy-Delay Curve : EE16-min HLFF-unbuf
SLIDE 22
E n e r g y ( f J ) Delay(ns)
PPCFF-unbuf
EE16-min: min. drive + 16 min inv load(28.8fF) Energy-Delay Curve : EE16-min
SLIDE 23
EE64-min: min. drive + 64 min inv load(115.2fF)
E n e r g y ( f J ) Delay(ns)
MSAFF-unbuf Energy-Delay Curve : EE64-min
SLIDE 24
E n e r g y ( f J ) Delay(ns)
HLFF-unbuf HLFF-buf EE64-min: min. drive + 64 min inv load(115.2fF) Energy-Delay Curve : EE64-min
SLIDE 25
EE4-big: 16x min. drive + 64 min inv load(115.2fF)
E n e r g y ( f J ) Delay(ns)
Energy-Delay Curve : EE4-big
SLIDE 26
Energy-Delay Curve : EE4-min vs EE4-big
EE4-min MSAFF-unbuf SAFF-unbuf PPCFF-unbuf EE4-big
SLIDE 27
MSAFF-unbuf SAFF-unbuf PPCFF-unbuf PPCFF-unbuf MSAFF-unbuf SAFF-unbuf
Energy-Delay Curve : EE4-min vs EE4-big
EE4-min EE4-big
SLIDE 28 Summary
- Different flip-flops have different gains and parasitics.
- Real VLSI designs exhibit a variety of flip-flop output loads.
- The output load size affects the relative performance and energy
consumption of different flip-flop designs.
- Therefore, output load effects should be accounted for when
comparing flip-flops.
- 1. Electrical effort
- 2. Absolute output load size
- 3. Output buffering