1
Asynchronous Computing in Sense Amplifier-based Pass Transistor - - PowerPoint PPT Presentation
Asynchronous Computing in Sense Amplifier-based Pass Transistor - - PowerPoint PPT Presentation
Asynchronous Computing in Sense Amplifier-based Pass Transistor Logic Tsung-Te Liu, Louis P. Alarcn, Matthew D. Pierson, and Jan M. Rabaey University of California, Berkeley 1 Outline Motivation Sense Amplifier-based Pass Transistor
2
Outline
- Motivation
- Sense Amplifier-based Pass Transistor Logic (SAPTL)
- Self-timed SAPTL
– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design
- Simulation results
– Area – Energy-delay – Leakage
- Conclusion
3
Focus: Ultra Low Voltage Design
Voltage scaling:
- Most effective technique for
power reduction
- But at low supply voltage
– Leakage dominates – Delay is large and variable
High Performance
Low Power Goal of Ultra Low Voltage Design
< 300mV
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.1 0.2 0.3 0.4 0.5
Channel Length L (m) Supply Voltage VDD (V) Supply voltage vs. CMOS technology node [ITRS]
4
Issue-1:
Leakage dominates at low supply voltage
- Leakage dominates the power and energy at low VDD
- Leakage determines standby power for portable devices
- 203-stage NAND2 ring oscillator
- 90-nm CMOS standard cell
Energy [fJ]
101 102 104 103 0.2 0.4 0.6 0.8 1
Supply Voltage [V]
Energy vs. Supply voltage
[JOLPE]
5
Issue-2:
Delay is large & variable at low supply voltage
- Supply = 300mV
Delay spread due to transistor mismatches Delay spread due to process variations
- 10% ~ +15%
- 60% ~ +100%
Normalized Delay Normalized Delay Count Count 0.0 0.2 0.4 0.6 0.8 1.0 10
- 8
10
- 7
10
- 6
1x10
- 5
Delay [S] Supply voltage [V]
Delay vs. Supply Voltage
- Delay increases exponentially at low VDD
- Large delay spread in actual technology environment
6
Outline
- Motivation
- Sense Amplifier-based Pass Transistor Logic (SAPTL)
- Self-timed SAPTL
– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design
- Simulation results
– Area – Energy-delay – Leakage
- Conclusion
7
Solution: Use pass transistor logic!
Pass Transistor Network drivers receivers
- No VDD and GND connections in the logic path
– Nearly zero leakage through the logic path – Can keep scaling down the threshold voltage
[JOLPE]
8
SAPTL:
Sense Amplifier-based Pass Transistor Logic
Leakage path only exists in the root node driver and sense amplifier Sense amplifier to recover voltage swing and reduce delay Pass Transistor network
Sense Amplifier Stack
Timing control Data inputs Data outputs
CK
[JOLPE]
9
S S S S P0
SAPTL Computation:
to contain leakage while reducing threshold
S S Out Out CK CK CK CK CK
- Current steering
- Works for very low Ion to Ioff ratio
- Could be programmable
- Outputs pre-charge to VDD to reset
- Latch retains the computation results
- Low voltage operation down to 300mV
Stack Sense Amplifier
to sense amp to next SAPTL
A B B A B B
Root Input
[JOLPE]
10
SAPTL Control:
to deal with delay variation & uncertainty
Fact:
- Computation delay varies in actual technology environment
– Ion to Ioff current ratio varies and changes the stack delay – Offset variations in the sense amplifier reduce the stack output margins
- Energy efficiency is function of timing accuracy
– Turn off the sense amplifier when inactive
- Synchronous timing is too conservative
– Unable to track local delay variations – Additional skew and jitter in the clock distribution
Sense Amplifier Stack
Timing control Data inputs Data outputs
Solution: Use Self-timed control!
– Exploit local timing information – Low implementation cost in SAPTL
11
Outline
- Motivation
- Sense Amplifier-based Pass Transistor Logic (SAPTL)
- Self-timed SAPTL
– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design
- Simulation results
– Area – Energy-delay – Leakage
- Conclusion
12
Timing control
Delay
Ackin
C
Enable
- 1. Din
H L
Sense Amplifier Stack
Driver
Data inputs Din and Din Sout Sout
Root
Dout Dout
Data Evaluation
in Bundled-Data Self-Timed SAPTL
RTA1: Din < Reqin
* RTA : Relative Timing Assumption
- 5. Dout
H L
- 2. Reqin
Reqin RTA2: Sout < Enable
- 6. Ackout & Reqout
Ackout Reqout
- 3. Sout
- 4. Enable
13
- 1. Din
H L
Data Reset
in Bundled-Data Self-Timed SAPTL
- 2. Sout unwanted GLITCH!
Logical low Issue: Opposite stack output charges up if Reqin is still low
- Degrades the reliability of the self-timed operation
- Increases the energy dissipation
Sout Dout
Sense Amplifier Stack
Driver
Data inputs
Din and Din Dout
Delay
Ackin Reqout Sout Ackout Reqin Enable
Root
C
14
Solution to Avoid Data Reset Glitch
Use different handshake protocols
- Use Din < Reqin for data evaluation
- Use Reqin < Din for data reset
Implementation-1: Late reset
- Increase data delay: delay data input signals
- Easy to implement
- But...retards the data reset operation
Implementation-2: Early reset
- Decrease control delay: trigger driver before data inputs reset
- Maintains the original data reset latency and throughput
- But…requires an additional relative timing assumption
...which is easy to meet
Our choice!
15
Glitch-Free
Bundled-Data Self-Timed SAPTL
- Replace the global clock signal with local handshake logic
- Delay line controls the local timing
- Still have a delay-matching issue (TControl > TData)
Sense Amplifier Stack
Data inputs,
Din and Din Dout Dout Ackout Reqin
Driver
C
Delay
Ackin Reqout
TControl TData
16
Outline
- Motivation
- Sense Amplifier-based Pass Transistor Logic (SAPTL)
- Self-timed SAPTL
– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design
- Simulation results
– Area – Energy-delay – Leakage
- Conclusion
17
Delay
Dual-Rail Self-Timed SAPTL Design
Sense Amplifier Stack
Data inputs,
Din and Din Dout Dout Ackout Reqin
Driver
C
Ackin Reqout
C
Ackin
Use stack outputs as trigger signals for sense amplifier
18
Dual-Rail Self-Timed SAPTL
Stack
Data inputs,
Din and Din Dout Reqout Dout Ackout Reqin
Driver
C C
Ackin
- No delay matching issue anymore
- C-element combines several functions
- Protocol control
- Signal recovery
- Hardware complexity similar to the original synchronous SAPTL design
19
Outline
- Motivation
- Sense Amplifier-based Pass Transistor Logic (SAPTL)
- Self-timed SAPTL
– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design
- Simulation results
– Area – Energy-delay – Leakage
- Conclusion
20
SAPTL5 Area Comparison in 90-nm CMOS
Stack Sense Amplifier Driver
Synchronous SAPTL: 20 m x 18 m Bundled-data SAPTL: 25 m x 19 m (~ 32% larger area) * Dual-rail SAPTL: 20 m x 21 m (~ 17% larger than synchronous SAPTL)
Stack Delay Line Sense Amplifier Driver C-Element AND
21
Energy-Delay Simulations
1 10 100 10 00
1 10 100 10 100 1K 10K 100K Energy [fJ] Delay [FO4 (@VDD = 1V)]
Synchronous Bundled-Data Dual-Rail 100 FO4
SAPTL5 Energy-Delay Plot
VDD = 300mV VDD = 1V
Note: Synchronous SAPTL plot excludes the clock distribution energy
- Synchronous & bundled-data
must slow down to account for process variations
- Early reset of self-timed SAPTL
reduces latency and energy
22
Leakage Simulations
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 10 00
1 10 100 1 0.4 0.6 0.8 0.2 Supply Voltage [V] Leakage Current [nA]
Synchronous Bundled-Data Dual-Rail
SAPTL5 Leakage Current Plot
VDD = 300mV VDD = 1V
- Dual-rail design requires larger
C-elements and thus suffers higher leakage
- Leakage difference diminishes
with lower supply voltages Note: Synchronous SAPTL plot excludes the clock distribution leakage
23
Conclusion
- SAPTL is a promising candidate
for low energy computation at low supply voltage
- Self-timed operation
improves reliability and energy-delay performance without increasing hardware complexity
- Dual-rail self-timed SAPTL
achieves better energy and speed performance in technologies with process variations
24
Acknowledgement
- Marly Roncken and Ivan Sutherland
- Berkeley Wireless Research Center
- NSF Infrastructure Grant
- STMicroelectronics
- Gigascale Systems Research Center