Asynchronous Computing in Sense Amplifier-based Pass Transistor - - PowerPoint PPT Presentation

asynchronous computing in
SMART_READER_LITE
LIVE PREVIEW

Asynchronous Computing in Sense Amplifier-based Pass Transistor - - PowerPoint PPT Presentation

Asynchronous Computing in Sense Amplifier-based Pass Transistor Logic Tsung-Te Liu, Louis P. Alarcn, Matthew D. Pierson, and Jan M. Rabaey University of California, Berkeley 1 Outline Motivation Sense Amplifier-based Pass Transistor


slide-1
SLIDE 1

1

Asynchronous Computing in Sense Amplifier-based Pass Transistor Logic

Tsung-Te Liu, Louis P. Alarcón, Matthew D. Pierson, and Jan M. Rabaey University of California, Berkeley

slide-2
SLIDE 2

2

Outline

  • Motivation
  • Sense Amplifier-based Pass Transistor Logic (SAPTL)
  • Self-timed SAPTL

– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design

  • Simulation results

– Area – Energy-delay – Leakage

  • Conclusion
slide-3
SLIDE 3

3

Focus: Ultra Low Voltage Design

Voltage scaling:

  • Most effective technique for

power reduction

  • But at low supply voltage

– Leakage dominates – Delay is large and variable

High Performance

Low Power Goal of Ultra Low Voltage Design

< 300mV

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.1 0.2 0.3 0.4 0.5

Channel Length L (m) Supply Voltage VDD (V) Supply voltage vs. CMOS technology node [ITRS]

slide-4
SLIDE 4

4

Issue-1:

Leakage dominates at low supply voltage

  • Leakage dominates the power and energy at low VDD
  • Leakage determines standby power for portable devices
  • 203-stage NAND2 ring oscillator
  • 90-nm CMOS standard cell

Energy [fJ]

101 102 104 103 0.2 0.4 0.6 0.8 1

Supply Voltage [V]

Energy vs. Supply voltage

[JOLPE]

slide-5
SLIDE 5

5

Issue-2:

Delay is large & variable at low supply voltage

  • Supply = 300mV

Delay spread due to transistor mismatches Delay spread due to process variations

  • 10% ~ +15%
  • 60% ~ +100%

Normalized Delay Normalized Delay Count Count 0.0 0.2 0.4 0.6 0.8 1.0 10

  • 8

10

  • 7

10

  • 6

1x10

  • 5

Delay [S] Supply voltage [V]

Delay vs. Supply Voltage

  • Delay increases exponentially at low VDD
  • Large delay spread in actual technology environment
slide-6
SLIDE 6

6

Outline

  • Motivation
  • Sense Amplifier-based Pass Transistor Logic (SAPTL)
  • Self-timed SAPTL

– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design

  • Simulation results

– Area – Energy-delay – Leakage

  • Conclusion
slide-7
SLIDE 7

7

Solution: Use pass transistor logic!

Pass Transistor Network drivers receivers

  • No VDD and GND connections in the logic path

– Nearly zero leakage through the logic path – Can keep scaling down the threshold voltage

[JOLPE]

slide-8
SLIDE 8

8

SAPTL:

Sense Amplifier-based Pass Transistor Logic

Leakage path only exists in the root node driver and sense amplifier Sense amplifier to recover voltage swing and reduce delay Pass Transistor network

Sense Amplifier Stack

Timing control Data inputs Data outputs

CK

[JOLPE]

slide-9
SLIDE 9

9

S S S S P0

SAPTL Computation:

to contain leakage while reducing threshold

S S Out Out CK CK CK CK CK

  • Current steering
  • Works for very low Ion to Ioff ratio
  • Could be programmable
  • Outputs pre-charge to VDD to reset
  • Latch retains the computation results
  • Low voltage operation down to 300mV

Stack Sense Amplifier

to sense amp to next SAPTL

A B B A B B

Root Input

[JOLPE]

slide-10
SLIDE 10

10

SAPTL Control:

to deal with delay variation & uncertainty

Fact:

  • Computation delay varies in actual technology environment

– Ion to Ioff current ratio varies and changes the stack delay – Offset variations in the sense amplifier reduce the stack output margins

  • Energy efficiency is function of timing accuracy

– Turn off the sense amplifier when inactive

  • Synchronous timing is too conservative

– Unable to track local delay variations – Additional skew and jitter in the clock distribution

Sense Amplifier Stack

Timing control Data inputs Data outputs

Solution: Use Self-timed control!

– Exploit local timing information – Low implementation cost in SAPTL

slide-11
SLIDE 11

11

Outline

  • Motivation
  • Sense Amplifier-based Pass Transistor Logic (SAPTL)
  • Self-timed SAPTL

– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design

  • Simulation results

– Area – Energy-delay – Leakage

  • Conclusion
slide-12
SLIDE 12

12

Timing control

Delay

Ackin

C

Enable

  • 1. Din

H L

Sense Amplifier Stack

Driver

Data inputs Din and Din Sout Sout

Root

Dout Dout

Data Evaluation

in Bundled-Data Self-Timed SAPTL

RTA1: Din < Reqin

* RTA : Relative Timing Assumption

  • 5. Dout

H L

  • 2. Reqin

Reqin RTA2: Sout < Enable

  • 6. Ackout  & Reqout

Ackout Reqout

  • 3. Sout
  • 4. Enable
slide-13
SLIDE 13

13

  • 1. Din

H L

Data Reset

in Bundled-Data Self-Timed SAPTL

  • 2. Sout unwanted GLITCH!

Logical low Issue: Opposite stack output charges up if Reqin is still low

  • Degrades the reliability of the self-timed operation
  • Increases the energy dissipation

Sout Dout

Sense Amplifier Stack

Driver

Data inputs

Din and Din Dout

Delay

Ackin Reqout Sout Ackout Reqin Enable

Root

C

slide-14
SLIDE 14

14

Solution to Avoid Data Reset Glitch

Use different handshake protocols

  • Use Din < Reqin for data evaluation
  • Use Reqin < Din for data reset

Implementation-1: Late reset

  • Increase data delay: delay data input signals
  • Easy to implement
  • But...retards the data reset operation

Implementation-2: Early reset

  • Decrease control delay: trigger driver before data inputs reset
  • Maintains the original data reset latency and throughput
  • But…requires an additional relative timing assumption

...which is easy to meet

Our choice!

slide-15
SLIDE 15

15

Glitch-Free

Bundled-Data Self-Timed SAPTL

  • Replace the global clock signal with local handshake logic
  • Delay line controls the local timing
  • Still have a delay-matching issue (TControl > TData)

Sense Amplifier Stack

Data inputs,

Din and Din Dout Dout Ackout Reqin

Driver

C

Delay

Ackin Reqout

TControl TData

slide-16
SLIDE 16

16

Outline

  • Motivation
  • Sense Amplifier-based Pass Transistor Logic (SAPTL)
  • Self-timed SAPTL

– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design

  • Simulation results

– Area – Energy-delay – Leakage

  • Conclusion
slide-17
SLIDE 17

17

Delay

Dual-Rail Self-Timed SAPTL Design

Sense Amplifier Stack

Data inputs,

Din and Din Dout Dout Ackout Reqin

Driver

C

Ackin Reqout

C

Ackin

Use stack outputs as trigger signals for sense amplifier

slide-18
SLIDE 18

18

Dual-Rail Self-Timed SAPTL

Stack

Data inputs,

Din and Din Dout Reqout Dout Ackout Reqin

Driver

C C

Ackin

  • No delay matching issue anymore
  • C-element combines several functions
  • Protocol control
  • Signal recovery
  • Hardware complexity similar to the original synchronous SAPTL design
slide-19
SLIDE 19

19

Outline

  • Motivation
  • Sense Amplifier-based Pass Transistor Logic (SAPTL)
  • Self-timed SAPTL

– Bundled-data self-timed SAPTL design – Dual-rail self-timed SAPTL design

  • Simulation results

– Area – Energy-delay – Leakage

  • Conclusion
slide-20
SLIDE 20

20

SAPTL5 Area Comparison in 90-nm CMOS

Stack Sense Amplifier Driver

Synchronous SAPTL: 20 m x 18 m Bundled-data SAPTL: 25 m x 19 m (~ 32% larger area) * Dual-rail SAPTL: 20 m x 21 m (~ 17% larger than synchronous SAPTL)

Stack Delay Line Sense Amplifier Driver C-Element AND

slide-21
SLIDE 21

21

Energy-Delay Simulations

1 10 100 10 00   

1 10 100 10 100 1K 10K 100K Energy [fJ] Delay [FO4 (@VDD = 1V)]

Synchronous Bundled-Data Dual-Rail 100 FO4

SAPTL5 Energy-Delay Plot

VDD = 300mV VDD = 1V

Note: Synchronous SAPTL plot excludes the clock distribution energy

  • Synchronous & bundled-data

must slow down to account for process variations

  • Early reset of self-timed SAPTL

reduces latency and energy

slide-22
SLIDE 22

22

Leakage Simulations

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 10 00

1 10 100 1 0.4 0.6 0.8 0.2 Supply Voltage [V] Leakage Current [nA]

Synchronous Bundled-Data Dual-Rail

SAPTL5 Leakage Current Plot

VDD = 300mV VDD = 1V

  • Dual-rail design requires larger

C-elements and thus suffers higher leakage

  • Leakage difference diminishes

with lower supply voltages Note: Synchronous SAPTL plot excludes the clock distribution leakage

slide-23
SLIDE 23

23

Conclusion

  • SAPTL is a promising candidate

for low energy computation at low supply voltage

  • Self-timed operation

improves reliability and energy-delay performance without increasing hardware complexity

  • Dual-rail self-timed SAPTL

achieves better energy and speed performance in technologies with process variations

slide-24
SLIDE 24

24

Acknowledgement

  • Marly Roncken and Ivan Sutherland
  • Berkeley Wireless Research Center
  • NSF Infrastructure Grant
  • STMicroelectronics
  • Gigascale Systems Research Center

Thank you!