Pushing Ultra-Low-Power Digital Circuits into the Era
David Bol
December 16, 2008
Microelectronics Laboratory
Ph.D public defense
Nanometer
Pushing Ultra-Low-Power Digital Circuits into the Era Nanometer - - PowerPoint PPT Presentation
Pushing Ultra-Low-Power Digital Circuits into the Era Nanometer David Bol Microelectronics Laboratory Ph.D public defense December 16, 2008 Pushing Ultra-Low-Power Digital Circuits into the Era Nanometer David Bol Microelectronics
David Bol
December 16, 2008
Microelectronics Laboratory
Ph.D public defense
Nanometer
David Bol
December 16, 2008 Ph.D public defense
Nanometer
Microelectronics Laboratory
High-performance circuits Performances: 10 GOp/s Power < 100 W Low-power circuits Performances: 1 GOp/s Power < 1 W
2
RFID tags Wearable electroncics Hearing aids and biomedical Sensor networks
Smart Dust [Berkeley]
Ultra-low-power circuits Performances: 10 k - 10 MOp/s Power < 1µW
3
David Bol
December 16, 2008 Ph.D public defense
Microelectronics Laboratory
[Intel]
5
6
7
Moore’s law without technology scaling Moore’s law with technology scaling
1 MHz 10 MHz 100 MHz 1 GHz Clock frequency Transistor count 104 105 106 107 108 109
130n
2008 45nm
8
1 MHz 10 MHz 100 MHz 1 GHz Clock frequency Transistor count 104 105 106 107 108 109
130n
2008 45nm
8
1998 2000 2002 2004 2006 2008 100 200 300 400 Year Technology node [nm]
9
Last chips [IEEE ISSCC’08]:
Ultra-low-power 0.3V µC for biomedical applications [Kwong]
6 5 n m
Ultra-low-power 0.32V motion estimator [Kaul]
65nm ITRS
‘0’ ‘1’
10
‘1’ ‘0’
10
‘1’ ‘0’
10
Vdd IN OUT CL
‘1’ ‘0’
11
kg
Ion
10
4
10
5
10
6
10
7
10
8
10
9
0.5 1 1.5 Minimum Vdd [V]
8-bit RCA multiplier in 130nm technology
10
4
10
5
10
6
10
7
10
8
10
9
10
10
10
10
Power [W] Throughput [Op/s]
Functional limit Speed limit
12
Pdyn ~ fclk x CL x Vdd
2
Frequency scaling F r e q u e n c y / v
t a g e s c a l i n g
ULP applications = subthreshold logic
Vdd IN OUT CL
‘1’ ‘0’
13
10
4
10
5
10
6
10
7
10
8
10
9
0.5 1 1.5 Minimum Vdd [V]
8-bit RCA multiplier in 130nm technology
10
4
10
5
10
6
10
7
10
8
10
9
10
10
10
10
Power [W] Throughput [Op/s]
Functional limit
Pdyn ~ fclk x CL x Vdd
2
Pstat = Vdd x Ileak
ULP applications
14
F r e q u e n c y / v
t a g e s c a l i n g Speed limit
ULP applications = subthreshold logic
10
4
10
5
10
6
10
7
10
8
10
9
0.5 1 1.5 Minimum Vdd [V]
8-bit RCA multiplier in 130nm technology
10
4
10
5
10
6
10
7
10
8
10
9
10
10
10
Throughput [Op/s] Energy per operation [J]
Edyn ~ CLVdd
2
Estat Emin
Functional limit
ULP applications
15
F r e q u e n c y / v
t a g e s c a l i n g Speed limit
ULP applications = subthreshold logic
16
Tox
Gate Source Drain
L
Gate Source
L W
L W
Tox, L, W ~ 1/S
Drain
Tox
Gate Source Drain
L
17
Speed
L W Tox
Gate Source Drain
L
Reduce Vdd
kg
Edyn ~ CL Vdd
2
Estat ~ Ileak Vdd
2
17
Gate Source Drain Gate Source Drain Continuous doping
Gate Source Drain
Discrete dopants Straight line edges Rough line edges Gate
130nm technology
45nm technology
10
4
10
5
10
6
10
7
10
8
10
9
0.5 1 1.5 Minimum Vdd [V]
8-bit RCA multiplier
10
4
10
5
10
6
10
7
10
8
10
9
10
10
10
10
10
Throughput [Op/s] Energy per operation [J]
Functional limit Speed limit
Edyn
1 3 n m 4 5 n m 1 3 n m 45nm
18
Variability
10
4
10
5
10
6
10
7
10
8
10
9
0.5 1 1.5 Minimum Vdd [V]
8-bit RCA multiplier
10
4
10
5
10
6
10
7
10
8
10
9
10
10
10
10
10
Throughput [Op/s] Energy per operation [J]
Functional limit Speed limit
Edyn Estat
1 3 n m 4 5 n m 1 3 n m 45nm
ULP applications
18
1 3 n m 45nm
ULP applications
19
Energy per operation Throughput 2 1 Emin Energy per
Famous Intel co-founder Scale,
scale,
1 2
21
1 Low-Power High-Performance/ General-Purpose 45nm technology
10
4
10
5
10
6
10
7
10
8
0.2 0.4 0.6 0.8 8-bit RCA multiplier in 45 nm technology
Minimum Vdd [V] 10
4
10
5
10
6
10
7
10
8
10
10
10
Throughput [Op/s] Energy per operation [J]
22
LP GP GP
1
ULP applications
high-Vt high-Vt
Low- Power General- Purpose
LP
23
Std-Vt clk Register clk Register Non-critical path Non-critical path
23
High-Vt High-Vt clk Register clk Register Std-Vt Non-critical path Non-critical path
0.2 0.4 0.6 0.8 1 1.2 10 20 30 40 Vdd [V] Maximum N Typical With variability 2 3 19 11 7 8 32 27
24
Mult A B OUT Std-Vt High-Vt IN OUT N Critical path Inefficient
25
1 target throughput
a c t u a l model
Energy per operation +40% +90%
25
1 target throughput
a c t u a l adapt. model
Energy per operation
26
Vdd VBB Vdd-VBB
target throughput
a c t u a l adapt
1
0.1 1 10 0.2 0.3 0.4 0.5
8-bit benchmark multiplier in 45 nm LP technology
Vdd [V] 0.1 1 10 0.8 1 1.2 1.4 1.6 1.8 2
0.3 0.6 VBB [V] ASV (VBB=0V) ABB (Vdd=0.35V)
+70%
0.1 1 10 0.2 0.3 0.4 0.5
8-bit benchmark multiplier in 45 nm LP technology
Minimum V
dd [V]
0.3 0.6 Minimum VBB [V] 0.1 1 10 0.8 1 1.2 1.4 1.6 1.8 2
ASV (VBB=0V) ABB (Vdd=0.35V)
ABB better ASV better
26
Vdd VBB Vdd-VBB
target throughput
a c t u a l adapt
1
0.1 1 10 0.2 0.3 0.4 0.5
8-bit benchmark multiplier in 45 nm LP technology
Minimum V
dd [V]
0.3 0.6 Minimum VBB [V] 0.1 1 10 0.8 1 1.2 1.4 1.6 1.8 2
ASV (VBB=0V) ABB (Vdd=0.35V)
ABB better ASV better
26
target throughput
a c t u a l adapt
1
Reverse body bias is fine in 45 nm LP technology Problem in 45 nm GP! What at 32 nm?
1 2
[Hanson, IEEE TED, pp. 175-185, 2008] 90nm 45nm
130nm 90nm 65nm 45nm 10 20 30 40 50 60 Emin [fJ]
New effects in nanometer technologies In all flavors
28
130nm 90nm 65nm 45nm 10 20 30 40 50 60 Emin [fJ]
CLS2
Bulk 5 10 15 20 25 30 Emin [fJ] Bulk opt. Var. Igate DIBL Sshort Slong
New effects:
Gate Source Drain
Igate DIBL CLS2
29
130nm 90nm 65nm 45nm 10 20 30 40 50 60 Emin [fJ]
CLS2
Bulk 5 10 15 20 25 30 Emin [fJ] Bulk opt. Var. Igate DIBL Sshort Slong
New effects:
Gate Source Drain
Igate DIBL
Low Vt + long Lg
CLS2 CLS2
OI Undoped FD SOI Var. Igate DIBL Sshort Slong
130nm 90nm 65nm 45nm 10 20 30 40 50 60 Emin [fJ]
CLS2
Bulk 5 10 15 20 25 30 Emin [fJ]
FD SOI
Gate Source Drain
Igate
Buried oxide Substrate
Low variability Low Cj CLS2 CLS2 Low S, mid DIBL
Undoped channel
30
Optimum bulk MOSFET
FD SOI
31
0.2 0.4 0.6 0.8 1 10
10
10
10
Vdd [V] Ileak [A/µm] 25° C 200° C
32
D G S
Ileak
130nm PD SOI technology Leakage x 100
Low-leakage SOI technology 1 or 0.5 µm, 5 or 3.3V
2
33
Standard SOI technology 0.13 µm, 1 V
34
0.2 0.4 0.6 0.8 1 10
10
10
10
Vdd [V] Ileak [A/µm] 25° C 200° C
35
ULP transistor
D G S
Ileak
X 130nm PD SOI technology Vgs<0
ULP logic style
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 VIN [V] VOUT [V] Rising input Falling input
VX1 VX2
IN VDD OUT X1 X2 VDD OUT IN GND Layout in SOI Hysteresis
36
37
ULP logic style at 200° C:
[ITRS07]
39
1 2
Reducing Emin Reaching Emin Technology level
Low CL, S, DIBL, variability (I0) Igate, Ijunc < Isub @ 0.3-0.4V Single device type for all logic gates I0 tuning Relaxed constraints:
Sleep-mode technique
Stand-by Active
with coarse granularity
with fine granularity
mode operation
Circuit level
40 Subthreshold logic
@ GP flavor ULP logic style
@ GP flavor 130 / 90 nm Subthreshold logic
+ adapt. dual-BG bias @ dedicated flavor Subthreshold logic
@ HP/GP flavor ULP mode in LP applications Subthreshold logic
@ LP flavor Standard ULP applications 32 / 22 nm 65 / 45 nm Node Applications Architectural techniques (//, pipe) for meeting throughput constraint High-temperature ULP industrial applications Performance issues Reliability issues Reliability issues Economical issues
Acknowledgements:
by FNRS and Walloon region of Belgium.
0.2 0.4 0.6 0.8 1 1.2 10
10
10
10
Vdd [V] Energy per operation [J]
8-bit RCA multiplier in 0.13 µm technology
Sub Vt Edyn~CLVdd
2
Estat=Vdd x Ileak x Tdel Tdel increase Emin