- Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu
Low Power Design
- Prof. Dr. J. Henkel
CES - Chair for Embedded Systems KIT, Germany
- V. Thermal Aspects of Low Power Design –
Low Power Design Prof. Dr. J. Henkel CES - Chair for Embedded - - PowerPoint PPT Presentation
Low Power Design Prof. Dr. J. Henkel CES - Chair for Embedded Systems KIT, Germany V. Thermal Aspects of Low Power Design Part 1 Prof. Jrg Henkel, Low Power Design, SS2014 ces.itec.kit.edu 2 Why design
2
n
Portable Systems
ä
Notebooks, palm-tops, PDA, cellular phones, pagers, etc.
l
32% of PC market, and growing ä
Battery-driven - long battery life crucial
ä
System cost, weight limited by batteries
l
40W, 10 hrs @ 20-35 W- hr/pound = 7-20 pounds
l
Slow growth in battery technology n
Must reduce energy drain from batteries
n
Thermal Considerations
ä
10 oC increase in operating temperature => component failure rate doubles
ä
Packaging: ceramic vs. plastic
ä
Cooling requirements
n
Increasing levels of integration / clock frequencies make the problem worse
ä
10cm2, 500 MHz => 315Watts
n
Reliability Issues
ä
Electro-migration
ä
IR drops on supply lines
ä
Inductive effects
n
Tied to peak/average power consumption
n
Environmental Concerns
ä
EPA estimate: 80% of office equipment electricity is used in computers
ä
“Energy Star” program to recognize power efficient PCs
ä
Power management standard for desktops and laptops
n
Drive towards “Green PC”
LOW POWER (Src: A. Raghunathan, NEC)
3
(Src: F. Pollack, Intel)
4
Activity Migration
DVFS Task Scheduling Task mapping
Multi-core architectures 3D architectures
5
Q c T ∆ = ∆
Material c Silica 1.55 Cu 3.45 H20 4.17 Air 0.0012
6
Thermal conductivity k (W/(m.k)) is a material constant
Material k Silica 1.5 Cu 401 H20 0.6 Air 0.025
7
1
1
t t
t h SS
−
t c A
−
25 26 27 28 29 30 31 32 33 34
21:49,0 21:50,6 21:52,2 21:53,7 21:55,2 21:56,7 21:58,2 21:59,7 22:01,2 22:02,7 22:04,2 22:05,7 22:07,2 22:08,7 22:10,2 22:11,7 22:13,2 22:14,7 22:16,2 22:17,7 22:19,2 22:20,7 22:22,2 22:23,7 22:25,2
Time (s)
0 10 20 30
Temperature (°C)
8
“Circuit heat generation is the main limiting factor for scaling of device speed and switch circuit density”
By Jeff Welser, Director SRC Nanoelectronics Research Initiative, IBM, Opening Keynote Address ICCAD 2007 (Src: K. Skadron: Low-Power Design and Temperature Management; IEEE Micro, Vol. 27, No. 6, 2007)
MTTF [years] Temp (Celsius)
6.85 7.24 7.73 8.34 9.06 5 6 7 8 9 10 5 10 15 20 25
Only one layer directly interfaces with the heat sink Heat needs to dissipate through multiple layers The heat sink is located on top of the chip Hot cores distant to the heat sink dissipate their heat through other layers Silicon has a low thermal conductivity! 150 W/(m*K) (Silicon) 401 W/(m*K) (Copper)
“Circuit heat generation is the main limiting factor for scaling of device speed and switch circuit density”
By Jeff Welser, Director SRC Nanoelectronics Research Initiative, IBM, Opening Keynote Address ICCAD 2007
Device count S2 Device frequency S Device power (cap) 1/S Device power (Vdd) 1/S2 Power Density 1
Device count S2 Device frequency S Device power (cap) 1/S Device power (Vdd) ~1 Power Density S2
(Src: “Dennard Scaling”)
11
Silicon and the End of Multicore Scaling”, in International Symposium on Computer Architecture (ISCA), 2011
utilization wall as a result of
a power greater than their power budgets given as TDP (thermal design power)
density are the ultimate limiting factors, thus determining the amount of Dark Silicon
power limited
Methodological Perspective on Energy Efficient Systems”, in ISLPED, 2012
12
12
Scaling Limits when Dark Silicon Dominates
Even if there is unlimited Parallelism, The Speedup is limited by the Power Constraint => There will be Dark Silicon
13
65nm 45nm 32nm 22nm 2 Cores @ < 2GHz
4 Core @ >= 2GHz 8 Core @ > 2GHz 16 Core @ >= 2GHz 8 Dark Cores 22nm 16 Core @ >= 4GHz 12 Dark Cores
Assumption: Scaling Factor S=2 Tradeoff between #Cores and Frequency
14
A 10°C increase in operating temperature cuts product lifetimes in half [http://www.nanowerk.com]
Time Failure rate Wearout Useful Time Infant Mortality Temperature increase reduces the lifetime of chip MTTF Curve
15
Basic Mean time to failure modeled by Black’s Equation: MTTF decreases exponentially with temperature Goal: reduce peak temperatures
[wikipedia]
Q n kT
−
16
Due to: a) Low-frequency power change, b) Workload change, c) Power management Affects MTTF
Spatial gradients
Virtex-5 with two PowerPC CPUs Temperature analysis of Virtex-5 FPGA using infrared thermal camera showing peak spatial thermal gradients of 0.12°C/μm resulting in an increase of electromigration and accelerated aging
Scr: Amrouch, Ebi, Henkel 43 44 45 46 47 48 49 50 51 52 53 1 24 47 70 93 116 139 162 185 208 231 254 277 300 323
Temperature [°c] Time [sec]
Goal: balance spatial/temporal gradients
Temporal gradients
17
Thermal image Corresponding spatial gradient map
30
[°C/mm]
peak spatial gradient of 32°C/mm
18mm 16 mm
18
45 48 51 54 57 60 63 1 67 133 199 265 331 397 463 529 595 661 727 793 859 925 991 1057 1123 1189 1255 1321 1387 1453 1519 1585 1651 1717 1783 1849
Temperature [°C]
Time [sec] Core1 Core2
Thermal Camera Virtex-5 FPGA
Activity migration between two cores at the rate of 154 MCycle
Scr: Amrouch, Ebi, Henkel
19
Lead to open failures or short circuit failures respectively Failures may be temperature dependent due to material expansion
Holes may function normally at high temperatures but fail at low temperatures Hillocks may function normally at low temperatures but short circuit at high temperatures
[W.D. Nix, 1992] Hole/crack Hillock
Q n kT
−
20
Negative Bias Temperature Instability Breakdown of Si-H bonds at the silicon-oxide interface due to voltage/thermal stress causes interface traps Affects mostly P-MOSFETs because of negative gate bias Effect in N-MOSFETS is negligible Despite research focus: NBTI is not yet fully understood!
n p S
gate D
Si Si Si H+ O H H
P-type MOSFET reaction-diffusion model
Si Si O H trap
Vg Vg < 0 STRESS! Vg = 0 p
21
NBTI manifests itself as a shift in Vth Causes increase in transistor delay Delay faults are responsible for NBTI induced bit-flips and resulting circuit failure Recovery effect in periods of no stress When voltage and temperature are low, Vth can shift back towards ist
Full recovery from a stress period
In practice overall Vth shift increases monotonously over longer periods, e.g. months/years Vth shift [V] Time
Stress Recovery
Vg [V]
22
23
Std deviation in 65nm SRAM P-MOSFETS Std deviation at 32nm
Mean Vth shift mainly due to Temperature/Voltage Small technology nodes have less Vth shift due to lower voltages However: Standard deviation of Vth shift mainly due to structure size
Small technology nodes and small P-MOSFETs (e.g. SRAM) show large deviations from the mean Vth shift inceased reliability concern Time [years] Time [years]
SRAM Vth shift
SRAM Vth shift
24
PMOS transistor
P-type P-type N-type gate source drain
Vg
Si O Si O Si H Si H +
trap
Vg < 0 stress phase Vg = 0 self-recovery phase
NBTI
VDD VDD WL WL BL BL ___ Ex: During 70% of the runtime, the PMOS transistor P1 is in the stress phase so it will age faster than P2 DutyCycle (P1) = 0.7 DutyCycle (P2) = 0.3 e.g. assuming the percentage of time that this SRAM-cell holds the value `0` is 0.7
P2 P1
[Src: IBM, KIT]
25
SRAM transfer characteristics during read operation in the case
SNM SNM degradation due to NBTI SRAM transfer characteristics during read operation in the case of α = 0.5 over 11 years SRAM transfer characteristics during read operation in the case of α = 0.3 over 11 years
[Src: IBM, KIT]
26
progressive reduction of carrier mobility increase in CMOS threshold voltage Switching speed slower => leads to timing problems => increases temperature
27
[Xie 2006] Timing errors result from spatial temperature variations localized hotspots need to be avoided Clock trees are particularly vulnerable
Span across multiple thermal areas Additional buffers can be inserted to cope with thermal clock skew
Clock skew compensation using a thermal management unit to control tunable delay buffers inserted into clock tree
[Chakraborty, 2008]
28
Increase in temperature leads to increase in leakage power feedback loop possible! Sub-threshold leakage approximated by where A and B are constants exponential growth!
B T sub
−
[Zhang 2003]
29
Time
R
Active Idle Idle Active Active Idle Idle Active Active Idle
Components
30
idle idle active idle active idle active idle active idle active idle active idle active idle active idle idle idle w r w r w r w r w r w r w r w r r r r r L R L R L R L R L R L R L R L R L R L R L R Δt Time
t0 t1 t2 t3 t4
hot cool Slow migration rate: higher thermal variance, high peak temperature Fast migration rate: lower thermal variance, possibly lower peak temperature
31
34 33 32 31
Effect of ω on temp- erature in an FPGA measured using infrared camera
fast Migration slow Temperature [°C] Temperature gradients [°C/mm]
70 °C 65 °C 63 °C 60 °C 55 °C 48 °C 50 °C
2000 T-FF Design at 600 MHz (Virtex-5 FPGA)
70 °C
1 10M 20M 40M 80M 160M 320M 1 0.5
32
Multi-dimensional case: two activities, four cores (FPGA)
rate (ω)
fast Migration slow
33
66,8 67 67,2 67,4 67,6 67,8 68 68,2 68,4 68,6 0K 100K 200K 300K 400K 500K Peak Temp [°C]
rate (ω) Effect of ω on temperature in a register file for the libquantum benchmark (SPEC 2006)
34
[Hanson 2007] H. Hanson et al, Thermal Response to DVFS: Analysis with an Intel Pentium M, ISLPED 2007 [Xie 2006] V. Narayanan et al, Reliability Concerns in Embedded System Designs. Computer 39, 1 (Jan. 2006). [Yang 2008] J.Yang et al, Dynamic Thermal Management through Task Scheduling, ISPASS 2008 [Zhang 2003] Y.Zhang et al, HotLeakage: A Temperature-Aware Model
[Chakraborty 2008] A.Chakraborty et al, Dynamic Thermal Clock Skew Compensation Usint Tunable Delay Buffers, TVLSI 2008 [Skadron 2005] K. Skadron, A Quick Thermal Tutorial, Invited talk.