Low Power Design Prof. Dr. J. Henkel CES - Chair for Embedded - - PowerPoint PPT Presentation

low power design
SMART_READER_LITE
LIVE PREVIEW

Low Power Design Prof. Dr. J. Henkel CES - Chair for Embedded - - PowerPoint PPT Presentation

Low Power Design Prof. Dr. J. Henkel CES - Chair for Embedded Systems KIT, Germany V. Thermal Aspects of Low Power Design Part 1 Prof. Jrg Henkel, Low Power Design, SS2014 ces.itec.kit.edu 2 Why design


slide-1
SLIDE 1
  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Low Power Design

  • Prof. Dr. J. Henkel

CES - Chair for Embedded Systems KIT, Germany

  • V. Thermal Aspects of Low Power Design –

Part 1

slide-2
SLIDE 2

2

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Why design for low power/energy?

n

Portable Systems

ä

Notebooks, palm-tops, PDA, cellular phones, pagers, etc.

l

32% of PC market, and growing ä

Battery-driven - long battery life crucial

ä

System cost, weight limited by batteries

l

40W, 10 hrs @ 20-35 W- hr/pound = 7-20 pounds

l

Slow growth in battery technology n

Must reduce energy drain from batteries

n

Thermal Considerations

ä

10 oC increase in operating temperature => component failure rate doubles

ä

Packaging: ceramic vs. plastic

ä

Cooling requirements

n

Increasing levels of integration / clock frequencies make the problem worse

ä

10cm2, 500 MHz => 315Watts

n

Reliability Issues

ä

Electro-migration

ä

IR drops on supply lines

ä

Inductive effects

n

Tied to peak/average power consumption

n

Environmental Concerns

ä

EPA estimate: 80% of office equipment electricity is used in computers

ä

“Energy Star” program to recognize power efficient PCs

ä

Power management standard for desktops and laptops

n

Drive towards “Green PC”

LOW POWER (Src: A. Raghunathan, NEC)

slide-3
SLIDE 3

3

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

(Src: F. Pollack, Intel)

slide-4
SLIDE 4

4

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Overview

Why do we need thermal management?: motivation Thermal basics “Dark Silicon” Reliability concerns Means to handle temperature 1:

Activity Migration

Cooling Methods Means to handle temperature 2:

DVFS Task Scheduling Task mapping

Multi-core architectures 3D architectures

Thermal Management at CES Part 1

slide-5
SLIDE 5

5

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Basic definitions

Heat is thermal energy (joules) Heat transfer (watt = joules/s) Q Heat flux is heat transfer rate through given surface area Specific thermal capacity c (J/cm3.K) of material determines amount of heating Temperature: T

Q c T ∆ = ∆

Material c Silica 1.55 Cu 3.45 H20 4.17 Air 0.0012

slide-6
SLIDE 6

6

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Methods of heat transfer

Conduction

Thermal conductivity k (W/(m.k)) is a material constant

Radiation: heat emitted as electromagnetic waves due to temperature dependent kinetic motion of charged particles in matter Convection: transfer of energy through movement of fluids

Material k Silica 1.5 Cu 401 H20 0.6 Air 0.025

slide-7
SLIDE 7

7

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Power in Relation to Temperature

Basic temperature equation: where Q is the heat dissipation rate. TSS is the steady state temperature the system will asymptotically reach with current power configuration  Ambient temperature TA is minimum reachable temperature

dT C Q P dt

  • = −

+

1

1

1 ( ) ( ) ( )

t t

T t T Q t P t dt C

  • =

+ − +

( ) ( )

t h SS

T t T T T e

= − − ( ) ( )

t c A

T t T T T e

= + −

Heating Cooling

.

25 26 27 28 29 30 31 32 33 34

21:49,0 21:50,6 21:52,2 21:53,7 21:55,2 21:56,7 21:58,2 21:59,7 22:01,2 22:02,7 22:04,2 22:05,7 22:07,2 22:08,7 22:10,2 22:11,7 22:13,2 22:14,7 22:16,2 22:17,7 22:19,2 22:20,7 22:22,2 22:23,7 22:25,2

Time (s)

0 10 20 30

Temperature (°C)

slide-8
SLIDE 8

8

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Heat Remains a Problem …

Folgerung?

“Circuit heat generation is the main limiting factor for scaling of device speed and switch circuit density”

By Jeff Welser, Director SRC Nanoelectronics Research Initiative, IBM, Opening Keynote Address ICCAD 2007 (Src: K. Skadron: Low-Power Design and Temperature Management; IEEE Micro, Vol. 27, No. 6, 2007)

MTTF [years] Temp (Celsius)

  • K. Skadron et al., ICCAD 2004

6.85 7.24 7.73 8.34 9.06 5 6 7 8 9 10 5 10 15 20 25

slide-9
SLIDE 9

Problem: vertical heat flow

Only one layer directly interfaces with the heat sink Heat needs to dissipate through multiple layers The heat sink is located on top of the chip Hot cores distant to the heat sink dissipate their heat through other layers Silicon has a low thermal conductivity! 150 W/(m*K) (Silicon) 401 W/(m*K) (Copper)

Temperature in 3D

slide-10
SLIDE 10

Heat Remains a Problem …

Folgerung?

“Circuit heat generation is the main limiting factor for scaling of device speed and switch circuit density”

By Jeff Welser, Director SRC Nanoelectronics Research Initiative, IBM, Opening Keynote Address ICCAD 2007

Classical scaling

Device count S2 Device frequency S Device power (cap) 1/S Device power (Vdd) 1/S2 Power Density 1

Limited scaling

Device count S2 Device frequency S Device power (cap) 1/S Device power (Vdd) ~1 Power Density S2

(Src: “Dennard Scaling”)

slide-11
SLIDE 11

11

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Power/Temperature Wall

  • H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, D. Burger, “Dark

Silicon and the End of Multicore Scaling”, in International Symposium on Computer Architecture (ISCA), 2011

  • The Dark Silicon Problem emerged due to the

utilization wall as a result of

  • Memory wall
  • Parallelism wall
  • Power Wall => Chips cannot be driven with

a power greater than their power budgets given as TDP (thermal design power)

  • Power dissipation, peak power, and power

density are the ultimate limiting factors, thus determining the amount of Dark Silicon

  • Power is more expensive than area
  • Regardless of available parallelism, chip
  • rganization & topology, multicore scaling is

power limited

  • @22 nm: Dark Silicon≈20%-50%
  • @8 nm: Dark Silicon > 50%-70%
  • Dark Silicon is must be kept powered-OFF
  • J. Allred, S. Roy, K. Chakraborty, “Designing for Dark Silicon: A

Methodological Perspective on Energy Efficient Systems”, in ISLPED, 2012

slide-12
SLIDE 12

12

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Power/Temperature Wall: Trends

12

  • H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, D. Burger, “Dark Silicon and the End of Multicore Scaling”, in International Symposium on Computer Architecture (ISCA), 2011

Scaling Limits when Dark Silicon Dominates

Even if there is unlimited Parallelism, The Speedup is limited by the Power Constraint => There will be Dark Silicon

slide-13
SLIDE 13

13

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

65nm 45nm 32nm 22nm 2 Cores @ < 2GHz

Power/Temperature Wall: Is Multi-Core Scaling Promising?

4 Core @ >= 2GHz 8 Core @ > 2GHz 16 Core @ >= 2GHz 8 Dark Cores 22nm 16 Core @ >= 4GHz 12 Dark Cores

Assumption: Scaling Factor S=2 Tradeoff between #Cores and Frequency

slide-14
SLIDE 14

14

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Impact of High Temperature

Higher temperatures result in a lower Mean Time To Failure (MTTF)

A 10°C increase in operating temperature cuts product lifetimes in half [http://www.nanowerk.com]

Time Failure rate Wearout Useful Time Infant Mortality Temperature increase reduces the lifetime of chip MTTF Curve

slide-15
SLIDE 15

15

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Temperature and Reliability

Electromigration: aging effect due to transport of mass in metal interconnects directly linked to temperature

Basic Mean time to failure modeled by Black’s Equation: MTTF decreases exponentially with temperature  Goal: reduce peak temperatures

[wikipedia]

Q n kT

MTTF Aj e

    −  

=

slide-16
SLIDE 16

16

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Due to: a) Low-frequency power change, b) Workload change, c) Power management Affects MTTF

Thermal Gradients

Spatial gradients

Virtex-5 with two PowerPC CPUs Temperature analysis of Virtex-5 FPGA using infrared thermal camera showing peak spatial thermal gradients of 0.12°C/μm resulting in an increase of electromigration and accelerated aging

Scr: Amrouch, Ebi, Henkel 43 44 45 46 47 48 49 50 51 52 53 1 24 47 70 93 116 139 162 185 208 231 254 277 300 323

Temperature [°c] Time [sec]

 Goal: balance spatial/temporal gradients

Temporal gradients

slide-17
SLIDE 17

17

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Thermal image Corresponding spatial gradient map

30

[°C/mm]

peak spatial gradient of 32°C/mm

18mm 16 mm

Thermal Gradients

slide-18
SLIDE 18

18

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Example of Potential Thermal Cycling in FPGAs

45 48 51 54 57 60 63 1 67 133 199 265 331 397 463 529 595 661 727 793 859 925 991 1057 1123 1189 1255 1321 1387 1453 1519 1585 1651 1717 1783 1849

Temperature [°C]

Time [sec] Core1 Core2

Thermal Camera Virtex-5 FPGA

Activity migration between two cores at the rate of 154 MCycle

Scr: Amrouch, Ebi, Henkel

slide-19
SLIDE 19

19

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Temperature-Dependent Effects

Process variations and electromigration can result in hillocks and holes

Lead to open failures or short circuit failures respectively Failures may be temperature dependent due to material expansion

Holes may function normally at high temperatures but fail at low temperatures Hillocks may function normally at low temperatures but short circuit at high temperatures

[W.D. Nix, 1992] Hole/crack Hillock

Q n kT

MTTF Aj e

    −  

=

slide-20
SLIDE 20

20

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

What is NBTI?

 Negative Bias Temperature Instability  Breakdown of Si-H bonds at the silicon-oxide interface due to voltage/thermal stress  causes interface traps  Affects mostly P-MOSFETs because of negative gate bias  Effect in N-MOSFETS is negligible  Despite research focus: NBTI is not yet fully understood!

n p S

  • xide

gate D

Si Si Si H+ O H H

P-type MOSFET reaction-diffusion model

Si Si O H trap

Vg Vg < 0  STRESS! Vg = 0 p

slide-21
SLIDE 21

21

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

NBTI - Effect

 NBTI manifests itself as a shift in Vth  Causes increase in transistor delay  Delay faults are responsible for NBTI induced bit-flips and resulting circuit failure  Recovery effect in periods of no stress  When voltage and temperature are low, Vth can shift back towards ist

  • riginal value

 Full recovery from a stress period

  • nly possible in infinite time

 In practice overall Vth shift increases monotonously over longer periods, e.g. months/years Vth shift [V] Time

Stress Recovery

Vg [V]

  • 1
slide-22
SLIDE 22

22

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

NBTI and Temperature

Temperature plays important aspect in NBTI modeling Higher temperatures increase shift in threshold voltage ΔVth approximately 50% higher at 75°C than 55°C NBTI effect at 75°C is approximately equal to alternating between 85°C and 25°C

slide-23
SLIDE 23

23

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

NBTI - Analysis

Std deviation in 65nm SRAM P-MOSFETS Std deviation at 32nm

Mean Vth shift mainly due to Temperature/Voltage Small technology nodes have less Vth shift due to lower voltages However: Standard deviation of Vth shift mainly due to structure size

Small technology nodes and small P-MOSFETs (e.g. SRAM) show large deviations from the mean Vth shift  inceased reliability concern Time [years] Time [years]

SRAM Vth shift

  • Std. deviation

SRAM Vth shift

  • Std. deviation
slide-24
SLIDE 24

24

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

NBTI Imapct in 6-T SRAM Cells

PMOS transistor

P-type P-type N-type gate source drain

Vg

  • xide

Si O Si O Si H Si H +

trap

Vg < 0  stress phase Vg = 0  self-recovery phase

NBTI

VDD VDD WL WL BL BL ___ Ex: During 70% of the runtime, the PMOS transistor P1 is in the stress phase so it will age faster than P2 DutyCycle (P1) = 0.7 DutyCycle (P2) = 0.3 e.g. assuming the percentage of time that this SRAM-cell holds the value `0` is 0.7 

P2 P1

[Src: IBM, KIT]

slide-25
SLIDE 25

25

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

NBTI Imapct in 6-T SRAM Cells

Static Noise Margin (SNM) is one of the critical reliability metrics in an SRAM cell. It represents the immunity of SRAM against noise during the read or write operation NBTI highly affects the SNM making the SRAM more susceptible to failure

SRAM transfer characteristics during read operation in the case

  • f α = 0.5 during the first year

SNM SNM degradation due to NBTI SRAM transfer characteristics during read operation in the case of α = 0.5 over 11 years SRAM transfer characteristics during read operation in the case of α = 0.3 over 11 years

[Src: IBM, KIT]

slide-26
SLIDE 26

26

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Temperature and Reliability

Hot-Carrier Effect: build up of trapped charges in the gate- channel interface region

progressive reduction of carrier mobility  increase in CMOS threshold voltage Switching speed slower => leads to timing problems => increases temperature

slide-27
SLIDE 27

27

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Temperature and Reliability

Transient errors may result due to timing errors

  • Approx. 5% decrease in delay every 10°C temperature increase

[Xie 2006] Timing errors result from spatial temperature variations  localized hotspots need to be avoided Clock trees are particularly vulnerable

Span across multiple thermal areas Additional buffers can be inserted to cope with thermal clock skew

Clock skew compensation using a thermal management unit to control tunable delay buffers inserted into clock tree

[Chakraborty, 2008]

slide-28
SLIDE 28

28

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Temperature and Leakage

Thermal “runaway” problem:

Increase in temperature leads to increase in leakage power  feedback loop possible! Sub-threshold leakage approximated by where A and B are constants  exponential growth!

B T sub

I A e

≈ ⋅

[Zhang 2003]

slide-29
SLIDE 29

29

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Activity Migration introduces additional instances of a system component to the design or divides existing componenents into sub-components Activity is limited to one component at a time Allows a cool down of idle component  Influcence thermal behavior through load balancing

Activity Balancing/Migration

Time

ω 2ω 3ω 4ω

R

Active Idle Idle Active Active Idle Idle Active Active Idle

Components

slide-30
SLIDE 30

30

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

A more specific scenario: activity migration in register file

Activity Balancing/Migration: Principle

idle idle active idle active idle active idle active idle active idle active idle active idle active idle idle idle w r w r w r w r w r w r w r w r r r r r L R L R L R L R L R L R L R L R L R L R L R Δt Time

t0 t1 t2 t3 t4

hot cool Slow migration rate: higher thermal variance, high peak temperature Fast migration rate: lower thermal variance, possibly lower peak temperature

slide-31
SLIDE 31

31

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

34 33 32 31

Effect of ω on temp- erature in an FPGA measured using infrared camera

Control-based Balancing/Migration: Analysis

fast  Migration  slow Temperature [°C] Temperature gradients [°C/mm]

70 °C 65 °C 63 °C 60 °C 55 °C 48 °C 50 °C

2000 T-FF Design at 600 MHz (Virtex-5 FPGA)

70 °C

1 10M 20M 40M 80M 160M 320M 1 0.5

slide-32
SLIDE 32

32

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Control-based Balancing: Tradeoff abs Temp and gradients

Multi-dimensional case: two activities, four cores (FPGA)

  • ptimal migration

rate (ω)

fast  Migration  slow

slide-33
SLIDE 33

33

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Control-based Balancing: finding right rate

66,8 67 67,2 67,4 67,6 67,8 68 68,2 68,4 68,6 0K 100K 200K 300K 400K 500K Peak Temp [°C]

Migration rate (ω) [cycle]

  • ptimal migration

rate (ω) Effect of ω on temperature in a register file for the libquantum benchmark (SPEC 2006)

slide-34
SLIDE 34

34

  • Prof. Jörg Henkel, Low Power Design, SS2014 ces.itec.kit.edu

Reference and sources

 [Hanson 2007] H. Hanson et al, Thermal Response to DVFS: Analysis with an Intel Pentium M, ISLPED 2007  [Xie 2006] V. Narayanan et al, Reliability Concerns in Embedded System Designs. Computer 39, 1 (Jan. 2006).  [Yang 2008] J.Yang et al, Dynamic Thermal Management through Task Scheduling, ISPASS 2008  [Zhang 2003] Y.Zhang et al, HotLeakage: A Temperature-Aware Model

  • f Subthreshold and Gate Leakage for Architects, Tech Report 2003

 [Chakraborty 2008] A.Chakraborty et al, Dynamic Thermal Clock Skew Compensation Usint Tunable Delay Buffers, TVLSI 2008  [Skadron 2005] K. Skadron, A Quick Thermal Tutorial, Invited talk.