Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing - - PowerPoint PPT Presentation

mitigating parameter variation with dynamic
SMART_READER_LITE
LIVE PREVIEW

Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing - - PowerPoint PPT Presentation

Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing * Radu Teodorescu , Jun Nakano, Abhishek Tiwari and Josep Torrellas University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu * to appear in MICRO-40, December 2007


slide-1
SLIDE 1

Mitigating Parameter Variation with Dynamic Fine-Grain Body Biasing*

Radu Teodorescu, Jun Nakano, Abhishek Tiwari and Josep Torrellas University of Illinois at Urbana-Champaign

*to appear in MICRO-40, December 2007

http://iacoma.cs.uiuc.edu

Tuesday, October 9, 2007

slide-2
SLIDE 2
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Intel Corp.

Motivation

2

  • Technology scaling continues
  • More and more transistors

every generation!

  • However...
  • Chips are increasingly

affected by parameter variation

Tuesday, October 9, 2007

slide-3
SLIDE 3
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Parameter Variation

  • Process variation
  • Manufacturing at low feature sizes
  • Temperature variation
  • Uneven activity distribution
  • Supply voltage variation
  • IR drop, di/dt noise

3

A>$:@

Intel Corp.

Tuesday, October 9, 2007

slide-4
SLIDE 4
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Effects of Parameter Variation

  • Higher power consumption
  • Lower frequency
  • Uncertainty in the design process

4 Tuesday, October 9, 2007

slide-5
SLIDE 5
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Outline

  • A Model of Process Variation
  • Dynamic Fine-Grain Body Biasing
  • Evaluation
  • Conclusions

5 Tuesday, October 9, 2007

slide-6
SLIDE 6
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Outline

  • A Model of Process Variation
  • Dynamic Fine-Grain Body Biasing
  • Evaluation
  • Conclusions

6 Tuesday, October 9, 2007

slide-7
SLIDE 7
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

  • Fast, simple and parameterizable model
  • We model two key process parameters:
  • Transistor critical dimension (Leff) and threshold

voltage (Vth)

  • We also model temperature effects

7

A Model For Process Variation

Tuesday, October 9, 2007

slide-8
SLIDE 8
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Variation Components

  • Granularity:
  • Within die

8

Die-to-die Within die

  • WID variation:
  • Systematic variation
  • Random variation
  • Die-to-die

Tuesday, October 9, 2007

slide-9
SLIDE 9
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 9

A Model For Process Variation

ΔP = ΔPD2D + ΔPWID = ΔPD2D + ΔPrand + ΔPsys

  • Variation in any parameter P:
  • We focus on WID variation
  • D2D is a chip-wide offset to ΔPWID
  • Random and systematic components
  • Modeled as normal distributions
  • Treated separately - impact different levels of the

microarchitecture

Tuesday, October 9, 2007

slide-10
SLIDE 10
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Systematic Variation

  • Characterized by a correlation function:

10

  • Correlation is position independent and isotropic
  • For ρ(r) we choose the spherical model

corr(P

x, P y) = ρ(r) ; r = |

x − y|

  • Multivariate normal distribution (μsys=0, σsys)

Px Py r

  • We divide the chip into a grid of points
  • Each point has one random value of ΔPsys

Tuesday, October 9, 2007

slide-11
SLIDE 11
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Spherical Model

  • Matches measured data [Friedberg et al. 05]

11

1 φ r (r) ρ

ρ(r) =        1 − 3r

2φ + r3 2φ3

: (r ≤ φ) : otherwise

Px Py r

Stronger correlation

Px Py r

Weaker correlation

Tuesday, October 9, 2007

slide-12
SLIDE 12
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Random Variation

  • Random variation - transistor level
  • We model it analytically as a normal distribution
  • Both ΔPrand and ΔPsys are normal and

independent with σrand and σsys

12

σtotal =

  • σ2

rand + σ2 sys

Tuesday, October 9, 2007

slide-13
SLIDE 13
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Outline

  • A Model of Process Variation
  • Dynamic Fine-Grain Body Biasing
  • Evaluation
  • Conclusions

13 Tuesday, October 9, 2007

slide-14
SLIDE 14
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Body Biasing

  • Well known technique for Vth control
  • A voltage is applied between source/drain and

substrate of a transistor

  • Forward body bias
  • Reverse body bias

14

RBB - Vth ↑ - Freq ↓ - Leak ↓ FBB - Vth ↓ - Freq ↑ - Leak ↑

  • Useful knob to control frequency and leakage

Tuesday, October 9, 2007

slide-15
SLIDE 15
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

  • D2D variation

[Intel Xscale]

  • D2D variation, power,

performance [Intel’s 80-core chip]

  • WID variation

[Tschanz et al]

  • WID variation, power,

performance

  • WID variation
  • T variation

(space and time)

Static

BB fixed for chip lifetime

Simple adaptation

FBB in active mode RBB in standby

Dynamic

BB changes with T and workload

Chip-wide Fine-grain

Body Bias Design Space

15

Space Time

Tuesday, October 9, 2007

slide-16
SLIDE 16
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

  • D2D variation

[Intel Xscale]

  • D2D variation, power,

performance [Intel’s 80-core chip]

  • WID variation

[Tschanz et al]

  • WID variation, power,

performance

  • WID variation
  • T variation

(space and time)

S-FGBB Static

BB fixed for chip lifetime

Simple adaptation

FBB in active mode RBB in standby

Dynamic

BB changes with T and workload

Chip-wide Fine-grain

Body Bias Design Space

15

Space Time

Tuesday, October 9, 2007

slide-17
SLIDE 17
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

  • D2D variation

[Intel Xscale]

  • D2D variation, power,

performance [Intel’s 80-core chip]

  • WID variation

[Tschanz et al]

  • WID variation, power,

performance

  • WID variation
  • T variation

(space and time)

S-FGBB D-FGBB Static

BB fixed for chip lifetime

Simple adaptation

FBB in active mode RBB in standby

Dynamic

BB changes with T and workload

Chip-wide Fine-grain

Body Bias Design Space

15

Space Time

Tuesday, October 9, 2007

slide-18
SLIDE 18
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Motivation for D-FGBB

  • Body bias trades off frequency

for leakage

  • Optimal body bias:

The lowest FBB or highest RBB s.t. circuit delay meets frequency target

16

50 60 70 80 90 100 0.6 0.8 1.0 1.2 1.4 Temperature (C) Relative Switching Frequency Vth = 0.180V Vth = 0.165V Vth = 0.150V Vth = 0.135V Vth = 0.120V

  • Circuit delay changes with temperature
  • Therefore optimal BB changes with temperature

Tuesday, October 9, 2007

slide-19
SLIDE 19
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Motivation for D-FGBB

  • Body bias trades off frequency

for leakage

  • Optimal body bias:

The lowest FBB or highest RBB s.t. circuit delay meets frequency target

16

50 60 70 80 90 100 0.6 0.8 1.0 1.2 1.4 Temperature (C) Relative Switching Frequency Vth = 0.180V Vth = 0.165V Vth = 0.150V Vth = 0.135V Vth = 0.120V

  • Circuit delay changes with temperature
  • Therefore optimal BB changes with temperature

The goal of D-FGBB is to keep the body bias optimal as T changes

Tuesday, October 9, 2007

slide-20
SLIDE 20
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Finding the Optimal BB

  • Measure the delay of each BB cell
  • Critical path replicas to sample cell delay

17

Critical Path Replica Phase Detector

extra delay fast slow RBB FBB

Sample Point

CLK

  • Phase detector “times” the critical path replica
  • If slow - FBB signal raised
  • If fast - RBB signal raised

Tuesday, October 9, 2007

slide-21
SLIDE 21
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 18

Body Bias Cell

RBB FBB RBB RBB RBB FBB FBB FBB

Sample Points Local Bias Generator N-CNT P-CNT Local Bias Generator N-CNT P-CNT D2A D2A NMOS Vbb PMOS Vbb Body Bias Cell Local Bias Generator N-CNT P-CNT D2A D2A

AND OR

DEC INC

RBB FBB RBB RBB RBB FBB FBB FBB

NMOS Vbb Sample Points PMOS Vbb

Applying Fine Grain BB

Tuesday, October 9, 2007

slide-22
SLIDE 22
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 18

Body Bias Cell

RBB FBB RBB RBB RBB FBB FBB FBB

Sample Points Local Bias Generator N-CNT P-CNT Local Bias Generator N-CNT P-CNT D2A D2A NMOS Vbb PMOS Vbb Body Bias Cell Local Bias Generator N-CNT P-CNT D2A D2A

AND OR

DEC INC

RBB FBB RBB RBB RBB FBB FBB FBB

NMOS Vbb Sample Points PMOS Vbb

Applying Fine Grain BB

Tuesday, October 9, 2007

slide-23
SLIDE 23
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 18

Body Bias Cell

RBB FBB RBB RBB RBB FBB FBB FBB

Sample Points Local Bias Generator N-CNT P-CNT Local Bias Generator N-CNT P-CNT D2A D2A NMOS Vbb PMOS Vbb Body Bias Cell Local Bias Generator N-CNT P-CNT D2A D2A

AND OR

DEC INC

RBB FBB RBB RBB RBB FBB FBB FBB

NMOS Vbb Sample Points PMOS Vbb

Applying Fine Grain BB

Tuesday, October 9, 2007

slide-24
SLIDE 24
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Applications of D-FGBB

19

S-FGBB D-FGBB Normal

Improve chip

  • perating point

Save leakage power

High Performance

Improve chip

  • perating point

Increase average frequency

Low Power

Save leakage power Save leakage power

Operating environments

Tuesday, October 9, 2007

slide-25
SLIDE 25
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

S-FGBB D-FGBB Normal

Improve chip

  • perating point

Save leakage power

High Performance

Improve chip

  • perating point

Increase average frequency

Low Power

Save leakage power Save leakage power

Applications of D-FGBB

20

Operating environments

Tuesday, October 9, 2007

slide-26
SLIDE 26
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Improving a Chip’s Operating Point

21

NoBB (a) 0.5 1.0 1.5 2.0 2.5 3.0 leakage 0.812 0.850 0.887 0.925 0.962 1.000 1.037 y c n e u q e r f

Acceptable region Leakage power limit High power Low frequency

Tuesday, October 9, 2007

slide-27
SLIDE 27
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

  • Post-manufacturing calibration phase:
  • 1. Bring chip to Tcal
  • 2. Set target frequency Fcal0, and run at full load
  • 3. BB is adjusted automatically
  • 4. Measure total power Pcal: if Pcal<Ptarget,

Fcal1=Fcal0++, else Fcal1=Fcal0--

  • 5. Repeat if needed, until Pcal ≈ Ptarget
  • Fcali becomes the chip’s frequency

22

Improving a Chip’s Operating Point

Tuesday, October 9, 2007

slide-28
SLIDE 28
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

D-FGBB Adapts to Changes in T

  • Calibration temperature Tcal is conservative
  • Average T much lower:

23 I n t Q I n t R e g L d S t Q I n t E x e c I n t M a p D T B I T B F P Q F P R e g F P M a p B p r e d F P A d d F P M u l D c a c h e I c a c h e L 2 C a c h e Functional Units 20 40 60 80 100 Temperature (C) Tmax Tavg

Tcal

Tuesday, October 9, 2007

slide-29
SLIDE 29
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 24

Frequency Leakage Leakage limit at Tcal Fcal S-FGBB at Tcal S-FGBB at Tavg D-FGBB at Tavg Original chip

D-FGBB Saves Leakage Power

  • S-FGBB finds and sets Fcal
  • D-FGBB adjusts dynamically to T changes to

save power while running at Fcal

Tuesday, October 9, 2007

slide-30
SLIDE 30
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

S-FGBB D-FGBB Normal

Improve chip

  • perating point

Save leakage power

High Performance

Improve chip

  • perating point

Increase average frequency

Low Power

Save leakage power Save leakage power

Applications of D-FGBB

25

Operating environments

Tuesday, October 9, 2007

slide-31
SLIDE 31
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 26

D-FGBB Improves Performance

  • Average power Pavg<Pmax
  • D-FGBB is used to push the chip to Favg>Fcal, as

long as P<Pmax

Frequency Leakage Leakage limit at Tcal Fcal S-FGBB at Tcal S-FGBB at Tavg D-FGBB at Tavg Original chip

Tuesday, October 9, 2007

slide-32
SLIDE 32
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

S-FGBB D-FGBB Normal

Improve chip

  • perating point

Save leakage power

High Performance

Improve chip

  • perating point

Increase average frequency

Low Power

Save leakage power Save leakage power

Applications of D-FGBB

27

Operating environments

Tuesday, October 9, 2007

slide-33
SLIDE 33
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 28

Frequency Leakage Leakage limit at Tcal Forig D-FGBB at Tavg Original chip

D-FGBB Saves Leakage Power

  • The chip runs at its original Forig
  • D-FGBB adjusts dynamically to T changes to

save power while running at Forig

Tuesday, October 9, 2007

slide-34
SLIDE 34
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Outline

  • A Model of Process Variation
  • Dynamic Fine-Grain Body Biasing
  • Evaluation
  • Conclusions

29 Tuesday, October 9, 2007

slide-35
SLIDE 35
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Evaluation Infrastructure

  • Statistical package R to generate variation maps

for 200 chips

  • SESC - cycle accurate microarchitectural

simulator - execution time, dynamic power

  • Mix of SPECint and SPECfp benchmarks
  • HotLeakage, SPICE model - leakage power
  • Hotspot - temperature estimation

30 Tuesday, October 9, 2007

slide-36
SLIDE 36
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Evaluation Infrastructure

31

R statistical package Variation model, BB HotLeakage BSIM3 SPICE Hotspot CMP Floorplan SESC Simulator dynamic power leakage power T leakage power under variation Frequency, power of each chip, under variation F

Tuesday, October 9, 2007

slide-37
SLIDE 37
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Evaluation Methodology

  • 4-core CMP

, based on Alpha 21364

  • 45nm technology, 4GHz
  • Vth variation: σVth/μVth=0.3-0.12, σsys=σrand
  • Leff variation σLeff= σVth/2
  • Vdd=1V, Vth0=150mV, Vbb= ±500mV

32 Tuesday, October 9, 2007

slide-38
SLIDE 38
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

CMP Architecture

33

L2 Cache DCache Bpred FPReg FPAdd FPMul DTB ITB LdSTQ IntExec IntReg FPMap IntMap IntQ FPQ ICache

Tuesday, October 9, 2007

slide-39
SLIDE 39
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Body Bias Cells

34

  • We partition each core into BB cells
  • Shapes and sizes follow functional units

FGBB16 FGBB64 FGBB144

Tuesday, October 9, 2007

slide-40
SLIDE 40
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Variation Impact

35

0.03 0.06 0.09 0.12 0.2 0.4 0.6 0.8 1.0 1.2 Frequency (a) 0.03 0.06 0.09 0.12 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 Leakage Power

φ=0.1 φ=0.2 φ=0.5

(b)

σ/μ σ/μ

Vth Vth

Tuesday, October 9, 2007

slide-41
SLIDE 41
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

S-FGBB D-FGBB Normal

Improve chip

  • perating point

Save leakage power

High Performance

Improve chip

  • perating point

Increase average frequency

Low Power

Save leakage power Save leakage power

Applications of D-FGBB

36

Operating environments

Tuesday, October 9, 2007

slide-42
SLIDE 42
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

S-FGBB Improves the Chip’s Operating Point

37 NoBB (a) 0.5 1.0 1.5 2.0 2.5 3.0 leakage 0.812 0.850 0.887 0.925 0.962 1.000 1.037 y c n e u q e r f 3.0 0.5 1.0 1.5 2.0 2.5 3.0 leakage 0.812 0.850 0.887 0.925 0.962 1.000 1.037 S-FGBB64 0.5 1.0 1.5 2.0 2.5 3.0 leakage 0.812 0.850 0.887 0.925 0.962 1.000 1.037 S-FGBB144

0% 11% 28% 55% 36% 57%

Tuesday, October 9, 2007

slide-43
SLIDE 43
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 38

D-FGBB Reduces Leakage

1 16 64 144 Number of BB Cells 0.2 0.4 0.6 0.8 1.0 Leakage Power

1.0 NoBB S-FGBB D-FGBB

  • Large leakage reduction after binning: 28-42%
  • More BB cells result in higher savings

Tuesday, October 9, 2007

slide-44
SLIDE 44
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

S-FGBB D-FGBB Normal

Improve chip

  • perating point

Save leakage power

High Performance

Improve chip

  • perating point

Increase average frequency

Low Power

Save leakage power Save leakage power

Applications of D-FGBB

39

Operating environments

Tuesday, October 9, 2007

slide-45
SLIDE 45
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

D-FGBB Improves Frequency

40

1 16 64 144 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 Frequency NoBB S-FGBB D-FGBB

Figure 15. Average frequency of the chips for different

1 16 64 144 Number of BB Cells (a)

  • Average frequency improvement 7-9% over S-

FGBB and 7-16% over NoBB

  • More BB cells result in higher increase

Tuesday, October 9, 2007

slide-46
SLIDE 46
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 41

Power Cost

1 16 64 144 Number of BB Cells 0.2 0.4 0.6 0.8 1.0 Total Power NoBB S-FGBB D-FGBB

Pmax

  • Significant power cost, but still within the power

budget

Tuesday, October 9, 2007

slide-47
SLIDE 47
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

S-FGBB D-FGBB Normal

Improve chip

  • perating point

Save leakage power

High Performance

Improve chip

  • perating point

Increase average frequency

Low Power

Save leakage power Save leakage power

Applications of D-FGBB

42

Operating environments

Tuesday, October 9, 2007

slide-48
SLIDE 48
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 43

1 16 64 144 Number of BB Cells 0.2 0.4 0.6 0.8 1.0 Leakage Power

NoBB S-FGBB D-FGBB

D-FGBB Reduces Leakage

  • Large leakage reduction at constant frequency:

10-51% vs. S-FGBB and 12-69% vs NoBB

  • More BB cells result in higher savings

Tuesday, October 9, 2007

slide-49
SLIDE 49
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

  • D-FGBB targets leakage power
  • DVFS targets mostly dynamic power
  • Can they be combined effectively?

44

Combining D-FGBB with DVFS

Tuesday, October 9, 2007

slide-50
SLIDE 50
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007 45

Combining D-FGBB with DVFS

1 V 0.8 V 0.6 V Vdd 0.2 0.4 0.6 0.8 1.0 Leakage Power Total Power 1.0 1.0 NoBB D-FGBB1 D-FGBB16 D-FGBB64 D-FGBB144 1 V 0.8 V 0.6 V Vdd 0.2 0.4 0.6 0.8 1.0 Total Power

  • D-FGBB scales well with DVFS
  • S-FGBB does not scale unless calibrated at

multiple voltages

Tuesday, October 9, 2007

slide-51
SLIDE 51
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

Conclusions

  • D-FGBB is an effective and versatile tool to address

parameter variation

  • We show three scenarios:
  • Normal: 28-42% leakage savings vs. S-FGBB
  • High performance: 7-9% frequency increase
  • Low power: 10-51% leakage reduction vs. S-FGBB
  • Combines well with DVFS

46 Tuesday, October 9, 2007

slide-52
SLIDE 52
  • Radu Teodorescu

Intel PhD Fellowship Forum, October 2007

More in our MICRO 2007 paper

  • More details on the variation model
  • A solution for combining D-FGBB with DVS
  • Estimated overheads of D-FGBB
  • More implementation details

47

http://iacoma.cs.uiuc.edu

Thank you! Questions?

Tuesday, October 9, 2007