[PPT] - Trade-off Analysis between Timing Error Rate and Power Dissipation PowerPoint Presentation

SLIDE 1

Trade-off Analysis between Timing Error Rate and Power Dissipation for Adaptive Speed Control with Timing Error Prediction

Hiroshi Fuketa† , Masanori Hashimoto†, Yukio Mitsuyama†, and Takao Onoye†

Dept. Information Systems Engineering, Osaka University

†JST, CREST

SLIDE 2

Background

Circuit speed is becoming more sensitive to:

manufacturing variability

perating environment (supply voltage, temperature, etc)

aging (NBTI, HCI, etc)

Timing margin of a chip varies chip by chip.

“Worst case design” is inefficient for large variation. Run-time adaptive speed control is promising.

Speed

Chip C Chip A

Target Speed

Chip B

Slow down Speed up

SLIDE 5

Adaptive speed control

Adaptive speed control with timing error prediction

Canary FF [1]

Warning signal

Delay buffer

Speed control unit

Timing error occurs at canary FF due to delay buffer before main FF captures a wrong value.

[1] T.Sato, et al., “A Simple Flip-Flop Circuit for Typical-Case Designs for DFM,” in Proc. ISQED, 2007

During a monitoring period, warning signal is: detected speed up not detected slow down

Canary FF

Comparator

SLIDE 6

Problem of adaptive speed control

A timing error can not be completely eliminated

If path activation probability is extremely low, a warning

signal may not occur during the monitoring period.

Circuit is slowed down excessively

Timing errors could occur before a warning signal emerges.

When the occurrence of timing errors is extremely

rare, some systems could accept the errors.

Need to estimate the occurrence of timing errors

systematically and quantitatively

SLIDE 7

Timing error rate and power dissipation

How to improve timing error rate?

Insert larger buffer delay

Timing margin of canary FF is much

severer than main FF speeded up more than required increase in power dissipation

Change inserted location Lengthen the monitoring periods

FA FA FA

ex) 32b Ripple carry adder

* FA = Full Adder

9 32

10 1 2 1 ≈ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ Path activation probability

trade-off relations between timing error rate and power

SLIDE 8

Contributions

Propose a framework that systematically evaluates

power dissipation and occurrence of timing errors.

Explore the design space of the adaptive speed control

with canary FF

Examine the relationship between the timing error rate

and the power dissipation

SLIDE 9

Assumed system

Only one canary FF is inserted. Circuit speed is controlled digitally. (“speed level”)

Canary FF

Focus on a path activation probability

Reveal the relationship between power dissipation and

timing error rate of this system Goal Goal

Find the optimum design parameters

satisfying the required power dissipation and timing error rate

Where should canary FF be inserted? How large should buffer delay be set? How long should monitoring period be set?

SLIDE 11

Path activation probability

Path activation probability depends on:

circuit structure speed level

perating condition (ex. temperature)

( )

t Pi

Probability that at least one of paths terminating at the th FF whose delays are larger than is activated. Probability that at least one path in a circuit whose delay is larger than is activated.

( )

t P

all

i

t

Functions of speed level and condition :

( )

X l t Pi , ,

( )

X l t P

all

, ,

l

X

SLIDE 12

Framework overview

w

P

π

given parameter warning probability error probability

state probability

expected power timing error rate

i

P

all

P , ,

w

P

avg

w,

P

err

N

err

P

Path activation probability at each speed level and operation condition Power dissipation at each speed level and operation condition

Expected power dissipation of the system
Timing error rate of the system

Design parameters

Inserted location of canary FF
Delay time of the delay buffer
Monitoring period
transition probability

d

P

SLIDE 13

Warning and error probability

Let be the occurrence probability

f a warning signal at speed level and

condition in a cycle

Canary FF is inserted at the th FF

is the buffer delay in the canary FF
is the clock cycle

Let be the occurrence probability

f a timing error at speed level and

condition in a cycle l X

( )

X l P

w ,

i

( ) ( ) ( )

X l T P X l D T P X l P

c i d c i w

, , , , , − − =

d

D

( ) ( )

X l T P X l P

c all err

, , , =

( )

X l P

err ,

c

T l X

w

P

π

given parameter warning probability error probability

state probability

expected power timing error rate

i

P

all

P , ,

w

P

avg

w,

P

err

N

err

P transition probability

d

P

SLIDE 14

Speed level transition

Speed level : How fast or slow the circuit is controlled

Higher speed level means the circuit is controlled faster. Once a warning signal is detected during the monitoring

period, speed level is incremented by 1.

Speed level transition satisfies Markov property.

The next speed level is determined by the present speed

level and by the detection of the warning signal.

speed level 1 speed level speed level

1

No warnings Warning Warning No warnings

SLIDE 15

Speed level transition probability

Probability that speed level transits

Let be the probability that at

least one warning signal is detected during the monitoring period at speed level and condition :

w

P

π

given parameter warning probability error probability

state probability

expected power timing error rate

i

P

all

P , ,

w

P

avg

w,

P

err

N

err

P transition probability

d

P

mon

N

l X

( )

X l P

d ,

w

P

( ) ( ) ( )

mon

N w d

X l P X l P , 1 1 , − − =

: warning probability Probability that warning is not detected in a cycle

SLIDE 16

Transition Matrix

Transition matrix of the Markov chain: P

P =

( )

max

l P

d

( )

1

max −

l P

d

( )

max

1 l P

d

−

( )

1 1

max −

− l P

d

M O

( )

1

min +

l P

d

( )

1 1

min +

− l P

d

( )

min

l P

d

( )

min

1 l P

d

− M L L

max

l

min

l

* and are maximum and minimum speed level speed level 1 speed level speed level

1

No warnings Warning Warning No warnings

( )

1 1

d

P −

( )

1

d

P −

( )

d

P

( )

1 −

d

P

SLIDE 17

State probability

Let be a state probability vector in

n-th time step

Let be a steady state probability of

being at speed level when condition

( )

n π

( ) ( ) P

n n ⋅ = + π π 1

( )

X

l

π

l X

( )=

∞ → n π

( ) ( ) ( )

X X X

l l l

min max max

1

π π π L

−

max

l

min

l

* and are maximum and minimum speed level

: transition matrix

P

w

P

π

given parameter warning probability error probability

state probability

expected power timing error rate

i

P

all

P , ,

w

P

avg

w,

P

err

N

err

P transition probability

d

P

SLIDE 18

Average cycle of a single stay

State probability is not suitable to evaluate

the power dissipation and the timing error rate.

is not directly related to actual time.

Speed level is changed immediately once a warning signal

is observed.

Periods (# cycles) of being at a certain speed level are not

always the same.

Need “Time”-based state probability

: the average cycle of a single stay at speed level l

( ) ( ) ( ) ( )

mon mon

N w mon w N w mon w w w rem

P N P P N P P P l N − + − + + − ⋅ + ⋅ =

−

1 1 1 2 1

1

L

( )

l Nrem

w

P : warning probability

( )

X

l

π

( )

X

l

π

SLIDE 19

Conversion to time based state probability

max

l

π

min

l

max

l

1 − 1 − 1 −

min

l

min

l

min

l

π

min

l

1 − 1 −

max

l

max

l

( )

max

l Nrem

( )

1

max −

l Nrem

( )

min

l Nrem

State Probability Time Based Probability

) ( max

time l

P ) 1 ( max

time

− l P ) ( min

time l

P

max

l

max

l

max

l

max

l

max

l

1 max−

l

π

max

l

max

l

min

l

max

l

π

1 max−

l

π

min

l

π

max

l

1 −

max

l

min

l

Time based Probability

( ) ( )

∑

⋅ ⋅

j j rem l rem

j N l N π π

= ) (

time l

P

SLIDE 20

Expected power and timing error rate

Expected power dissipation of the system

with canary FF:

is the power dissipation at speed

level and condition (given parameter).

Timing error rate:

Average interval between timing errors Similarly defined to MTBF (Mean Time

Between Failures)

( )

X P

avg

w,

( )

X l P ,

w

l X

( ) ( ) ( )

∑

=

⋅ =

max min

, ,

time

w

avg

w,

l l l

X l P X l P X P

( ) ( ) ( )

∑ ∑

⋅ ⋅ ⋅

l err l rem l l rem

l P l N l N π π

Operating Time Number of failures =

( ) =

X Nerr

( )

X Nerr

w

P

π

given parameter warning probability error probability

state probability

expected power timing error rate

i

P

all

P , ,

w

P

avg

w,

P

err

N

err

P transition probability

d

P

SLIDE 21

Experimental setup

Circuit: 32-bit ripple carry adder (S[0] – S[32]) Supply voltage: 300mV (subthreshold operation) Clock period : 100ns (10MHz) Focus on adaptive speed control for temperature

Consider a temperature variation from 0 C to 80 C Sweeping temperature from 0 C to 80 C by 1 C, evaluate

worst and average

, and are given as closed-form expressions.

derived by numerical fitting based on circuit simulations

Speed control is implemented by body biasing.

avg

w,

P

err

N

( )

t Pi

( )

t P

all

w

P

c

T

SLIDE 23

Experimental results (1/4)

Trade-off between power dissipation and timing

error rate

Buffer delay is changed with 5ns step at each inserted location.

107 109 1011 1013 1015 1 1.1 1.2 1.3 1.4 1.5 1 min 1 hour 1 day 10 days 100 days 1000 days

Worst Nerr (Error rate) [cycles] Normalized Power

S[32] S[16] S[10] Inserted Location

Optimum design parameters vary depending on the required error rate

* Monitoring cycle: Nmon=108

SLIDE 24

Experimental results (2/4)

Inserted location of canary FF and power dissipation

Constraint – worst timing error rate > 1014 cycles Insert canary FF with minimum buffer delay

1.24 1.28 1.32 1.36 1.4 40 50 60 70 80 90

Inserted location

S[8] S[16] S[24] S[32]

Optimum Optimum Normalized power Delay time of delay buffer [ns]

* Monitoring cycle: Nmon=108

err

N

SLIDE 25

Experimental results (3/4)

Monitoring cycle and power dissipation

Power can be reduced by lengthening monitoring cycle.

107 108 109

Inserted location

S[8] S[16] S[24] S[32] 1.2 1.3 1.4

Normalized power

Constraint – worst timing error rate > 1014 cycles

err

N

mon

N

mon

N

Too large deteriorates adjustment response to temperature change.

SLIDE 26

Experimental results (4/4)

Optimal design

Inserted location of canary FF is freely selected

with optimum delay buffer.

1.2 1.3 1.4

Optimal design S[32] Fixed

S[20] / Dd0=37ns S[9] / Dd0=69ns S[13] / Dd0=57ns

Power dissipation can be reduced by optimally selecting the inserted location and buffer delay

Critical path

Normalized power

1011 1013 1015

Worst Nerr (Error rate)

* Monitoring cycle: Nmon=109

[cycles]

SLIDE 27

Conclusion

Propose a framework that systematically evaluates

power dissipation and timing error rate for self- adaptive circuits with timing error prediction

Experiments using a 32-bit ripple carry adder

Reveal the trade-off between the timing error rate and the

power dissipation

Demonstrate that the trade-off depends on design

Trade-off Analysis between Timing Error Rate and Power Dissipation for Adaptive Speed Control with Timing Error Prediction

Hiroshi Fuketa*† , Masanori Hashimoto*†, Yukio Mitsuyama*†, and Takao Onoye*†

*Dept. Information Systems Engineering, *Osaka University

Table of Contents

timing error rate

Table of Contents

timing error rate

Background

Timing margin of a chip varies chip by chip.

Speed

Target Speed

Slow down Speed up

Adaptive speed control

Warning signal

Delay buffer

Timing error occurs at canary FF due to delay buffer before main FF captures a wrong value.

During a monitoring period, warning signal is: detected speed up not detected slow down

Canary FF

Comparator

Problem of adaptive speed control

signal may not occur during the monitoring period.

Timing errors could occur before a warning signal emerges.

rare, some systems could accept the errors.

systematically and quantitatively

Timing error rate and power dissipation

severer than main FF speeded up more than required increase in power dissipation

FA FA FA

ex) 32b Ripple carry adder

10 1 2 1 ≈ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ Path activation probability

trade-off relations between timing error rate and power

Contributions

power dissipation and occurrence of timing errors.

with canary FF

and the power dissipation

Table of Contents

timing error rate

Assumed system

Canary FF

Focus on a path activation probability

timing error rate of this system Goal Goal

satisfying the required power dissipation and timing error rate

Path activation probability

( )

t Pi

Probability that at least one of paths terminating at the th FF whose delays are larger than is activated. Probability that at least one path in a circuit whose delay is larger than is activated.

( )

t P

i

t

t

Functions of speed level and condition :

( )

X l t Pi , ,

( )

X l t P

, ,

l

X

Framework overview

π

state probability

P

P , ,

Path activation probability at each speed level and operation condition Power dissipation at each speed level and operation condition

Design parameters

Warning and error probability

condition in a cycle

condition in a cycle l X

( )

X l P

i

( ) ( ) ( )

X l T P X l D T P X l P

, , , , , − − =

D

( ) ( )

X l T P X l P

, , , =

( )

X l P

Hiroshi Fuketa† , Masanori Hashimoto†, Yukio Mitsuyama†, and Takao Onoye†

Dept. Information Systems Engineering, Osaka University