A Statistical Framework for Designing On-chip Thermal Sensing - - PowerPoint PPT Presentation

a statistical framework for designing on chip thermal
SMART_READER_LITE
LIVE PREVIEW

A Statistical Framework for Designing On-chip Thermal Sensing - - PowerPoint PPT Presentation

A Statistical Framework for Designing On-chip Thermal Sensing Infrastructure Yufu Zhang, Bing Shi, Ankur Srivastava University of Maryland, College Park {yufuzh, bingshi, ankurs}@umd.edu Outline Motivation/overview Fusion center design


slide-1
SLIDE 1

A Statistical Framework for Designing On-chip Thermal Sensing Infrastructure

Yufu Zhang, Bing Shi, Ankur Srivastava

University of Maryland, College Park

{yufuzh, bingshi, ankurs}@umd.edu

slide-2
SLIDE 2

2

Outline

 Motivation/overview  Fusion center design  Sensor design/compression

 Noisy sensor behavior  Exploiting the correlation

 Sensor placement  Overall flow and interplay  Results and conclusion

slide-3
SLIDE 3

3

Motivation

 Thermal/power stress

 Heavy task execution  Increasing chip density  Leakage power

 Dynamic thermal management (DTM)

 Essentially sacrificing performance for lower temperature  Need accurate runtime thermal information

slide-4
SLIDE 4

4

Motivation

 Need sensors to provide accurate runtime

thermal input

 On-chip thermal sensors

 On-chip sensors can sample the thermal state of the

chip during runtime

Counter Sensor Output Enabled for a fixed period of time tp

EN

A simple ring

  • scillator-based

thermal sensor

slide-5
SLIDE 5

Motivation

 Several problems for a naïve thermal

sensing scheme.

 Sensors cannot go everywhere  Sensors are subject to noise  Resource is limited

 Our goal --- a complete thermal sensing

infrastructure that includes:

 Sensor design/compression  Sensor placement  Data fusion

5

slide-6
SLIDE 6

Overall structure

6

slide-7
SLIDE 7

Fusion Center Design

 Central register (finite size M)

 Could be a single or multiple actual registers

 Fusion algorithm

 Model the thermal profile as a random vector T  Predict (T) given the sensor obs vector (TS)  Exploit statistical information (mean, var, correlation etc.)

 Bayesian Estimation Philosophy

7

1

( | ) ( )

S S T S TS SS

E T T T µ µ

= + Σ Σ −          

( | ) ( )

TS T S T S S S

E T T T ρ σ µ µ σ = + −

Scalar case: Vector case:

85 90 95 100 105 110 115 500 1000 1500 2000 2500 3000 3500 4000 Temperature (degrees Celsius) Number of samples

slide-8
SLIDE 8

Fusion Center Design

 Given sensor input, the variance of T is reduced to:

 Diagonal elements – variance of the thermal estimates.  Reflects the fundamental uncertainty of our estimation.

(how far away our estimates are from the real temperature)

 Used to drive sensor placement.

 A better metric to drive sensor placement?

 Sensors are not like cameras  Generate the probability of capturing all hotspots

8

( )

1

( ( | )) ( ( | ))T

TT s s TT TS SS ST

E T E T T T E T T

Σ = − ⋅ − = Σ − Σ Σ Σ

slide-9
SLIDE 9

Sensor Design

 Noisy sensor behavior (Monte Carlo Simulation)  Sensor readings are

compressed as well due to center register size constraint

 Hypothesis testing

9

1 1 ( )

PHL PLH

f P N t t = = +

3 4 2 1 ln ( / ) ( ) 2

t DD t PHL n

  • x

n DD t DD t DD

V V V C t C W L V V V V V µ     − = +     − −    

0.002( )

t t

V V T T = + −

1.5 /

( / )

n p

T T µ µ

=

50 100 150 200 250 300 350 0.5 1 1.5 2 2.5 3 3.5x 10

4

Sensor frequency(MHz) Number of samples

T = 20°C T = 40°C T = 100°C T = 80°C T = 60°C

Counter Sensor Output Enabled for a fixed period of time tp

EN

slide-10
SLIDE 10

Sensor Design

 Target: minimize the expected prediction error:

 Optimal decision rule:  Implement as an encoder at the sensor output

10

1

(| | ) | | ( | )

pred real

  • bs

n pred i real i

  • bs

i

Cost E T T T T H prob T H T

=

= − = − ⋅ =

1

( | ) ( ) ( | ) ( | )

  • bs

real i i

  • bs
  • bs

real i i n

  • bs

real j j j

prob T T H P prob T prob T T H P prob T T H P

=

= ⋅ = = ⋅ = = ⋅

1...

( ) arg min (| | )

pred n

pred

  • bs

pred real

  • bs

T H H

T T E T T T δ

=

= = −

Bayes rule

slide-11
SLIDE 11

Sensor and fusion center co-design

 How do we compress sensors so that…

 They fit into the central register  Collectively they provide better accuracy

  • --- more compressed sensors vs fewer non-compressed ones

 Bit allocation problem:

 Decide how to allocate a total of M bits to n sensors so that the

  • verall expected estimation error is minimum:

(suppose si is the number of bits allocated to sensor i )

Minimize Subject to

11

1 2

( ( , ,..., ))

n

E error s s s

i i i i

s b s M ≤ ≤    =  ∑

slide-12
SLIDE 12

Sensor Compression

 Target: to reduce the overall expected error caused by

sensor compression.

 Different compression scheme leads to different overall error.  Can be formulated as a optimization problem (see details in our

paper).

12

( )

1 2 : 1

( , ,..., ) | ( | ) ( | ) | | ( ) |

n c a s s i i grids i c a s s TS SS rows

TotalCost E error s s s E E T T E T T E T T

∀ − ∀

=   = −       = Σ Σ −    

∑ ∑

       

1

( | ) ( )

i

S i S S T TS SS

E T T T µ µ

= + Σ Σ −      

slide-13
SLIDE 13

Sensor Placement

 Let “S” and “T” represent the set of sensor

locations and all chip locations, respectively.

 Problem formulation:  As mentioned earlier represents the

fundamental uncertainty/variance associated with our thermal estimates

13

| | ( )

TT

choose S T with S n such that trace is minimized ⊂ = Σ

TT

Σ

1 TT TT TS SS ST −

Σ = Σ − Σ Σ Σ

slide-14
SLIDE 14

Sensor Placement Algorithm

14

slide-15
SLIDE 15

Overall flow and interplay

15

Sensor placement Bit allocation/ sensor compression Evaluate overall E(error) = Fusion center design total size of CR = M Design spec Statistical info Too much error? Yes Increase number of sensors No Done

( )

TT

trace Σ

slide-16
SLIDE 16

Experimental Results

16

1200 1300 1400 1500 1600 1700 1800 75 80 85 90 95 100 105 Time (seconds) Temperature (degrees Celsius) Actual temperature Our estimates Range-based estimates

  • Fig. 1 Dynamic temperature tracking curves
slide-17
SLIDE 17

Experimental Results

17

  • Fig. 2 RMS error comparison when increasing

the number of sensors

slide-18
SLIDE 18

Conclusion

 We presented a unified statistical framework for

designing a complete thermal sensing infrastructure.

 Significant improvement in thermal sensing accuracy can

be achieved with very small overhead

 Our methodology has the capability of trading off

complexity for accuracy at will. It also takes into account various design considerations such as sensor noise and area constraints.

18

slide-19
SLIDE 19

19

Thank you!