Online Aging Monitoring and Resilience Hao-Chun Chang, Li-An Huang, - - PowerPoint PPT Presentation

online aging monitoring and resilience
SMART_READER_LITE
LIVE PREVIEW

Online Aging Monitoring and Resilience Hao-Chun Chang, Li-An Huang, - - PowerPoint PPT Presentation

Selective Sensor Placement for Cost-Effective Online Aging Monitoring and Resilience Hao-Chun Chang, Li-An Huang, Kai-Chang Wu Department of Computer Science, National Chiao Tung University Yu-Guang Chen Department of Electrical Engineering,


slide-1
SLIDE 1

Selective Sensor Placement for Cost-Effective Online Aging Monitoring and Resilience

Hao-Chun Chang, Li-An Huang, Kai-Chang Wu

Department of Computer Science, National Chiao Tung University

Yu-Guang Chen

Department of Electrical Engineering, National Central University

slide-2
SLIDE 2

Outline

Introduction

  • Background
  • Motivation
  • Contribution

Preliminaries Proposed Framework Experimental Results Conclusion

slide-3
SLIDE 3

⚫ Aging effect is the major challenge of reliability-aware IC/SOC

design techniques

⚫ The effects of device aging

⚫ Performance degradation ⚫ Potential failure

Background

slide-4
SLIDE 4

⚫ Traditional design method adopts guard-band by adding extra timing

margin

⚫ Razor flip-flop is a well-known technique of timing speculation

May need to be deployed widely throughout a circuit

⚫ The sensitization rate of critical paths is negligibly small ⚫ Many critical paths have common sub-paths in the forward section

Motivation

slide-5
SLIDE 5

G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

The Deployment

RFF3 RFF2 RFF1 P2 P1 P3

slide-6
SLIDE 6

G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

The Deployment

RFF3 RFF2 RFF1 P2 P1 P3 TDG3

slide-7
SLIDE 7

G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

The Deployment

RFF3 RFF2 RFF1 P2 P1 P3 TDG2

slide-8
SLIDE 8

⚫ Reduction in hardware cost with insignificant performance loss ⚫ Tradeoff between extra hardware cost and performance by adjusting

the weights of cost function

Contribution

slide-9
SLIDE 9

Outline

Introduction Preliminaries

  • Razor Flip-Flop
  • Transition Detector

Proposed Framework Experimental Results Conclusion

slide-10
SLIDE 10

⚫ Consist of a regular main flip-flop, an additional shadow latch and

some control logic

Razor Flip-Flop (RFF)

slide-11
SLIDE 11

⚫ Detect whether the signal transition has happened lately ⚫ Forecast the excessive aging from forwarding gates ⚫ Send a warning signal to stall for one clock cycle

Transition Detector (TD)

D D Q Q

Detection window The monitored signal (D) Warning signal CLK The monitored signal (D) Warning signal Detection window T1 T2

slide-12
SLIDE 12

⚫ Detect whether the signal transition has happened lately ⚫ Reserve the execution time with worst-case aging for the remaining

sub-paths

,∀p monitored by TDg

TD: Detection Window

⚫TDg: Transition detector at gate g ⚫DWg: Detection window on TDg ⚫Tc: Clock period ⚫t: Current clock cycle beginning time ⚫𝜀w: (Worst-case) 10-years aging of the long path ⚫RDp: The remaining delay of vulnerable path p

) (

,g p w g

RD Max Tc t DW  − + = 

slide-13
SLIDE 13

Outline

Introduction Preliminaries Proposed Framework

  • Problem Formulation
  • Exact Set Cover
  • Maximum Satisfiability

Experimental Results Conclusion

slide-14
SLIDE 14

⚫ TD deployment to monitor vulnerable paths ⚫ A TD monitors those vulnerable paths, passing through it ⚫ It’s a covering problem!

Problem Formulation

P2 P1 P3 G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

slide-15
SLIDE 15

⚫ The universal set U: {1, 2, 3, 4} ⚫ Given a collection C of subsets of U

⚫ C: {{1, 3}, {2, 3}, {2, 4},{1, 2}}

⚫ An exact cover is a sub-collection C* of C

⚫ Each element in U is covered by exactly one subset in C* ⚫ C*: {{1, 3}, {2, 4}}

Exact Set Cover (ESC)

U

1 2 4 3

slide-16
SLIDE 16

⚫ Each vulnerable path represents a variable of the universal set

⚫ U: {P1, P2, P3}

ESC: The Universal Set U

P2 P1 P3 G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

slide-17
SLIDE 17

⚫ Each gate that including FF and PI on vulnerable paths defines a set

Sg and Sg is a subset of the universal set

⚫ C: {SPI, SG1, SG2, …, SG8, SFF1, …, SFF3} ⚫ SG2: {P1, P2, P3} ⚫ SG3: {P1, P2} ⚫ SFF3: {P3}

P2 P1 P3

ESC: A Collection C of Subsets of U

G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

slide-18
SLIDE 18

⚫ Each element in U is covered by exactly one subset in C*

⚫ C*: {SG1} = {{P1, P2, P3}} ⚫ C*: {SG3, SG6} = {{P1, P2}, {P3}} ⚫ C*: {SFF1, SFF2, SFF3} = {{P1}, {P2}, {P3}}

ESC: A Sub-collection C* of C

P2 P1 P3 G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

slide-19
SLIDE 19

⚫ Maximum number of clauses can be made true by an assignment ⚫ Weighted MAX-SAT solver

⚫ The minimum sum of costs of unsatisfied clauses

⚫ CNF: Conjunctive Normal Form

⚫ For each candidate g introduce a new variable Vg ⚫ Hard clauses I &II

⚫ Boolean constraint

⚫ Soft clause

⚫ Cost function

Maximum Satisfiability

slide-20
SLIDE 20

G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

⚫ If two subsets Sg and Sh intersect, produce a hard clause ⚫

, ∀g, h; g ≠ h and Sg ∩ Sh ≠ ∅

MAX-SAT: Hard Clause I

T ∨ F

h g

V V

  • P2

P1 P3 TDG2

{P1, P2, P3} {P1, P2}

G3 G2

V V

slide-21
SLIDE 21

G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

⚫ For each vulnerable path p, produce a hard clause ⚫

MAX-SAT: Hard Clause II

FF2 G5 G3 G2 G1 PI

V V V V V V     

F F F F ∨ ∨ ∨ ∨ T

RFF2 P2 P1 P3

∨ F

g p S

V

g 

slide-22
SLIDE 22

⚫ For each candidate g with COST(g), produce a soft clause

⚫ The minimum sum of costs of unsatisfied clauses

⚫ The minimum sum of costs of C*

MAX-SAT: Soft Clause

g

V

  • ( )

g COST

slide-23
SLIDE 23

⚫ Cost function

⚫ Ag: Area cost

⚫ TD: 1.5 ⚫ RFF: 1

Definition: Cost

( )

g A S p M

A W p g M W g COST

g

 +  =

 

) , (

⚫ WM: The weight of misprediction rate ⚫ M(g, p): Misprediction rate at gate g on the vulnerable path p ⚫ WA: The weight of area cost ⚫ Ag: Area cost for deployment

slide-24
SLIDE 24

⚫ Probability that consecutive side-inputs from starting point are non-

controlling value

0.4 0.3 0.6 0.7 0.1 G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

Cost: Sensitization Rate

⚫Sensitization rate ⚫Probability of side inputs being 1 ⚫SR(g, p): Sensitization rate at gate g on path p

SR(G3,P1) = SR(G3,P2 ) SR(G2, P1) = SR(G2,P2) = SR(G2,P3)

0.5*(1-0.7) 0.15*(1-0.6)

slide-25
SLIDE 25

⚫ Probability that the sensitization from the starting point but not

propagate to the endpoint

G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

Cost: Misprediction Rate

⚫Sensitization rate ⚫SR(g, p): Sensitization rate at gate g on path p ⚫M(g, p): Misprediction rate at gate g on the vulnerable path p

SR(G2, (∀p ϵ SG2)) SR(FF1, P1) SR(FF2, P2) SR(FF3, P3)

( )

) , ( / ) , ( 1 ) , ( ) , ( p g SR p FF SR p g SR p g M −  =

M(G2, P1): 0.09 M(G2, P2): 0.144 M(G2, P3): 0.108

∑∀pϵSG2 M(G2, p): 0.342 0.15 0.06 0.042 0.006

slide-26
SLIDE 26

⚫ Finally, Combine three kinds of clauses

⚫ The deployment by the output of MAX-SAT solver

⚫ Expected: an acceptable solution exists

⚫ Or no more improvement

⚫ Unexpected: WM and WA adjustment

The Input & Output of MAX-SAT

SC HC HC   2 1

slide-27
SLIDE 27

⚫ WM:WA = 1:10

⚫ More consideration on hardware cost ⚫ Total cost: 15.342

Weight Adjustment: Case 1

15.342 TDG2 G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

slide-28
SLIDE 28

G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

⚫ WM:WA = 5:1

⚫ More consideration on performance ⚫ Total cost: 1.77+1

Weight Adjustment: Case 2

1 1 1 RFF3 TDG3

slide-29
SLIDE 29

G1 G3 G5 FF1 G4 G7 G6 G8 G2 FF2 FF3 PI

⚫ WM:WA = 10:1

⚫ More consideration on performance ⚫ Total cost: 1+1+1

Weight Adjustment: Case 3

1 1 1 RFF3 RFF2 RFF1

slide-30
SLIDE 30

Outline

Introduction Preliminaries Proposed Framework Experimental Results Conclusion

slide-31
SLIDE 31

Setting

⚫ The benchmark are chosen from the IWLS’05 and ISCAS’89 ⚫ The technology used is TSMC 65nm GP standard cell series ⚫ Timing reports are provided by ITRI ⚫ Workstation:

⚫ Linux ⚫ C++ ⚫ MAXHS 2.9

slide-32
SLIDE 32
slide-33
SLIDE 33

Outline

Introduction Preliminaries Proposed Framework Experimental Results Conclusion

slide-34
SLIDE 34

⚫ Cost-effective TD deployment for aging resilience with

insignificant performance loss

⚫ Tradeoff between extra hardware cost and performance

Conclusion

slide-35
SLIDE 35

Thanks for listening!

The End