Stochastic Analysis of Bubble Razor Guowei Zhang Peter A. Beerel - - PowerPoint PPT Presentation

stochastic analysis of bubble razor
SMART_READER_LITE
LIVE PREVIEW

Stochastic Analysis of Bubble Razor Guowei Zhang Peter A. Beerel - - PowerPoint PPT Presentation

Stochastic Analysis of Bubble Razor Guowei Zhang Peter A. Beerel Department of Microelectronics and Ming Hsieh Electrical Engineering Department Nanoelectronics Tsinghua University, China Univ. of Southern California, USA Why Bubble Razor?


slide-1
SLIDE 1

Stochastic Analysis of Bubble Razor

Guowei Zhang

Department of Microelectronics and Nanoelectronics Tsinghua University, China

Peter A. Beerel

Ming Hsieh Electrical Engineering Department

  • Univ. of Southern California, USA
slide-2
SLIDE 2

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Notable delay variations in IC

– Process, temperature, and voltage variations – Aging effects

  • Traditional synchronous design

– Too much timing margin – Performance and energy loss

  • Existing resilient designs – detect errors & change freq. or voltage

– Canary circuits [Nakai 2005, Hirair 2012]

  • Delay chains mimic critical path

– Razor I & II & Lite [Ernst 2003, Park 2013, Kim 2013, Tokunaga 2009]

  • In situ error detection and correction
  • Recover via architectural replay

Why Bubble Razor?

26-Mar-14 2

slide-3
SLIDE 3

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Problem: Adoption challenge

– Detection: Necessary to analyze inserted error signals – Correction: Necessary to implement replay

  • Bubble Razor [Fojtik et al., ISSCC 2012] [Fojtik et al.,JSSC 2013]

– Architecture independent – Latch-based structure

  • Latches don’t change together
  • No architectural replay requirement

– Local correction mechanism

  • Single stall recovery
  • Suitable for large circuits (no global control requirement)

Why Bubble Razor?

26-Mar-14 3

slide-4
SLIDE 4

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Factors effecting performance unclear

– # of stages in pipeline – Delay variance – Probability of a timing error

  • Thus, effectiveness and sensitivities not quantified
  • Our proposed solution

– Analyze performance using Markov Chains

  • Focus on an N-stage pipeline ring
  • Consider both normal and log-normal delays

– Propose simplified model for other pipeline structures

Our focus – Analysis of Bubble Razor

26-Mar-14 4

slide-5
SLIDE 5

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Mechanism

– 2-stage Bubble Razor Ring

Bubble Razor

26-Mar-14 5

slide-6
SLIDE 6

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Clock Cycle Time (C)

– Cycle time of global clock

  • Effective Clock Cycle Time (EC)

– Average time to process each instruction – EC > C because of pipelines stalls

  • Performance of Bubble Razor

– Assume we can somehow find π(working)

  • Probability of a latch processing an instruction

– Then, EC = C / π(working)

Quantifying Performance of Bubble Razor

26-Mar-14 6

Examples No timing violations Timing violation every instruction

  • π(working) = 1
  • EC = C
  • π(working) = 1/2
  • EC = 2C
slide-7
SLIDE 7

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Circuit State

– If modeled correctly, next circuit state depends on current state and probability of error p – System is then a Markov Chain

  • Our approach

– Define circuit state as combination of latch states

Markov Chain Analysis

26-Mar-14 7

Description of all latch states

Working A timing violation is detected? No W Yes E Stalling To whom this latch sends bubbles? Neither N Right Neighbor (RN) R Left Neighbor (LN) L Both B

slide-8
SLIDE 8

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Circuit state transition rule

– Easily derived from latch state transition rule

  • Latch state transition rule
  • Leads to stationary distribution of circuit states, and thus

latch states [ π(working) = π(W) + π(E) ] – Expressed as a function of p and N

Markov Chain Analysis

26-Mar-14 8

LN RN Next State W L, B L E L (annihilation) B, R N (annihilation) B, R W, E, N, R R E B N, L W W W with probability of (1-p) E with probability of p

slide-9
SLIDE 9

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

Markov Chain Analysis Results

26-Mar-14 9

  • Closed-form formulas

0.2 0.4 0.6 0.8 1 1 1.2 1.4 1.6 1.8 2 N = 1 MC Model N = 2 MC Model N = 3 MC Model N = 4 MC Model

slide-10
SLIDE 10

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Transition Probability Matrix (T)

– Product of 2 transition matrices

  • Each represents one clock phase

– Can be reduced

  • The problem

– Not feasible for large N

Markov Chain– State Explosion Problem

26-Mar-14 10

N Theoretical size = 6 ^ (2N) Optimized size (impossible states deleted) Final size (unreachable states deleted) 1 36 7 5 2 1,296 45 21 3 46,656 301 95 4 1,679,616 2017 449 Size of T

slide-11
SLIDE 11

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Stalling is related to all other latches

– But, can assume they are independent – 2N independent possible sources of a stall

  • Thus

– Probability (Stalling) = 1 - ( 1 - p ) ^ 2N – EC = C + C * Probability (Stalling) = C [ 2 – ( 1 – p ) ^ 2N ]

Our Solution: Simplified Analysis

26-Mar-14 11

slide-12
SLIDE 12

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • EC ~ C, N, p

– Simplified model is conservative – Results for small p are close

Markov Chain versus Simplified Models

26-Mar-14 12

0.2 0.4 0.6 0.8 1 1 1.2 1.4 1.6 1.8 2 N = 1 MC Model N = 1 Simplified Model N = 4 MC Model N = 4 Simplified Model

slide-13
SLIDE 13

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Delay of pipeline stage (d) is a random variable

– p = Probability (d > C/2) – Two kinds of delay distribution

  • Normal distribution
  • Log-normal distribution [Zhai 2005, Chandrakasan 2005]

– μ: mean of a delay – σ: standard deviation of a delay

  • Final results

– EC is a function of

  • C:

Clock cycle time

  • N:

Number of stages

  • σ/μ: represents delay variance

Delay Distribution

26-Mar-14 13

slide-14
SLIDE 14

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • How do N, σ/μ, distribution type (normal / log-normal)

and model type (Markov Chain / Simplified) influence EC? – EC ~ C, N, σ/μ

  • Is Bubble Razor better than traditional counterparts?

– Condition

  • Constrained with same Systematic Error Rate

– For example, 0.1% – Performance metric

  • EC(Bubble Razor) / EC(Traditional circuit)
  • Set μ = 0.5

Systematic Error Rate

26-Mar-14 14

slide-15
SLIDE 15

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Vary pipeline depth N

– Set σ/μ = 0.4 – Assume Log-normal distribution

  • Results

– BR always better than sync – As N↑, benefits drop slightly (2%)

Performance Analysis Results (1 of 3)

26-Mar-14 15

0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 2.5 3 3.5 4 N = 1 N = 2 N = 3 N = 4

slide-16
SLIDE 16

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Vary σ/μ

– Assume N = 4 – Assume log-normal distribution

  • Results

– σ/μ↑, benefit improves – For σ/μ = 0.5, EC is reduced by 40.2%

Performance Analysis Results (2 of 3)

26-Mar-14 16

0.5 1 1.5 2 2.5 3 3.5 1 2 3 4 5

σ/µ = 20% σ/µ = 30% σ/µ = 40% σ/µ = 50%

slide-17
SLIDE 17

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Vary distribution and

model – Assume N = 4 – Assume σ/μ = 0.25

  • Results

– Distribution type (line color) impacts benefits – Simplified model tracks well ( < 5% )

  • p is usually < 25%

Performance Analysis Results (3 of 3)

26-Mar-14 17

0.5 1 1.5 2 0.5 1 1.5 2 2.5 Log-Normal, MC Model Log-Normal, Simplified Model Normal, MC Model Normal, Simplified Model

slide-18
SLIDE 18

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Point A

– Constrained by SER – Easy to realize

  • Increase f until PoFF
  • Point B

– Local Minimum Effective Cycle (EC) time

Optimizing Strategy

26-Mar-14 18

0.5 1 1.5 2 2.5 3 3.5 1 2 3 4 5

σ/µ = 20% σ/µ = 30% σ/µ = 40% σ/µ = 50%

Better than traditional sync Optimal (for every N) Point A σ/μ ≥ 16% (moderate variance) σ/μ ≥ 31% (high variance) Point B σ/μ ≥ 3% σ/μ ≤ 28%

slide-19
SLIDE 19

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

  • Proposed Analytical Methods for Analyzing Bubble Razor

– Markov Chain & Simplified Model – The latter is a conservative and a close approximation

  • Bubble Razor indeed has better performance than

traditional circuits, especially under high delay variance

  • Setting clock cycle time as short as possible is often

efficient

Summary and Conclusions

26-Mar-14 19

slide-20
SLIDE 20

Guowei Zhang, Tsinghua University and Peter A. Beerel, USC

Thank you!

26-Mar-14 20

Guowei Zhang

Department of Microelectronics and Nanoelectronics Tsinghua University, China

Peter A. Beerel

Ming Hsieh Electrical Engineering Department

  • Univ. of Southern California, USA