Integrated Retiming and Simultaneous Vdd/Vth Scaling for Total - - PowerPoint PPT Presentation

integrated retiming and simultaneous vdd vth scaling for
SMART_READER_LITE
LIVE PREVIEW

Integrated Retiming and Simultaneous Vdd/Vth Scaling for Total - - PowerPoint PPT Presentation

Integrated Retiming and Simultaneous Vdd/Vth Scaling for Total Power Minimization Mongkol Ekpanyapong Advisor: Prof. Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology Outline Introduction and


slide-1
SLIDE 1

Integrated Retiming and Simultaneous Vdd/Vth Scaling for Total Power Minimization

Mongkol Ekpanyapong Advisor: Prof. Sung Kyu Lim

School of Electrical and Computer Engineering Georgia Institute of Technology

slide-2
SLIDE 2

2

May, 2006

Outline

Introduction and Motivation Related Work Methodology Experimental Results Conclusions

slide-3
SLIDE 3

3

May, 2006

Introduction

Both static and dynamic power are the important issue in deep

submicron design

Performance is important issue The objective of this work is to minimize total power consumption

while maintain the target clock period

slide-4
SLIDE 4

4

May, 2006

Retiming Algorithm

Linear Programming

Can easily be modified to handle any linear objective

Bellman-Ford Algorithm

Can handle large circuits

slide-5
SLIDE 5

5

May, 2006

Power Minimization

Minimize total number of Flip-flop to reduce flip-flop power Using dual Vdd and Vth to minimize static and dynamic power

slide-6
SLIDE 6

6

May, 2006

Outline

Introduction and Motivation Related Work Methodology Experimental Results Conclusions

slide-7
SLIDE 7

7

May, 2006

Retiming and Voltage Scaling

  • C. E. Leiserson and J. B. Saxe, “Retiming synchronous

circuitry,” Algorithmica 1991

  • K. Usami and M. Horowitz, “Clustered Voltage Scaling

Technique for Low-Power Design“ , ISLPED 1995

  • N. Chabini and W. Wolf, “Reducing Dynamic Power

Consumption in Synchronous Sequential Digital Designs Using Retiming and Supply Voltage Scaling,” TVLSI 2004

slide-8
SLIDE 8

8

May, 2006

Outline

Introduction and Motivation Related Work Methodology Experimental Results Conclusions

slide-9
SLIDE 9

9

May, 2006

Circuit Description

Target Clock Period

Power Minimization with Retiming

RETIMING Voltage Scaling (LP) Fixed

slide-10
SLIDE 10

10

May, 2006

Retiming Formulation

Objective: Minimize the number of flip-flops (FF.) Constraints:

  • Num. FF. has to be satisfied

r(u) ≤ w(eu,v) + r(v)

  • Num. FF. on critical paths has to be greater than zero
Q Q SET CLR S R

1 2 3

r(v) 1 r(v) V is the set of gates and E is the set of edges. v ∈ V and e ∈ E r(v) is the number of FF. moved from fanout of node v to fanin of node v w(eu,v) is the FF. count on edge u,v, D(u,v) is the maximum delay on path u,v W(u,v) is minimum number of FF. on path u,v 1 w(e) 1 wr(e)

slide-11
SLIDE 11

11

May, 2006

Retiming Formulation

Q Q SET CLR S R

1 2 3

r(v) 1 r(v) V is the set of gates and E is the set of edges. v ∈ V and e ∈ E r(v) is the number of FF. moved from fanout of node v to fanin of node v w(eu,v) is the FF. count on edge u,v, D(u,v) is the maximum delay on path u,v W(u,v) is minimum number of FF. on path u,v Objective: Minimize the number of flip-flops (FF.) Constraints:

  • Num. FF. has to be satisfied

r(u) ≤ w(eu,v) + r(v)

  • Num. FF. on critical paths has to be greater than zero

4

Q Q SET CLR S R
slide-12
SLIDE 12

12

May, 2006

Retiming Formulation

Q Q SET CLR S R

u v

V is the set of gates and E is the set of edges. v ∈ V and e ∈ E r(v) is the number of FF. moved from fanout of node v to fanin of node v w(eu,v) is the FF. count on edge u,v, D(u,v) is the maximum delay on path u,v W(u,v) is minimum number of FF. on path u,v Objective: Minimize the number of flip-flops (FF.) Constraints:

  • Num. FF. has to be satisfied

r(u) ≤ w(eu,v) + r(v)

  • Num. FF. on critical paths has to be greater than zero
Q Q SET CLR S R

Only these 2 FF. can move out of u

slide-13
SLIDE 13

13

May, 2006

Retiming Formulation

Q Q SET CLR S R

1 2 3

W(1,2) = 0 W(1,3) = 1 W(2,3) = 1 V is the set of gates and E is the set of edges. v ∈ V and e ∈ E r(v) is the number of FF. moved from fanout of node v to fanin of node v w(eu,v) is the FF. count on edge u,v, D(u,v) is the maximum delay on path u,v W(u,v) is minimum number of FF. on path u,v Cycle Time (L) =2 Objective: Minimize the number of flip-flops (FF.) Constraints:

  • Num. FF. has to be satisfied

r(u) ≤ w(eu,v) + r(v)

  • Num. FF. on critical paths has to be greater than zero

D(1,2) = 2 D(1,3) = 3 D(2,3) = 2 r(1)-r(3) ≤ 0 r(1) ≤ r(3)

slide-14
SLIDE 14

14

May, 2006

Retiming Formulation

Q Q SET CLR S R

1 2 3

W(1,2) = 0 W(1,3) = 1 W(2,3) = 1 V is the set of gates and E is the set of edges. v ∈ V and e ∈ E r(v) is the number of FF. moved from fanout of node v to fanin of node v w(eu,v) is the FF. count on edge u,v, D(u,v) is the maximum delay on path u,v W(u,v) is minimum number of FF. on path u,v Cycle Time (L) =2 Objective: Minimize the number of flip-flops (FF.) Constraints:

  • Num. FF. has to be satisfied

r(u) ≤ w(eu,v) + r(v)

  • Num. FF. on critical paths has to be greater than zero

D(1,2) = 2 D(1,3) = 3 D(2,3) = 2 r(1)-r(3) ≤ 0 r(1) ≤ r(3)

slide-15
SLIDE 15

15

May, 2006

Q Q SET CLR S R Q Q SET CLR S R

Non-critical Gates for Power Minimization

Non-critical gates: What should we do? We can use the voltage scaling for non-critical gates after retiming to minimize total power consumption

slide-16
SLIDE 16

16

May, 2006

Low-to-High Vdd Conversion

Level Converter (LC) requirement

LC

slide-17
SLIDE 17

17

May, 2006

Voltage Scaling Formulation

Objective: Minimize gate power + LC power Constraints: Each gate has to be assigned to

  • nly one voltage state

Arrival time + gate delay of each node ≤ target clock period Level converter inserted if low Vdd node drives high Vdd node

slide-18
SLIDE 18

18

May, 2006

Voltage Scaling Formulation

v v v v v Vdd High Vth Low (xv,4=1) Vdd High Vth High (xv,3=1) Vdd Low Vth Low (xv,2=1) Vdd Low Vth High (xv,1=1)

slide-19
SLIDE 19

19

May, 2006

Voltage Scaling Formulation

u v d(u) = 1 s(u) = 0 s(v) = 1

slide-20
SLIDE 20

20

May, 2006

Voltage Scaling Formulation

u v d(u) = 1 s(w) = 1 s(v) = 2 w s(u) = 0 d(w) = 1

slide-21
SLIDE 21

21

May, 2006

Voltage Scaling Formulation

u v d(u) = 1 s(u) = 0 s(v) = 1 d(v) = 1 Cycle time (L) = 2 s(u) + d(u) ≤ 2 s(v) + d(v) ≤ 2

slide-22
SLIDE 22

22

May, 2006

Voltage Scaling Formulation

VL VH

LC

m(e) = 1

slide-23
SLIDE 23

23

May, 2006

Convert from ILP to LP

0.0 0.6 0.3 0.5 1 0.3 0.3 0.4 0.0 0.6 0.0 0.8 0.8 0.5

Assume only two states for illustration purpose m(e) xu,2 = x(u)

0 = low Vdd 1 = high Vdd

VL VH

xu,1=1 xu,2=0 xu,1=0 xu,2=1

slide-24
SLIDE 24

24

May, 2006

Gradient Search Algorithm for LC Relaxation

Solve LP by setting m(e) = 0 if m(e) < mth Otherwise m(e) = 1 Solve LP Return Compute new mth Relax LP solution

0.0 0.6 0.3 0.5 1 0.3 0.3 0.4 0.0 0.6 0.0 0.8 0.8 0.5

mth = 0.5

1 1 1

While |Gain| > Threshold

slide-25
SLIDE 25

25

May, 2006

Gradient Search Algorithm for LC Relaxation

Solve LP by setting m(e) = 0 if m(e) < mth Otherwise m(e) = 1 Solve LP Return Compute new mth Relax LP solution

0.0 1 0.3 0.4 0.0 0.0 0.8 0.8 0.5

mth = 0.5

1 1 1 0.7 0.3 0.3 0.7

While |Gain| > Threshold

Voltage Assignment Relaxation

slide-26
SLIDE 26

26

May, 2006

Voltage Assignment

Four possible voltage assignment:

High Vdd, low Vth node

Fastest gate, high dynamic power, high leakage power

High Vdd, high Vth node

High dynamic power, low leakage power

Low Vdd, low Vth node

Low dynamic power, high leakage power

Low Vdd, high Vth node

Slowest gate, low dynamic power, low leakage power

slide-27
SLIDE 27

27

May, 2006

Possible Supply Voltage Assignment

VH VH VL VH VH VL VL VL

LC

VH VH VL VH VH VL VL VL

LC LC LC

Feasible Solution Infeasible Solution

slide-28
SLIDE 28

28

May, 2006

LP Relaxation for Voltage State Assignment

u

LC

low Vdd high Vdd

VL VH

v

slide-29
SLIDE 29

29

May, 2006

LP Relaxation for Voltage State Assignment

u

0.7

low Vdd high Vdd

VL VH

v

slide-30
SLIDE 30

30

May, 2006

LP Relaxation for Voltage State Assignment

u

LC

v

low Vdd high Vdd

VL VH

slide-31
SLIDE 31

31

May, 2006

LP Relaxation for Voltage State Assignment

high Vdd low Vth

Slk = 2.2 v v Assigned VddHigh to V Dly = 1 Dly = 2.1 v

high Vdd high Vth

slide-32
SLIDE 32

32

May, 2006

LP Relaxation for Voltage State Assignment

Slk = 1.5 v v Assigned VddHigh to V Dly = 1 Dly = 2.1 v

high Vdd low Vth high Vdd high Vth

slide-33
SLIDE 33

33

May, 2006

LP Relaxation for Voltage State Assignment

0.0 1 1 0.3 0.3 0.0 0.0 0.7 0.7 0.3 1 1

Assume only two states

VL VH

slide-34
SLIDE 34

34

May, 2006

Gradient Search Algorithm for LC Relaxation

Solve LP by setting m(e) = 0 if m(e) < mth Otherwise m(e) = 1 Solve LP Return Compute new mth Relax LP solution

0.0 1 0.3 0.0 0.0

mth = 0.5

1 1 1 0.7 0.3 0.3 0.7

While |Gain| > Threshold

0.6 0.3 0.5 0.4 0.6 0.8 0.8 0.5 0.3

Compute for next mth mth = 0.6

slide-35
SLIDE 35

35

May, 2006

Post Refinement

slide-36
SLIDE 36

36

May, 2006

Outline

Introduction and Motivation Related Work Methodology Experimental Results Conclusions

slide-37
SLIDE 37

37

May, 2006

Impact of Retiming on Power

0.66 0.93 0.76 1

  • Ratio

552.8 536.2 781.7 765 643.5 569.8 835.2 761.4 s1494 552.4 535.7 781.7 765 627.6 568.1 821.3 761.8 s1488 447.9 395.5 599.1 546.7 602.3 404.8 764.6 567 s1238 434.1 381.8 591 538.7 579.7 389.3 758.3 567.9 s1196 433.5 283.6 586.9 436.9 407 247.6 543 383.5 s838 314.2 299.9 418.2 403.9 331.2 307.4 407.8 384 s832 310.3 296 415 400.8 322.8 299 404.8 381 s820 244.6 199.4 375.8 330.5 269.9 181.9 399.2 311.1 s713 232.6 187.4 361.8 316.6 246.1 170 372.1 295.9 s641 GLF GL GLF GL GLF GL GLF GL Vdd + Vth (uW) Vdd (uW) Vdd + Vth (uW) Vdd (uW) min FF. retiming Retiming + Scaling [Chabini04] ckt GL = Gate Power + LC Power GLF = Gate Power + LC Power + FF Power

slide-38
SLIDE 38

38

May, 2006

Power Comparison on Different Voltage Scaling Techniques (in uW)

1 day 44 sec 29 sec 28 sec time 0.66 0.66 0.94 1 ratio 550.5 552.8 773.2 795.5 s1494 551.4 552.4 773.8 796.1 s1488 446.6 447.9 619.1 648.4 s1238 434.1 434.1 616.2 646.8 s1196 428.6 433.5 579.6 627.3 s838 312.5 314.2 415.6 428.9 s832 309.7 310.3 412.4 425.7 s820 243.3 244.6 392.3 458 s713 230.6 232.6 374.5 434.3 s641 ILP LP CVS[Usami95] INIT ckt

INIT = all nodes Vdd-H + Vth-L CVS= clustered Voltage Scaling LP = Linear Programming ILP = Integer Linear Programming

slide-39
SLIDE 39

39

May, 2006

Outline

Introduction and Motivation Related Work Methodology Experimental Results Conclusions

slide-40
SLIDE 40

40

May, 2006

Conclusions

Power minimization is an important VLSI design issue: both

static and dynamic power

We propose a mathematical model to solve power optimization

issue while maintain the target clock period

The experiment results show up to 30% power reduction

slide-41
SLIDE 41

41

May, 2006

slide-42
SLIDE 42

42

May, 2006

Delay and Power for Voltage Scaling

slide-43
SLIDE 43

43

May, 2006

Retiming Algorithm

  • FF. edge has weight =

clock period * number of FF.

If Bellman-Ford algorithm has a

feasible solution, the target clock period is feasible

Binary search is used to identify

smallest feasible clock period (cycle time)

1 1 1 1 1 1

  • 1

1

1 2 2

1

1 2 3 3 Gate and wire delay Flipflop

slide-44
SLIDE 44

44

May, 2006

Retiming LP Formulation

1 2 3 4 5

2 2 20 10 10 20 10

60

1 1 1

30

slide-45
SLIDE 45

45

May, 2006

Retiming Formulation

Q Q SET CLR S R

1 w(e)

1 2 3

r(v) W(1,2) = 0 W(1,3) = 1 W(2,3) = 1 1 wr(e) 1 r(v) r(v) is the number of FF. moved from fanout of node v to fanin of node v w(eu,v) is the FF. count on edge u,v, D(u,v) is the maximum delay on path u,v W(u,v) is minimum number of FF. on path u,v Cycle Time (L) =2 Objective: Minimize the number of flip-flops (FF.) Constraints:

  • Num. FF. has to be satisfied

r(u) ≤ w(eu,v) + r(v)

  • Num. FF. on critical paths has to be greater than zero

D(1,2) = 2 D(1,3) = 3 D(2,3) = 2

slide-46
SLIDE 46

46

May, 2006

LC

slide-47
SLIDE 47

47

May, 2006

LCFF

slide-48
SLIDE 48

48

May, 2006

Power Comparison on Different Voltage Scaling Techniques (in uW)

35 sec 0.69 575.0 577.1 454.1 439.0 440.1 331.6 327.7 268.7 253.9 MVVS

[Srivastava04]

1 day 44 sec 29 sec 28 sec time 0.66 0.66 0.94 1 ratio 550.5 552.8 773.2 795.5 s1494 551.4 552.4 773.8 796.1 s1488 446.6 447.9 619.1 648.4 s1238 434.1 434.1 616.2 646.8 s1196 428.6 433.5 579.6 627.3 s838 312.5 314.2 415.6 428.9 s832 309.7 310.3 412.4 425.7 s820 243.3 244.6 392.3 458 s713 230.6 232.6 374.5 434.3 s641 ILP LP CVS[Usami95] INIT ckt

INIT = all nodes Vdd-H + Vth-L CVS= clustered Voltage Scaling MVS = modified Vdd/Vth and Sizing LP = Linear Programming ILP = Integer Linear Programming