Interconnect Power and Delay Optimization by Dynamic Programming - - PowerPoint PPT Presentation

interconnect power and delay optimization
SMART_READER_LITE
LIVE PREVIEW

Interconnect Power and Delay Optimization by Dynamic Programming - - PowerPoint PPT Presentation

Interconnect Power and Delay Optimization by Dynamic Programming in Gridded Design Rules Konstantin Moiseev, Avinoam Kolodny EE Dept. Technion, Israel Institute of Technology Shmuel Wimer Eng. School, Bar-Ilan University March 2010 ISPD


slide-1
SLIDE 1

March 2010 ISPD 2010 1

Interconnect Power and Delay Optimization

by

Dynamic Programming

in

Gridded Design Rules

Konstantin Moiseev, Avinoam Kolodny

EE Dept. Technion, Israel Institute of Technology

Shmuel Wimer

  • Eng. School, Bar-Ilan University
slide-2
SLIDE 2

March 2010 ISPD 2010 2

Agenda

  • What is the problem?
  • Results for 32nm design
  • It is NP complete
  • Optimal solution by dynamic programming
  • How it works in practice
  • Further research problems
slide-3
SLIDE 3

March 2010 ISPD 2010 3

Interconnect Power and Delay

line-to-line coupling line-to-line coupling Line-to-line coupling is dynamic power killer Using Elmore delay model, simple, inaccurate but with high fidelity signal's activity, 0<= AF <=1

slide-4
SLIDE 4

March 2010 ISPD 2010 4

Interconnect Bus Model

A Wi Si Si+1 L

slide-5
SLIDE 5

March 2010 ISPD 2010 5

Delay and Dynamic Power Minimization

, , , ,

  • technology parameters, driver's

resistance, capacitive load and bus length .

i i i i i

L α β γ δ ε

Delay:

( )

1 1

1 1 , ,

i i i i i i i i i i i i i i

s w s w w w s s D

µ µ

γ ε α β δ

− −

   = + + + + +      

( )

1 1

1 1 , ,

i i i i i i i i i

P s w s w s s

µ µ

κ η

− −

  = + +    

Dynamic power:

,

  • technology parameters, signal's activity, and bus length .

i i

L κ η

slide-6
SLIDE 6

March 2010 ISPD 2010 6

( ) ( )

max 1 1 1 1

, ,

  • r

max , ,

n sum i i i i i i i i i i n

D D s w s D D s w s

− − = ≤ ≤

= =

Minimize delay:

( )

1 1

, ,

n i i i i i

P P s w s

− =

=∑

Minimize power:

1 n n i i i i

w s A

= =

+ =

∑ ∑

Subject to Constrained area:

{ } { }

1 1

,..., , ,...,

i p i q

s S S w W W ∈ ∈

In 32nm and 22nm: Discrete optimization: problem is NP-complete Dynamic programming works

Formulation of the Problem

slide-7
SLIDE 7

March 2010 ISPD 2010 7

Power-Delay “Shape Function”

slide-8
SLIDE 8

March 2010 ISPD 2010 8

Results Obtained for 32nm

  • Implemented in C++ / OpenAccess
  • Ran on 32nm control blocks of Intel mobile processor

– Routed by Synopsys tool – Width and space re-allocated in metal 2, 3 and 4 – Used effective drivers and loads from netlist – Typical block size was 250u X 250u

  • Both dynamic power and delays are reduced
  • 10%-15% dynamic power reduction

– Per optimized layer

  • 2% - 5% delay reduction
slide-9
SLIDE 9

March 2010 ISPD 2010 9

MIN_DLYPWR Problem

Is there a setting of widths and spaces such that delay reduction from base is at least, while power increase from base is at most? D P δ δ : Question

( ) ( ) ( )

MIN_DLYPWR is NPC by polynomial reduction of PARTITION, which attemps to answer whether for a given set whose elements have size , , there's a subset satisfying .

b B b B B

B s b b B s b s b

+ ′ ′ ∈ ∈ −

∈ ∀ ∈ =

∑ ∑

slide-10
SLIDE 10

March 2010 ISPD 2010 10

MIN_MAX_DLYPWR Problem

( ) ( )

MIN_MAX_PWRDLY is NPC by polynomial reduction

  • f SUBSET_SUM which answers whether for whose

elements have size , and diven a number , there is satisfying .

b B

B s b N B B s b N

+ + ′ ∈

∈ ∈ ′ ⊆ =

  Is there a setting of widths and spaces such that power decrease from base power is at least while maximal delay is increasing by at most? P D δ δ Question :

slide-11
SLIDE 11

March 2010 ISPD 2010 11

Dynamic Programming Solution

Minimization of power and delay from j+1 to n is independent of power and delay from 1 to j.

( ) ( ) ( ) ( ) ( )

{ }

1, 1, Delay is additive: 1, max 1, , 1, D j D j n D n D j D j n  + +  =  +  

( ) ( ) ( )

Power is additive 1, 1, 1, P n P j P j n = + +

( )

( )

Area is additive ,

j j i i i i

A j n A w s

= =

= − +

∑ ∑

This suggests dynamic programming. Algorithm generates only essential (P,D) pairs in progression from wire to wire.

slide-12
SLIDE 12

March 2010 ISPD 2010 12

Power-Delay Solution Space

D P D=D0 Pmax Pmin

Dynamic Programming finds the red curve progressively. Optimal solution is derived from solution space of last wire.

slide-13
SLIDE 13

March 2010 ISPD 2010 13

State Definition for Dynamic Programming

( ) ( )

( )

( )

( )

1, , , 1, , , 1, ,

j j j

A j n s D A j n s P A j n s     + + +                

area left for n-(j+1) wires accumulated delay rightmost allocated space accumulated power

slide-14
SLIDE 14

March 2010 ISPD 2010 14

( )

: , ,..., ,

j j

w s w s ω′ ′ ′ ′ ′

( )

: , ,..., ,

j j

w s w s ω′′ ′′ ′′ ′′ ′′

( ) ( )

∑ ∑ ∑ ∑

= = = =

′ ′ + ′ ′ − ≥ ′ + ′ −

j i i j i i j i i j i i

w s A w s A

j j

s s ′ ′ ≥ ′ ( ) ( ) ( ) ( )

D D P P ω ω ω ω ′ ′′ ′ ′′ ≤ ∧ ≤

Allocation is dominating allocation if and

.

State Dominancy and Redundancy

slide-15
SLIDE 15

March 2010 ISPD 2010 15

Stage Progression and State Augmentation

Width and space allocations of next wire wire j stage

j

Λ

wire j+1 stage

1 j+

Λ

slide-16
SLIDE 16

March 2010 ISPD 2010 16

n

Λ

Theorem (optimality): Stage of the DP algorithm contains all the feasible non-redundant, and hence optimal, power-delay pairs that can be obtained by any width and space allocation to n wires

P D

Theorem: Any power-delay function monotonically increasing in and achieves minimum on the boundary of the power-delay feasible region.

( )

, f P D

slide-17
SLIDE 17

March 2010 ISPD 2010 17

Modeling Real Layout

u0 un+1

Use transitive reduction of wire visibility graph Design rules are transitively closed Process wires from left to right in topological order with appropriate enhancement to power-delay calculations

slide-18
SLIDE 18

March 2010 ISPD 2010 18

Time and Storage

( )

3 log

and are max in-degree and out-degree, respectively, of wire adjacency graph vertex O pq n n

α β

ε α β

+

Time complexity:

( )

3

O q n

β

ε

Storage complexity: Time and storage in practice are manageable due to power grid which decomposes the problem into many independent smaller problems.

slide-19
SLIDE 19

March 2010 ISPD 2010 19

Further Research Directions

  • Filling aware optimization

– Dynamic programming can generate filling patterns! – Line-to-line capacitance can be measured on the spot

  • Current algorithm works on P&R style only

– Enhancement for full-custom design style – Cross-hierarchy dynamic programming

  • Is “bang-bang” sizing possible?

– Using two values only is tremendous for litho!

  • Simultaneous cell and interconnect resizing

– Use cell families with same footprint

slide-20
SLIDE 20

March 2010 ISPD 2010 20

Thank You!

slide-21
SLIDE 21

March 2010 ISPD 2010 21

Backup

slide-22
SLIDE 22

March 2010 ISPD 2010 22

Reduction of PARTITION into MIN_DLYPWR:

  • 1. A wire is allocated for every element of PARTITION.
  • 2. Resistance of drivers and wires are set to 0 and 1, resp.,

hence wire resistance is not affecting delays.

It is NP since any substitution of valid guess into delay and power equations can be checked for YES or NO answer.

NP Completeness Proof of Delay Sum

slide-23
SLIDE 23

March 2010 ISPD 2010 23

  • 3. Only one spacing is allowed,

hence not affecting the problem.

{ }

1 2

  • 4. Only two width values are allowed

, . W W

  • 5. Bus area is set sufficiently large,

hence not affecting the problem.

Coeficients in delay and power equations are set to zero or one, except load and activity, yielding: and .

b b b b b b

D C w P F w = =

slide-24
SLIDE 24

March 2010 ISPD 2010 24

( ) ( ) ( ) ( ) ( ) ( ) ( )

2 1 1 2 2 1 1 2 2 1 2 1

  • 6. Activity factors are set to

. Capacitive loads are set . Delay turns into: 1 ( ) , Power turns into: ( ) .

sum b b B b b B

s b W W s b WW W W D w s b WW W W P w s b W W

∈ ∈

− − = − = −

∑ ∑

( )

  • 7. Bounds of power increase and delay reduction

are set to 2.

b B

P D s b δ δ

= =∑ Transformation consumes polynomial time.

slide-25
SLIDE 25

March 2010 ISPD 2010 25

( ) ( )

Instance is set such that increases and decreases in . There's single and where: 2.

b b b B b B b B

f I P D w P D D P s b δ δ

′ ′ ∈ ∈ ∈

= =

∑ ∑ ∑

( )

( )

( ) ( ) ( )

1 2 1 2 2 1

We obtained 2 1 1 , implying that , solves PARTITION.

b b B b B b B b B

s b D WW s b s b W W W W B B B δ

′ ∈ ∈ ′ ′ ∈ ∈

= =   − =   −   ′ ′ −

∑ ∑ ∑ ∑

( )

Let the answer to

  • f MIN_DLYPWR be YES.

f I

slide-26
SLIDE 26

March 2010 ISPD 2010 26

( ) ( ) ( ) ( )

1 2 2 1

reduction is 1 1 , and increase is , thus yielding a YES answer to the MIN_DLYPWR problem.

b b b b

D D C W W s b P P F W W s b δ δ = − = = − = Q.E.D

2 1

Conversely, let be a YES answer to PARTITION. Set width of , , to , and rest wires stay .

b

B B w b B W W ′ ⊆ ′ ∈

slide-27
SLIDE 27

March 2010 ISPD 2010 27 2

Base delay and power are obtained by setting widths to , resulting maximum power and minimum delay for each signal. W

Settings 1 to 5 of MIN _ MAX _ DLYPWR instance are similar to those of MIN _ DLYPWR proof.

( ) ( ) ( )

( )

{ }

1 2 2 1 2 2

Base delays are increasing upon wire narrowing. Setting 6 in MIN_DLYPWR proof is modified such that load is set to , yielding: max 1 .

b max b b B

C NWW W W D N w WW W W

= − = −

NP Completeness Proof of Max Delay

slide-28
SLIDE 28

March 2010 ISPD 2010 28

( )

2 1

For any , wire narrowing from to reduces power by . Delay grows by , so maximum delay increase by always.

b

b B W W s b D N N δ ∈ =

Setting 7 in MIN_DLYPWR proof turns to . P D N δ δ = =

{ }

Consequently, iff max . This holds iff the answer to SUBSET _SUM is YES.

b b b B b B

P D N P D N δ δ δ δ

′ ∈ ∈

= = = =

Q.E.D