Floorplan and Power/ Ground Network Co-Synthesis for Fast Design - - PowerPoint PPT Presentation

floorplan and power ground network co synthesis for fast
SMART_READER_LITE
LIVE PREVIEW

Floorplan and Power/ Ground Network Co-Synthesis for Fast Design - - PowerPoint PPT Presentation

Floorplan and Power/ Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University, Taipei Taiwan April 11, 2006 1


slide-1
SLIDE 1

1

Floorplan and Power/ Ground Network Co-Synthesis for Fast Design Convergence

Chen-Wei Liu12 and Yao-Wen Chang2

1Synopsys Taiwan Limited 2Department of Electrical Engineering

National Taiwan University, Taipei Taiwan April 11, 2006

slide-2
SLIDE 2

2

Outline

․ Introduction ․ Proposed Design Flow ․ Floorplan and P/G Network

Co-Synthesis Algorithm

․ Experimental Results

slide-3
SLIDE 3

3

Outline

․ ․ Introduction

Introduction

․ Proposed Design Flow ․ Floorplan and P/G Network

Co-Synthesis Algorithm

․ Experimental Results

slide-4
SLIDE 4

4

Power pad module Power pin Power trunk Power strap

I ntroduction

․ As technology advances, the metal width decreases while the global

wirelength increases. Also, supply voltage is decreasing and the power density keeps increasing.

․ The trends cause serious P/G network problems:

⎯ Voltage (IR) drop violation: serious IR-drop on a P/G network

Slow down clock rate and cause functional error

⎯ Electron-Migration (EM) violation: severe current density through a P/G

wire

Shorten chip lifetime

1.6V 1.59V 1.8V 1.79V 1.76V 1.73V 2.5mA/µ

m

1.8mA/µ

m

1.8mA/µm 5.8mA/µm

Power Integrity Constraints

IR-drop constraint: < 0.18V EM constraint: < 5mA/ µm

I R-drop violations EM violation

slide-5
SLIDE 5

5

Previous Work

․ Power integrity issues are dealt at post-layout

stage in the traditional design flow

⎯ P/G network topology determination

  • Singh, et al., ISPD-04

⎯ P/G wire sizing

Wang and Shadoska, DAC-03, Chowdhury et al., DAC-89

․ As the design complexity increases, it is

necessary to handle P/G network problems earlier

⎯ Dharchoudhury et al., DAC-98

Pre-floorplan, post-floorplan and post-layout

P/G networks analysis and fix

⎯ Yim et al., DAC-99

Post-floorplan P/G networks planning

⎯ Wu and Chang, DAC-04

Iterative tree-structured P/G networks

verification and fix with floorplan optimization Traditional P/G Analysis P/G Optimization P/G routing Place and Route

Post-Layout Optimization Post-Layout Verification

Floorplanning N Y OK? DAC 1998 P/G analysis Place and Route

Post-Layout Verification

Floorplanning N Y Post-Floorplan Analysis & Fix Prefloorplan Planning OK? Place and Route

Post-Layout Verification

Floorplanning OK? N Y P/G Planning DAC 1999 P/G Analysis Place and Route

Post-Layout Verification

Floorplanning OK? N Y P/G Analysis DAC 2004

slide-6
SLIDE 6

6

Our Contributions

․ Propose an automatic floorplan and P/G network co-synthesis

method

․ Develop a sophisticated model for fast P/G analysis

⎯ Make the co-synthesis design flow possible

․ Develop P/G network aware method to reduce the floorplan

solution space

⎯ Improve the runtime by an average of 68%

․ Integrate into a commercial design flow to develop a power integrity

driven design flow

⎯ 2.56X faster than the generic Astro design flow

slide-7
SLIDE 7

7

Outline

․ Introduction ․ ․ Proposed Design Flow

Proposed Design Flow

․ Floorplan and P/G Network

Co-Synthesis Algorithm

․ Experimental Results

slide-8
SLIDE 8

8

Proposed Design Flow

․ Floorplan & P/G network co-synthesis problem

formulation

⎯ Given a set of modules, power consumption

data, and power integrity constraints

⎯ Generate a floorplan and a power integrity

feasible global P/G network

․ Significantly improve the design convergence

Place and Route Post-Layout verification Floorplan & P/G Network Co-Synthesis

PASSED

Perform co-synthesis: Generate floorplan and P/G network plan Perform place and route: Route P/G straps Perform post-layout verification: Detailed P/G analysis

Power Consumption Power Integrity Constraints

modules

slide-9
SLIDE 9

9

I mplementation of the Design Flow

Data preparation

․ Power profile

⎯ Power consumption data of the

modules generated by PrimePower

․ Hierarchical circuit partition

⎯ Organize the design into hard

modules and soft modules according to the hierarchy

Post-layout verification

․ AstroRail

⎯ Static cell-level P/G analysis

Design Compiler Hierarchical Partition Our Floorplanner Astro P&R AstroRail PrimePower Calculate Current Consumption Power Profile Netlist RTL Cell Lib Current Model & Constraints

slide-10
SLIDE 10

10

Flow Comparison

․ Previous work: Only move the iterative fix to an earlier stage ․ Our work: Further combine P/G planning into floorplanning

⎯ Thousands of floorplans are evaluated in a second ⎯ A very efficient, yet sufficiently accurate P/G network analysis method is

needed Place and Route

Post-Layout Verification

Floorplanning OK? N Y P/G Analysis DAC-04 Place and Route Post-Layout verification Floorplan & P/G Network Co-Synthesis

Ours

Place and Route

Post-Layout Verification

Floorplanning OK? N Y P/G Planning DAC-99 P/G Analysis

slide-11
SLIDE 11

11

Outline

․ Introduction ․ Proposed Design Flow ․ ․ Floorplan

Floorplan and P/G Network and P/G Network Co Co-

  • Synthesis Algorithm

Synthesis Algorithm

․ Experimental Results

slide-12
SLIDE 12

12

Overview of the Co-Synthesis Algorithm

․ B*-tree Floorplan Representation and

Simulated Annealing (SA) Algorithm

․ P/G network analysis

⎯ Global P/G Network Construction ⎯ P/G Network Modeling ⎯ P/G Network Evaluation

․ Solution Space Reduction Technique (SSR)

slide-13
SLIDE 13

13

B*-tree: Compacted Floorplan Representation

․ Chang et al., “B*-tree: A new representation for non-slicing

floorplans,” DAC-2k.

⎯ Given a B*-tree, a legal floorplan can be obtained in amortized linear

time

⎯ Root: The most left-bottom module ⎯ Left child: the lowest, adjacent block on the right (xj = xi + wi) ⎯ Right child: the first block above, with the same x-coordinate (xj = xi)

n0 n7 n8 n9 n1 n2 n3 n4 n5 n6 n0 n7 n8 n9 n1 n2 n3 n4 n5 n6

A compacted floorplan The corresponding B* -tree

b0 b7 b8 b9 b1 b2 b3 b6 b5 b4

(x0, y0) x1 = x0 w0 x7 = x0 + w0

b0 b7 b8 b9 b1 b2 b3 b6 b5 b4

(x0, y0) x1 = x0 w0 x7 = x0 + w0

slide-14
SLIDE 14

14

․ Cost function: ․ W: wirelength ․ A : area ․ Φ: P/G network cost ․ Dpitch: pitch of P/G network

⎯ Update by multiplying ⎯

: Average P/G network cost at a temperature

: , a budget factor for adjusting the density of P/G networks Small for low P/G density and large one for high P/G density

Cost Function for Simulated Anealing

,

2 pitch

D A A W ⋅ + Φ ⋅ + ⋅ + ⋅ = Ψ ω γ β α

avg

Φ Φ / ˆ Φ ˆ

avg

Φ 1 ˆ < Φ <

Wirelength Area P/G cost P/G Density

Φ ˆ

slide-15
SLIDE 15

15

  • 1

1 2 3 4 0.00001 0.0001 0.001 0.01 0.1 1 0.01 0.1 1 10 100 1000

․ At the beginning of SA, Dpitch = 2 and ․ During SA process,

converges to 1 while temperature cools down

Pitch Updating: An Example

avg

Φ Φ / ˆ 02 . ˆ = Φ

Temperature

Dpitch Dpitch

avg

Φ Φ / ˆ

avg

Φ Φ / ˆ

SA process

pitch avg pitch

D D × Φ Φ ⇐ / )

slide-16
SLIDE 16

16

P/ G Network Cost

Φ: P/G network cost

․ Bem: set of branches violating electromigration constraints ․ B : total branches of the P/G mesh ․ vpvi: amount of the violation at the pin pvi ․ P : set of all P/G pins ․ Pv : set of violating P/G pins ․ Vlim,pi : IR-drop constraint of the P/G pin pi

, ) 1 (

lim,

∑ ∑

∈ ∀ ∈ ∀

⋅ − + ⋅ = Φ

P P pi P p p em

v v vi vi

V v B B θ θ

EM cost IR-drop cost

1 < <θ

slide-17
SLIDE 17

17

P/ G Network Construction

․ For each floorplan, we construct a uniform global P/G network

according to Dpitch

1 2 3 1

․ The number of trunks is defined by

round[width/Dpitch]+1 and round[height/Dpitch]+1

Floorplan Width Height

2X4 uniform P/G network is constructed Calculate the P/G network dimention

3+ 1 = 4 1+ 1 = 2

slide-18
SLIDE 18

18

P/ G Network Modeling

Apply static analysis for fast P/G network evaluation

․ Use resistive P/G model ․ Model a P/G pin as a current source

⎯ Current value: maximum current drawn from a P/G pin

․ Reduce circuit size

⎯ Connect each current source to the nearest global trunk node

Power pad module Power pin Power trunk Power strap Global trunk node

Reduced circuit

slide-19
SLIDE 19

19

Macro Current Modeling

․ Divide the floorplan into regions ․ For Hard macros

⎯ Connect each P/G pins to the nearest node (center of the region)

․ For Soft macros

⎯ Collect the largest current drawn by standard cells in the overlapping

area of the region and the soft macro

d/ 2 d/ 2 d

Overlapping Area

Hard module Soft module The border line of the region is defined by the center of the nodes Assign current to the center node of the region

slide-20
SLIDE 20

20

Soft Macro Modeling

Standard cells of the soft module

Overlapping Area 3mA 5mA 1mA 1mA 1mA 4mA

․ Derive the largest current drawn by standard cells of the overlapping

area

⎯ Maximize the current of the overlapping area ⎯ Constraint: total stdcell area < the overlapping area ⎯ The problem is known as 0-1 Knapsack Problem (NP-complete)

․ Approximate it by Fractional Knapsack Algorithm

⎯ Assume standard cells can be broken into arbitrary smaller pieces ⎯ Rank cells by current to area ratio ⎯ Apply a greedy algorithm (complexity O(n lg n))

1mA

slide-21
SLIDE 21

21

Evaluation of P/ G Network

․ The static analysis of a P/G network is formulated into the following

modified nodal analysis (MNA) formula:

Gx Gx = i = i

⎯ G: conductance matrix (sparse positive definite) ⎯ x : vector of node voltages ⎯ i : vector of current and voltage sources ⎯ Dimensions of G, i and x are equal to the number of nodes in the P/G

network

․ Solve the linear equation

⎯ Apply Preconditioned Conjugated Gradient (PCG) method ⎯ The time complexity is linear

slide-22
SLIDE 22

22

Solution Space Reduction

․ The IR-drop of a P/G pin is proportional to the effective resistance

between the P/G pin and the P/G pad

⎯ The closer the P/G pin is placed to the P/G pad, the smaller the IR-

drop

․ A technique to reduce the solution space

⎯ Place the modules consuming larger current (power-hungry modules)

near the boundary of the floorplan

⎯ Place power pads near them

slide-23
SLIDE 23

23

B* tree Boundary Properties

Bottom-boundary condition

⎯ Bottom boundary modules are related to the leftmost branch

․ Left-boundary condition

⎯ Left boundary modules are related to the rightmost branch

․ Right-boundary condition

⎯ Right boundary module are related to the bottom-left branch

․ Top-boundary condition

⎯ Top boundary modules are related to the bottom-right branch

slide-24
SLIDE 24

24

Power-Hungry Modules Handling

․ Power-Hungry Modules

⎯ Are clustered and restricted to satisfy the boundary properties

during B*-tree perturbation

⎯ P/G pads are placed near these modules

6 5 9 8 4 2 7 3 1 5 2 4 6 8 1 7 3 9

Clustered modules

slide-25
SLIDE 25

25

Outline

․ Introduction ․ Proposed Design Flow ․ Floorplan and P/G Network

Co-Synthesis Algorithm

․ ․ Experimental Results

Experimental Results

slide-26
SLIDE 26

26

Experimental Settings

․ Implementation

⎯ Use GNU C++

․ Platform

⎯ On Sun Blade 2000 with single 1GHz CPU and 8G memory

․ OpenRISC

⎯ Open source 32bit RISC micro processor (OPENCORE) ⎯ UMC 0.18 technology ⎯ Compare to the Astro design flow with IR-drop driven placement

(manually and iteratively fix P/G network faults)

․ MCNC benchmark

⎯ TSMC 0.25 technology ⎯ Given large power consumption and small power budget (low P/G

network density and only a pair of P/G pads)

In order to test the robustness of our floorplanner

slide-27
SLIDE 27

27

Results on OpenRI SC1200

13.9% 72 62 62 Utilization (%) 41.8 55.14 78.20 80.18 Max IR-drop (mv) 2.56X 135 346 505 CPU Runtime (s)

  • 1

3 4 Iterations

  • 0.1%

8.55 8.54 8.62

  • Avg. Delay (ns)
  • 0.1%

154017 2 1539125 1655463 Wirelength (µm) 15.9% 3.33 3.86 3.86 Die Area (mm2) Ours vs. Astro w/IR-drop Our Flow *Astro w/ IR-drop Driven Placement *Astro Flow OpenRISC1200

*Need iterative and manual P/G network fix

Improvements in runtime and the max IR-drop

Small overhead Better IR-drop Speed improvement

slide-28
SLIDE 28

28

Resulting Voltage Map

Astro design flow

Power-hungry blocks (register file A&B) are placed far from the power pad

Our design flow

Power-hungry blocks are placed beside the power pad

A B B A

slide-29
SLIDE 29

29

Results on MCNC Benchmark

1 1 1 0.033 195 0.97 0.98 comp. 1412 39.8 832.8 42.2 195 39.86 864.6 ami49 43.4 1.4 69.0 8.8 99 1.31 58.4 ami33 58.2 11.2 187.1 3.2 38 9.56 155.5 Hp 122.3 21.3 401.5 3.3 39 20.42 387.6 xerox 165.2 49.8 440.4 1.1 6 48.21 435.5 apte CPU #Vio. Area WL CPU #Vio Area WL Ours Without SSR Plain B*-tree Floorplanner Circuit

Compared with plain B*-tree floorplanner, our floorplanner solves all the violations with small overhead

slide-30
SLIDE 30

30

Results of Solution Space Reduction

The solution space reduction technique speeds up floorplanning 3x with similar quality

1 1 1 0.32 1.04 0.99 comp. 1412 39.8 832.8 450.0 44.2 779.9 ami49 43.4 1.4 69.0 20.2 1.2 73.2 ami33 58.2 11.2 187.1 24.0 11.7 189.5 Hp 122.3 21.3 401.5 47.3 22.4 410.2 xerox 165.2 49.8 440.4 43.2 48.8 452.1 apte CPU #Vio Area WL CPU #Vio Area WL Ours without SSR Ours With SSR Circuit

slide-31
SLIDE 31

31

Conclusions

․ Have proposed a practical design flow compatible with commercial

CAD design flow

․ Have proposed an algorithm and a modeling technique to make

floorplan and P/G network co-synthesis possible

․ Have shown the solution space reduction method to speed up the

co-synthesis algorithm

․ Have shown the efficiency and effectiveness of the proposed

power integrity driven design flow and applied on a real design

slide-32
SLIDE 32

32

slide-33
SLIDE 33

33

Simulate Annealing Process

․ Non-zero probability for up-hill

climbing:

․ Perturbations (neighboring solutions)

⎯ Op1: Rotate a block ⎯ Op2: Move a node/block to

another place

⎯ Op3: Swap two nodes/blocks ⎯ Op4: Resize a soft block

․ The cost function Ψ is based

  • n the floorplan cost and P/G

network cost

․ T is decreased every n cycles,

where n is proportional to the number of blocks ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ =

∆Ψ − T

e p , 1 min

Update T Construct P/G network Evaluate cost Ψ N

Cool/Good enough?

Y Pack B*-tree Initialize B*-tree and temperature T

Better ?

Keep solution Recover last solution

Accept?

Update P/G pitch Dpitch N Y Y N Perturb B*-tree