CS137: Today Electronic Design Automation Placement Improving - - PDF document

cs137 today electronic design automation
SMART_READER_LITE
LIVE PREVIEW

CS137: Today Electronic Design Automation Placement Improving - - PDF document

CS137: Today Electronic Design Automation Placement Improving Quality Avoiding local minima Day 17: November 11, 2005 Techniques: Placement Simulated Annealing (Simulated Annealing) Exhaustive


slide-1
SLIDE 1

1

CALTECH CS137 Fall2005 -- DeHon 1

CS137: Electronic Design Automation

Day 17: November 11, 2005 Placement (Simulated Annealing…)

CALTECH CS137 Fall2005 -- DeHon 2

Today

  • Placement
  • Improving Quality

– Avoiding local minima

  • Techniques:

– Simulated Annealing – Exhaustive (Branch-and-bound)

CALTECH CS137 Fall2005 -- DeHon 3

Simulated Annealing

  • Physically motivated approach
  • Physical world has similar problems

– objects/atoms seeking minimum cost arrangement – at high temperature (energy) can move around – at low temperature, no free energy to move – cool quicklyfreeze in defects (weak structure) – cool slowly allow to find minimum cost

CALTECH CS137 Fall2005 -- DeHon 4

Key Benefit

  • Avoid Local Minima

– Allowed to take locally non-improving moves in order to avoid being stuck

CALTECH CS137 Fall2005 -- DeHon 5

Simulated Annealing

  • At high temperature can move around

– not trapped to only make “improving” moves – free energy from temperature allows exploration of non-minimum states – avoid being trapped in local minima

  • As temperature lowers

– less energy to take big, non-minimizing moves – more local / greedy moves

CALTECH CS137 Fall2005 -- DeHon 6

Design Optimization

Components: 1. “Energy” (Cost) function to minimize

– represent entire state, drives system forward

2. Moves

– local rearrangement/transformation of solution

3. Cooling schedule

– initial temperature – temperature steps (sequence) – time at each temperature

slide-2
SLIDE 2

2

CALTECH CS137 Fall2005 -- DeHon 7

Basic Algorithm Sketch

  • Pick an initial solution
  • Set temperature (T) to initial value
  • while (T>Tmin)

– for time at T

  • pick a move at random
  • compute Δcost
  • if less than zero, accept
  • else if (RND<e-Δcost/T), accept

– update T

CALTECH CS137 Fall2005 -- DeHon 8

Details

  • Initial Temperature

– T0=Δavg/ln(Paccept)

  • Cooling schedule

– fixed ratio: T=λT

  • (e.g. λ=0.85)

– temperature dependent – function of both temperature and acceptance rate

  • example to come
  • Time at each temperature

– fixed number of moves? – fixed number of rejected moves? – fixed fraction of rejected moves?

CALTECH CS137 Fall2005 -- DeHon 9

Cost Function

  • Can be very general

– Combine area, timing, energy, routability…

  • Should drive entire solution in right

direction

– reward each good move

  • Should be cheap to compute delta costs

– e.g. FM – Ideally O(1)

CALTECH CS137 Fall2005 -- DeHon 10

Example Cost Functions

  • Total Wire Length

– Linear, quadratic…

  • Bounding Box (semi-perimeter)

– Surrogate for routed net length

  • Channel widths

– probably wants to be more than just width

  • Cut width

bby bbx

CALTECH CS137 Fall2005 -- DeHon 11

Bad Cost Functions

  • Update cost

– rerun maze route on every move – rerun timing analysis – recalculate critical path delay

  • Drive toward solution:

– size < threshold ? – Critical path delay

CALTECH CS137 Fall2005 -- DeHon 12

VPR Wire Costs

  • VPR Bounding Box

( ) ( ) ( )

[ ] ( )

∑ =

+ × =

Nets i y x

i bb i bb i q Cost

1 Swartz, Betz, & Rose FPGA 1998 Original table: Cheng ICCAD 1994

slide-3
SLIDE 3

3

CALTECH CS137 Fall2005 -- DeHon 13

VPR Timing Costs

  • Criticality(e)=1-Slack(e)/Dmax
  • TCost(e)=Delay(e)*Criticality(e)CriticalityExp
  • Keep all edge delays in a table
  • Recompute Net Criticality

at each Temperature

Marquardt, Betz, & Rose FPGA2000

CALTECH CS137 Fall2005 -- DeHon 14

VPR Balance Wire and Time Cost ( )

⎟ ⎠ ⎞ ⎜ ⎝ ⎛ Δ − + ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ Δ = Δ OldWCost WCost OldTCost TCost Cost λ λ 1

Marquardt, Betz, & Rose FPGA2000

CALTECH CS137 Fall2005 -- DeHon 15

Initial Solution

  • Spectral Placement
  • Random
  • Constructive Placement

– Fast placers start at lower temperature; assume constructive got global right.

CALTECH CS137 Fall2005 -- DeHon 16

Moves

  • Swap two cells

– Within some distance limit? (ex. to come)

  • swap regions

– …rows, columns, subtrees

  • rotate cell (when feasible)
  • flip (mirror) cell
  • permute cell inputs (equivalent inputs)

CALTECH CS137 Fall2005 -- DeHon 17

Variant

  • Allow non-legal solutions

– capture badness in cost function – E.g. -- allow cells to overlap

  • Just make sure cost function makes

very expensive as cool

– settle out to legal solutions

CALTECH CS137 Fall2005 -- DeHon 18

Variant: “Rejectionless”

  • Order moves by cost

– compare FM

  • Pick random number first
  • Use random to define range of move costs

will currently accept

  • Pick randomly within this range
  • Idea: never pick a costly move which will be

rejected

slide-4
SLIDE 4

4

CALTECH CS137 Fall2005 -- DeHon 19

Theory

  • If stay long enough at each cooling

stage

– will achieve tight error bound

  • If cool long enough

– will find optimum

CALTECH CS137 Fall2005 -- DeHon 20

Practice

  • Good results

– ultimately, what most commercial tools use...what vpr uses…

  • Slow convergence
  • Tricky to pick schedules to accelerate

convergence

CALTECH CS137 Fall2005 -- DeHon 21

Range Limit

  • Want to tune so accepting 44% of the

moves – Lam and Delosme DAC 1988

  • VPR

– Define Rlimit – defines maximum Δx and Δy accepted – Tune Rlimit to maintain acceptance rate – Rlimitnew=Rlimitold×(1-0.44+α)

฀ α is measured acceptance rate

CALTECH CS137 Fall2005 -- DeHon 22

VPR Cooling Schedule

  • Moves at Temperature = cN4/3
  • Temperature Update

– Tnew=Told×γ – Idea: advance slowly in good α range

Betz, Rose, & Marquardt Kluwer 1999

CALTECH CS137 Fall2005 -- DeHon 23

Range Limiting?

  • Eguro alternate [DAC 2005]

– define P=D-M – Tune M to control α

CALTECH CS137 Fall2005 -- DeHon 24

Range Limiting

Eurgo, Hauck, & Sharma DAC 2005

slide-5
SLIDE 5

5

CALTECH CS137 Fall2005 -- DeHon 25

Big Hammer

  • Costly, but general
  • Works for most all problems

– (part, placement, route, retime, schedule…)

  • Can have hybrid/mixed cost functions

– as long as weight to single potential – (e.g. wire/time from VPR)

  • With care, can attack multiple levels

– place and route

  • Ignores structure of problem

– resignation to finding/understanding structure

CALTECH CS137 Fall2005 -- DeHon 26

Optimal/Exhaustive

CALTECH CS137 Fall2005 -- DeHon 27

  • If you run simulated annealing long

enough….

– It should converge to optimum

  • If you have enough monkeys typing at

keyboards keyboards long enough

– They’ll eventually produce the works of Shakespeare….

CALTECH CS137 Fall2005 -- DeHon 28

Brute Force?

  • If you are really going to give up on

structure and explore the entire space

– …there are more efficient ways to do it

  • …and maybe they’re not terrible

– For small/modest problems – As our computers get faster

CALTECH CS137 Fall2005 -- DeHon 29

Optimum Placement

  • Simplest case:

– Gate/array (FPGA) w/ fixed cell locations

  • N locations
  • M cells
  • Try all permutations of N choose M

N!/M! cases

CALTECH CS137 Fall2005 -- DeHon 30

Improving

  • Prune off symmetry cases

– Rotate 90, 180, 270 – Mirror X, Y, XY

  • Reject provably bad starts

– (return to in a minute)

slide-6
SLIDE 6

6

CALTECH CS137 Fall2005 -- DeHon 31

Exhaustive Placement

  • More general:

– Modules have variable size – Modules can be rotated/flipped…

  • To explore all cases:

– For each module

  • For each orientation

CALTECH CS137 Fall2005 -- DeHon 32

Branch-and-Bound

  • Consider dense 1D placement

– Search space of all placements – Tree branch is choice of logical cell for physical cell position – Keep track of best solution so far as reach leaves – If partial solution worse than best solution, prune branch

CALTECH CS137 Fall2005 -- DeHon 33

Pruning: 1D example

  • Reducing channel width

– Have solution with width=10 – When find a partial solution with width>=10

  • Can abort that branch
  • Reducing Delay

– Have solution with delay=20 – When find a partial solution with delay>=20

  • Can abort branch

CALTECH CS137 Fall2005 -- DeHon 34

Viable

  • Only on small problems

– But “small” growing with machine speed

  • Use for end-case in constructive

– Flatten bottom of hierarchy – Maybe even in iteration/overlap relaxation

CALTECH CS137 Fall2005 -- DeHon 35

Caldwell et. al. Results

[TRCAD v19n11p1304-13]

CALTECH CS137 Fall2005 -- DeHon 36

Runtime

[TRCAD v19n11p1304-13]

slide-7
SLIDE 7

7

CALTECH CS137 Fall2005 -- DeHon 37

Runtime

[TRCAD v19n11p1304-13]

CALTECH CS137 Fall2005 -- DeHon 38

Faster?

  • Accounting and gain complexity

– Make it linear time – But does make each update somewhat complex – Exhaustive case less bookkeeping

  • Not an atypical result…

CALTECH CS137 Fall2005 -- DeHon 39

Summary

  • Simulated Annealing

– use randomness to explore space – accept “bad” moves to avoid local minima – decrease tolerance over time

  • General purpose solution

– costly in runtime

  • Small (sub)problems

– May solve exhaustively – Can prune to accelerate…

CALTECH CS137 Fall2005 -- DeHon 40

Admin

  • Class Monday
  • …but not W, F (finish assignment)

CALTECH CS137 Fall2005 -- DeHon 41

Big Ideas:

  • Use randomness to explore large (non-

convex) space

– Simulated Annealing

  • Use dominance to quickly skip over
  • bviously bad solutions

– Branch and Bound