MrDP: Multiple-row Detailed Placement of Heterogeneous-sized Cells - - PowerPoint PPT Presentation

mrdp multiple row detailed placement of heterogeneous
SMART_READER_LITE
LIVE PREVIEW

MrDP: Multiple-row Detailed Placement of Heterogeneous-sized Cells - - PowerPoint PPT Presentation

MrDP: Multiple-row Detailed Placement of Heterogeneous-sized Cells for Advanced Nodes Yibo Lin 1 Bei Yu 2 Xiaoqing Xu 1 Jhih-Rong Gao 3 Natarajan Viswanathan 3 Wen-Hao Liu 3 Zhuo Li 3 Charles J. Alpert 3 David Z. Pan 1 1 University of Texas at


slide-1
SLIDE 1

MrDP: Multiple-row Detailed Placement of Heterogeneous-sized Cells for Advanced Nodes

Yibo Lin1 Bei Yu2 Xiaoqing Xu1 Jhih-Rong Gao3 Natarajan Viswanathan3 Wen-Hao Liu3 Zhuo Li3 Charles J. Alpert3 David Z. Pan1

1University of Texas at Austin 2The Chinese University of Hong Kong 3Cadence Design Systems

1 / 28

slide-2
SLIDE 2

Outline

Introduction Problem Formulation Detailed Placement Algorithms Experimental Results Conclusion

2 / 28

slide-3
SLIDE 3

Introduction: Technology Scaling

AI / Cu / W wires Planar CMOS LE Patterning Transistors

Complexity

Interconnect

2005 2010 2015 2020 2025 [Courtesy ARM]

3 / 28

slide-4
SLIDE 4

Introduction: Technology Scaling

AI / Cu / W wires Planar CMOS LE FinFET LELE W LI Patterning Transistors

Complexity

Interconnect

2005 2010 2015 2020 2025 [Courtesy ARM]

3 / 28

slide-5
SLIDE 5

Introduction: Technology Scaling

AI / Cu / W wires Planar CMOS LE 10 nm 7 nm 5 nm 3 nm FinFET LELE W LI Patterning Transistors

Complexity

Interconnect

2005 2010 2015 2020 2025 [Courtesy ARM]

3 / 28

slide-6
SLIDE 6

Introduction: Technology Scaling

AI / Cu / W wires Planar CMOS LE HNW VNW eNVM LELELE SADP SAQP EUV LELE 10 nm 7 nm 5 nm 3 nm EUV EUV EBL EUV DSA Cu Doping 3D IC

Opto Connect Graphene CNT

FinFET LELE W LI Patterning Transistors

Complexity

Interconnect

2005 2010 2015 2020 2025 [Courtesy ARM]

3 / 28

slide-7
SLIDE 7

Technology Scaling: Fewer Tracks

Track # per row decreases:

◮ From 10 to 7.5 ◮ Exploring 7.5T for 7nm technology node ◮ Even with EUV, additional metal layer may be required (a) And-or-invert (AOI); (b) 2-finger inverter [Liebman+,SPIE’15].

4 / 28

slide-8
SLIDE 8

Motivation of Multiple-Row Cells 1

◮ Complex standard cells, such as flip-flops, MUXes, etc. ◮ Intra-Cell Routability

(a) Cell size 54 grids (b) Cell size 48 grids

5 / 28

slide-9
SLIDE 9

Motivation of Multiple-Row Cells 2

Pin access problem [Taghavi+,ICCAD’10] V1 M1 pin M2 M3 V2 Blocked pin

Cell 1 Cell 2 (a) Cell 1 Cell 2 (b)

(a) pin access failure; (b) pin access success. [Xu+,DAC’14]

6 / 28

slide-10
SLIDE 10

Motivation of Multiple-Row Cells 3

Multi-bit flip-flops (MBFF)

[Jiang+,ISPD’11] [Pokala+,ASIC’92]

7 / 28

slide-11
SLIDE 11

Power Line Alignment

Odd-row height cells

◮ Misalignment fixable with vertical flipping

Even-row height cells

◮ Misalignment NOT fixable with vertical flipping ◮ New placement techniques are highly necessary

VDD GND VDD GND VDD a b c d e f g

8 / 28

slide-12
SLIDE 12

Power Line Alignment

Odd-row height cells

◮ Misalignment fixable with vertical flipping

Even-row height cells

◮ Misalignment NOT fixable with vertical flipping ◮ New placement techniques are highly necessary

VDD GND VDD GND VDD d a b c e f g

8 / 28

slide-13
SLIDE 13

Power Line Alignment

Odd-row height cells

◮ Misalignment fixable with vertical flipping

Even-row height cells

◮ Misalignment NOT fixable with vertical flipping ◮ New placement techniques are highly necessary

VDD GND VDD GND VDD d a b c e f g

8 / 28

slide-14
SLIDE 14

Previous Works

Double-row height cells [Wu+,TCAD’15]

◮ Group and extend single-row height cells into double-row height blocks ◮ Re-use existing detailed placement frameworks ◮ Incapable to handle three- and four-row height cells ◮ Power alignment not addressed

Legalization for Multiple-row height cells [Chow+,DAC’16]

◮ General to heterogeneous-sized cells ◮ Minimize total displacement while removing overlaps ◮ Power alignment addressed ◮ No performance optimization

9 / 28

slide-15
SLIDE 15

Wirelength and Density Metrics

Cell Density: ABU [ICCAD’13 Contest]

  • verflowγ = max (0, ABUγ

dt − 1) ABU =

  • γ∈Γ wγ · overflowγ
  • γ∈Γ wγ

, Γ ∈ {2, 5, 10, 20}

Scaled wirelength (sHPWL)

sHPWL = HPWL · (1 + ABU)

10 / 28

slide-16
SLIDE 16

Wirelength and Density Metrics

Cell Density: ABU [ICCAD’13 Contest]

  • verflowγ = max (0, ABUγ

dt − 1) ABU =

  • γ∈Γ wγ · overflowγ
  • γ∈Γ wγ

, Γ ∈ {2, 5, 10, 20}

Scaled wirelength (sHPWL)

sHPWL = HPWL · (1 + ABU)

APU

Average Pin Utilization: capture pin distribution of the layout.

10 / 28

slide-17
SLIDE 17

Problem Formulation: MrDP

Multi-row Detailed Placement (MrDP)

Input:

◮ A netlist with heterogeneous-sized cells ◮ Initial placement with fixed macro blocks

Output:

◮ Legal placement ◮ Minimize wirelength and density cost, i.e., sHPWL and APU

11 / 28

slide-18
SLIDE 18

Conventional Global Move

◮ Pick a cell and move to better position ◮ More difficult with heterogeneous-sized cells

c b l j m g f a e d k i h t ?

12 / 28

slide-19
SLIDE 19

Conventional Global Move

◮ Pick a cell and move to better position ◮ More difficult with heterogeneous-sized cells

c b l j m g f a e d k i h t

12 / 28

slide-20
SLIDE 20

Conventional Global Move

◮ Pick a cell and move to better position ◮ More difficult with heterogeneous-sized cells

c b l j m g f a e d k i h t

12 / 28

slide-21
SLIDE 21

Chain Move

◮ Cell Pool:

A queue structure used for temporary storage of cells within a chain move

◮ Scoreboard:

Consists of an array of chain move entries with corresponding changes in wirelength cost for each chain move

◮ Inspired by KL and FM algorithms in partitioning [KL’70][FM,DAC’82] ◮ Look for cumulatively good cost

t c j g f d k h

13 / 28

slide-22
SLIDE 22

Chain Move

◮ Cell Pool:

A queue structure used for temporary storage of cells within a chain move

◮ Scoreboard:

Consists of an array of chain move entries with corresponding changes in wirelength cost for each chain move

◮ Inspired by KL and FM algorithms in partitioning [KL’70][FM,DAC’82] ◮ Look for cumulatively good cost

c f d k h t g j

13 / 28

slide-23
SLIDE 23

Chain Move

◮ Cell Pool:

A queue structure used for temporary storage of cells within a chain move

◮ Scoreboard:

Consists of an array of chain move entries with corresponding changes in wirelength cost for each chain move

◮ Inspired by KL and FM algorithms in partitioning [KL’70][FM,DAC’82] ◮ Look for cumulatively good cost

c g f d k h t j

13 / 28

slide-24
SLIDE 24

Chain Move

◮ Cell Pool:

A queue structure used for temporary storage of cells within a chain move

◮ Scoreboard:

Consists of an array of chain move entries with corresponding changes in wirelength cost for each chain move

◮ Inspired by KL and FM algorithms in partitioning [KL’70][FM,DAC’82] ◮ Look for cumulatively good cost

c j g f d k h t Cell pool

Scoreboard . . . . . .

Chain move entry

  Cell t: p0

1 → p1

Cell g: p0

2 → p2

Cell j: p0

3 → p3

  , ∆WL

13 / 28

slide-25
SLIDE 25

Chain Move Discussion

◮ Order is important ◮ Max prefix sum of wirelength improvement ◮ Discard long chains

Cost for a Cell:

cost = ∆WL · (1 + α · cd) + β · cov

◮ ∆ WL: wirelength cost ◮ cd: density cost (average of cell and pin densities) ◮ cov: overlap cost

14 / 28

slide-26
SLIDE 26

Chain Move Discussion

◮ Order is important ◮ Max prefix sum of wirelength improvement ◮ Discard long chains

Cost for a Cell:

cost = ∆WL · (1 + α · cd) + β · cov

◮ ∆ WL: wirelength cost ◮ cd: density cost (average of cell and pin densities) ◮ cov: overlap cost

Theorem

If the input is legal, then the output is guaranteed legal

14 / 28

slide-27
SLIDE 27

Ordered Single-Row (OSR) Placement

Well explored for single-row height cells

◮ Free-to-move [Vygen,DATE’98] [Kahng+,ASPDAC’99] ◮ Max displacement [Taghavi+,ICCAD’10] [Lin+,ASPDAC’16]

How to deal with multiple-row height cells? b i f d a c e

Limited movements by multiple rows.

15 / 28

slide-28
SLIDE 28

Ordered Double-Row (ODR) Placement

◮ Extend single-row to double-row placement ◮ Some definitions

c b l j m g f a d k h e i

Double-row region Splitting cells Crossing cells Partition 1 Partition 2 Partition 3

16 / 28

slide-29
SLIDE 29

Problem Formulation: ODR Placement

Ordered Double-Row (ODR) Placement

Input:

◮ Two rows of cells in a double-row region ◮ Ordered from left to right within each row ◮ Maximum displacement M for each cell ◮ All other cells outside double-row region are fixed

Output:

◮ Horizontally shift cells ◮ Optimize HPWL while keep the order of cells within each row

17 / 28

slide-30
SLIDE 30

ODR Placement: Ideal Cases

◮ Only double-row splitting cells ◮ No crossing cells ◮ No inter-row connection within double-row region ◮ Solve ideal case optimally

c b l j m g f a d k h e i

Partition 1 Partition 2 Partition 3 Fixed Fixed Independent Independent Independent

18 / 28

slide-31
SLIDE 31

Nested Dynamic Programming

c b l j m g f a d k h e i

Partition 1 Partition 2 Partition 3 Fixed Fixed Independent Independent Independent

e1 e2 i1 i2 s t ek ik . . . . . .

fi(e1, i1) fe(s, e1) fe(s, ek) fi(ek, ik) ft(i1, t) ft(ik, t) Partition 1 Partition 2 Partition 3

Outer-level shortest path

19 / 28

slide-32
SLIDE 32

Nested Dynamic Programming

c b l j m g f a d k h e i

Partition 1 Partition 2 Partition 3 Fixed Fixed Independent Independent Independent

e1 e2 i1 i2 s t ek ik . . . . . .

fi(e1, i1) fe(s, e1) fe(s, ek) fi(ek, ik) ft(i1, t) ft(ik, t) Partition 1 Partition 2 Partition 3

Outer-level shortest path

fi(e1, i1) f1 f2

. . .

fk ti1 se1 g1 g2 h1 h2

. . . . . .

gk hk ti1 se1 +

Inner-level shortest path

19 / 28

slide-33
SLIDE 33

Nested Dynamic Programming

◮ Any shortest path algorithm can be applied ◮ Adopt dynamic programming [Lin+,ASPDAC’16] ◮ O(nM) for single-row placement ◮ O(nM2) for double-row placement ◮ Flexible to any cost that only depends on cell itself

b i f d a c e

Support additional overlap cost Add very large cost if there is overlap

20 / 28

slide-34
SLIDE 34

ODR Placement: General Cases

◮ Multiple-row height splitting cells ◮ Multiple-row height crossing cells: Add overlap cost ◮ Inter-row connections within double-row region: Lose optimality

c b l j m g f a d k h e i

21 / 28

slide-35
SLIDE 35

Overall Flow

Global Placement Legal? Chain Move in Overlap Reduction Mode Converge? Chain Move in WL Mode Multi-Row Placement Final Placement N Y N Y Legal? Legalization N Y

22 / 28

slide-36
SLIDE 36

Overall Flow

Global Placement Legal? Chain Move in Overlap Reduction Mode Converge? Chain Move in WL Mode Multi-Row Placement Final Placement N Y N Y Legal? Legalization N Y

22 / 28

slide-37
SLIDE 37

Overall Flow

Global Placement Legal? Chain Move in Overlap Reduction Mode Converge? Chain Move in WL Mode Multi-Row Placement Final Placement N Y N Y Legal? Legalization N Y

22 / 28

slide-38
SLIDE 38

Overall Flow

Global Placement Legal? Chain Move in Overlap Reduction Mode Converge? Chain Move in WL Mode Multi-Row Placement Final Placement N Y N Y Legal? Legalization N Y

22 / 28

slide-39
SLIDE 39

23 / 28

slide-40
SLIDE 40

23 / 28

slide-41
SLIDE 41

Experimental Setup

◮ Implemented in C++ ◮ 8-Core 3.4GHz Linux server ◮ 32GB RAM ◮ ISPD 2005 Contest Benchmark:

◮ Double-row height cells [Wu+,TCAD’15] ◮ Benchmark sizes: 200K to 2M ◮ Utilization: 67% to 91% ◮ Double-Row Ratio: around 30%

◮ ICCAD 2014 Contest Benchmark:

◮ Multiple-row height cells (2–4 rows) ◮ Benchmark sizes: 133K to 961K ◮ Utilization: 47% to 65% ◮ Multiple-Row Ratio: 15% to 41% 24 / 28

slide-42
SLIDE 42

Results on Double-row Height Cells

(a) Normalized sHPWL (b) APU penalty (c) Runtime (s)

MrDP v.s. [Wu+,TCAD’15]

◮ 3% better sHPWL ◮ 13.2% better APU ◮ 23.5% runtime overhead

25 / 28

slide-43
SLIDE 43

Results on Heterogeneous-sized Cells

(a) Normalized sHPWL (b) APU penalty (c) Runtime (s)

MrDP v.s. GP

◮ 3.7% better sHPWL ◮ 15.3% better APU

26 / 28

slide-44
SLIDE 44

Conclusion

Placement challenges with heterogeneous-sized standard cells in advanced technology nodes

◮ A placement framework to optimize wirelength and congestion ◮ Chain move scheme ◮ Ordered double-row placement

27 / 28

slide-45
SLIDE 45

Conclusion

Placement challenges with heterogeneous-sized standard cells in advanced technology nodes

◮ A placement framework to optimize wirelength and congestion ◮ Chain move scheme ◮ Ordered double-row placement

Future work

◮ Explore the impacts of legalization step ◮ Different configurations of placement flows

27 / 28

slide-46
SLIDE 46

Thank You

Yibo Lin (yibolin@cerc.utexas.edu) Bei Yu (byu@cse.cuhk.edu.hk) Xiaoqing Xu (xiaoqingxu@cerc.utexas.edu) Jhih-Rong Gao (jrgao@cadence.com) Natarajan Viswanathan (nviswan@cadence.com) Wen-Hao Liu (wliu@cadence.com) Zhuo Li (zhuoli@cadence.com) Charles J. Alpert (alpert@cadence.com) David Z. Pan (dpan@ece.utexas.edu)

28 / 28