ICs Heechun Park, Bon Woong Ku, Kyungwook Chang, Da Eun Shim, and - - PowerPoint PPT Presentation

ics
SMART_READER_LITE
LIVE PREVIEW

ICs Heechun Park, Bon Woong Ku, Kyungwook Chang, Da Eun Shim, and - - PowerPoint PPT Presentation

Pseudo-3D Approaches for Commercial-Grade RTL-to-GDS Tool Flow Targeting Monolithic 3D ICs Heechun Park, Bon Woong Ku, Kyungwook Chang, Da Eun Shim, and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology


slide-1
SLIDE 1

Pseudo-3D Approaches for Commercial-Grade RTL-to-GDS Tool Flow Targeting Monolithic 3D ICs

Heechun Park, Bon Woong Ku, Kyungwook Chang, Da Eun Shim, and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology

slide-2
SLIDE 2

2

Monolithic 3D IC (M3D IC)

  • Massive vertical interconnects

– Nanoscale Monolithic inter-tier via (MIV) – Free from huge load from through-silicon-via (TSV)

  • Electronic design automation (EDA) flow is required

– Full benefit from 3D structure

2D Product TSV-based 3D Monolithic 3D

slide-3
SLIDE 3

3

M3D IC Design Flow - Placement

  • True-3D placer

– Fundamental 3D solution – Extend 2D analytic placer to 3D

  • (x,y) → (x,y,z), z: tier assignment

– NOT ready for commercial adoption

  • Not compatible to industrial RTL-to-GDS flow
  • Only reports HPWL

– Low quality

  • Pseudo-3D placer

– Utilize 2D CAD engines

  • Focus on seamless application to 3D design

– Fully compatible to industrial flow

  • GDS-based results (PPA reports)

– Commercial-grade quality

True-3D (Force-3D) Pseudo-3D (Shrunk-2D) AES-128 (3.4GHz) WL (m) 1.495 1.212

  • 18.93%

WNS (ns)

  • 0.077
  • 0.66
  • Pwr. (mW)

274.16 268.00

  • 2.24%

LDPC (1.2GHz) WL (m) 2.132 1.226

  • 42.50%

WNS (ns)

  • 0.195
  • 0.086
  • Pwr. (mW)

139.31 81.91

  • 41.20%
slide-4
SLIDE 4

4

  • Detailed discussion of state-of-the-art pseudo-3D design flows

– Cascade-2D [ICCAD’16] – Shrunk-2D [ISLPED’14, TCAD’17] – Compact-2D [ISPD’18, TCAD’20(early access)]

  • Thorough comparison of pseudo-3D design flows

– Qualitative – Quantitative (power-performance-area (PPA))

  • Limitation analysis and enhancements

– Cascade-2D: Placement-aware MIV planning – Shrunk-2D & Compact-2D: Partitioning-first adoption

Contributions

slide-5
SLIDE 5

5

  • Partitioning-first

– Tier partitioning (z) → 2D placement (x,y) – More design freedom

  • MIV = 2 anchor cells + dummy wire
  • Simultaneous timing closure

– Both tiers are in a same design space

  • 1:2 plane

– Cut-and-slide for 3D IC

State-of-the-art Pseudo-3D: Cascade-2D

Top Bottom

Anchor cells Dummy wires

Cascade-2D design flow ► MIV planning ▲

slide-6
SLIDE 6

6

  • Partitioning-last

– 2D placement (x,y) → Tier partitioning (z) – Less design freedom

  • Utilize 2D engine with shrunk cells

– ‘Projected’ 2D placement

  • Placement-aware tier partitioning

– Bin-based FM min-cut [DAC’82] algorithm

State-of-the-art Pseudo-3D: Shrunk-2D

Shrunk-2D design flow ▲ 3D placement ►

slide-7
SLIDE 7

7

  • Partitioning-last

– 2D placement (x,y) → Tier partitioning (z)

  • RC derating to mimic small footprint

– Do NOT shrink cell → allowed in small tech.

  • Ex. 7nm → 5nm Shrunk-2D is impossible

➔ 7nm Compact-2D is possible

  • Placement-aware tier partitioning

– Bin-based FM min-cut [DAC’82] algorithm

State-of-the-art Pseudo-3D: Compact-2D

Compact-2D design flow ▲ 3D placement ▲

slide-8
SLIDE 8

8

  • Macro blocks

– Cascade-2D: Fully cared – Shrunk-2D: Partial blockages – Compact-2D: Partial blockages

Cascade-2D Shrunk-2D Compact-2D Tier partitioning Beginning Middle (after 2D placement) Middle (after 2D placement) Technology file fix + Anchor cells + Dummy metals (easy) + Shrunk std. cells + Shrunk metals (hard) RC factor (easy) Treat macro blocks? Yes Partially yes Partially yes Timing closure Simultaneous Per-tier Per-tier

Comparison: Qualitative (1)

0% 0%

0% 35% 70%

0% 0%

Core util. 70% Core util. 70%

Expected 3D floorplan

Top Bottom

Cascade-2D Shrunk-2D Compact-2D

Core util. 70% Core util. 70%

slide-9
SLIDE 9

9

  • Timing closure

– Cascade-2D: Simultaneously

  • Less WNS

– Shrunk-2D: Tier-by-tier – Compact-2D: Tier-by-tier

Comparison: Qualitative (2)

Cascade-2D Shrunk-2D Compact-2D Tier partitioning Beginning Middle (after 2D placement) Middle (after 2D placement) Technology file fix + Anchor cells + Dummy metals (easy) + Shrunk std. cells + Shrunk metals (hard) RC factor (easy) Treat macro blocks? Yes Partially yes Partially yes Timing closure Simultaneous Per-tier Per-tier

Top Bottom Top Bottom Top Bottom

top.sdc bottom.sdc

Top Bottom

Simultaneous Per-tier Merge

Driving cell changed Load cap. changed

slide-10
SLIDE 10

10

  • PPA comparison: Cascade-2D vs. Shrunk-2D vs. Compact-2D

– Gate-level designs (OpenCores benchmark suite)

  • Cascade-2D

– Small #MIV → long WL (detour) – Better timing (less WNS) – Worse power (long WL)

  • Shrunk-2D & Compact-2D

– Large #MIV → short WL – Worse timing (high WNS) – Less power

Comparison: Quantitative (1)

2D Cascade-2D Shrunk-2D Compact-2D AES-128 (3.4GHz) #MIV 429 39521 39772 WL (m) 1.444 1.637 +13.37% 1.212 -16.09% 1.224 -15.24% WNS (ns)

  • 0.009
  • 0.020
  • 0.066
  • 0.069
  • Pwr. (mW)

274.21 292.92 +6.82% 268.00 -2.27% 266.58 -2.78% LDPC (1.2GHz) #MIV 5996 15390 16462 WL (m) 1.527 1.707 +11.81% 1.226 -19.69% 1.197 -21.61% WNS (ns)

  • 0.044
  • 0.056
  • 0.086
  • 0.102
  • Pwr. (mW)

85.63 107.83 +25.92% 81.91

  • 4.35%

74.91 -12.52%

slide-11
SLIDE 11

11

  • (Table cont’d)

Comparison: Quantitative (2)

2D Cascade-2D Shrunk-2D Compact-2D Nova (625MHz) #MIV 264 33980 33658 WL (m) 2.257 2.571 +13.91% 2.145

  • 4.96%

2.120

  • 6.08%

WNS (ns)

  • 0.060
  • 0.054
  • 0.193
  • 0.205
  • Pwr. (mW)

110.54 113.47 +2.66% 110.74 +0.18% 110.75 +0.19% TATE (1.4GHz) #MIV 3199 52442 52672 WL (m) 1.935 3.141 +62.38% 1.853

  • 4.23%

1.843

  • 4.73%

WNS (ns)

  • 0.012
  • 0.026
  • 0.098
  • 0.129
  • Pwr. (mW)

315.44 319.42 +1.26% 318.99 +1.13% 318.53 +0.98% ECG (1.0GHz) #MIV 1010 21384 21270 WL (m) 0.989 1.253 +26.59% 0.909

  • 8.13%

0.900

  • 9.04%

WNS (ns)

  • 0.037
  • 0.020
  • 0.04
  • 0.056
  • Pwr. (mW)

104.56 104.24

  • 0.30%

105.70 +1.09% 105.42 +0.83%

slide-12
SLIDE 12

12

  • PPA comparison: Cascade-2D vs. Shrunk-2D vs. Compact-2D

– Mixed-size design (RISC-V Rocketcore processor)

  • Same PPA trends

Comparison: Quantitative (3)

2D Cascade-2D Shrunk-2D Compact-2D Footprint (𝑛𝑛2) 0.563 0.281

  • 50%

0.281

  • 50%

0.281

  • 50%
  • Std. cell area (𝑛𝑛2)

0.238 0.241 +1.00% 0.234

  • 1.90%

0.231

  • 3.10%

#MIV 1313 46088 45624 WL (m) 2.896 3.517 +21.40% 2.542 -12.20% 2.592 -10.50% WNS

  • 0.046
  • 0.033
  • 0.197
  • 0.320

Cell Pwr. (mW) 334.5 343.0 +2.50% 334.6 +0.04% 332.8

  • 0.50%

Net Pwr. (mW) 115.2 142.6 +23.80% 107.0

  • 7.10%

106.0

  • 8.00%

Leak Pwr. (mW) 47.8 49.1 +2.60% 45.7

  • 4.40%

44.3

  • 7.40%
  • Tot. Pwr. (mW)

497.5 534.6 +7.50% 487.3

  • 2.00%

483.0

  • 2.90%
slide-13
SLIDE 13

13

  • Bad MIV planning → long 3D net wire detour
  • Sequential placement based → MIVs crowded at the boundary

– Getting worse for high #MIV cases

  • Ex. AES-128, 3.4GHz

Enhancement: Cascade-2D (1)

LR, 429 MIVs FM, 1,976 MIVs FM_5K, 5,238 MIVs FM_10K, 10,252 MIVs

slide-14
SLIDE 14

14

  • Solution: Placement-aware MIV planning
  • Pre-place on 2D plane → Center-of-mass to MIV
  • MIVs are spread in the middle

– Less WL (less detours) → Less power

Enhancement: Cascade-2D (2)

FM_5K, 5,238 MIVs FM_10K, 10,252 MIVs

2D Netlist

2D DEF

Partition Info. MIV Locations 3D Net C.o.M. → MIV Place & trialRoute Project MIV locations

Orig Place-aware Orig Place-aware #MIV 5,238 10,252 Cell area (𝒏𝒏𝟑) 0.127 0.123

  • 3.60%

0.133 0.126 -4.89% WL (m) 1.817 1.575 -13.29% 1.894 1.737 -8.29%

  • Tot. Pwr. (mW)

301 286.29 -4.89% 305.61 295.8 -3.21%

slide-15
SLIDE 15

15

  • Internal tier partitioning method (partitioning-last)

→ Less freedom to control the design

  • Adopt partitioning-first scheme
  • TP in the beginning → assign in the original stage

Enhancement: Shrunk-2D, Compact-2D (1)

2D Netlist + Tech files Cell Blow-up Shrunk-2D P&R Technology Scaling Legalization 2D Netlist Placement Contraction Compact-2D P&R Legalization Shrunk-2D Compact-2D Tier Partitioning Tier assignment Tier Partitioning Tier assignment

slide-16
SLIDE 16

16

  • Local density skew → PPA degraded

– AES-128 (cell-dominated): Tolerable – LDPC (wire-dominated): Severe

Enhancement: Shrunk-2D, Compact-2D (2)

AES-128

Partitioning-last Partitioning-first

LDPC Empty spaces

AES-128 (3.4GHz) Shrunk-2D Orig. Partitioning-first #MIV 39,521 4,704 WL (m) 1.212 1.278 +5.43% WNS (ns)

  • 0.066
  • 0.072

Net Pwr. (mW) 62.05 63.27 +1.96% Cell Pwr. (mW) 181.21 181.92 +0.39% LDPC (1.2GHz) Shrunk-2D Orig. Partitioning-first #MIV 15,390 13,529 WL (m) 1.226 1.365 +11.36% WNS (ns)

  • 0.086
  • 0.623

Net Pwr. (mW) 41.29 45.24 +9.58% Cell Pwr. (mW) 30.61 31.26 +2.12%

slide-17
SLIDE 17

17

  • We provide detailed discussion of state-of-the-art pseudo-3D

design flows

– Cascade-2D, Shrunk-2D, Compact-2D

  • We provide thorough comparisons of pseudo-3D design flows,

both qualitative and quantitative

  • We analyzed limitations on each flow and provide enhancements

for better quality

– Cascade-2D: Detour WL → Placement-aware MIV planning – Shrunk-2D & Compact-2D: Freedom → Partitioning-first adoption

Conclusion

slide-18
SLIDE 18

Thank you!

Q&A: heechun@gatech.edu (Heechun Park)