ICs Heechun Park, Bon Woong Ku, Kyungwook Chang, Da Eun Shim, and - - PowerPoint PPT Presentation
ICs Heechun Park, Bon Woong Ku, Kyungwook Chang, Da Eun Shim, and - - PowerPoint PPT Presentation
Pseudo-3D Approaches for Commercial-Grade RTL-to-GDS Tool Flow Targeting Monolithic 3D ICs Heechun Park, Bon Woong Ku, Kyungwook Chang, Da Eun Shim, and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology
2
Monolithic 3D IC (M3D IC)
- Massive vertical interconnects
– Nanoscale Monolithic inter-tier via (MIV) – Free from huge load from through-silicon-via (TSV)
- Electronic design automation (EDA) flow is required
– Full benefit from 3D structure
2D Product TSV-based 3D Monolithic 3D
3
M3D IC Design Flow - Placement
- True-3D placer
– Fundamental 3D solution – Extend 2D analytic placer to 3D
- (x,y) → (x,y,z), z: tier assignment
– NOT ready for commercial adoption
- Not compatible to industrial RTL-to-GDS flow
- Only reports HPWL
– Low quality
- Pseudo-3D placer
– Utilize 2D CAD engines
- Focus on seamless application to 3D design
– Fully compatible to industrial flow
- GDS-based results (PPA reports)
– Commercial-grade quality
True-3D (Force-3D) Pseudo-3D (Shrunk-2D) AES-128 (3.4GHz) WL (m) 1.495 1.212
- 18.93%
WNS (ns)
- 0.077
- 0.66
- Pwr. (mW)
274.16 268.00
- 2.24%
LDPC (1.2GHz) WL (m) 2.132 1.226
- 42.50%
WNS (ns)
- 0.195
- 0.086
- Pwr. (mW)
139.31 81.91
- 41.20%
4
- Detailed discussion of state-of-the-art pseudo-3D design flows
– Cascade-2D [ICCAD’16] – Shrunk-2D [ISLPED’14, TCAD’17] – Compact-2D [ISPD’18, TCAD’20(early access)]
- Thorough comparison of pseudo-3D design flows
– Qualitative – Quantitative (power-performance-area (PPA))
- Limitation analysis and enhancements
– Cascade-2D: Placement-aware MIV planning – Shrunk-2D & Compact-2D: Partitioning-first adoption
Contributions
5
- Partitioning-first
– Tier partitioning (z) → 2D placement (x,y) – More design freedom
- MIV = 2 anchor cells + dummy wire
- Simultaneous timing closure
– Both tiers are in a same design space
- 1:2 plane
– Cut-and-slide for 3D IC
State-of-the-art Pseudo-3D: Cascade-2D
Top Bottom
Anchor cells Dummy wires
Cascade-2D design flow ► MIV planning ▲
6
- Partitioning-last
– 2D placement (x,y) → Tier partitioning (z) – Less design freedom
- Utilize 2D engine with shrunk cells
– ‘Projected’ 2D placement
- Placement-aware tier partitioning
– Bin-based FM min-cut [DAC’82] algorithm
State-of-the-art Pseudo-3D: Shrunk-2D
Shrunk-2D design flow ▲ 3D placement ►
7
- Partitioning-last
– 2D placement (x,y) → Tier partitioning (z)
- RC derating to mimic small footprint
– Do NOT shrink cell → allowed in small tech.
- Ex. 7nm → 5nm Shrunk-2D is impossible
➔ 7nm Compact-2D is possible
- Placement-aware tier partitioning
– Bin-based FM min-cut [DAC’82] algorithm
State-of-the-art Pseudo-3D: Compact-2D
Compact-2D design flow ▲ 3D placement ▲
8
- Macro blocks
– Cascade-2D: Fully cared – Shrunk-2D: Partial blockages – Compact-2D: Partial blockages
Cascade-2D Shrunk-2D Compact-2D Tier partitioning Beginning Middle (after 2D placement) Middle (after 2D placement) Technology file fix + Anchor cells + Dummy metals (easy) + Shrunk std. cells + Shrunk metals (hard) RC factor (easy) Treat macro blocks? Yes Partially yes Partially yes Timing closure Simultaneous Per-tier Per-tier
Comparison: Qualitative (1)
0% 0%0% 35% 70%
0% 0%Core util. 70% Core util. 70%
Expected 3D floorplan
Top Bottom
Cascade-2D Shrunk-2D Compact-2D
Core util. 70% Core util. 70%
9
- Timing closure
– Cascade-2D: Simultaneously
- Less WNS
– Shrunk-2D: Tier-by-tier – Compact-2D: Tier-by-tier
Comparison: Qualitative (2)
Cascade-2D Shrunk-2D Compact-2D Tier partitioning Beginning Middle (after 2D placement) Middle (after 2D placement) Technology file fix + Anchor cells + Dummy metals (easy) + Shrunk std. cells + Shrunk metals (hard) RC factor (easy) Treat macro blocks? Yes Partially yes Partially yes Timing closure Simultaneous Per-tier Per-tier
Top Bottom Top Bottom Top Bottom
top.sdc bottom.sdc
Top Bottom
Simultaneous Per-tier Merge
Driving cell changed Load cap. changed
10
- PPA comparison: Cascade-2D vs. Shrunk-2D vs. Compact-2D
– Gate-level designs (OpenCores benchmark suite)
- Cascade-2D
– Small #MIV → long WL (detour) – Better timing (less WNS) – Worse power (long WL)
- Shrunk-2D & Compact-2D
– Large #MIV → short WL – Worse timing (high WNS) – Less power
Comparison: Quantitative (1)
2D Cascade-2D Shrunk-2D Compact-2D AES-128 (3.4GHz) #MIV 429 39521 39772 WL (m) 1.444 1.637 +13.37% 1.212 -16.09% 1.224 -15.24% WNS (ns)
- 0.009
- 0.020
- 0.066
- 0.069
- Pwr. (mW)
274.21 292.92 +6.82% 268.00 -2.27% 266.58 -2.78% LDPC (1.2GHz) #MIV 5996 15390 16462 WL (m) 1.527 1.707 +11.81% 1.226 -19.69% 1.197 -21.61% WNS (ns)
- 0.044
- 0.056
- 0.086
- 0.102
- Pwr. (mW)
85.63 107.83 +25.92% 81.91
- 4.35%
74.91 -12.52%
11
- (Table cont’d)
Comparison: Quantitative (2)
2D Cascade-2D Shrunk-2D Compact-2D Nova (625MHz) #MIV 264 33980 33658 WL (m) 2.257 2.571 +13.91% 2.145
- 4.96%
2.120
- 6.08%
WNS (ns)
- 0.060
- 0.054
- 0.193
- 0.205
- Pwr. (mW)
110.54 113.47 +2.66% 110.74 +0.18% 110.75 +0.19% TATE (1.4GHz) #MIV 3199 52442 52672 WL (m) 1.935 3.141 +62.38% 1.853
- 4.23%
1.843
- 4.73%
WNS (ns)
- 0.012
- 0.026
- 0.098
- 0.129
- Pwr. (mW)
315.44 319.42 +1.26% 318.99 +1.13% 318.53 +0.98% ECG (1.0GHz) #MIV 1010 21384 21270 WL (m) 0.989 1.253 +26.59% 0.909
- 8.13%
0.900
- 9.04%
WNS (ns)
- 0.037
- 0.020
- 0.04
- 0.056
- Pwr. (mW)
104.56 104.24
- 0.30%
105.70 +1.09% 105.42 +0.83%
12
- PPA comparison: Cascade-2D vs. Shrunk-2D vs. Compact-2D
– Mixed-size design (RISC-V Rocketcore processor)
- Same PPA trends
Comparison: Quantitative (3)
2D Cascade-2D Shrunk-2D Compact-2D Footprint (𝑛𝑛2) 0.563 0.281
- 50%
0.281
- 50%
0.281
- 50%
- Std. cell area (𝑛𝑛2)
0.238 0.241 +1.00% 0.234
- 1.90%
0.231
- 3.10%
#MIV 1313 46088 45624 WL (m) 2.896 3.517 +21.40% 2.542 -12.20% 2.592 -10.50% WNS
- 0.046
- 0.033
- 0.197
- 0.320
Cell Pwr. (mW) 334.5 343.0 +2.50% 334.6 +0.04% 332.8
- 0.50%
Net Pwr. (mW) 115.2 142.6 +23.80% 107.0
- 7.10%
106.0
- 8.00%
Leak Pwr. (mW) 47.8 49.1 +2.60% 45.7
- 4.40%
44.3
- 7.40%
- Tot. Pwr. (mW)
497.5 534.6 +7.50% 487.3
- 2.00%
483.0
- 2.90%
13
- Bad MIV planning → long 3D net wire detour
- Sequential placement based → MIVs crowded at the boundary
– Getting worse for high #MIV cases
- Ex. AES-128, 3.4GHz
Enhancement: Cascade-2D (1)
LR, 429 MIVs FM, 1,976 MIVs FM_5K, 5,238 MIVs FM_10K, 10,252 MIVs
14
- Solution: Placement-aware MIV planning
- Pre-place on 2D plane → Center-of-mass to MIV
- MIVs are spread in the middle
– Less WL (less detours) → Less power
Enhancement: Cascade-2D (2)
FM_5K, 5,238 MIVs FM_10K, 10,252 MIVs
2D Netlist
2D DEF
Partition Info. MIV Locations 3D Net C.o.M. → MIV Place & trialRoute Project MIV locations
Orig Place-aware Orig Place-aware #MIV 5,238 10,252 Cell area (𝒏𝒏𝟑) 0.127 0.123
- 3.60%
0.133 0.126 -4.89% WL (m) 1.817 1.575 -13.29% 1.894 1.737 -8.29%
- Tot. Pwr. (mW)
301 286.29 -4.89% 305.61 295.8 -3.21%
15
- Internal tier partitioning method (partitioning-last)
→ Less freedom to control the design
- Adopt partitioning-first scheme
- TP in the beginning → assign in the original stage
Enhancement: Shrunk-2D, Compact-2D (1)
2D Netlist + Tech files Cell Blow-up Shrunk-2D P&R Technology Scaling Legalization 2D Netlist Placement Contraction Compact-2D P&R Legalization Shrunk-2D Compact-2D Tier Partitioning Tier assignment Tier Partitioning Tier assignment
16
- Local density skew → PPA degraded
– AES-128 (cell-dominated): Tolerable – LDPC (wire-dominated): Severe
Enhancement: Shrunk-2D, Compact-2D (2)
AES-128
Partitioning-last Partitioning-first
LDPC Empty spaces
AES-128 (3.4GHz) Shrunk-2D Orig. Partitioning-first #MIV 39,521 4,704 WL (m) 1.212 1.278 +5.43% WNS (ns)
- 0.066
- 0.072
Net Pwr. (mW) 62.05 63.27 +1.96% Cell Pwr. (mW) 181.21 181.92 +0.39% LDPC (1.2GHz) Shrunk-2D Orig. Partitioning-first #MIV 15,390 13,529 WL (m) 1.226 1.365 +11.36% WNS (ns)
- 0.086
- 0.623
Net Pwr. (mW) 41.29 45.24 +9.58% Cell Pwr. (mW) 30.61 31.26 +2.12%
17
- We provide detailed discussion of state-of-the-art pseudo-3D
design flows
– Cascade-2D, Shrunk-2D, Compact-2D
- We provide thorough comparisons of pseudo-3D design flows,
both qualitative and quantitative
- We analyzed limitations on each flow and provide enhancements