Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory - - PowerPoint PPT Presentation
Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory - - PowerPoint PPT Presentation
Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks Aditya Agrawal, Josep Torrellas and Sachin Idgunji University of Illinois at Urbana Champaign and Nvidia Corporation http://iacoma.cs.uiuc.edu MICRO, October 2017 Xylem
2
Xylem
Image source: http://repasosdeharold.blogspot.com/2015/02/plants-test-review.html
3
Processor-Memory Stacks: ▪ Reduced interconnect length and power ▪ Higher memory bandwidth ▪ Smaller form factors ▪ Heterogeneous integration ▪ 2.5D processor-memory stacks exist ▪ 3D processor-memory stacking is the future
Major challenge: Thermals
Motivation: Thermal Issues in 3D Stacking
Xylem, MICRO 2017
4
3D Stacking Technologies
Xylem, MICRO 2017 TSV TSV BM1 Backside Metal Layers Frontside Metal Layers Mn M1 Silicon Devices Silicon D2D Layer Electrical μbump
Lower Die
(Back) Upper Die (Face) 2 μm 20 μm 2 μm Dummy μbump TTSV TTSV
Face-to-back (f2b) die interface (not to scale)
5
✓ Processor power and I/O signals do not traverse TSVs ✓ Processor IR drop similar to current designs ✓ DRAM and processor die floorplans are independent because TSV count and location are governed by stacked DRAM standards Thermal challenges
Stack Organization: Memory on Top
Xylem, MICRO 2017 Heat Sink Integrated Heat Spreader (IHS) Thermal Interface Material (TIM) Package Processor Silicon Processor Frontside Metal (Cu) DRAM Frontside Metal (Al) DRAM Silicon Die to Die (D2D) Layer Through Silicon Vias (TSVs) C4 pads
6
▪ Identify the thermal bottleneck in 3D stacks: Die-to-Die(D2D) layers ▪ Improve vertical conduction through the D2D layer: – Align and short dummy μbumps with Thermal TSVs – Generic and custom TTSV placement schemes ▪ Use the resulting thermal headroom: – Boost processor frequency (400-720 MHz) & performance (11-18%) ▪ Exploit thermal heterogeneity: cores closer to TTSVs conduct heat better – Conductivity-aware thread placement and migration, and frequency boosting
Contributions
Xylem, MICRO 2017
7
Thermal Resistance in the Stack
Xylem, MICRO 2017
Layer Rth (mm2-K/W) Bulk Silicon 0.83
- Proc. Metal
1.00 D2D 13.33
▪ Thermal resistance per unit area: D2D layer is 13-16x more resistive than bulk silicon or metal layers
8
Shortcomings of Prior Work
Xylem, MICRO 2017
▪ Underestimated the thermal resistance of D2D layer by assuming: – High conductivity – Small thickness ▪ Focused on increasing the conductivity of the bulk silicon using TTSVs ▪ Concluded that TTSVs alone are effective Our approach: Combine TTSVs with a mechanism to reduce D2D resistance
9
Before Proposed
Propose: Dummy μbump-TTSV Alignment & Shorting
Xylem, MICRO 2017 Silicon Dummy μbump TTSV Silicon Underfill TTSV Upper Die (Face) Lower Die (Back) D2D Frontside Metal Layers Backside Metal Layers Silicon Dummy μbump TTSV Silicon Underfill TTSV Upper Die (Face) Lower Die (Back) D2D Frontside Metal Layers Backside Metal Layers
10
TTSVs: ▪ Cannot disrupt regular DRAM arrays: Place in the DRAM peripheral logic ▪ Distribute TTSVs and avoid TTSV farms ▪ Maintain Keep Out Zone (KOZ) around each TTSV Dummy μbumps: ▪ Anywhere in the D2D layer except the electrical μbump locations
TTSV Placement: Constraints
Xylem, MICRO 2017
11
DRAM (Wide IO) die floorplan Processor die floorplan
DRAM and Processor Baseline Floorplans
Xylem, MICRO 2017
Bank
TSV Bus
Logic
L2 IL1 DL1 L2 Core 1 IL1 DL1 L2 Core 2 DL1 IL1 L2 Core 3 DL1 IL1 L2 Core 4 L2 L2 L2 IL1 DL1 Core 5 IL1 DL1 Core 6 DL1 IL1 Core 7 DL1 IL1 Core 8
Memory Controllers Coherent Bus TSV Bus
12
Generic (oblivious to hotspots) Custom (aligned with hotspots)
Proposal: TTSV Placement Schemes
Xylem, MICRO 2017
Bank
TTSV TSV Bus
Bank
TTSV TSV Bus
13
▪ TTSV placement & TTSV-μbump alignment and shorting: – Increases thermal conduction from the processor die to the heat sink – Reduces the temperature of the processor die ▪ Proposal: Increase processor frequency to consume the thermal headroom – Increase application performance
Proposal: Frequency Boosting
Xylem, MICRO 2017
14
▪ TTSV-μbump alignment and shorting creates high conductivity paths – Areas closer to TTSVs dissipate heat more easily – Result is thermal spatial heterogeneity in the stack ▪ Proposal: Three λ-aware optimizations to further improve performance – λ-aware thread placement – λ-aware frequency boosting – λ-aware thread migration
Proposal: Conductivity (λ) Aware Techniques
Xylem, MICRO 2017
15
▪ 8-core OoO processor die @ 2.4 GHz ▪ 8 high Wide IO memory on top ▪ Processor timing and power: SESC & McPAT ▪ DRAM timing and power: DRAMSim2 ▪ Thermal analysis: 3D HotSpot ▪ Applications: SPLASH-2, PARSEC & NAS
Evaluation Setup
Xylem, MICRO 2017 Heat Sink IHS TIM Motherboard
- Proc. Silicon
Proc.Metal DRAM Metal DRAM Silicon D2D Layer TSVs C4 pads
Memory-on-top configuration
16
λ-aware techniques enable further 100-200 MHz improvements
Result Summary
Xylem, MICRO 2017
TTSV Placement Generic Custom Area Overhead 0.63% 0.81%
- Proc. Temp. Reduction
5.0 oC 8.4 oC
- Avg. Frequency Boost
400 MHz 720 MHz
- Avg. Performance Gain
11% 18%
17
▪ Identified that D2D layer is the thermal bottleneck in 3D stacks ▪ Improved vertical conduction through the D2D layer: – Align and short dummy μbumps with TTSVs – Generic and custom TTSV placement schemes ▪ Used the resulting thermal headroom to – Boost processor frequency (400-720 MHz) & performance (11-18%) ▪ Exploited thermal heterogeneity: cores closer to TTSVs conduct heat better – Conductivity-aware thread placement and migration, and frequency boosting – Enable further 100-200 MHz improvements
Conclusion
Xylem, MICRO 2017
18