Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory - - PowerPoint PPT Presentation

xylem enhancing vertical thermal conduction in 3d
SMART_READER_LITE
LIVE PREVIEW

Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory - - PowerPoint PPT Presentation

Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks Aditya Agrawal, Josep Torrellas and Sachin Idgunji University of Illinois at Urbana Champaign and Nvidia Corporation http://iacoma.cs.uiuc.edu MICRO, October 2017 Xylem


slide-1
SLIDE 1

Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks

University of Illinois at Urbana Champaign and Nvidia Corporation http://iacoma.cs.uiuc.edu MICRO, October 2017 Aditya Agrawal, Josep Torrellas and Sachin Idgunji

slide-2
SLIDE 2

2

Xylem

Image source: http://repasosdeharold.blogspot.com/2015/02/plants-test-review.html

slide-3
SLIDE 3

3

Processor-Memory Stacks: ▪ Reduced interconnect length and power ▪ Higher memory bandwidth ▪ Smaller form factors ▪ Heterogeneous integration ▪ 2.5D processor-memory stacks exist ▪ 3D processor-memory stacking is the future

Major challenge: Thermals

Motivation: Thermal Issues in 3D Stacking

Xylem, MICRO 2017

slide-4
SLIDE 4

4

3D Stacking Technologies

Xylem, MICRO 2017 TSV TSV BM1 Backside Metal Layers Frontside Metal Layers Mn M1 Silicon Devices Silicon D2D Layer Electrical μbump

Lower Die

(Back) Upper Die (Face) 2 μm 20 μm 2 μm Dummy μbump TTSV TTSV

Face-to-back (f2b) die interface (not to scale)

slide-5
SLIDE 5

5

✓ Processor power and I/O signals do not traverse TSVs ✓ Processor IR drop similar to current designs ✓ DRAM and processor die floorplans are independent because TSV count and location are governed by stacked DRAM standards  Thermal challenges

Stack Organization: Memory on Top

Xylem, MICRO 2017 Heat Sink Integrated Heat Spreader (IHS) Thermal Interface Material (TIM) Package Processor Silicon Processor Frontside Metal (Cu) DRAM Frontside Metal (Al) DRAM Silicon Die to Die (D2D) Layer Through Silicon Vias (TSVs) C4 pads

slide-6
SLIDE 6

6

▪ Identify the thermal bottleneck in 3D stacks: Die-to-Die(D2D) layers ▪ Improve vertical conduction through the D2D layer: – Align and short dummy μbumps with Thermal TSVs – Generic and custom TTSV placement schemes ▪ Use the resulting thermal headroom: – Boost processor frequency (400-720 MHz) & performance (11-18%) ▪ Exploit thermal heterogeneity: cores closer to TTSVs conduct heat better – Conductivity-aware thread placement and migration, and frequency boosting

Contributions

Xylem, MICRO 2017

slide-7
SLIDE 7

7

Thermal Resistance in the Stack

Xylem, MICRO 2017

Layer Rth (mm2-K/W) Bulk Silicon 0.83

  • Proc. Metal

1.00 D2D 13.33

▪ Thermal resistance per unit area: D2D layer is 13-16x more resistive than bulk silicon or metal layers

slide-8
SLIDE 8

8

Shortcomings of Prior Work

Xylem, MICRO 2017

▪ Underestimated the thermal resistance of D2D layer by assuming: – High conductivity – Small thickness ▪ Focused on increasing the conductivity of the bulk silicon using TTSVs ▪ Concluded that TTSVs alone are effective Our approach: Combine TTSVs with a mechanism to reduce D2D resistance

slide-9
SLIDE 9

9

Before Proposed

Propose: Dummy μbump-TTSV Alignment & Shorting

Xylem, MICRO 2017 Silicon Dummy μbump TTSV Silicon Underfill TTSV Upper Die (Face) Lower Die (Back) D2D Frontside Metal Layers Backside Metal Layers Silicon Dummy μbump TTSV Silicon Underfill TTSV Upper Die (Face) Lower Die (Back) D2D Frontside Metal Layers Backside Metal Layers

slide-10
SLIDE 10

10

TTSVs: ▪ Cannot disrupt regular DRAM arrays: Place in the DRAM peripheral logic ▪ Distribute TTSVs and avoid TTSV farms ▪ Maintain Keep Out Zone (KOZ) around each TTSV Dummy μbumps: ▪ Anywhere in the D2D layer except the electrical μbump locations

TTSV Placement: Constraints

Xylem, MICRO 2017

slide-11
SLIDE 11

11

DRAM (Wide IO) die floorplan Processor die floorplan

DRAM and Processor Baseline Floorplans

Xylem, MICRO 2017

Bank

TSV Bus

Logic

L2 IL1 DL1 L2 Core 1 IL1 DL1 L2 Core 2 DL1 IL1 L2 Core 3 DL1 IL1 L2 Core 4 L2 L2 L2 IL1 DL1 Core 5 IL1 DL1 Core 6 DL1 IL1 Core 7 DL1 IL1 Core 8

Memory Controllers Coherent Bus TSV Bus

slide-12
SLIDE 12

12

Generic (oblivious to hotspots) Custom (aligned with hotspots)

Proposal: TTSV Placement Schemes

Xylem, MICRO 2017

Bank

TTSV TSV Bus

Bank

TTSV TSV Bus

slide-13
SLIDE 13

13

▪ TTSV placement & TTSV-μbump alignment and shorting: – Increases thermal conduction from the processor die to the heat sink – Reduces the temperature of the processor die ▪ Proposal: Increase processor frequency to consume the thermal headroom – Increase application performance

Proposal: Frequency Boosting

Xylem, MICRO 2017

slide-14
SLIDE 14

14

▪ TTSV-μbump alignment and shorting creates high conductivity paths – Areas closer to TTSVs dissipate heat more easily – Result is thermal spatial heterogeneity in the stack ▪ Proposal: Three λ-aware optimizations to further improve performance – λ-aware thread placement – λ-aware frequency boosting – λ-aware thread migration

Proposal: Conductivity (λ) Aware Techniques

Xylem, MICRO 2017

slide-15
SLIDE 15

15

▪ 8-core OoO processor die @ 2.4 GHz ▪ 8 high Wide IO memory on top ▪ Processor timing and power: SESC & McPAT ▪ DRAM timing and power: DRAMSim2 ▪ Thermal analysis: 3D HotSpot ▪ Applications: SPLASH-2, PARSEC & NAS

Evaluation Setup

Xylem, MICRO 2017 Heat Sink IHS TIM Motherboard

  • Proc. Silicon

Proc.Metal DRAM Metal DRAM Silicon D2D Layer TSVs C4 pads

Memory-on-top configuration

slide-16
SLIDE 16

16

λ-aware techniques enable further 100-200 MHz improvements

Result Summary

Xylem, MICRO 2017

TTSV Placement Generic Custom Area Overhead 0.63% 0.81%

  • Proc. Temp. Reduction

5.0 oC 8.4 oC

  • Avg. Frequency Boost

400 MHz 720 MHz

  • Avg. Performance Gain

11% 18%

slide-17
SLIDE 17

17

▪ Identified that D2D layer is the thermal bottleneck in 3D stacks ▪ Improved vertical conduction through the D2D layer: – Align and short dummy μbumps with TTSVs – Generic and custom TTSV placement schemes ▪ Used the resulting thermal headroom to – Boost processor frequency (400-720 MHz) & performance (11-18%) ▪ Exploited thermal heterogeneity: cores closer to TTSVs conduct heat better – Conductivity-aware thread placement and migration, and frequency boosting – Enable further 100-200 MHz improvements

Conclusion

Xylem, MICRO 2017

slide-18
SLIDE 18

18

Xylem: Enhancing Vertical Thermal Conduction in 3D Processor-Memory Stacks

Aditya Agrawal, Josep Torrellas and Sachin Idgunji University of Illinois at Urbana Champaign and Nvidia Corporation http://iacoma.cs.uiuc.edu