Future Scaling of Processor- Memory Interfaces
Jung Ho Ahn†§, Norman P. Jouppi †, Christos Kozyrakis‡, Jacob Leverich‡, Robert S. Schreiber†
†HP Labs, ‡Stanford University, §Seoul National University
Future Scaling of Processor- Memory Interfaces Jung Ho Ahn , Norman - - PowerPoint PPT Presentation
Future Scaling of Processor- Memory Interfaces Jung Ho Ahn , Norman P. Jouppi , Christos Kozyrakis , Jacob Leverich , Robert S. Schreiber HP Labs, Stanford University, Seoul National University Executive summary
Jung Ho Ahn†§, Norman P. Jouppi †, Christos Kozyrakis‡, Jacob Leverich‡, Robert S. Schreiber†
†HP Labs, ‡Stanford University, §Seoul National University
Nov 19, 2009 2
SC09 – Future Scaling of Processor-Memory Interfaces
3 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
4 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
wl bit
DRAM – 1T1C cell
bank 7 bank 1 request data Row decoder Sense amplifier Column decoder DRAM Memory array bank 0 16,384 r
8,192 columns
5 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
6 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
Overfetching problem DRAM row size = 8kb, 8 or 16 DRAMs per DIMM Cache line size = 512b Over 99% of bits are unused if row/col = 1
7 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
MCDIMM features VMD = Virtual Memory Device : rank subsetting Demux register Over 99% of bits are unused if row/col = 1
8 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
Register Demultiplexer (optional ) Counter Demux Register
9 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
MCDIMM vs. mini-rank Register for data path vs. control path Timing constraint due to access interference Load balancing between rank subsets
10 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
D : # of DRAM chips per subset S : # of subsets per rank R : # of ranks per channel SP : static power of a DRAM chip ERW : energy needed to read/write a bit BWRW : read/write bandwidth per memory channel EAP : energy to activate/precharge a row fAP : frequency of activate/precharge per memory channel
11 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
BWRW : read/write bandwidth per memory channel fAP : frequency of activate/precharge per memory channel fCM : frequency of cache miss CL : line size of last-level cache : row/col (bank conflict ratio)
12 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
2b + l additional bits to correct b bits of bursty error + to detect l bits of bursty error
13 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
14 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
DIMM MT Core L1$ MT Core L1$ MT Core L1$ MC MC MC MC L2$ Dir Dir L2$ Dir L2$ L2$ Dir MT Core L1$
15 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
16 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
17 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
SPLASH-2 SPEC CPU 2006 PARSEC
18 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
19 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
SPLASH-2 SPEC CPU 2006 PARSEC
20 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
21 Nov 19, 2009 SC09 – Future Scaling of Processor-Memory Interfaces
SPLASH-2 SPEC CPU 2006 PARSEC
Nov 19, 2009 22
Multicore DIMM Instantiation of rank subsetting Gain energy efficiency & concurrency Sacrifice serialization latency Advantage in EDP (energy-delay product) with proper subsetting Energy-efficient, capacity-inefficient reliability solution Challenges on main memory systems Performance/capacity demands Energy-efficiency goals Reliability constraints
SC09 – Future Scaling of Processor-Memory Interfaces