Memory Hierarchy
Instructor: Jun Yang
11/19/2009
1
Memory Hierarchy Instructor: Jun Yang 1 11/19/2009 Motivation - - PowerPoint PPT Presentation
Memory Hierarchy Instructor: Jun Yang 1 11/19/2009 Motivation Processor-DRAM Memory Gap (latency) Proc 1000 CPU 60%/yr. Moores Law (2X/1.5yr) Performance 100 Processor-Memory Performance Gap: (grows 50% / year) 10 DRAM
1
2
DRAM CPU
3
4
Control Datapath Secondary Storage (Disk) Processor Registers Main Memory (DRAM) Off-Chip Caches (SRAM) On-Chip Caches
Tertiary Storage (Tape)
5
Lower Level Memory Upper Level Memory To Processor From Processor
Blk X Blk Y
6
7
8
00001 00101 01001 01101 10001 10101 11001 11101 000 Cache Memory 001 010 011 100 101 110 111
9
10
11
12
tag
index Direct mapping log n log b tag
index Set-associative mapping log n/s log b tag
Fully associative mapping log b
13 Cache Index 1 2 3
Cache Data Byte 0 4 31
Cache Tag Example: 0x50 Ex: 0x01 0x50 Stored as part
Valid Bit
31 Byte 1 Byte 31
Byte 32 Byte 33 Byte 63
Byte 992 Byte 1023
Cache Tag Byte Select Ex: 0x00 9
Block address
14
Cache Data Cache Block 0 Cache Tag Valid
Cache Data Cache Block 0 Cache Tag Valid
Cache Index Mux
1 Sel1 Sel0
Cache Block Compare Adr Tag Compare OR Hit
15
Cache Data Byte 0 4 31
Cache Tag (27 bits long) Valid Bit
Byte 1 Byte 31
Byte 32 Byte 33 Byte 63
Cache Tag Byte Select Ex: 0x01 = = = = =
16
17
18
19
LRU Tags A B A C B A B addresses
20
LRU Tags A B A C B A B addresses
A
A B A B 1 A C B C 1 B A B A 1 miss miss miss miss miss On a miss, we replace the LRU. On a hit, we j ust update the LRU.
21
23
buffer
24
Write buffer
AMAT = Hit timeL1 + Miss rateL1 x (Hit timeL2 + Local miss rateL2 x Miss penaltyL2) Average memory stall time = Miss rateL1 x Hit timeL2 + Miss rateL1 x Local miss rateL2 x Miss penaltyL2 Average memory stalls per instruction = Miss per instructionL1 x Hit timeL2 + Miss per instrucitonL2 x Miss penaltyL2
35
1 2 3 4 5 6 7 8 5 6 7 8 1 2 3 4
Ld r4, 1000 St 35, 1000
to memory ld/st enqueue
hit in victim cache
next instruction
40
+ Index the cache using virtual addresses, avoid TLB accesses, save time − Context switch causes flushing the entire cache − Alias: different virtual addresses may map into same physical address – need protection mechanism!