Caches and Memory
Anne Bracy CS 3410 Computer Science Cornell University
See P&H Chapter: 5.1-5.4, 5.8, 5.10, 5.13, 5.15, 5.17
1 Slides by Anne Bracy with 3410 slides by Professors Weatherspoon, Bala, McKee, and Sirer.
Caches and Memory Anne Bracy CS 3410 Computer Science Cornell - - PowerPoint PPT Presentation
Caches and Memory Anne Bracy CS 3410 Computer Science Cornell University Slides by Anne Bracy with 3410 slides by Professors Weatherspoon, Bala, McKee, and Sirer. See P&H Chapter: 5.1-5.4, 5.8, 5.10, 5.13, 5.15, 5.17 1 Programs 101 C
See P&H Chapter: 5.1-5.4, 5.8, 5.10, 5.13, 5.15, 5.17
1 Slides by Anne Bracy with 3410 slides by Professors Weatherspoon, Bala, McKee, and Sirer.
int main (int argc, char* argv[ ]) { int i; int m = n; int sum = 0; for (i = 1; i <= m; i++) { sum += i; } printf (“...”, n, sum); }
main: addiu $sp,$sp,-48 sw $31,44($sp) sw $fp,40($sp) move $fp,$sp sw $4,48($fp) sw $5,52($fp) la $2,n lw $2,0($2) sw $2,28($fp) sw $0,32($fp) li $2,1 sw $2,24($fp) $L2: lw $2,24($fp) lw $3,28($fp) slt $2,$3,$2 bne $2,$0,$L3 . . . 2
Instructions that read from
3
Write- Back Memory Instruction Fetch Execute Instruction Decode
extend
register file control ALU memory din dout addr PC memory new pc inst
IF/ID ID/EX EX/MEM MEM/WB
imm B A ctrl ctrl ctrl B D D M
compute jump/branch targets
+4
forward unit detect hazard Stack, Data, Code Stored in Memory Code Stored in Memory (also, data and stack)
SandyBridge Motherboard, 2011 http://news.softpedia.com
4
5
6
7
Level 2 $
Level 1 Data $ Level 1 Insn $ Intel Pentium 3, 1999
8
total = 0; for (i = 0; i < n; i++) total += a[i]; return total;
9
10
11
Registers
1 cycle, 128 bytes 4 cycles, 64 KB
Intel Haswell Processor, 2013
12 cycles, 256 KB 36 cycles, 2-20 MB 50-70 ns, 512 MB – 4 GB 5-20 ms 16GB – 4 TB,
12
13
Registers
1 cycle, 128 bytes 4 cycles, 64 KB
Intel Haswell Processor, 2013
50-70 ns, 512 MB – 4 GB 5-20 ms 16GB – 4 TB,
12 cycles, 256 KB 36 cycles, 2-20 MB
14
16
Registers
ON CHIP
Processor
Regs
I$ D$
Registers(
L1(Caches(
L2(Cache(
L3(Cache( Main(Memory(
Disk( ON CHIP
Processor
Regs
I$ D$
Processor
Regs
I$ D$
Processor
Regs
I$ D$
Processor
Regs
I$ D$
17
*Registers,D-Flip Flops: 10-100’s of registers Memory technology Transistor count* Access time Access time in cycles $ per GIB in 2012 Capacity SRAM (on chip) 6-8 transistors 0.5-2.5 ns 1-3 cycles $4k 256 KB SRAM (off chip) 1.5-30 ns 5-15 cycles $4k 32 MB DRAM 1 transistor (needs refresh) 50-70 ns 150-200 cycles $10-$20 8 GB SSD (Flash) 5k-50k ns Tens of thousands $0.75-$1 512 GB Disk 5M-20M ns Millions $0.05- $0.1 4 TB 18
19
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
20
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
data A B C D
index 00 01 10 11
21
A B C D E F … Z
http://www.bedbathandbeyond.com
Spice Wall (Memory) Spice Rack (Cache)
index spice 22
Cinnamon
innamon
Spice Wall (Memory)
A B C D E F … Z
Spice Rack (Cache)
index spice tag 23
data A B C D tag 00 00 00 00
index 00 01 10 11 addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
24
V
tag
data 00 X 00 X 00 X 00 X index 00 01 10 11 addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
25
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data 11 X 11 X 11 X 11 X
Miss
index 00 01 10 11
26
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data 1 11 N xx X xx X xx X
Miss
index 00 01 10 11
27
V
tag
data 1 11 N 11 X 11 X 11 X
Miss Hit!
index 00 01 10 11 addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
28
V
tag
data 1 00 1111 0000 1 11 1010 0101 01 1010 1010 1 11 0000 0000
2 2 2 = Hit! data 8 1010 0101
29
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data 11 X 11 X 11 X 11 X
Miss
index 00 01 10 11
30
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data 1 11 N xx X xx X xx X
Miss
index 00 01 10 11
31
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data 1 11 N 11 X 11 X 11 X
Miss Miss
index 00 01 10 11
32
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data 1 11 N 1 11 O 11 X 11 X
Miss Miss
index 00 01 10 11
33
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data 1 11 N 1 11 O xx X xx X
Miss Miss Miss
index 00 01 10 11
34
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data 1 01 E 1 11 O 11 X 11 X
Miss Miss Miss
index 00 01 10 11
35
V
tag
data 1 01 E 1 11 O 11 X 11 X
Miss Miss Miss Miss
index 00 01 10 11 addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
36
V
tag
data 1 11 N 1 11 O 11 X 11 X
Miss Miss Miss Miss
cold cold cold
index 00 01 10 11 addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
37
38
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data x A | B x C | D x E | F x G | H
index 00 01 10 11
39
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
V
tag
data x X | X x X | X x X | X x X | X
Miss
index
index 00 01 10 11
40
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q V
tag
data x X | X x X | X 1 1 N | O x X | X
index
Miss
index 00 01 10 11
41
V
tag
data x X | X x X | X 1 1 N | O x X | X
Hit!
index
Miss
index 00 01 10 11 addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
42
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q V
tag
data x X | X x X | X 1 1 N | O x X | X
Hit! Miss Miss
index
index 00 01 10 11
43
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q V
tag
data x X | X x X | X 1 E | F x X | X
Hit! Miss Miss
index
index 00 01 10 11
44
V
tag
data x X | X x X | X 1 E | F x X | X
Hit! Miss Miss Miss
index
index 00 01 10 11 addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
45
V
tag
data x X | X x X | X 1 E | F x X | X
Hit! Miss Miss Miss
cold cold conflict
index 00 01 10 11 addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
46
47
V
tag
data
xxx
X | X V
tag
data
xxx
X | X V
tag
data
xxx
X | X V
tag
data
xxx
X | X addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
48
V
tag
data
xxx
X | X V
tag
data
xxx
X | X V
tag
data
xxx
X | X V
tag
data
xxx
X | X
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
Miss
49
LRU Pointer
V
tag
data 1 110 N | O V
tag
data 0 xxx X | X V
tag
data
xxx
X | X V
tag
data
xxx
X | X
Miss
Hit!
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
50
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q V
tag
data 1 110 N | O V
tag
data 0 xxx X | X V
tag
data
xxx
X | X V
tag
data
xxx
X | X
Miss
Hit! Miss 51
LRU Pointer
V
tag
data 1 110 N | O V
tag
data 1 010 E | F V
tag
data
xxx
X | X V
tag
data
xxx
X | X
Miss
Hit! Miss Hit!
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
52
LRU Pointer
53
55
index
V
tag
data xx E | F xx C | D V
tag
data xx N | O xx P | Q
index 1 addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
56
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
index 1
index
V
tag
data xx X | X xx X | X V
tag
data xx X | X xx X | X
Miss
58
LRU Pointer
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
index 1
index
V
tag
data 1 11 N | O xx X | X V
tag
data xx X | X xx X | X
Miss
Hit! 59
LRU Pointer
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
index 1
index
V
tag
data 1 11 N | O xx X | X V
tag
data xx X | X xx X | X
Miss
Hit! Miss 60
LRU Pointer
addr data 0000 A 0001 B 0010 C 0011 D 0100 E 0101 F 0110 G 0111 H 1000 J 1001 K 1010 L 1011 M 1100 N 1101 O 1110 P 1111 Q
index 1
index
V
tag
data 1 11 N | O xx X | X V
tag
data 1 01 E | F xx X | X
Miss
Hit! Miss Hit! 61
LRU Pointer
62
63
64
66
67
More Associative Bigger Block Sizes Larger Capacity
68
69
71
32bits 64bytes
72
32bits 64bytes
(what we usually mean when we ask “how big” is the cache)
73
75
77
79
80
CPU Cache SRAM Memory DRAM
addr data
writes invalidate the cache and go directly to memory
writes go to main memory and cache
CPU writes only to cache cache writes to main memory later (when block is evicted)
CPU Cache SRAM Memory DRAM
addr data
allocate a cache line for new data (and maybe write-through)
ignore cache, just go to main memory
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
$0 $1 $2 $3
78 120 71 173 21 28 200 225
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 $0 $1 $2 $3
78 120 71 173 21 28 200 225
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
78 120 71 173 21 28 200 225
1 29 78 29 Addr: 0001
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
78 120 71 173 21 28 200 225
1 29 78 29
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
011 78 120 71 173 21 28 200 225
1 1 29 78 29 162 173 173 Addr: 0111
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
011 78 120 71 173 21 28 200 225
1 1 29 78 29 162 173 173
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
011 120 71 173 21 28 200 225
1 1 29 29 162 173 173 173 173 Addr: 0000
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 173 29 173 Addr: 0101 162 173 150 71
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 173 29 173 150 71 150 29
1 29
29 123 29 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 173 29 173 29 71 Addr: 1010
1
29 123 29 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 29 71 33 28 33
1
29 123 29 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 29 71 33 28 33 29 29 Addr: 0101
1
29 123 29 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 29 71 33 28 33 29 29
1
29 123 29 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 29 71 33 28 33 29 29 Addr: 1011
1
33 29
29 123 29 162 18 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 29 71 28 33 29 29 33 29
1
29 123 29 162 18 29 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 29 71 29 28 33 29 29 29 29
1
29 123 29 162 18 29 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 173 120 71 173 21 28 200 225
1 1 29 29 71 29 28 33 29 29 29
1
V D Tag Byte 1 Byte 2 … Byte N
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 $0 $1 $2 $3
78 120 71 173 21 28 200 225
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 $0 $1 $2 $3
78 120 71 173 21 28 200 225
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
78 120 71 173 21 28 200 225
1 29 78 29 Addr: 0001
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
78 120 71 173 21 28 200 225
1 29 78 29
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
78 120 71 173 21 28 200 225
1 29 78 29
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
011 78 120 71 173 21 28 200 225
1 1 29 78 29 162 173 173 Addr: 0111
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
011 78 120 71 173 21 28 200 225
1 1 29 78 162 173 29 173
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
011 78 120 71 173 21 28 200 225
1 1 1 29 173 29 162 173 173 Addr: 0000
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
011 78 120 71 173 21 28 200 225
1 1 1 29 173 29 162 173 173
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 1 29 173 29 173 150 71 Addr: 0101 29
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 1 29 173 29 173 29 71 Addr: 1010
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
000
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 1 29 173 29 173 29 71 173 Addr: 1010
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 29 29 71 33 28 33 Addr: 1010
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 29 29 71 33 28 33 Addr: 0101
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 29 29 71 33 28 33
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 29 29 71 33 28 33
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 1 29 29 71 29 28 33
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 1 29 29 71 29 28 33
1
29 123 150 162 18 33 19 210
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
101
$0 $1 $2 $3
010 78 120 71 173 21 28 200 225
1 1 1 1 29 29 71 29 28 33
1
1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12 13 …