Computer Systems Lecture 17 Caching Continued CS 230 - Spring 2020 - - PowerPoint PPT Presentation

computer systems
SMART_READER_LITE
LIVE PREVIEW

Computer Systems Lecture 17 Caching Continued CS 230 - Spring 2020 - - PowerPoint PPT Presentation

CS 230 Introduction to Computers and Computer Systems Lecture 17 Caching Continued CS 230 - Spring 2020 3-1 Cache Writing Strategy Store word loads blocks just like load word Then what does it do with new value? Write-through


slide-1
SLIDE 1

CS 230 - Spring 2020 3-1

CS 230 – Introduction to Computers and Computer Systems Lecture 17 – Caching Continued

slide-2
SLIDE 2

CS 230 - Spring 2020 3-2

Cache Writing Strategy

 Store word loads blocks just like load word  Then what does it do with new value?  Write-through

 store word updates both cache and main memory

 Write-back – many updates in same block

 only update cache, mark block as dirty

 add dirty bit column to cache diagram  1 (dirty) if store word executed for this block, 0 (clean) if not  update main memory when dirty block is evicted

and set dirty bit back to 0 if the new block came from a load word

slide-3
SLIDE 3

CS 230 - Spring 2020 3-3

Associative Caches

 n-way set associative

 divide a cache with M cache-blocks into M/n sets  each set contains n cache-blocks

 sets are contiguous

 addresses have cache-set instead of cache-block

 for address P we have cache-set Cs = (P/B)MOD(M/n)  in binary it’s the next log2(M/n) bits adjacent to the

“within the block bits”

 tag is T = P/(BM/n) which is remaining bits in binary

 when loading: check all tags in cache-set for a hit

slide-4
SLIDE 4

CS 230 - Spring 2020 3-4

Fully Associative and Direct Mapped

 Direct-mapped cache is 1-way set associative

 n = 1  no choice of where to put blocks

 Fully associative

 n-way set associative with n = B  allow a given block to go in any cache entry  requires all entries to be searched at once  hardware comparator per entry (expensive)

slide-5
SLIDE 5

CS 230 - Spring 2020 3-5

Replacement Policy

 Where do we put loaded blocks?

 associative caches give us a choice

 n=1 (direct-mapped) has only one block per set

 no choice, just put it in the one spot

 n>1

 fill any empty spots in order  if no empty spots: evict least recently used

 block that was interacted with (hit or miss) the longest time

ago

slide-6
SLIDE 6

CS 230 - Spring 2020 3-6

Example

 Assume 2-way set associative, M = 8, B = 2

bytes, 8-bit machine word, least-recently used eviction policy

Cache-set Index Valid Dirty Tag Data 00 000 N N 001 N N 01 010 N N 011 N N 10 100 N N 101 N N 11 110 N N 111 N N

slide-7
SLIDE 7

CS 230 - Spring 2020 3-7

Example

Instruction Address Binary addr Hit/miss Cache set lw 4410 00101 10 0 Miss 10 Cache-set Index Valid Dirty Tag Data 00 000 N N 001 N N 01 010 N N 011 N N 10 100 Y N 00101 Mem[0010110X] 101 N N 11 110 N N 111 N N

slide-8
SLIDE 8

CS 230 - Spring 2020 3-8

Example

Instruction Address Binary addr Hit/miss Cache set lw 6810 01000 10 0 Miss 10 Cache-set Index Valid Dirty Tag Data 00 000 N N 001 N N 01 010 N N 011 N N 10 100 Y N 00101 Mem[0010110X] 101 Y N 01000 Mem[0100010X] 11 110 N N 111 N N

slide-9
SLIDE 9

CS 230 - Spring 2020 3-9

Example

Instruction Address Binary addr Hit/miss Cache set sw 4510 00101 10 1 Hit 10 Cache-set Index Valid Dirty Tag Data 00 000 N N 001 N N 01 010 N N 011 N N 10 100 Y Y 00101 Mem[0010110X] 101 Y N 01000 Mem[0100010X] 11 110 N N 111 N N

slide-10
SLIDE 10

CS 230 - Spring 2020 3-10

Example

Instruction Address Binary addr Hit/miss Cache set sw 810 00001 00 0 Miss 00 Cache-set Index Valid Dirty Tag Data 00 000 Y Y 00001 Mem[0000100X] 001 N N 01 010 N N 011 N N 10 100 Y Y 00101 Mem[0010110X] 101 Y N 01000 Mem[0100010X] 11 110 N N 111 N N

slide-11
SLIDE 11

CS 230 - Spring 2020 3-11

Example

Instruction Address Binary addr Hit/miss Cache set lw 1210 00001 10 0 Miss 10 Cache-set Index Valid Dirty Tag Data 00 000 Y Y 00001 Mem[0000100X] 001 N N 01 010 N N 011 N N 10 100 Y Y 00101 Mem[0010110X] 101 Y N 00001 Mem[0000110X] 11 110 N N 111 N N The block that was here before got evicted

slide-12
SLIDE 12

CS 230 - Spring 2020 3-12

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

slide-13
SLIDE 13

CS 230 - Spring 2020 3-13

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

Cache-set Index Valid Dirty Tag Data 000 N N 001 N N 010 N N 011 N N 1 100 N N 101 N N 110 N N 111 N N

slide-14
SLIDE 14

CS 230 - Spring 2020 3-14

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

Cache-set Index Valid Dirty Tag Data 000 Y N 00001 Mem[000010XX] 001 N N 010 N N 011 N N 1 100 N N 101 N N 110 N N 111 N N 00001 0 00

tag set xx

miss

slide-15
SLIDE 15

CS 230 - Spring 2020 3-15

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

Cache-set Index Valid Dirty Tag Data 000 Y N 00001 Mem[000010XX] 001 N N 010 N N 011 N N 1 100 Y N 00111 Mem[001111XX] 101 N N 110 N N 111 N N 00111 1 10

tag set xx

miss miss

slide-16
SLIDE 16

CS 230 - Spring 2020 3-16

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

Cache-set Index Valid Dirty Tag Data 000 Y N 00001 Mem[000010XX] 001 Y N 10000 Mem[100000XX] 010 N N 011 N N 1 100 Y N 00111 Mem[001111XX] 101 N N 110 N N 111 N N 10000 0 11

tag set xx

miss miss miss

slide-17
SLIDE 17

CS 230 - Spring 2020 3-17

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

Cache-set Index Valid Dirty Tag Data 000 Y N 00001 Mem[000010XX] 001 Y N 10000 Mem[100000XX] 010 N N 011 N N 1 100 Y N 00111 Mem[001111XX] 101 N N 110 N N 111 N N 00001 0 11

tag set xx

miss miss miss hit

slide-18
SLIDE 18

CS 230 - Spring 2020 3-18

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

Cache-set Index Valid Dirty Tag Data 000 Y N 00001 Mem[000010XX] 001 Y N 10000 Mem[100000XX] 010 Y Y 00100 Mem[001000XX] 011 N N 1 100 Y N 00111 Mem[001111XX] 101 N N 110 N N 111 N N 00100 0 10

tag set xx

miss miss miss hit miss

slide-19
SLIDE 19

CS 230 - Spring 2020 3-19

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

Cache-set Index Valid Dirty Tag Data 000 Y N 00001 Mem[000010XX] 001 Y N 10000 Mem[100000XX] 010 Y Y 00100 Mem[001000XX] 011 Y N 00110 Mem[001100XX] 1 100 Y N 00111 Mem[001111XX] 101 N N 110 N N 111 N N 00110 0 10

tag set xx

miss miss miss hit miss miss

slide-20
SLIDE 20

CS 230 - Spring 2020 3-20

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

Cache-set Index Valid Dirty Tag Data 000 Y N 00001 Mem[000010XX] 001 Y Y 00011 Mem[000110XX] 010 Y Y 00100 Mem[001000XX] 011 Y N 00110 Mem[001100XX] 1 100 Y N 00111 Mem[001111XX] 101 N N 110 N N 111 N N 00011 0 00

tag set xx

miss miss miss hit miss miss miss+evic

slide-21
SLIDE 21

CS 230 - Spring 2020 3-21

Try it Yourself

Draw a cache diagram showing what the cache would look like after loading or storing the following addresses from/to memory. What is the hit chance of this cache for this access sequence? Assume 4-way assoc., M = 8, B = 4 bytes, 8-bit machine word. lw 0x08, lw 0x3E, lw 0x83, lw 0x0B, sw 0x22, lw 0x32, sw 0x18

miss miss miss hit miss miss miss+evic

7 accesses 1 hits 6 misses 1/7 = 0.14310 hit chance

slide-22
SLIDE 22

CS 230 - Spring 2020 3-22

Cache Performance

 CPU time

 program execution cycles (includes cache hits)  memory stall cycles from cache misses

 With simplifying assumptions:

penalty Miss n Instructio Misses Program ns Instructio penalty Miss rate Miss Program accesses Memory cycles stall Memory       Memory Stalls

 Include this in CPI to get Effective CPI

slide-23
SLIDE 23

CS 230 - Spring 2020 3-23

Cache Performance Example

 For some program we have:

 Miss rate = 4%  Miss penalty = 100 cycles to main memory  Base CPI = 2  36% of instructions are memory access

 Miss cycles per instruction

 0.36 * 0.04 * 100 = 1.44

 Effective CPI = 2 + 1.44 = 3.44

slide-24
SLIDE 24

CS 230 - Spring 2020 3-24

Average Access Time

 Average memory access time (AMAT)

 AMAT = hit time + miss rate * miss penalty

 Example

 1ns clock  hit time = 1 cycle  miss penalty = 20 cycles to main memory  cache miss rate = 5%  AMAT = 1 + 0.05 * 20 = 2ns

slide-25
SLIDE 25

CS 230 - Spring 2020 3-25

Multilevel Caches

 Primary (level-1) cache attached to CPU

 small, but fast (usually instruction speed)

 Level-2 cache handles misses from L1 cache

 larger, slower, but still faster than main memory

 Main memory handles L2 cache misses  Most modern computers have even more levels

slide-26
SLIDE 26

CS 230 - Spring 2020 3-26

Multilevel Cache Example

 Assume

 CPU base CPI = 1, clock rate = 4GHz  miss rate = 2%  50% of program is lw/sw  main memory access time = 100ns

 With just a one-level cache

 miss penalty = 100ns/0.25ns = 400 cycles  effective CPI = 1 + 0.5 * 0.02 * 400 = 5

slide-27
SLIDE 27

CS 230 - Spring 2020 3-27

Multilevel Cache Example

 Now add L2 cache

 access time = 5ns  global miss rate to main memory = 0.5%

 this is the chance of missing L1 cache and L2 cache

 Primary miss with L2 hit

 penalty = 5ns/0.25ns = 20 cycles

 Primary miss with L2 miss

 100ns main memory access time = 400 cycles

  • Eff. CPI = 1 + 0.5 * 0.02 * 20 + 0.5 * 0.005 * 400

= 1 + 0.2 + 1 = 2.2

slide-28
SLIDE 28

CS 230 - Spring 2020 3-28

Try it Yourself

Consider a processor with a 2-level cache

 it has a base CPI of 3  L1 cache has a miss rate of 20%  L2 cache has a hit time of 10 cycles  the global miss rate is 4%  main memory access time is 50 cycles  40% of instructions are lw/sw

 What is the effective CPI of this processor?

slide-29
SLIDE 29

CS 230 - Spring 2020 3-29

Try it Yourself

Consider a processor with a 2-level cache

 it has a base CPI of 3  L1 cache has a miss rate of 20%  L2 cache has a hit time of 10 cycles  the global miss rate is 4%  main memory access time is 50 cycles  40% of instructions are lw/sw

 What is the effective CPI of this processor?

3

slide-30
SLIDE 30

CS 230 - Spring 2020 3-30

Try it Yourself

Consider a processor with a 2-level cache

 it has a base CPI of 3  L1 cache has a miss rate of 20%  L2 cache has a hit time of 10 cycles  the global miss rate is 4%  main memory access time is 50 cycles  40% of instructions are lw/sw

 What is the effective CPI of this processor?

3 + 0.4 * 0.2 * 10

slide-31
SLIDE 31

CS 230 - Spring 2020 3-31

Try it Yourself

Consider a processor with a 2-level cache

 it has a base CPI of 3  L1 cache has a miss rate of 20%  L2 cache has a hit time of 10 cycles  the global miss rate is 4%  main memory access time is 50 cycles  40% of instructions are lw/sw

 What is the effective CPI of this processor?

3 + 0.4 * 0.2 * 10 + 0.4 * 0.04 * 50

slide-32
SLIDE 32

CS 230 - Spring 2020 3-32

Try it Yourself

Consider a processor with a 2-level cache

 it has a base CPI of 3  L1 cache has a miss rate of 20%  L2 cache has a hit time of 10 cycles  the global miss rate is 4%  main memory access time is 50 cycles  40% of instructions are lw/sw

 What is the effective CPI of this processor?

3 + 0.4 * 0.2 * 10 + 0.4 * 0.04 * 50 = 3+0.8+0.8 = 4.6