cache performance
play

Cache Performance Associativity Replacement Samira Khan Cache - PDF document

3/28/17 Agenda Review from last lecture Cache access Cache Performance Associativity Replacement Samira Khan Cache Performance March 28, 2017 Direct-Mapped Cache: Placement and Access Cache Abstraction and Metrics 00 | 000


  1. 3/28/17 Agenda • Review from last lecture • Cache access Cache Performance • Associativity • Replacement Samira Khan • Cache Performance March 28, 2017 Direct-Mapped Cache: Placement and Access Cache Abstraction and Metrics 00 | 000 | 000 - • Assume byte-addressable memory: 256 bytes, 8-byte blocks A 00 | 000 | 111 à 32 blocks Address • Assume cache: 64 bytes, 8 blocks Tag Store Data Store 01 | 000 | 000 - • Direct-mapped: A block can go to only one location B (is the address (stores 01 | 000 | 111 tag memory index byte in block in the cache? blocks) Tag store Data store + bookkeeping) 2b 3 bits 3 bits Address 10 | 000 | 000 - 10 | 000 | 111 Hit/miss? Data V tag 11 | 000 | 000 - • Cache hit rate = (# hits) / (# hits + # misses) = (# hits) / (# accesses) byte in block =? MUX 11 | 000 | 111 • Average memory access time (AMAT) Hit? Data = ( hit-rate * hit-latency ) + ( miss-rate * miss-latency ) • Addresses with same index contend for the same location 11 | 111 | 000 - • Cause conflict misses 11 | 111 | 111 3 Memory 4 1

  2. 3/28/17 Direct-Mapped Cache: Placement and Access Direct-Mapped Cache: Placement and Access A, B, A, B, A, B A, B, A, B, A, B A = 0b 00 000 xxx A = 0b 00 000 xxx Tag store Tag store B = 0b 01 000 xxx Data store Data store B = 0b 01 000 xxx 0 0 0 1 00 XXXXXXXXX 0 0 1 1 2 0 2 0 tag index byte in block tag index byte in block 3 0 3 0 A A 00 000 XXX 00 000 XXX 4 0 4 0 5 0 5 0 6 0 6 0 7 0 7 0 byte in block byte in block =? MUX MUX =? tag index byte in block tag index byte in block Hit? Data Hit? Data 2 bits 3 bits 3 bits 2 bits 3 bits 3 bits MISS: Fetch A and update tag 8-bit address 8-bit address Direct-Mapped Cache: Placement and Access Direct-Mapped Cache: Placement and Access A, B, A, B, A, B A, B, A, B, A, B A = 0b 00 000 xxx A = 0b 00 000 xxx Tag store Tag store B = 0b 01 000 xxx Data store Data store B = 0b 01 000 xxx 1 00 XXXXXXXXX 0 0 1 01 YYYYYYYYYY 0 0 1 1 2 0 2 0 tag index byte in block tag index byte in block 3 0 3 0 B B 01 000 XXX 01 000 XXX 4 0 4 0 5 0 5 0 6 0 6 0 7 0 7 0 byte in block byte in block =? MUX MUX =? tag index byte in block tag index byte in block Hit? Data Hit? Data 2 bits 3 bits 3 bits 2 bits 3 bits 3 bits Tags do not match: MISS 8-bit address 8-bit address Fetch block B, update tag 2

  3. 3/28/17 Direct-Mapped Cache: Placement and Access Direct-Mapped Cache: Placement and Access A, B, A, B, A, B A, B, A, B, A, B A = 0x 00 000 xxx A = 0x 00 000 xxx Tag store Tag store B = 0x 01 000 xxx Data store Data store B = 0x 01 000 xxx 1 01 YYYYYYYYYY 0 0 1 00 XXXXXXXXX 0 0 1 1 2 0 2 0 tag index byte in block tag index byte in block 3 0 3 0 A A 00 000 XXX 00 000 XXX 4 0 4 0 5 0 5 0 6 0 6 0 7 0 7 0 byte in block byte in block =? MUX MUX =? tag index byte in block tag index byte in block Hit? Data Hit? Data 2 bits 3 bits 3 bits 2 bits 3 bits 3 bits Tags do not match: MISS 8-bit address 8-bit address Fetch block A, update tag Associativity (and Tradeoffs) Set Associative Cache • Degree of associativity: How many blocks can map to the same index (or A, B, A, B, A, B set)? A = 0b 000 00 xxx Tag store Data store B = 0b 010 00 xxx XXXXXXXXX YYYYYYYYYY 0 1 000 1 010 • Higher associativity 0 1 0 2 0 0 ++ Higher hit rate tag index byte in block 3 0 0 A -- Slower cache access time (hit latency and data access latency) 000 00 XXX -- More expensive hardware (more comparators) =? MUX =? MUX • Diminishing returns from higher Logic byte in block associativity hit rate Data Hit? tag index byte in block HIT 3 bits 2 bits 3 bits 8-bit address associativity 12 3

  4. 3/28/17 Issues in Set-Associative Caches Eviction/Replacement Policy • Think of each block in a set having a “priority” • Which block in the set to replace on a cache miss? • Indicating how important it is to keep the block in the cache • Any invalid block first • Key issue: How do you determine/adjust block priorities? • If all are valid, consult the replacement policy • There are three key decisions in a set: • Random • FIFO • Insertion, promotion, eviction (replacement) • Least recently used (how to implement?) • Not most recently used • Insertion: What happens to priorities on a cache fill? • Least frequently used • Where to insert the incoming block, whether or not to insert the block • Hybrid replacement policies • Promotion: What happens to priorities on a cache hit? • Whether and how to change block priority • Eviction/replacement: What happens to priorities on a cache miss? • Which block to evict and how to adjust priorities 13 14 Least Recently Used Replacement Policy Least Recently Used Replacement Policy • 4-way • 4-way Tag store Tag store LRU LRU MRU -1 MRU -1 MRU -2 MRU MRU -2 MRU Set 0 A B C D Set 0 B C D E =? =? =? =? =? =? =? =? Logic Hit? Logic Hit? Data store Data store ACCESS PATTERN: ACBD ACCESS PATTERN: ACBDE 15 16 4

  5. 3/28/17 Least Recently Used Replacement Policy Least Recently Used Replacement Policy • 4-way • 4-way Tag store Tag store MRU MRU MRU -1 MRU -1 MRU -2 MRU MRU -2 MRU -1 Set 0 E B C D Set 0 B C D E =? =? =? =? =? =? =? =? Logic Hit? Logic Hit? Data store Data store ACCESS PATTERN: ACBDE ACCESS PATTERN: ACBDE 17 18 Least Recently Used Replacement Policy Least Recently Used Replacement Policy • 4-way • 4-way Tag store Tag store MRU MRU MRU -2 MRU -2 MRU -2 MRU -1 LRU MRU -1 Set 0 E B C D Set 0 B C D E =? =? =? =? =? =? =? =? Logic Hit? Logic Hit? Data store Data store ACCESS PATTERN: ACBDE ACCESS PATTERN: ACBDE 19 20 5

  6. 3/28/17 Least Recently Used Replacement Policy Least Recently Used Replacement Policy • 4-way • 4-way Tag store Tag store MRU MRU -1 MRU MRU LRU MRU -1 LRU MRU -1 Set 0 E B C D Set 0 B C D E =? =? =? =? =? =? =? =? Logic Hit? Logic Hit? Data store Data store ACCESS PATTERN: ACBDEB ACCESS PATTERN: ACBDEB 21 22 Least Recently Used Replacement Policy Implementing LRU • 4-way • Idea: Evict the least recently accessed block Tag store MRU -1 • Problem: Need to keep track of access ordering of blocks MRU LRU MRU -2 Set 0 E B C D • Question: 2-way set associative cache: • What do you need to implement LRU perfectly? =? =? =? =? • Question: 16-way set associative cache: Logic Hit? • What do you need to implement LRU perfectly? • What is the logic needed to determine the LRU victim? Data store ACCESS PATTERN: ACBDEB 23 24 6

  7. 3/28/17 Approximations of LRU Cache Replacement Policy: LRU or Random • Most modern processors do not implement “true LRU” (also • LRU vs. Random: Which one is better? called “perfect LRU”) in highly-associative caches • Example: 4-way cache, cyclic references to A, B, C, D, E • 0% hit rate with LRU policy • Set thrashing: When the “ program working set ” in a set is • Why? larger than set associativity • True LRU is complex • Random replacement policy is better when thrashing occurs • LRU is an approximation to predict locality anyway (i.e., not the best • In practice: possible cache management policy) • Depends on workload • Average hit rate of LRU and Random are similar • Examples: • Best of both Worlds: Hybrid of LRU and Random • Not MRU (not most recently used) • How to choose between the two? Set sampling • See Qureshi et al., “ A Case for MLP-Aware Cache Replacement, “ ISCA 2006. 25 26 What’s In A Tag Store Entry? Handling Writes (I) • Valid bit n When do we write the modified data in a cache to the next level? Write through: At the time the write happens • • Tag • Write back: When the block is evicted • Replacement policy bits • Write-back + Can consolidate multiple writes to the same block before eviction • Dirty bit? • Potentially saves bandwidth between cache levels + saves energy • Write back vs. write through caches -- Need a bit in the tag store indicating the block is “ dirty/modified ” • Write-through + Simpler + All levels are up to date. Consistent -- More bandwidth intensive; no coalescing of writes 27 28 7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend