cache architecture
play

CACHE ARCHITECTURE Mahdi Nazm Bojnordi Assistant Professor School - PowerPoint PPT Presentation

CACHE ARCHITECTURE Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 3 will be released on Oct. 31 st This lecture Cache addressing and


  1. CACHE ARCHITECTURE Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture

  2. Overview ¨ Announcement ¤ Homework 3 will be released on Oct. 31 st ¨ This lecture ¤ Cache addressing and lookup ¤ Cache optimizations n Techniques to improve miss rate n Replacement policies n Write policies

  3. Recall: Cache Addressing ¨ Instead of specifying cache address we specify main memory address ¨ Simplest: direct-mapped cache Memory 0000 0001 Note: each memory address maps to 0010 0011 a single cache location determined by 0100 0101 modulo hashing 0110 0111 1000 Cache 1001 1010 How to exactly specify 00 1011 01 which blocks are in the 1100 10 cache? 1101 11 1110 1111

  4. Direct-Mapped Lookup tag index byte ¨ Byte offset: to select v the requested byte 0 1 ¨ Tag: to maintain the 2 address ¨ Valid flag (v): … whether content is 1021 meaningful 1022 1023 ¨ Data and tag are = always accessed data hit

  5. Example Problem ¨ Find the size of tag, index, and offset bits for an 8MB, direct-mapped L3 cache with 64B cache blocks. Assume that the processor can address up to 4GB of main memory.

  6. Example Problem ¨ Find the size of tag, index, and offset bits for an 8MB, direct-mapped L3 cache with 64B cache blocks. Assume that the processor can address up to 4GB of main memory. ¨ 4GB = 2 32 B à address bits = 32 ¨ 64B = 2 6 B à byte offset bits = 6 ¨ 8MB/64B = 2 17 à index bits = 17 ¨ tag bits = 32 – 6 – 17 = 9

  7. Example Problem ¨ Find the size of tag, index, and offset bits for an 8MB, direct-mapped L3 cache with 64B cache blocks. Assume that the processor can address up to 4GB of main memory.

  8. Example Problem ¨ Find the size of tag, index, and offset bits for an 8MB, direct-mapped L3 cache with 64B cache blocks. Assume that the processor can address up to 4GB of main memory. ¨ 4GB = 2 32 B à address bits = 32 ¨ 64B = 2 6 B à byte offset bits = 6 ¨ 8MB/64B = 2 17 à index bits = 17 ¨ tag bits = 32 – 6 – 17 = 9

  9. Cache Optimizations ¨ How to improve cache performance? AMAT = t h + r m t p ¨ Reduce hit time (t h ) ¨ Improve hit rate (1 - r m ) ¨ Reduce miss penalty (t p )

  10. Cache Optimizations ¨ How to improve cache performance? AMAT = t h + r m t p ¨ Reduce hit time (t h ) ¤ Memory technology, critical access path ¨ Improve hit rate (1 - r m ) ¨ Reduce miss penalty (t p )

  11. Cache Optimizations ¨ How to improve cache performance? AMAT = t h + r m t p ¨ Reduce hit time (t h ) ¤ Memory technology, critical access path ¨ Improve hit rate (1 - r m ) ¤ Size, associativity, placement/replacement policies ¨ Reduce miss penalty (t p )

  12. Cache Optimizations ¨ How to improve cache performance? AMAT = t h + r m t p ¨ Reduce hit time (t h ) ¤ Memory technology, critical access path ¨ Improve hit rate (1 - r m ) ¤ Size, associativity, placement/replacement policies ¨ Reduce miss penalty (t p ) ¤ Multi level caches, data prefetching

  13. Set Associative Caches ¨ Improve cache hit rate by allowing a memory location to be placed in more than one cache block ¤ N-way set associative cache ¤ Fully associative ¨ For fixed capacity, higher associativity typically leads to higher hit rates ¤ more places to simultaneously map cache lines ¤ 8-way SA close to FA in practice … for (i=0; i<10000; i++) { a a++; b++; } b Memory

  14. Set Associative Caches ¨ Improve cache hit rate by allowing a memory location to be placed in more than one cache block ¤ N-way set associative cache ¤ Fully associative ¨ For fixed capacity, higher associativity typically leads to higher hit rates ¤ more places to simultaneously map cache lines ¤ 8-way SA close to FA in practice … for (i=0; i<10000; i++) { a a++; b++; } b way 1 way 0 Memory

  15. n-Way Set Associative Lookup tag index byte ¨ Index into cache sets v 0 ¨ Multiple tag comparisons 1 ¨ Multiple data reads … ¨ Special cases 510 ¤ Direct mapped 511 n Single block sets ¤ Fully associative mux = = n Single set cache data hit OR

  16. Example Problem ¨ Find the size of tag, index, and offset bits for an 4MB, 4-way set associative cache with 32B cache blocks. Assume that the processor can address up to 4GB of main memory.

  17. Example Problem ¨ Find the size of tag, index, and offset bits for an 4MB, 4-way set associative cache with 32B cache blocks. Assume that the processor can address up to 4GB of main memory. ¨ 4GB = 2 32 B à address bits = 32 ¨ 32B = 2 5 B à byte offset bits = 5 ¨ 4MB/(4x32B) = 2 15 à index bits = 15 ¨ tag bits = 32 – 5 – 15 = 12

  18. Cache Miss Classifications ¨ Start by measuring miss rate with an ideal cache ¤ 1. ideal is fully associative and infinite capacity ¤ 2. then reduce capacity to size of interest ¤ 3. then reduce associativity to degree of interest

  19. Cache Miss Classifications ¨ Start by measuring miss rate with an ideal cache ¤ 1. ideal is fully associative and infinite capacity ¤ 2. then reduce capacity to size of interest ¤ 3. then reduce associativity to degree of interest 1. Cold (compulsory)

  20. Cache Miss Classifications ¨ Start by measuring miss rate with an ideal cache ¤ 1. ideal is fully associative and infinite capacity ¤ 2. then reduce capacity to size of interest ¤ 3. then reduce associativity to degree of interest 1. Cold (compulsory) q Cold start: first access to block q How to improve o large blocks o prefetching

  21. Cache Miss Classifications ¨ Start by measuring miss rate with an ideal cache ¤ 1. ideal is fully associative and infinite capacity ¤ 2. then reduce capacity to size of interest ¤ 3. then reduce associativity to degree of interest 1. Cold (compulsory) 2. Capacity q Cold start: first access to block q How to improve o large blocks o prefetching

  22. Cache Miss Classifications ¨ Start by measuring miss rate with an ideal cache ¤ 1. ideal is fully associative and infinite capacity ¤ 2. then reduce capacity to size of interest ¤ 3. then reduce associativity to degree of interest 1. Cold (compulsory) 2. Capacity q Cold start: first q Cache is smaller access to block than the program q How to improve data q How to improve o large blocks o prefetching o large cache

  23. Cache Miss Classifications ¨ Start by measuring miss rate with an ideal cache ¤ 1. ideal is fully associative and infinite capacity ¤ 2. then reduce capacity to size of interest ¤ 3. then reduce associativity to degree of interest 1. Cold (compulsory) 2. Capacity 3. Conflict q Cold start: first q Cache is smaller access to block than the program q How to improve data q How to improve o large blocks o prefetching o large cache

  24. Cache Miss Classifications ¨ Start by measuring miss rate with an ideal cache ¤ 1. ideal is fully associative and infinite capacity ¤ 2. then reduce capacity to size of interest ¤ 3. then reduce associativity to degree of interest 1. Cold (compulsory) 2. Capacity 3. Conflict q Cold start: first q Cache is smaller q Set size is smaller access to block than the program than mapped q How to improve data mem. locations q How to improve q How to improve o large blocks o prefetching o large cache o large cache o more assoc.

  25. Miss Rates: Example Problem ¨ 100,000 loads and stores are generated; L1 cache has 3,000 misses; L2 cache has 1,500 misses. What are various miss rates?

  26. Miss Rates: Example Problem ¨ 100,000 loads and stores are generated; L1 cache has 3,000 misses; L2 cache has 1,500 misses. What are various miss rates? ¨ L1 miss rates ¤ Local/global: 3,000/100,000 = 3% ¨ L2 miss rates ¤ Local: 1,500/3,000 = 50% ¤ Global: 1,500/100,000 = 1.5%

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend