admin
play

ADMIN 12 week exam next Wed Do practice problems before Monday - PowerPoint PPT Presentation

ADMIN 12 week exam next Wed Do practice problems before Monday Homework due Friday Last late turn-in Monday at 0800 Slide Set #16: Exploiting Memory Hierarchy Chapter 7 Reading 7.1-7.3 1 2 Memory, Cost, and


  1. ADMIN • 12 week exam next Wed – Do practice problems before Monday • Homework due Friday – Last late turn-in Monday at 0800 Slide Set #16: Exploiting Memory Hierarchy • Chapter 7 Reading – 7.1-7.3 1 2 Memory, Cost, and Performance Locality • Ideal World: we want a memory that is • A principle that makes caching work – Fast, – Big, & • If an item is referenced, 1. it will tend to be referenced again soon – Cheap! why? • Real World: SRAM access times are .5 – 5ns at cost of $4000 to $10,000 per GB. 2004 DRAM access times are 50-70ns at cost of $100 to $200 per GB. Disk access times are 5 to 20 million ns at cost of $.50 to $2 per GB. 2. nearby items will tend to be referenced soon. • Solution? why? 3 4

  2. Caching Basics Example – (Simplified) Direct Mapped Cache Memory Cache (N = 5) Processor • Definitions 1. Minimum unit of data: “block” or “cache line” 20 7 Address Data 1. Read 24 For now assume, block is 1 byte 21 3 2. Data requested is in the cache: 2. Read 25 22 27 3. Data requested is not in the cache: 0 3. Read 26 23 32 • Cache has a given number of blocks (N) 24 101 4. Read 24 1 25 78 Challenge: How to locate an item in the cache? • 5. Read 21 26 59 – Simplest way: 2 6. Read 26 27 24 Cache index = (Data address) mod N 7. Read 24 28 56 e.g., N = 10, Address = 1024, Index = 3 29 87 e.g., N = 16, Address = 33, Index = 8. Read 26 30 36 – Implications 4 9. Read 27 31 98 For a given data address, there is __________ possible cache index But for a given cache index there are __________ possible data items that Total hits? could go there Total misses? 5 6 Exercise #1 – Direct Mapped Cache Exercise #2 – Direct Mapped Cache Cache ( N = 4 ) Memory Cache (N = 5) Processor Memory Processor 20 7 Address Data 20 7 Address Data 1. Read 30 1. Read 30 21 3 21 3 2. Read 31 2. Read 31 22 27 22 27 0 0 3. Read 30 3. Read 30 23 32 23 32 24 101 24 101 4. Read 26 4. Read 26 1 1 25 78 25 78 5. Read 25 5. Read 25 26 59 26 59 2 2 6. Read 28 6. Read 28 27 24 27 24 7. Read 23 7. Read 23 28 56 28 56 3 3 29 87 29 87 8. Read 25 8. Read 25 30 36 30 36 4 9. Read 28 9. Read 28 31 98 31 98 Total hits? Total hits? Total misses? 7 Total misses? 8

  3. Exercise #3 – Stretch Exercise #4 – Stretch • Suppose we want to minimize the total number of bits needed to • Look back at Exercises 1 and 2 and identify at least two different implement a cache with N blocks. What is inefficient about our kinds of reasons for why there might be a cache miss. current design? • How might you possibly address each type of miss? • (Hint – consider bigger addresses) 9 10 Improving our basic cache Approach #1 – Increase Block Size � � ByteAddres s Index = � � mod N � BytesPerBl ock � • Why did we miss? How can we fix it? Memory Cache Processor 20 7 Address Data 1. Read 24 21 3 2. Read 25 22 27 3. Read 26 23 32 0 24 101 4. Read 24 25 78 1 5. Read 21 26 59 6. Read 18 27 24 2 28 56 7. Read 24 29 87 8. Read 27 30 36 3 9. Read 26 31 98 11 12

  4. Approach #2 – Add Associativity Performance Impact – Part 1 � � ByteAddres s N • To be fair, want to compare cache organizations with same data size Index = � � mod � BytesPerBl ock � Associativ ity – E.g., increasing block size must decrease number blocks (N) Memory Cache Processor Overall, increasing block size tends to decrease miss rate: • 40% 20 7 Address Data 1. Read 24 21 3 35% 2. Read 25 22 27 30% 3. Read 26 23 32 25% Miss rate 24 101 4. Read 24 20% 0 25 78 15% 5. Read 21 26 59 10% 6. Read 18 27 24 5% 28 56 7. Read 24 0% 1 4 16 64 256 29 87 8. Read 27 1 KB � Block size (bytes) 30 36 8 KB � 9. Read 26 31 98 16 KB � 64 KB � 256 KB 13 14 Performance Impact – Part 2 Exercise #1 – Show final cache and total hits Block size = 2, N = 4 • Increasing block size… Memory Cache Processor – May help by exploiting _____________locality 20 7 Address Data 1. Read 16 – But, may hurt by increasing _____________ 21 3 2. Read 14 (due to smaller __________ ) 22 27 3. Read 17 – Lesson – want block size > 1, but not too large 23 32 0 24 101 4. Read 13 25 78 • Increasing associativity 1 5. Read 24 26 59 – Overall N stays the same, but smaller number of sets 6. Read 17 27 24 – May help by exploiting _____________ locality 2 28 56 7. Read 15 (due to fewer ____________ ) 29 87 8. Read 25 30 36 – May hurt because cache gets slower 3 9. Read 27 31 98 – Do we want associativity? 15 16

  5. Exercise #2 Exercise #3 – Fill in blanks, show final cache & total hits Block size = _____, N = ______, Assoc = _____ • Show the correct formula for calculating the cache index, given the Memory Cache Processor cache parameters below 1. N = 10, Block size = 4 20 7 Address Data 1. Read 24 21 3 2. Read 25 22 27 3. Read 26 23 32 24 101 4. Read 24 0 2. N = 8, Block size = 1, Associativity = 4 25 78 5. Read 21 26 59 6. Read 26 27 24 28 56 7. Read 24 29 87 8. Read 26 3. N = 16, Block size = 8, Associativity = 2 30 36 1 9. Read 27 31 98 17 18 Exercise #4 Further Issues • When the associativity is > 1 and the cache is full, then whenever there is a miss the cache will: How to deal with writes? • – Find the set where the new data should go – Choose some existing data from that set to “evict” – Place the new data in the newly empty slot • Bit details – how can we store more efficiently? How should the cache decide which data to evict? • What happens on a miss? Evictions? 19 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend