ece232 hardware organization and design
play

ECE232: Hardware Organization and Design Lecture 22: Introduction to - PowerPoint PPT Presentation

ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer Organization and Design , Patterson & Hennessy, UCB Overview Caches hold a subset of data from the main memory Three types of caches


  1. ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer Organization and Design , Patterson & Hennessy, UCB

  2. Overview  Caches hold a subset of data from the main memory  Three types of caches • Direct mapped • Set associative • Fully associative  Today: Direct mapped • Each memory value can only be in one place in the cache • Is it there (Hit?) • Or is it not there (Miss?) ECE232: Introduction to Caches 2

  3. Direct Mapped Cache - Textbook Location determined by address  Direct mapped: only one choice  (Block address) modulo (#Blocks in cache) • #Blocks is a  power of 2 Use low-order  address bits ECE232: Introduction to Caches 3

  4. Direct mapped cache (assume 1 byte/Block) 4-Block Direct Cache Block 0 can be  Memory Mapped Cache occupied by data from Memory blocks • 0000 2 0 0 0, 4, 8, 12 1 1 2 2 Cache Block 1 can be 3 3  0100 2 4 occupied by data from 5 Memory blocks • Cache 6 1, 5, 9, 13 Index 7 1000 2 8 Cache Block 2 can be  9 occupied by data from 10 Memory blocks • 11 2, 6, 10, 14 1100 2 12 13 14 Cache Block 3 can be  15 occupied by data from Memory blocks • 3, 7, 11, 15 Block Index ECE232: Introduction to Caches 4

  5. Direct Mapped Cache – Index and Tag Memory 1 byte 00 00 2 0 0 1 1 2 2 3 3 01 00 2 4 5 Cache 6 Index Memory block 7 10 00 2 address 8 9 10 index tag 11 11 00 2 12 13 index determines block in cache  14 index = (address) mod (# blocks)  15 The number of cache blocks is power  of 2  cache index is the lower n bits Block of memory address Index ECE232: Introduction to Caches 5

  6. Direct Mapped w/Tag Memory tag 0 0 1 1 00 10 2 11 2 3 3 4 5 01 10 6 Cache Memory block 7 Index 8 address 9 10 10 10 index tag 11 12 tag determines which memory  13 block occupies cache block 11 10 14 15 hit: cache tag field = tag bits of  address Block miss: tag field  tag bits of  Index address ECE232: Introduction to Caches 6

  7. Direct Mapped Cache Simplest mapping is a direct mapped cache  Each memory address is associated with one possible block  within the cache • Therefore, we only need to look in a single location in the cache for the data if it exists in the cache ECE232: Introduction to Caches 7

  8. Finding Item within Block In reality, a cache block consists of a number of bytes/words  to (1) increase cache hit due to locality property and (2) reduce the cache miss time Given an address of item, index tells which block of cache to  look in Then, how to find requested item within the cache block?  Or, equivalently, “What is the byte offset of the item within  the cache block?” ECE232: Introduction to Caches 8

  9. Selecting part of a block (block size > 1 byte) If block size > 1, rightmost bits of index are really the offset  within the indexed block TAG INDEX OFFSET Tag to check if have Index to select a Byte offset correct block block in cache Example: Block size of 8 bytes; select byte 4 (or 2 nd word)  tag Memory address 0 11 1 2 11 01 100 3 Cache Index ECE232: Introduction to Caches 9

  10. Accessing data in a direct mapped cache Three types of events:  cache hit: cache block is valid and contains proper address,  so read desired word cache miss: nothing in cache in appropriate block, so fetch  from memory cache miss, block replacement: wrong data is in cache at  appropriate block, so discard it and fetch desired data from memory Cache Access Procedure:  • (1) Use Index bits to select cache block • (2) If valid bit is 1, compare the tag bits of the address with the cache block tag bits • (3) If they match, use the offset to read out the word/byte ECE232: Introduction to Caches 10

  11. Tags and Valid Bits How do we know which particular block is stored in a cache  location? Store block address as well as the data • Actually, only need the high-order bits • Called the tag • What if there is no data in a location?  Valid bit: 1 = present, 0 = not present • Initially 0 • ECE232: Introduction to Caches 11

  12. Cache Example 8-blocks, 1 byte/block, direct mapped  Initial state  Index V Tag Data 000 N 001 N 010 N 011 N 100 N 101 N 110 N 111 N ECE232: Introduction to Caches 12

  13. Cache Example Addr Binary Hit/mis Cache addr s block 22 10 110 Miss 110 Index V Tag Data 000 N 001 N 010 N 011 N 100 N 101 N 110 Y 10 Mem[10110] 111 N ECE232: Introduction to Caches 13

  14. Cache Example Addr Binary addr Hit/miss Cache block 26 11 010 Miss 010 Index V Tag Data 000 N 001 N 010 Y 11 Mem[11010] 011 N 100 N 101 N 110 Y 10 Mem[10110] 111 N ECE232: Introduction to Caches 14

  15. Cache Example Addr Binary addr Hit/miss Cache block 22 10 110 Hit 110 26 11 010 Hit 010 Index V Tag Data 000 N 001 N 010 Y 11 Mem[11010] 011 N 100 N 101 N 110 Y 10 Mem[10110] 111 N ECE232: Introduction to Caches 15

  16. Cache Example Addr Binary addr Hit/miss Cache block 16 10 000 Miss 000 3 00 011 Miss 011 16 10 000 Hit 000 Index V Tag Data 000 Y 10 Mem[10000] 001 N 010 Y 11 Mem[11010] 011 Y 00 Mem[00011] 100 N 101 N 110 Y 10 Mem[10110] 111 N ECE232: Introduction to Caches 16

  17. Cache Example Addr Binary addr Hit/miss Cache block 18 10 010 Miss 010 Index V Tag Data 000 Y 10 Mem[10000] 001 N 010 Y 10 Mem[10010] 011 Y 00 Mem[00011] 100 N 101 N 110 Y 10 Mem[10110] 111 N ECE232: Introduction to Caches 17

  18. Example: Larger Block Size 64 blocks, 16 bytes/block  To what block number does address 1200 map? • Block address =  1200/16  = 75  Block number = 75 modulo 64 = 11  31 10 9 4 3 0 Tag Index Offset 22 bits 6 bits 4 bits ECE232: Introduction to Caches 18

  19. Block Size Considerations Larger blocks should reduce miss rate  Due to spatial locality • But in a fixed-sized cache  Larger blocks  fewer of them • • More competition  increased miss rate Larger blocks  pollution • Larger miss penalty  Can override benefit of reduced miss rate • Early restart and critical-word-first can help • ECE232: Introduction to Caches 19

  20. Cache Misses On cache hit, CPU proceeds normally  On cache miss  Stall the CPU pipeline • Fetch block from next level of hierarchy • Instruction cache miss • • Restart instruction fetch Data cache miss • • Complete data access ECE232: Introduction to Caches 20

  21. Write-Through On data-write hit, could just update the block in cache  But then cache and memory would be inconsistent • Write through: also update memory  But makes writes take longer  e.g., if base CPI = 1, 10% of instructions are stores, write to • memory takes 100 cycles Effective CPI = 1 + 0.1×100 = 11 • Solution: write buffer  Holds data waiting to be written to memory • CPU continues immediately • • Only stalls on write if write buffer is already full ECE232: Introduction to Caches 21

  22. Write-Back Alternative: On data-write hit, just update the block in cache  Keep track of whether each block is dirty • When a dirty block is replaced  Write it back to memory • Can use a write buffer to allow replacing block to be read first • ECE232: Introduction to Caches 22

  23. Measuring Cache Performance Components of CPU time  Program execution cycles • • Includes cache hit time Memory stall cycles • • Mainly from cache misses With simplifying assumptions:  Memory stall cycles Memory accesses    Miss rate Miss penalty Program Instructio ns Misses    Miss penalty Program Instructio n ECE232: Introduction to Caches 23

  24. Average Access Time Hit time is also important for performance  Average memory access time (AMAT)  AMAT = Hit time + Miss rate × Miss penalty • Example  CPU with 1ns clock, hit time = 1 cycle, miss penalty = 20 • cycles, I-cache miss rate = 5% AMAT = 1 + 0.05 × 20 = 2ns • • 2 cycles per instruction ECE232: Introduction to Caches 24

  25. Summary  Today: Direct mapped cache  Performance: tied to whether values are located in the cache • Cache miss = bad performance  Need to understand how to numerically determine system performance based on cache hit rate  Why might direct mapped caches be bad • Lots of data map to same location in cache  Idea • Maybe we should have multiple locations for each data value • Next time: set associative ECE232: Introduction to Caches 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend