Adapted from Computer Organization and Design, Patterson & Hennessy, UCB
ECE232: Hardware Organization and Design Lecture 23: Associative - - PowerPoint PPT Presentation
ECE232: Hardware Organization and Design Lecture 23: Associative - - PowerPoint PPT Presentation
ECE232: Hardware Organization and Design Lecture 23: Associative Caches Adapted from Computer Organization and Design , Patterson & Hennessy, UCB Overview Last time: Direct mapped cache Pretty simple to understand Every memory
ECE232: Associative Caches 2
Overview
- Last time: Direct mapped cache
- Pretty simple to understand
- Every memory block goes in only one place in the cache
- Somewhat limiting
- May cause a lot of the cache to be unused
- Idea!
- Why not be more flexible: data can go into more than one place
- Associative caches
ECE232: Associative Caches 3
Cache addressing
- How do you know if
something is in the cache? (Q1)
- If it is in the cache,
how to find it? (Q2)
- Traditional Memory
- Given an address, provide
the data (has address decoder)
- Associative Memory
- AKA “Content Addressable
Memory”
- Each line contain the
address (or part of it) and the data
Full/MSBs of Address Data
Tag Memory Cache To Processor From Processor Block X Block Y Block X
CPU
ECE232: Associative Caches 4
Cache Organization
- Fully-associative: any memory
location can be stored anywhere in the cache
- Cache location and memory
address are unrelated
- Direct-mapped: each memory
location maps onto exactly one cache entry
- Some of the memory address
bit are used to index the cache
- N-way set-associative: each
memory location can go into
- ne of N sets
MSBs of Address Data LSBs of Address
Tag
ECE232: Associative Caches 5
Direct mapped cache (assume 1 byte/Block)
- Cache Block 0 can be
- ccupied by data from
- Memory blocks
0, 4, 8, 12
- Cache Block 1 can be
- ccupied by data from
- Memory blocks
1, 5, 9, 13
- Cache Block 2 can be
- ccupied by data from
- Memory blocks
2, 6, 10, 14
- Cache Block 3 can be
- ccupied by data from
- Memory blocks
3, 7, 11, 15
4-Block Direct Mapped Cache
Memory
Cache Index
00002 01002 10002 11002
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3
Block Index
ECE232: Associative Caches 6
Fully Associative Cache
00 10 00 01 10 00 10 10 00 11 10 00 tag Memory
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Block Index
Memory block address
1 word
tag
- ffset
0110 0010 1110 1010
ECE232: Associative Caches 7
CPU Main Memory
Tag Data
data address
Fully Associative Cache: Block=1 Byte
ECE232: Associative Caches 8
Two-way Set Associative Cache
- Two direct-mapped caches operate in parallel
- Cache Index selects a “set” from the cache (set includes 2
blocks)
- The two tags in the set are compared in parallel
- Data is selected based on the tag result
Cache Data Cache Block 0 Cache Tag Valid
: : :
Cache Data Cache Block 0 Cache Tag Valid
: : :
Cache Index
Mux
1 Sel1 Sel0
Cache Block Compare
Tag
Compare OR
Hit Tag Set
ECE232: Associative Caches 9
4-way Set Associative Cache
- Allow block anywhere in a set
- Advantages:
- Better hit rate
- Disadvantage:
- More tag bits
- More hardware
- Higher access time
A Four-Way Set-Associative Cache, Block size = 4 bytes
Address 22 8 V Tag Index 1 2 253 254 255 Data V Tag Data V Tag Data V Tag Data 32 22 4-to-1 multiplexor Hit Data 1 2 3 8 9 10 11 12 30 31
ECE232: Associative Caches 10
Set Associative Cache - addressing
TAG INDEX/Set # OFFSET
Tag to check if have correct block anywhere in set Index to select a set in cache Byte offset
ECE232: Associative Caches 11
Associative Caches
- Fully associative
- Allow a given block to go in any cache entry
- Requires all entries to be searched at once
- Comparator per entry (expensive)
- n-way set associative
- Each set contains n entries
- Block number determines which set
- (Block number) modulo (#Sets in cache)
- Search all entries in a given set at once
- n comparators (less expensive)
ECE232: Associative Caches 12
Spectrum of Associativity
- For a cache with 8 entries
ECE232: Associative Caches 13
How Much Associativity
- Increased associativity decreases miss rate
- But with diminishing returns
- Simulation of a system with 64KB
D-cache, 16-word blocks, SPEC2000
- 1-way: 10.3%
- 2-way: 8.6%
- 4-way: 8.3%
- 8-way: 8.1%
ECE232: Associative Caches 14
Set Associative Cache Organization
ECE232: Associative Caches 15
Types of Cache Misses (for 3 organizations)
- Compulsory (cold start): location has never been accessed -
first access to a block not in the cache
- Capacity: since the cache cannot contain all the blocks of a
program, some blocks will be replaced and later retrieved
- Conflict: when too many blocks try to load into the same set,
some blocks will be replaced and later retrieved
ECE232: Associative Caches 16
Cache Design Decisions
- For a given cache size
- Block (Line) size
- Number of Blocks (Lines)
- How is the cache organized
- Write policy
- Replacement Strategy
- Increase cache size
- More Blocks (Lines)
- More lines == Higher hit rate
- Slower Memory
- As many as practical
ECE232: Associative Caches 17
Summary
- Today: Associative caches
- Provide more choices for block storage
- More expensive in terms of hardware
- Require comparators for tags
- Many caches are set associative
- Remember:
- Direct mapped = 1 way set associative
- Full associative = N way set associate (N is total blocks in cache)