Set-Associative Caches Improve cache hit ratio by allowing a memory - PDF document

Set-Associative Caches ● Improve cache hit ratio by allowing a memory location to be placed in more than one cache block — N-Way associative cache allows placement in any block of a set with N elements • N is the set-size • Number of blocks = N x number of sets • Set number is selected by a simple modulo function of the address bits (The set number is sometimes called the index .) • N comparators are needed to search all elements of the set in parallel — Fully-Associative Cache • When there is a single set allowing a memory location to be placed in any cache block — Direct-mapped organization can be considered a degenerate set-associative cache with set-size 1 ● For fixed cache capacity, higher associativity leads to higher hit rates — Because more combinations of cache lines can be present in the cache at the same time Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 28 2-Way Set-Associative Cache Example 31 11 10 2 1 0 Byte Address Address Tag Set Number Offset Set Number Cache Tag Cache Data Cache Data Cache Tag Valid 0 : : : : : : : 511 21 21 Adr Tag Compare Compare 1 0 Mux Sel1 Sel0 OR Cache Block Hit 4KB 2-Way Associative Cache with 4B Blocks Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 29

Memory Reference Sequence ● Look again at the following sequence of memory references for the previous 2-way associative cache — 0,4,8188,0,16384,0 ● This sequence had 5 misses and 1 hit for the direct- mapped cache with the same capacity and block size Set Valid Tag Data Number XXXX XXXX 0 0 0 XXXX XXXX 0 XXXX XXXX 1 0 XXXX XXXX . . . . . . 0 XXXX XXXX 0 XXXX XXXX 511 0 XXXX XXXX Cache Initially Empty Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 30 After Reference 1 ● Look again at the following sequence of memory references for the previous 2-way associative cache — 0,4,8188,0,16384,0 Address = 000000000000000000000 000000000 00 Set Valid Tag Data Number 000000000000000000000 Memory bytes 0..3 (copy) 1 0 0 XXXX XXXX 0 XXXX XXXX 1 0 XXXX XXXX . . . . . . 0 XXXX XXXX 0 XXXX XXXX 511 0 XXXX XXXX Cache Miss, Place in First Block of Set 0 Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 31

After Reference 2 ● Look again at the following sequence of memory references for the previous 2-way associative cache — 0,4,8188,0,16384,0 Address = 000000000000000000000 000000001 00 Set Valid Tag Data Number 000000000000000000000 Memory bytes 0..3 (copy) 1 0 0 XXXX XXXX 1 000000000000000000000 Memory bytes 4..7 (copy) 1 0 XXXX XXXX . . . . . . 0 XXXX XXXX 0 XXXX XXXX 511 0 XXXX XXXX Cache Miss, Place in First Block of Set 1 Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 32 After Reference 3 ● Look again at the following sequence of memory references for the previous 2-way associative cache — 0,4,8188,0,16384,0 Address = 000000000000000000011 111111111 00 Set Valid Tag Data Number 000000000000000000000 Memory bytes 0..3 (copy) 1 0 0 XXXX XXXX 1 000000000000000000000 Memory bytes 4..7 (copy) 1 0 XXXX XXXX . . . . . . 0 XXXX XXXX 1 000000000000000000011 Memory bytes 8188..8191 (copy) 511 0 XXXX XXXX Cache Miss, Place in First Block of Set 511 Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 33

After Reference 4 ● Look again at the following sequence of memory references for the previous 2-way associative cache — 0,4,8188,0,16384,0 Address = 000000000000000000000 000000000 00 Set Valid Tag Data Number 000000000000000000000 Memory bytes 0..3 (copy) Hit 1 0 0 XXXX XXXX 1 000000000000000000000 Memory bytes 4..7 (copy) 1 0 XXXX XXXX . . . . . . 0 XXXX XXXX 0 XXXX XXXX 511 1 000000000000000000011 Memory bytes 8188..8191 (copy) Cache Hit to First Block in Set 0 Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 34 After Reference 5 ● Look again at the following sequence of memory references for the previous 2-way associative cache — 0,4,8188,0,16384,0 Address = 000000000000000001000 000000000 00 Set Valid Tag Data Number 000000000000000000000 Memory bytes 0..3 (copy) 1 0 1 000000000000000001000 Memory bytes 16384..16387 (copy) 1 000000000000000000000 Memory bytes 4..7 (copy) 1 0 XXXX XXXX . . . . . . 0 XXXX XXXX 0 XXXX XXXX 511 1 000000000000000000011 Memory bytes 8188..8191 (copy) Cache Miss, Place in Second Block of Set 0 Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 35

After Reference 6 ● Look again at the following sequence of memory references for the previous 2-way associative cache — 0,4,8188,0,16384,0 Address = 000000000000000000000 000000000 00 Set Valid Tag Data Number 000000000000000000000 Memory bytes 0..3 (copy) Hit 1 0 1 000000000000000001000 Memory bytes 16384..16387 (copy) 1 000000000000000000000 Memory bytes 4..7 (copy) 1 0 XXXX XXXX . . . . . . 0 XXXX XXXX 0 XXXX XXXX 511 1 000000000000000000011 Memory bytes 8188..8191 (copy) Cache Hit to First Block in Set 0 Total of 2 hits and 4 misses Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 36 Miss Rate vs. Set Size Associativity Instruction Miss Rate Data Miss Rate 1 2.0% 1.7% 2 1.6% 1.4% 4 1.6% 1.4% ● Data is for gcc (compiler execution) for DECStation 3100 with separate code/data 64KB caches using 16B blocks ● In general, the benefit increasing associativity beyond 2- 4 has minimal impact on miss ratio — 4-way associativity shows more benefit for combined code/data caches Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 37

Miss Rate vs. Set Size 15% Data for SPEC92 on 12% combined code/data cache with 32B block 9% Miss rate 6% 3% 0% One-way Two-way Four-way Eight-way Associativity 1 KB 16 KB 2 KB 32 KB 4 KB 64 KB 8 KB 128 KB Figure 7.29 from text. Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 38 Set-Associative Cache Disadvantages ● N-way Set Associative vs. ● Direct mapped cache: Direct Mapped Cache — Data available before — N comparators vs. 1 Hit/Miss — Extra mux delay for data • Assume hit and continue • Recover later if miss — Data available after Hit/Miss Set Number Cache Tag Cache Data Cache Data Cache Tag Valid 0 : : : : : : : 511 21 21 Adr Tag Compare Compare 1 0 Mux Sel1 Sel0 OR Cache Block Hit Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 39

Another Extreme: Fully Associative ● Fully Associative Cache — push set associative to its limit: only one set! • => no set number (or Index) — Compare the Cache Tags of all cache entries in parallel — Example: Block Size = 32B blocks => N 27-bit comparators — Generally not used for caches because of cost, but fully- associative translation buffers (cover soon) are common 31 4 0 Cache Tag (27 bits long) Byte Select Ex: 0x01 Cache Tag Valid Bit Cache Data = Byte 01 Byte 30 Byte 31 : = Byte 32 Byte 62 Byte 63 : = = : : : = Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 40 Cache Miss Classification ● Start by measuring miss rate with an idealized cache — Ideal is fully associative and infinite capacity — Then reduce capacity to size of interest — Then reduce associativity to degree of interest ● Compulsory — First access to a block => cold start — Helps to increase block size and can use prefetching ● Capacity — Cache cannot contain all blocks accessed by program — Helps to increase cache size ● Conflict — Number of memory locations mapped to a set exceeds the set size — Helps to • increase cache size because there are more sets • increase associativity ● Invalidation — Another processor or I/O invalidates the line — Helps to tune allocation and usage of shared data Mark Heinrich & John Hennessy EE182 Winter/98 Memory Hierarchy Slide 41

Set-Associative Caches Improve cache hit ratio by allowing a memory - PDF document

Set-Associative Caches Improve cache hit ratio by allowing a memory location to be placed in more than one cache block N-Way associative cache allows placement in any block of a set with N elements N is the set-size Number of

Associative caches (3 rd Ed: p.496-504, 4 th Ed: 479-487) flexible block placement schemes

Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By

Associative arrays Associative arrays map a key to a value Keys and values can be different

In-Place Associative Computing Avidan Akerib Ph.D. Vice President Associative Computing BU

Multicore Workshop Caches Mark Bull David Henty EPCC, University of Edinburgh Overview

Trace Caches and optimizations therein CSE 240C - Rushi Chakrabarti - Winter 2009 Trace Caches

Cache Related Preemption Delay for Set-Associative Caches Resilience Analysis Sebastian

Cache design overview ANY cache can be viewed as k-way associative. What are the pros and cons of

Review: Why We Use Caches Caches Review Mechanism for transparent movement of Proc 1000

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Associative Fine-Tuning of Biologically Inspired Active Neuro-Associative Knowledge Graphs Adrian

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Associative dyadic boolean functions Goals Def: A Boolean function f : { 0 , 1 } 2 { 0 , 1 }

Example: Associative Arrays An environment can be expressed as an associative array, e.g.:

10. Left-associative grammar (LAG) 10.1 Rule types and derivation order 10.1.1 The notion

Highly-Associative Caches for Low-Power Processors

Virtual Memory CS 351: Systems Programming Michael Saelee <lee@iit.edu> Computer Science

Chapter 4 Cache Memory Contents Computer memory system overview Characteristics of

CSCE 410/611: Virtualization ! Definitions, Terminology ! Why Virtual Machines? !

F LOW P ROPHET : Generic and Accurate Traffic Prediction for Data-parallel Cluster Computing Hao

Operating Systems ECE344 Ding Yuan Lecture Overview Today well cover more paging mechanisms:

WORKPLACE WELLBEING TEAM NUMBER 02 TEAM LEADER Sara Dickinson, Stantec TEAM MEMBERS Andy

Large objects in the Cloud Thursday, 11 April 13 Riak Cloud Storage Cloud Storage software

Virtual Memory: Concepts 15-213: Introduc0on to Computer Systems

Set-Associative Caches Improve cache hit ratio by allowing a memory - PDF document

Set-Associative Caches Improve cache hit ratio by allowing a memory location to be placed in more than one cache block N-Way associative cache allows placement in any block of a set with N elements N is the set-size Number of

Associative caches (3 rd Ed: p.496-504, 4 th Ed: 479-487) flexible block placement schemes

Lazy Associative Classification Decision Tree Classifier (Eager) Associative Classifier By

Associative arrays Associative arrays map a key to a value Keys and values can be different

In-Place Associative Computing Avidan Akerib Ph.D. Vice President Associative Computing BU

Multicore Workshop Caches Mark Bull David Henty EPCC, University of Edinburgh Overview

Trace Caches and optimizations therein CSE 240C - Rushi Chakrabarti - Winter 2009 Trace Caches

Cache Related Preemption Delay for Set-Associative Caches Resilience Analysis Sebastian

Cache design overview ANY cache can be viewed as k-way associative. What are the pros and cons of

Review: Why We Use Caches Caches Review Mechanism for transparent movement of Proc 1000

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Associative Fine-Tuning of Biologically Inspired Active Neuro-Associative Knowledge Graphs Adrian

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Associative dyadic boolean functions Goals Def: A Boolean function f : { 0 , 1 } 2 { 0 , 1 }

Example: Associative Arrays An environment can be expressed as an associative array, e.g.:

10. Left-associative grammar (LAG) 10.1 Rule types and derivation order 10.1.1 The notion

Highly-Associative Caches for Low-Power Processors

Virtual Memory CS 351: Systems Programming Michael Saelee &lt;lee@iit.edu&gt; Computer Science

Chapter 4 Cache Memory Contents Computer memory system overview Characteristics of

CSCE 410/611: Virtualization ! Definitions, Terminology ! Why Virtual Machines? !

F LOW P ROPHET : Generic and Accurate Traffic Prediction for Data-parallel Cluster Computing Hao

Operating Systems ECE344 Ding Yuan Lecture Overview Today well cover more paging mechanisms:

WORKPLACE WELLBEING TEAM NUMBER 02 TEAM LEADER Sara Dickinson, Stantec TEAM MEMBERS Andy

Large objects in the Cloud Thursday, 11 April 13 Riak Cloud Storage Cloud Storage software

Virtual Memory: Concepts 15-213: Introduc0on to Computer Systems

Virtual Memory CS 351: Systems Programming Michael Saelee <lee@iit.edu> Computer Science