ECE232: Hardware Organization and Design Lecture 23: Associative - - PowerPoint PPT Presentation

ece232 hardware organization and design
SMART_READER_LITE
LIVE PREVIEW

ECE232: Hardware Organization and Design Lecture 23: Associative - - PowerPoint PPT Presentation

ECE232: Hardware Organization and Design Lecture 23: Associative Caches Adapted from Computer Organization and Design , Patterson & Hennessy, UCB Overview Last time: Direct mapped cache Pretty simple to understand Every memory


slide-1
SLIDE 1

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB

ECE232: Hardware Organization and Design

Lecture 23: Associative Caches

slide-2
SLIDE 2

ECE232: Associative Caches 2

Overview

  • Last time: Direct mapped cache
  • Pretty simple to understand
  • Every memory block goes in only one place in the cache
  • Somewhat limiting
  • May cause a lot of the cache to be unused
  • Idea!
  • Why not be more flexible: data can go into more than one place
  • Associative caches
slide-3
SLIDE 3

ECE232: Associative Caches 3

Cache addressing

  • How do you know if

something is in the cache? (Q1)

  • If it is in the cache,

how to find it? (Q2)

  • Traditional Memory
  • Given an address, provide

the data (has address decoder)

  • Associative Memory
  • AKA “Content Addressable

Memory”

  • Each line contain the

address (or part of it) and the data

Full/MSBs of Address Data

Tag Memory Cache To Processor From Processor Block X Block Y Block X

CPU

slide-4
SLIDE 4

ECE232: Associative Caches 4

Cache Organization

  • Fully-associative: any memory

location can be stored anywhere in the cache

  • Cache location and memory

address are unrelated

  • Direct-mapped: each memory

location maps onto exactly one cache entry

  • Some of the memory address

bit are used to index the cache

  • N-way set-associative: each

memory location can go into

  • ne of N sets

MSBs of Address Data LSBs of Address

Tag

slide-5
SLIDE 5

ECE232: Associative Caches 5

Direct mapped cache (assume 1 byte/Block)

  • Cache Block 0 can be
  • ccupied by data from
  • Memory blocks

0, 4, 8, 12

  • Cache Block 1 can be
  • ccupied by data from
  • Memory blocks

1, 5, 9, 13

  • Cache Block 2 can be
  • ccupied by data from
  • Memory blocks

2, 6, 10, 14

  • Cache Block 3 can be
  • ccupied by data from
  • Memory blocks

3, 7, 11, 15

4-Block Direct Mapped Cache

Memory

Cache Index

00002 01002 10002 11002

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3

Block Index

slide-6
SLIDE 6

ECE232: Associative Caches 6

Fully Associative Cache

00 10 00 01 10 00 10 10 00 11 10 00 tag Memory

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Block Index

Memory block address

1 word

tag

  • ffset

0110 0010 1110 1010

slide-7
SLIDE 7

ECE232: Associative Caches 7

CPU Main Memory

Tag Data

data address

Fully Associative Cache: Block=1 Byte

slide-8
SLIDE 8

ECE232: Associative Caches 8

Two-way Set Associative Cache

  • Two direct-mapped caches operate in parallel
  • Cache Index selects a “set” from the cache (set includes 2

blocks)

  • The two tags in the set are compared in parallel
  • Data is selected based on the tag result

Cache Data Cache Block 0 Cache Tag Valid

: : :

Cache Data Cache Block 0 Cache Tag Valid

: : :

Cache Index

Mux

1 Sel1 Sel0

Cache Block Compare

Tag

Compare OR

Hit Tag Set

slide-9
SLIDE 9

ECE232: Associative Caches 9

4-way Set Associative Cache

  • Allow block anywhere in a set
  • Advantages:
  • Better hit rate
  • Disadvantage:
  • More tag bits
  • More hardware
  • Higher access time

A Four-Way Set-Associative Cache, Block size = 4 bytes

Address 22 8 V Tag Index 1 2 253 254 255 Data V Tag Data V Tag Data V Tag Data 32 22 4-to-1 multiplexor Hit Data 1 2 3 8 9 10 11 12 30 31

slide-10
SLIDE 10

ECE232: Associative Caches 10

Set Associative Cache - addressing

TAG INDEX/Set # OFFSET

Tag to check if have correct block anywhere in set Index to select a set in cache Byte offset

slide-11
SLIDE 11

ECE232: Associative Caches 11

Associative Caches

  • Fully associative
  • Allow a given block to go in any cache entry
  • Requires all entries to be searched at once
  • Comparator per entry (expensive)
  • n-way set associative
  • Each set contains n entries
  • Block number determines which set
  • (Block number) modulo (#Sets in cache)
  • Search all entries in a given set at once
  • n comparators (less expensive)
slide-12
SLIDE 12

ECE232: Associative Caches 12

Spectrum of Associativity

  • For a cache with 8 entries
slide-13
SLIDE 13

ECE232: Associative Caches 13

How Much Associativity

  • Increased associativity decreases miss rate
  • But with diminishing returns
  • Simulation of a system with 64KB

D-cache, 16-word blocks, SPEC2000

  • 1-way: 10.3%
  • 2-way: 8.6%
  • 4-way: 8.3%
  • 8-way: 8.1%
slide-14
SLIDE 14

ECE232: Associative Caches 14

Set Associative Cache Organization

slide-15
SLIDE 15

ECE232: Associative Caches 15

Types of Cache Misses (for 3 organizations)

  • Compulsory (cold start): location has never been accessed -

first access to a block not in the cache

  • Capacity: since the cache cannot contain all the blocks of a

program, some blocks will be replaced and later retrieved

  • Conflict: when too many blocks try to load into the same set,

some blocks will be replaced and later retrieved

slide-16
SLIDE 16

ECE232: Associative Caches 16

Cache Design Decisions

  • For a given cache size
  • Block (Line) size
  • Number of Blocks (Lines)
  • How is the cache organized
  • Write policy
  • Replacement Strategy
  • Increase cache size
  • More Blocks (Lines)
  • More lines == Higher hit rate
  • Slower Memory
  • As many as practical
slide-17
SLIDE 17

ECE232: Associative Caches 17

Summary

  • Today: Associative caches
  • Provide more choices for block storage
  • More expensive in terms of hardware
  • Require comparators for tags
  • Many caches are set associative
  • Remember:
  • Direct mapped = 1 way set associative
  • Full associative = N way set associate (N is total blocks in cache)