Cache Memories Thanks to Randal E. Bryant and David R. - PowerPoint PPT Presentation

Carnegie Mellon Cache ¡Memories ¡ ¡ Thanks ¡to ¡Randal ¡E. ¡Bryant ¡and ¡David ¡R. ¡O’Hallaron ¡from ¡CMU ¡ Reading ¡Assignment: ¡ ¡ Computer ¡Systems: ¡A ¡Programmer’s ¡Perspec4ve, ¡Third ¡Edi4on, ¡Chapter ¡6 ¡ ¡ 1

Carnegie Mellon Today ¡ ¢ Cache ¡memory ¡organiza7on ¡and ¡opera7on ¡ ¢ Performance ¡impact ¡of ¡caches ¡ § The ¡memory ¡mountain ¡ § Rearranging ¡loops ¡to ¡improve ¡spa4al ¡locality ¡ § Using ¡blocking ¡to ¡improve ¡temporal ¡locality ¡ ¡ 2

Carnegie Mellon Example Memory Hierarchy L0: Regs CPU registers hold words Smaller, retrieved from the L1 cache. L1 cache L1: faster, (SRAM) and L1 cache holds cache lines retrieved from the L2 cache. costlier L2 cache L2: (per byte) (SRAM) storage L2 cache holds cache lines devices retrieved from L3 cache L3 cache L3: (SRAM) L3 cache holds cache lines retrieved from main memory. Larger, L4: Main memory slower, (DRAM) and Main memory holds cheaper disk blocks retrieved (per byte) from local disks. storage Local secondary storage L5: devices (local disks) Local disks hold files retrieved from disks on remote servers Remote secondary storage L6: (e.g., Web servers) 3

Carnegie Mellon General ¡Cache ¡Concept ¡ Smaller, ¡faster, ¡more ¡expensive ¡ Cache ¡ 8 4 ¡ ¡ 9 ¡ 10 14 ¡ ¡ 3 ¡ memory ¡caches ¡a ¡ ¡subset ¡of ¡ the ¡blocks ¡ Data ¡is ¡copied ¡in ¡block-‑sized ¡ 10 4 ¡ ¡ transfer ¡units ¡ Larger, ¡slower, ¡cheaper ¡memory ¡ Memory ¡ viewed ¡as ¡par77oned ¡into ¡“blocks” ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 4 ¡ ¡ 5 ¡ 6 ¡ 7 ¡ 8 ¡ 9 ¡ 10 10 ¡ ¡ 11 ¡ 12 ¡ 13 ¡ 14 ¡ 15 ¡ 4

Carnegie Mellon Cache ¡Memories ¡ ¢ Cache ¡memories ¡are ¡small, ¡fast ¡SRAM-‑based ¡memories ¡ managed ¡automa7cally ¡in ¡hardware ¡ § Hold ¡frequently ¡accessed ¡blocks ¡of ¡main ¡memory ¡ ¢ CPU ¡looks ¡first ¡for ¡data ¡in ¡cache ¡ ¢ Typical ¡system ¡structure: ¡ CPU chip Register file Cache ALU memory System bus Memory bus Main I/O Bus interface memory bridge 5

Carnegie Mellon General ¡Cache ¡Organiza7on ¡(S, ¡E, ¡B) ¡ E ¡= ¡2 e ¡lines ¡per ¡set ¡ set ¡ line ¡ S ¡= ¡2 s ¡sets ¡ Cache ¡size: ¡ C ¡= ¡S ¡x ¡E ¡x ¡B ¡data ¡bytes ¡ B-‑1 ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ v ¡ valid ¡bit ¡ B ¡= ¡2 b ¡bytes ¡per ¡cache ¡block ¡(the ¡data) ¡ 6

Carnegie Mellon Cache ¡Read ¡ • Locate ¡set ¡ • Check ¡if ¡any ¡line ¡in ¡set ¡ has ¡matching ¡tag ¡ E ¡= ¡2 e ¡lines ¡per ¡set ¡ • Yes ¡+ ¡line ¡valid: ¡hit ¡ • Locate ¡data ¡starAng ¡ at ¡offset ¡ Address ¡of ¡word: ¡ t ¡bits ¡ s ¡bits ¡ b ¡bits ¡ S ¡= ¡2 s ¡sets ¡ tag ¡ set ¡ block ¡ index ¡ offset ¡ data ¡begins ¡at ¡this ¡offset ¡ tag ¡ B-‑1 ¡ v ¡ 0 ¡ 1 ¡ 2 ¡ valid ¡bit ¡ B ¡= ¡2 b ¡bytes ¡per ¡cache ¡block ¡(the ¡data) ¡ 7

Carnegie Mellon Example: ¡Direct ¡Mapped ¡Cache ¡(E ¡= ¡1) ¡ Direct ¡mapped: ¡One ¡line ¡per ¡set ¡ Assume: ¡cache ¡block ¡size ¡8 ¡bytes ¡ Address ¡of ¡int: ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ t ¡bits ¡ 0…01 ¡ 100 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ find ¡set ¡ S ¡= ¡2 s ¡sets ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ tag ¡ v ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ 8

Carnegie Mellon Example: ¡Direct ¡Mapped ¡Cache ¡(E ¡= ¡1) ¡ Direct ¡mapped: ¡One ¡line ¡per ¡set ¡ Assume: ¡cache ¡block ¡size ¡8 ¡bytes ¡ Address ¡of ¡int: ¡ valid? ¡ ¡ ¡+ ¡ match: ¡assume ¡yes ¡= ¡hit ¡ t ¡bits ¡ 0…01 ¡ 100 ¡ v ¡ tag ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ block ¡offset ¡ 9

Carnegie Mellon Example: ¡Direct ¡Mapped ¡Cache ¡(E ¡= ¡1) ¡ Direct ¡mapped: ¡One ¡line ¡per ¡set ¡ Assume: ¡cache ¡block ¡size ¡8 ¡bytes ¡ Address ¡of ¡int: ¡ valid? ¡ ¡ ¡+ ¡ match: ¡assume ¡yes ¡= ¡hit ¡ t ¡bits ¡ 0…01 ¡ 100 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ block ¡offset ¡ int ¡(4 ¡Bytes) ¡is ¡here ¡ If ¡tag ¡doesn’t ¡match: ¡old ¡line ¡is ¡evicted ¡and ¡replaced ¡ 10

Carnegie Mellon Direct-‑Mapped ¡Cache ¡Simula7on ¡ t=1 ¡ s=2 ¡ b=1 ¡ M=16 ¡bytes ¡(4-‑bit ¡addresses), ¡B=2 ¡bytes/block, ¡ ¡ x ¡ xx ¡ x ¡ S=4 ¡sets, ¡E=1 ¡Blocks/set ¡ ¡ ¡ Address ¡trace ¡(reads, ¡one ¡byte ¡per ¡read): ¡ miss ¡ ¡ 0 ¡[0000 2 ], ¡ ¡ hit ¡ ¡1 ¡[0001 2 ], ¡ ¡ ¡ miss ¡ ¡7 ¡[0111 2 ], ¡ ¡ ¡ miss ¡ ¡8 ¡[1000 2 ], ¡ ¡ ¡ miss ¡ ¡0 ¡[0000 2 ] ¡ v ¡ Tag ¡ Block ¡ 1 ¡ 1 ¡ 1 ¡ 0 ¡ 0 ¡ 1 ¡ 0 ¡ ? ¡ M[8-‑9] ¡ M[0-‑1] ¡ M[0-‑1] ¡ ? ¡ Set ¡0 ¡ Set ¡1 ¡ Set ¡2 ¡ Set ¡3 ¡ 1 ¡ 0 ¡ M[6-‑7] ¡ 11

Carnegie Mellon E-‑way ¡Set ¡Associa7ve ¡Cache ¡(Here: ¡E ¡= ¡2) ¡ E ¡= ¡2: ¡Two ¡lines ¡per ¡set ¡ Assume: ¡cache ¡block ¡size ¡8 ¡bytes ¡ Address ¡of ¡short ¡int: ¡ t ¡bits ¡ 0…01 ¡ 100 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ find ¡set ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ 12

Carnegie Mellon E-‑way ¡Set ¡Associa7ve ¡Cache ¡(Here: ¡E ¡= ¡2) ¡ E ¡= ¡2: ¡Two ¡lines ¡per ¡set ¡ Assume: ¡cache ¡block ¡size ¡8 ¡bytes ¡ Address ¡of ¡short ¡int: ¡ t ¡bits ¡ 0…01 ¡ 100 ¡ compare ¡both ¡ valid? ¡ ¡+ ¡ ¡ match: ¡yes ¡= ¡hit ¡ v ¡ tag tag ¡ ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ block ¡offset ¡ 13

Carnegie Mellon E-‑way ¡Set ¡Associa7ve ¡Cache ¡(Here: ¡E ¡= ¡2) ¡ E ¡= ¡2: ¡Two ¡lines ¡per ¡set ¡ Assume: ¡cache ¡block ¡size ¡8 ¡bytes ¡ Address ¡of ¡short ¡int: ¡ t ¡bits ¡ 0…01 ¡ 100 ¡ compare ¡both ¡ valid? ¡ ¡+ ¡ ¡ match: ¡yes ¡= ¡hit ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ v ¡ tag ¡ 0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ block ¡offset ¡ short ¡int ¡(2 ¡Bytes) ¡is ¡here ¡ No ¡match: ¡ ¡ • One ¡line ¡in ¡set ¡is ¡selected ¡for ¡evic7on ¡and ¡replacement ¡ • Replacement ¡policies: ¡random, ¡least ¡recently ¡used ¡(LRU), ¡… ¡ 14

Carnegie Mellon 2-‑Way ¡Set ¡Associa7ve ¡Cache ¡Simula7on ¡ t=2 ¡ s=1 ¡ b=1 ¡ M=16 ¡byte ¡addresses, ¡B=2 ¡bytes/block, ¡ ¡ xx ¡ x ¡ x ¡ S=2 ¡sets, ¡E=2 ¡blocks/set ¡ ¡ Address ¡trace ¡(reads, ¡one ¡byte ¡per ¡read): ¡ miss ¡ ¡ 0 ¡[0000 2 ], ¡ ¡ hit ¡ ¡1 ¡[0001 2 ], ¡ ¡ ¡ miss ¡ ¡7 ¡[0111 2 ], ¡ ¡ ¡ miss ¡ ¡8 ¡[1000 2 ], ¡ ¡ ¡ hit ¡ ¡0 ¡[0000 2 ] ¡ v ¡ Tag ¡ Block ¡ 1 ¡ 0 ¡ 00 ¡ ? ¡ M[0-‑1] ¡ ? ¡ Set ¡0 ¡ 0 ¡ 1 ¡ 10 ¡ M[8-‑9] ¡ 0 ¡ 1 ¡ 01 ¡ M[6-‑7] ¡ Set ¡1 ¡ 0 ¡ 15

Cache Memories Thanks to Randal E. Bryant and David R. - PowerPoint PPT Presentation

Carnegie Mellon Cache Memories Thanks to Randal E. Bryant and David R. OHallaron from CMU Reading Assignment: Computer Systems: A Programmers Perspec4ve,

Plan Hierarchical memories and their impact on our programs 1 Cache Memories, Cache Complexity

Cache Memories, Cache Complexity Marc Moreno Maza University of Western Ontario, London, Ontario

1 Classifying cache misses Cache Organization Classifying misses by causes (3Cs) Cache size,

Real Time Embedded Systems " Memories Memories " rene.beuchat@epfl.ch LAP/ISIM/IC/EPFL

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

L09: Cache Name: ID: Question: Direct Mapping Cache Hit Rate Consider a 4-block empty Cache,

Generations of Cache 1980: no cache in proc; 1989 first Intel proc with a cache on chip.

Cache Memory Chapter 17 S. Dandamudi Outline Introduction Types of cache misses

Cache Performance Associativity Replacement Samira Khan Cache Performance March 28,

Cache Memory Chapter 17 S. Dandamudi Outline Introduction Types of cache misses

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Memories Introduction Why do we need memory in an FPGA Device? Topics Types of FPGA

Cache Memories Lecture, Oct. 30, 2018 1 Bryant and OHallaron, Computer Systems: A

Cache Creek Placer Area Fee Proposal History of Placer Mining at Cache Creek Prospecting in

Review: Drawing Basics Canvas size( width , height ) Drawing Tools

Lecture 8: F -Test for Nested Linear Models Zhenke Wu Department of Biostatistics Johns Hopkins

D. Frekers Charge-exchange reactions GT-transitions, bb -decay and b n Flux @ 1 AU [cm -1 s -1

Progress in Superconduc/ng Qubits 1 2-Qubit Gate Error Two-Qubit Gate Error 0.1 0.01 Now

A Comprehensive Theory of Volumetric Radiance EsImaIon Using Photon

Shell Model Calculations of the Nuclear Matrix Elements for the Neutrinoless Double Beta Decay A.

Neutrino Physics a theoretical Perspective Manfred Lindner 8. Mai 2013 M. Lindner, MPIK .

Experimental search for Planck Stars Francesca Vidotto with A.Barrau, H. Haggard, C. Rovelli

Sambuz

Useful Links

Newsletter

Mail Us

Cache Memories Thanks to Randal E. Bryant and David R. - PowerPoint PPT Presentation

Carnegie Mellon Cache Memories Thanks to Randal E. Bryant and David R. OHallaron from CMU Reading Assignment: Computer Systems: A Programmers Perspec4ve,

Plan Hierarchical memories and their impact on our programs 1 Cache Memories, Cache Complexity

Cache Memories, Cache Complexity Marc Moreno Maza University of Western Ontario, London, Ontario

1 Classifying cache misses Cache Organization Classifying misses by causes (3Cs) Cache size,

Real Time Embedded Systems &quot; Memories Memories &quot; rene.beuchat@epfl.ch LAP/ISIM/IC/EPFL

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Web Cache Consistency Web Cache Consistency Web Cache Consistency Web Cache Consistency

L09: Cache Name: ID: Question: Direct Mapping Cache Hit Rate Consider a 4-block empty Cache,

Generations of Cache 1980: no cache in proc; 1989 first Intel proc with a cache on chip.

Cache Memory Chapter 17 S. Dandamudi Outline Introduction Types of cache misses

Cache Performance Associativity Replacement Samira Khan Cache Performance March 28,

Cache Memory Chapter 17 S. Dandamudi Outline Introduction Types of cache misses

Caches Electronic Computers M Caches 1 Cache LOCALITY PRINCIPLE (SPATIAL AND TEMPORAL)

Memories Introduction Why do we need memory in an FPGA Device? Topics Types of FPGA

Cache Memories Lecture, Oct. 30, 2018 1 Bryant and OHallaron, Computer Systems: A

Cache Creek Placer Area Fee Proposal History of Placer Mining at Cache Creek Prospecting in

Review: Drawing Basics Canvas size( width , height ) Drawing Tools

Lecture 8: F -Test for Nested Linear Models Zhenke Wu Department of Biostatistics Johns Hopkins

D. Frekers Charge-exchange reactions GT-transitions, bb -decay and b n Flux @ 1 AU [cm -1 s -1

Progress in Superconduc/ng Qubits 1 2-Qubit Gate Error Two-Qubit Gate Error 0.1 0.01 Now

A Comprehensive Theory of Volumetric Radiance EsImaIon Using Photon

Shell Model Calculations of the Nuclear Matrix Elements for the Neutrinoless Double Beta Decay A.

Neutrino Physics a theoretical Perspective Manfred Lindner 8. Mai 2013 M. Lindner, MPIK .

Experimental search for Planck Stars Francesca Vidotto with A.Barrau, H. Haggard, C. Rovelli

Sambuz

Useful Links

Newsletter

Mail Us

Real Time Embedded Systems " Memories Memories " rene.beuchat@epfl.ch LAP/ISIM/IC/EPFL