IC220 Caching 2: Memory Hierarchy (more from Chapter 5 - - PowerPoint PPT Presentation

IC220 Caching 2: Memory Hierarchy (more from Chapter 5 - specifically 5.7, 5.8) 1

Cache design overview ANY cache can be viewed as k-way associative. What are the pros and cons of each? • Fully associative: k = N/B • 4-way set associative, k = 4 • Direct-mapped, k = 1 2

Improving Cache Performance Remember key metrics: Miss Rate, Hit Time, Miss Penalty What happens if we: • Increase the cache size (N)? • Increase the block size (keeping N the same)? • Increase associativity (keeping N the same)? 3

Cache performance key tradeoff Inherent conflict: HIT TIME vs MISS RATE 4

More hierarchy – L2 cache? • Problem: CPUs get faster, DRAM gets bigger – Must keep hit time small (1 or 2 cycles) – But then cache must be small too (fast SRAM is expensive) – So miss rate gets higher... • Solution: Add another level of cache: – try and optimize the ____________ on the 1st level cache – try and optimize the ____________ on the 2nd level cache 5

Memory Hierarchy 6

Questions • Will the miss rate of a L2 cache be higher or lower than for the L1 cache? • Claim: “The register file is really the lowest level cache” What are reasons in favor and against this statement? 7

Split Caches • Instructions and data have different properties – May benefit from different cache organizations (block size, assoc…) ICache DCache (L1) (L1) L2 Cache CPU L3, L4, …? Main memory 8

What does an address refer to? The old way: • Address refers to a specific byte in main memory (DRAM). • This is called a physical address. Problems with this: CPU Physical address Cache Memory 9

Virtual memory: Main idea CPU works with (fake) virtual addresses. Operating system translates to physical addresses. Advantages: CPU Virtual address OS Translation New challenge: Physical address Cache Memory 10

Pages and virtual address translation • Virtual AND physical addresses divided into blocks called pages. • Typical page size is 4KiB (means 12 bits for offset) Cache Disk Memory 11

Page Tables • Translation from virtual to physical pages stored in page table. 12

Pages: virtual memory blocks • Page faults: the data is not in memory, retrieve it from disk – huge miss penalty (slow disk), thus • pages should be fairly • Replacement strategy: – can handle the faults in software instead of hardware • Writeback or write-through? 13

Address Translation Terminology: •Cache block ฀ •Cache miss ฀ •Cache tag ฀ •Byte offset ฀ 14

Making Address Translation Fast • A cache for address translations: translation lookaside buffer (TLB) Typical values: 16-512 PTEs (page table entries), miss-rate: .01% - 1% 15 miss-penalty: 10 – 100 cycles

Virtual Memory Take-Aways • CPU/programs deal with virtual addresses (virtual page number + page offset). • Translated to physical addresses (physical page # + page offset) between CPU and cache. • Memory is divided into blocks called pages, commonly 4KiB (therefore 12 bits for page offset). • Page tables, managed by the operating system for each process, store virtual->physical page number mapping, as well as that process’s permissions (read/write). • TLB is a special CPU cache for page table lookups. • Physical addresses can reside in DRAM (typical), or be stored on disk (making RAM “look” larger to CPU), or can even refer to other devices (memory-mapped I/O). 16

Modern Systems 17

Program Design 2D array layout • Consider this C declaration: int A[4][3] = { {10, 11, 12}, {20, 21, 22}, {30, 31, 32}, {40, 41, 42} }; • How is this array stored in memory? 20

Program Design for Caches – Example 1 • Option #1 for (j = 0; j < 20; j++) for (i = 0; i < 200; i++) x[i][j] = x[i][j] + 1; • Option #2 for (i = 0; i < 200; i++) for (j = 0; j < 20; j++) x[i][j] = x[i][j] + 1; 21

Program Design for Caches – Example 2 • Why might this code be problematic? int A[1024][1024]; int B[1024][1024]; for (i = 0; i < 1024; i++) for (j = 0; j < 1024; j++) A[i][j] += B[i][j]; • How to fix it? 22

Concluding Remarks • Fast memories are small, large memories are slow – We really want fast, large memories – Caching gives this illusion • Principle of locality – Programs use a small part of their memory space frequently • Memory hierarchy – L1 cache ↔ L2 cache ↔ … ↔ DRAM memory ↔ disk • Memory system design is critical for multiprocessors 23

IC220 Caching 2: Memory Hierarchy (more from Chapter 5 - - PowerPoint PPT Presentation

IC220 Caching 2: Memory Hierarchy (more from Chapter 5 - specifically 5.7, 5.8) 1 Cache design overview ANY cache can be viewed as k-way associative. What are the pros and cons of each? Fully associative: k = N/B 4-way set

IC220: Caching 1 (Chapter 5) 1 Memory, Cost, and Performance Ideal World: we want a memory

Memory Hierarchy Chapter 6 1 Outline Storage technologies and

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy & Caching Use several

Abstractions for Practical Systems Caching and the memory hierarchy Operating systems and the

I/O Caching and Page Replacement I/O Caching and Page Replacement Memory/Storage Hierarchy 101

1 Memory Read Transaction (1) Memory Read Transaction (2) CPU places address A on the memory

Memory Hierarchy Design Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview

1 Memory Read Transaction (1) Memory Read Transaction (2) CPU places address A on the memory

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

Caching in the Memory Hierarchy: 5 Minutes Ought to Be Enough for Everybody Anastasia Ailamaki

Memory Hierarchy & Caching CS 351: Systems Programming Michael Saelee <lee@iit.edu>

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

The Memory Hierarchy Today Storage technologies and trends Locality of reference

ADMIN Reading finish Chapter 5 Sections 5.4 (skip 511-515), 5.5, 5.11, 5.12 IC220

Elastic Cooperative Caching: An Autonomous Dynamically Adaptive Memory Hierarchy for Chip

Memory Hierarchy & Caching Use several levels of faster and faster memory to hide delay of

Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview Problem CPU vs Memory

Cache design overview ANY cache can be viewed as k-way associative. What are the pros and cons of

Slide Set #15: Exploiting Memory Hierarchy 1 ADMIN Chapter 5 Reading 5.1, 5.3, 5.4 2

Final Exam Review CS 351: Systems Programing Michael Saelee <lee@iit.edu> Coverage -

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder

1 5.1 Introduction A Typical Memory Hierarchy A Typical Memory Hierarchy Memory Technology

IC220: Set #13: Building a real processor! ( Chapter 5) 1 The Processor: Datapath & Control

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

IC220 Caching 2: Memory Hierarchy (more from Chapter 5 - - PowerPoint PPT Presentation

IC220 Caching 2: Memory Hierarchy (more from Chapter 5 - specifically 5.7, 5.8) 1 Cache design overview ANY cache can be viewed as k-way associative. What are the pros and cons of each? Fully associative: k = N/B 4-way set

IC220: Caching 1 (Chapter 5) 1 Memory, Cost, and Performance Ideal World: we want a memory

Memory Hierarchy Chapter 6 1 Outline Storage technologies and

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy &amp; Caching Use several

Abstractions for Practical Systems Caching and the memory hierarchy Operating systems and the

I/O Caching and Page Replacement I/O Caching and Page Replacement Memory/Storage Hierarchy 101

1 Memory Read Transaction (1) Memory Read Transaction (2) CPU places address A on the memory

Memory Hierarchy Design Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview

1 Memory Read Transaction (1) Memory Read Transaction (2) CPU places address A on the memory

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

Caching in the Memory Hierarchy: 5 Minutes Ought to Be Enough for Everybody Anastasia Ailamaki

Memory Hierarchy &amp; Caching CS 351: Systems Programming Michael Saelee &lt;lee@iit.edu&gt;

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

The Memory Hierarchy Today Storage technologies and trends Locality of reference

ADMIN Reading finish Chapter 5 Sections 5.4 (skip 511-515), 5.5, 5.11, 5.12 IC220

Elastic Cooperative Caching: An Autonomous Dynamically Adaptive Memory Hierarchy for Chip

Memory Hierarchy &amp; Caching Use several levels of faster and faster memory to hide delay of

Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview Problem CPU vs Memory

Cache design overview ANY cache can be viewed as k-way associative. What are the pros and cons of

Slide Set #15: Exploiting Memory Hierarchy 1 ADMIN Chapter 5 Reading 5.1, 5.3, 5.4 2

Final Exam Review CS 351: Systems Programing Michael Saelee &lt;lee@iit.edu&gt; Coverage -

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder

1 5.1 Introduction A Typical Memory Hierarchy A Typical Memory Hierarchy Memory Technology

IC220: Set #13: Building a real processor! ( Chapter 5) 1 The Processor: Datapath &amp; Control

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy & Caching Use several

Memory Hierarchy & Caching CS 351: Systems Programming Michael Saelee <lee@iit.edu>

Memory Hierarchy & Caching Use several levels of faster and faster memory to hide delay of

Final Exam Review CS 351: Systems Programing Michael Saelee <lee@iit.edu> Coverage -

IC220: Set #13: Building a real processor! ( Chapter 5) 1 The Processor: Datapath & Control