The Memory Hierarchy 10/25/16 Transition First half of course: - PowerPoint PPT Presentation

The Memory Hierarchy 10/25/16

Transition • First half of course: hardware focus • How the hardware is constructed • How the hardware works • How to interact with hardware • Second half: performance and software systems • Memory performance • Operating systems • Standard libraries • Parallel programming

Making programs efficient • Algorithms matter • CS35 • CS41 • Hardware matters • Engineering • Using the hardware properly matters • CPU vs GPU • Parallel programming • Memory hierarchy

Memory so far: array abstraction • Memory is a big array of bytes. • Every address is an index into this array. This is the level of abstraction at which an assembly programmer thinks. C programmers can think even more abstractly with variables.

Memory Technologies Latches Magnetic (registers, cache) (hard drives) Volatile $ Non-Volatile $$ $$ (loses data (maintains without data when power) computer is turned off) $$ $$$ Flash Capacitors (SSDs) (DRAM)

The Memory Hierarchy Faster 1 cycle to access Registers Cache(s) (SRAM) few cycles to access Main memory (DRAM) ~100 cycles to access Local secondary storage (disk, SSD) ~100,000,000 cycles to access Cheaper

Key idea this week: caching • Store everything in cheap, slow storage. • Store a subset in fast, expensive storage. • Try to guess the most useful subset to cache.

A note on terminology • Caching: the general principle of holding a small subset of your data in fast-access storage. • The cache: SRAM memory inside the CPU.

Connecting CPU and Memory • Components are connected by a bus: • A bus is a bundle of parallel wires that carry address, data, and control signals. • Buses are typically shared by multiple devices. CPU chip Register file Cache ALU System bus Memory bus Main I/O Bus interface memory bridge

How a Memory Read Works (1) CPU places address A on the memory bus. CPU chip Load operation: movl (A), %eax Register file Cache ALU %eax Main memory 0 I/O bridge Bus interface A x A

How a Memory Read Works (2) Main Memory reads Address A from Memory Bus, fetches data X at that address and puts it on the bus CPU chip Load operation: movl (A), %eax Register file Cache ALU %eax Main memory 0 I/O bridge Bus interface A x X

How a Memory Read Works (3) CPU reads X from the bus, and copies it into register %eax. A copy also goes into the on-chip cache memory CPU chip Load operation: movl (A), %eax Register file Cache X ALU %eax X Main memory 0 I/O bridge Bus interface A x

Write 1. CPU writes A to bus, Memory Reads it 2. CPU writes Y to bus, Memory Reads it 3. Memory stores read value, y, at address A CPU chip Store operation: movl %eax, (A) Register file Y Cache ALU %eax Y Main memory 0 I/O bridge Bus interface A Y AY

I/O Bus: connects Devices & Memory CPU chip OS moves data between Main Register file Memory & Devices Cache ALU Memory bus System bus Main I/O Bus interface memory bridge I/O bus Expansion slots for other devices such as network controller. USB Graphics Disk controller controller controller Mouse Keyboard Monitor Disk

Device Driver: OS device-specific code CPU chip OS driver code running on CPU Register file makes read & write requests to Device Controller via I/O Bridge Cache ALU System bus Memory bus Main I/O Bus interface memory bridge I/O bus USB Graphics Disk controller controller controller Mouse Keyboard Monitor Disk

Abstraction Goal • Reality: There is no one type of memory to rule them all! • Abstraction: hide the complex/undesirable details of reality. • Illusion: We have the speed of SRAM, with the capacity of disk, at reasonable cost.

What’s Inside A Disk Drive? Spindle Arm Platters Data Encoded as points of magnetism on Actuator Platter surfaces R/W head Controller Electronics (includes processor & memory) bus connector Device Driver (part of OS code) interacts with Controller to R/W to disk Image from Seagate Technology

Reading and Writing to Disk Data blocks located in some Sector of some Track on some Surface 1. Disk Arm moves to correct track (seek time) 2. Wait for sector spins under R/W head (rotational latency) 3. As sector spins under head, data are Read or Written (transfer time) sector disk arm sweeps across surface to position read/write head over a disk surface specific track. spins at a fixed rotational rate ~7200 rotations/min

Cache Basics CPU • CPU real estate Regs ALU L1 dedicated to cache L2 Cache • Usually two levels: Memory Bus • L1: smallest, fastest • L2: larger, slower Main Memory • Same rules apply: • L1 subset of L2

Cache Basics CPU • CPU real estate Regs ALU dedicated to cache Cache • Usually two levels: Memory Bus • L1: smallest, fastest • L2: larger, slower Main Memory • We’ll assume one cache (same principles) Cache is a subset of main memory. (Not to scale, memory much bigger!)

Cache Basics: Read from memory CPU • In parallel: In cache? Regs ALU • Issue read to memory • Check cache Cache Memory Bus Request data Main Memory

Cache Basics: Read from memory CPU • In parallel: In cache? Regs ALU • Issue read to memory • Check cache Cache • Data in cache (hit): Memory Bus • Good, send to register • Cancel/ignore memory Main Memory

Cache Basics: Read from memory • In parallel: CPU In cache? • Issue read to memory Regs ALU • Check cache 2. Cache • Data in cache (hit): • Good, send to register 1. Memory Bus (~200 cycles) • Cancel/ignore memory • Data not in cache (miss): Main Memory 1. Load cache from memory (might need to evict data) 2. Send to register

Cache Basics: Write to memory • Assume data already cached CPU • Otherwise, bring it in like read Regs ALU Data Cache 1. Update cached copy. Memory Bus 2. Update memory? Main Memory

When should we copy the written data from cache to memory? Why? A. Immediately update the data in memory when we update the cache. B. Update the data in memory when we evict the data from the cache. C. Update the data in memory if the data is needed elsewhere (e.g., another core). D. Update the data in memory at some other time. (When?)

When should we copy the written data from cache to memory? Why? A. Immediately update the data in memory when we update the cache. (“Write-through”) B. Update the data in memory when we evict the data from the cache. (“Write-back”) C. Update the data in memory if the data is needed elsewhere (e.g., another core). D. Update the data in memory at some other time. (When?)

Cache Basics: Write to memory • Both options (write-through, write-back) viable • write-though: write to memory immediately • simpler, accesses memory more often (slower) • write-back: only write to memory on eviction • complex (cache inconsistent with memory) • potentially reduces memory accesses (faster) Sells better. Servers/Desktops/Laptops

Discussion Question What data should we keep in the cache? What principles can we use to make a decent guess?

Problem: Prediction • We can’t know the future… • So… are we out of luck? What might we look at to help us decide? • The past is often a pretty good predictor…

Analogy: two types of Netflix users 1: 2: What should be next in each user’s queue?

Critical Concept: Locality • Locality: we tend to repeatedly access recently accessed items, or those that are nearby. • Temporal locality: An item accessed recently is likely to be accessed again soon. • Spatial locality: We’re likely to access an item that’s nearby others we just accessed.

In the following code, how many examples are there of temporal / spatial locality? Where are they? void print_array(int *array, int num) { int i; for (i = 0; i < num; i++) { printf(“%d : %d”, i, array[i]); } } A. 1 temporal, 1 spatial D. 2 temporal, 2 spatial B. 1 temporal, 2 spatial E. Some other number C. 2 temporal, 1 spatial

Example void print_array(int *array, int num) { int i; for (i = 0; i < num; i++){ printf(“%d : %d”, i, array[i]); Temporal Locality? } } array , num and i used over and over again in each iteration Spatial Locality? array bucket access program instructions Programs with loops tend to have a lot of locality and most programs have loops: it’s hard to write a long-running program w/o a loop 33

Use Locality to Speed-up Memory Access Caching Key idea: keep copy of “likely to be accessed soon” data in higher levels of Memory Hierarchy to make their future accesses faster: recently accessed data (temporal locality) • data nearby recently accessed data (spatial locality) • If program has high degree of locality, next data access is likely to be in cache - if little/no locality, then caching won’t help + luckily most programs have a high degree of locality 34

Discussion Question What data should we evict from the cache? What principles can we use to make a decent guess?

The Memory Hierarchy 10/25/16 Transition First half of course: - PowerPoint PPT Presentation

The Memory Hierarchy 10/25/16 Transition First half of course: hardware focus How the hardware is constructed How the hardware works How to interact with hardware Second half: performance and software systems Memory

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Memory Hierarchy Design Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Abstractions for Practical Systems Caching and the memory hierarchy Operating systems and the

1 5.1 Introduction A Typical Memory Hierarchy A Typical Memory Hierarchy Memory Technology

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy & Caching Use several

Why memory hierarchy (3 rd Ed: p.468-487, 4 th Ed: p. 452-470) users want unlimited fast

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Data Management Systems Storage Management The Memory hierarchy Memory hierarchy

Memory Hierarchy Design Issues Memory Hierarchy Design Issues in Many in Many-Core Processors

Hierarchy of School Marketing Needs Leadership Day - February 16, 2018 Maslows Hierarchy of

Extensions of the Caucal Hierarchy? Pawe Parys University of Warsaw LATA 2019 Caucal

Mr. Gursharan Singh Tatla professorgstatla@gmail.com Gursharan Singh Tatla 1 System Bus The

Modern DRAM Memory Systems Brian T. Davis MTU Interview Seminar Advanced Computer Architecture

Transaction-level modeling of bus-based systems with SystemC 2.0 Ric Hilderink, Thorsten

EE 457 Unit 7b Main Memory Organization 2 Motivation Organize main memory to Facilitate

Chapter 5 General Architecture of Computer CPU - MEM - I/O Peripheral Computer

Table ADT and Sorting Algorithm topics continuing (or reviewing?) CS 24 curriculum A table ADT

CS525: Advanced Database Organization Notes 6: Query Optimization and Execution Yousef M.

Algorithms and Data Structures Open Addressing, Priority Queue Albert-Ludwigs-Universitt

The Memory Hierarchy 10/25/16 Transition First half of course: - PowerPoint PPT Presentation

The Memory Hierarchy 10/25/16 Transition First half of course: hardware focus How the hardware is constructed How the hardware works How to interact with hardware Second half: performance and software systems Memory

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Memory Hierarchy Design Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Abstractions for Practical Systems Caching and the memory hierarchy Operating systems and the

1 5.1 Introduction A Typical Memory Hierarchy A Typical Memory Hierarchy Memory Technology

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy &amp; Caching Use several

Why memory hierarchy (3 rd Ed: p.468-487, 4 th Ed: p. 452-470) users want unlimited fast

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Data Management Systems Storage Management The Memory hierarchy Memory hierarchy

Memory Hierarchy Design Issues Memory Hierarchy Design Issues in Many in Many-Core Processors

Hierarchy of School Marketing Needs Leadership Day - February 16, 2018 Maslows Hierarchy of

Extensions of the Caucal Hierarchy? Pawe Parys University of Warsaw LATA 2019 Caucal

Mr. Gursharan Singh Tatla professorgstatla@gmail.com Gursharan Singh Tatla 1 System Bus The

Modern DRAM Memory Systems Brian T. Davis MTU Interview Seminar Advanced Computer Architecture

Transaction-level modeling of bus-based systems with SystemC 2.0 Ric Hilderink, Thorsten

EE 457 Unit 7b Main Memory Organization 2 Motivation Organize main memory to Facilitate

Chapter 5 General Architecture of Computer CPU - MEM - I/O Peripheral Computer

Table ADT and Sorting Algorithm topics continuing (or reviewing?) CS 24 curriculum A table ADT

CS525: Advanced Database Organization Notes 6: Query Optimization and Execution Yousef M.

Algorithms and Data Structures Open Addressing, Priority Queue Albert-Ludwigs-Universitt

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy & Caching Use several