Memory Hierarchy (Performance Optimization) 2 Lab Schedule - PowerPoint PPT Presentation

ì Computer Systems and Networks ECPE 170 – Jeff Shafer – University of the Pacific Memory Hierarchy (Performance Optimization)

2 Lab Schedule Activities Assignments Due This Week Lab 6 ì ì Due by Mar 6 th 5:00am Lab 6 – Perf Optimization ì ì Lab 7 – Memory Hierarchy ì Lab 7 ì Due by Mar 20 th 5:00am Next Tuesday ì ì Intro to Python ì Next Thursday ì ** Midterm Exam ** ì Computer Systems and Networks Spring 2017

3 Your Personal Repository 2017_spring_ecpe170\lab02 lab03 lab04 lab05 lab06 lab07 lab08 lab09 Hidden Folder! lab10 (name starts with period) lab11 Used by Mercurial to lab12 track all repository .hg history (files, changelogs, …) Computer Systems and Networks Spring 2017

4 Mercurial .hg Folder ì The existence of a .hg hidden folder is what turns a regular directory (and its subfolders) into a special Mercurial repository ì When you add/commit files, Mercurial looks for this .hg folder in the current directory or its parents Computer Systems and Networks Spring 2017

5 ì Memory Hierarchy Computer Systems and Networks Spring 2017

6 Memory Hierarchy Goal as system designers: Fast Performance and Low Cost Tradeoff: Faster memory is more expensive than slower memory Computer Systems and Networks Spring 2017

7 Memory Hierarchy ì To provide the best performance at the lowest cost, memory is organized in a hierarchical fashion Small , fast storage elements are kept in the CPU ì Larger , slower main memory are outside the CPU ì (and accessed by a data bus) Largest , slowest , permanent storage (disks, etc…) ì is even further from the CPU Computer Systems and Networks Spring 2017

8 To date, you’ve only cared about two levels: Main memory and Disks Computer Systems and Networks Spring 2017

9 Memory Hierarchy ì – Registers and Cache Computer Systems and Networks Spring 2017

10 Let’s examine the fastest memory available Computer Systems and Networks Spring 2017

11 Memory Hierarchy – Registers ì Storage locations available on the processor itself ì Manually managed by the assembly programmer or compiler ì You’ll become intimately familiar with registers when we do assembly programming Computer Systems and Networks Spring 2017

12 Memory Hierarchy – Caches ì What is a cache? Speed up memory accesses by storing recently used ì data closer to the CPU Closer than main memory – on the CPU itself! ì Although cache is much smaller than main memory, ì its access time is much faster! Cache is automatically managed by the hardware ì memory system ì Clever programmers can help the hardware use the cache more effectively Computer Systems and Networks Spring 2017

13 Memory Hierarchy – Caches ì How does the cache work? Not going to discuss how caches work internally ì ì If you want to learn that, take ECPE 173! This class is focused on what does the programmer ì need to know about the underlying system Computer Systems and Networks Spring 2017

14 Memory Hierarchy – Access ì CPU wishes to read data (needed for an instruction) Does the instruction say it is in a register or 1. memory? ì If register, go get it! If in memory, send request to nearest memory 2. (the cache) If not in cache, send request to main memory 3. If not in main memory, send request to the disk 4. Computer Systems and Networks Spring 2017

15 (Cache) Hits versus Misses Hit When data is found at a ì given memory level You want to write (e.g. a cache) programs that produce a lot of hits , not misses! Miss When data is not found at a ì given memory level (e.g. a cache) Computer Systems and Networks Spring 2017

16 Memory Hierarchy – Cache ì Once the data is located and delivered to the CPU, it will also be saved into cache memory for future access We often save more than just the specific byte(s) ì requested Typical: Neighboring 64 bytes ì (called the cache line size ) Computer Systems and Networks Spring 2017

17 Cache Locality Principle of Locality Once a data element is accessed, it is likely that a nearby data element (or even the same element) will be needed soon Computer Systems and Networks Spring 2017

18 Cache Locality ì Temporal locality – Recently-accessed data elements tend to be accessed again Imagine a loop counter … ì ì Spatial locality - Accesses tend to cluster in memory Imagine scanning through all elements in an array, ì or running several sequential instructions in a program Computer Systems and Networks Spring 2017

19 Programs with good locality run faster than programs with poor locality Computer Systems and Networks Spring 2017

20 A program that randomly accesses memory addresses (but never repeats) will gain no benefit from a cache Computer Systems and Networks Spring 2017

21 Recap – Cache Which is bigger – a cache or main memory? ì Main memory ì Which is faster to access – the cache or main memory? ì Cache – It is smaller (which is faster to search) and closer ì to the processor (signals take less time to propagate to/from the cache) Why do we add a cache between the processor and ì main memory? Performance – hopefully frequently-accessed data will be ì in the faster cache (so we don’t have to access slower main memory) Computer Systems and Networks Spring 2017

22 Recap – Cache ì Which is manually controlled – a cache or a register? Registers are manually controlled by the assembly ì language program (or the compiler) Cache is automatically controlled by hardware ì ì Suppose a program wishes to read from a particular memory address. Which is searched first – the cache or main memory? Search the cache first – otherwise, there’s no ì performance gain Computer Systems and Networks Spring 2017

23 Recap – Cache ì Suppose there is a cache miss (data not found) during a 1 byte memory read operation. How much data is loaded into the cache? Trick question – we always load data into the cache ì 1 “line” at a time . Cache line size varies – 64 bytes on a Core i7 ì processor Computer Systems and Networks Spring 2017

24 Cache Q&A ì Imagine a computer system only has main memory (no cache was present). Is temporal or spatial locality important for performance when repeatedly accessing an array with 8-byte elements? No. Locality is not important in a system without ì caching, because every memory access will take the same length of time. Computer Systems and Networks Spring 2017

25 Cache Q&A Imagine a memory system has main memory and a 1- ì level cache, but each cache line size is only 8 bytes in size. Assume the cache is much smaller than main memory. Is temporal or spatial locality important for performance here when repeatedly accessing an array with 8-byte elements? Only 1 array element is loaded at a time in this cache ì Temporal locality is important (access will be faster if the ì same element is accessed again) Spatial locality is not important (neighboring elements ì are not loaded into the cache when an earlier element is accessed) Computer Systems and Networks Spring 2017

26 Cache Q&A ì Imagine a memory system has main memory and a 1-level cache, and the cache line size is 64 bytes. Assume the cache is much smaller than main memory. Is temporal or spatial locality important for performance here when repeatedly accessing an array with 8-byte elements? 8 elements (64B) are loaded into the cache at a time ì Both forms of locality are useful here! ì Computer Systems and Networks Spring 2017

27 Cache Q&A ì Imagine your program accesses a 100,000 element array (of 8 byte elements) once from beginning to end with stride 1. The memory system has a 1- level cache with a line size of 64 bytes. No pre- fetching is implemented. How many cache misses would be expected in this system? 12500 cache misses. The array has 100,000 ì elements. Upon a cache miss, 8 adjacent and aligned elements (one of which is the miss) is moved into the cache. Future accesses to those remaining elements should hit in the cache. Thus, only 1/8 of the 100,000 element accesses result in a miss Computer Systems and Networks Spring 2017

28 Cache Q&A Imagine your program accesses a 100,000 element ì array (of 8 byte elements) once from beginning to end with stride 1. The memory system has a 1-level cache with a line size of 64 bytes. A hardware prefetcher is implemented . In the best-possible case, how many cache misses would be expected in this system? 1 cache miss - This program has a trivial access pattern ì with stride 1. In the perfect world, the hardware prefetcher would begin guessing future memory accesses after the initial cache miss and loading them into the cache. Assuming the prefetcher can stay ahead of the program, then all future memory accesses with the trivial +1 pattern should result in cache hits Computer Systems and Networks Spring 2017

29 Cache Example – Intel Core i7 980x ì 6 core processor with a sophisticated multi-level cache hierarchy ì 3.5GHz, 1.17 billion transistors Computer Systems and Networks Spring 2017

Memory Hierarchy (Performance Optimization) 2 Lab Schedule - PowerPoint PPT Presentation

Computer Systems and Networks ECPE 170 Jeff Shafer University of the Pacific Memory Hierarchy (Performance Optimization) 2 Lab Schedule Activities Assignments Due This Week Lab 6 Due by Mar 6 th 5:00am Lab 6 Perf

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Memory Hierarchy Design Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Abstractions for Practical Systems Caching and the memory hierarchy Operating systems and the

1 5.1 Introduction A Typical Memory Hierarchy A Typical Memory Hierarchy Memory Technology

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy & Caching Use several

Why memory hierarchy (3 rd Ed: p.468-487, 4 th Ed: p. 452-470) users want unlimited fast

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Data Management Systems Storage Management The Memory hierarchy Memory hierarchy

Memory Hierarchy Design Issues Memory Hierarchy Design Issues in Many in Many-Core Processors

Hierarchy of School Marketing Needs Leadership Day - February 16, 2018 Maslows Hierarchy of

Extensions of the Caucal Hierarchy? Pawe Parys University of Warsaw LATA 2019 Caucal

OpenCms Days 2011 Workshop Track: The OpenCms 8 Content Subscription Engine Georg Westenberger,

CS4513 Dist ribut ed Comput er Syst ems The Web Huge client-ser ver syst em (Ch 11.1)

1 Course updates Project user understanding phase: writeup due Monday Oct 12 Home page for each

Scaling that Rails App Christian Amor Kvalheim Linqia.com Search Service for social

Caching Demystified presented by Aaron Welch and C a c h i n g D e m y s t i f i

CSCI x760 - Computer Networks Spring 2016 Instructor: Prof. Roberto Perdisci perdisci@cs.uga.edu

Attacks on DNS D. J. Bernstein University of Illinois at Chicago The Domain Name System

Development of Web Applications Principles and Practice Vincent Simonet, 2015-2016 Universit

Memory Hierarchy (Performance Optimization) 2 Lab Schedule - PowerPoint PPT Presentation

Computer Systems and Networks ECPE 170 Jeff Shafer University of the Pacific Memory Hierarchy (Performance Optimization) 2 Lab Schedule Activities Assignments Due This Week Lab 6 Due by Mar 6 th 5:00am Lab 6 Perf

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Memory Hierarchy Design Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Abstractions for Practical Systems Caching and the memory hierarchy Operating systems and the

1 5.1 Introduction A Typical Memory Hierarchy A Typical Memory Hierarchy Memory Technology

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy &amp; Caching Use several

Why memory hierarchy (3 rd Ed: p.468-487, 4 th Ed: p. 452-470) users want unlimited fast

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Data Management Systems Storage Management The Memory hierarchy Memory hierarchy

Memory Hierarchy Design Issues Memory Hierarchy Design Issues in Many in Many-Core Processors

Hierarchy of School Marketing Needs Leadership Day - February 16, 2018 Maslows Hierarchy of

Extensions of the Caucal Hierarchy? Pawe Parys University of Warsaw LATA 2019 Caucal

OpenCms Days 2011 Workshop Track: The OpenCms 8 Content Subscription Engine Georg Westenberger,

CS4513 Dist ribut ed Comput er Syst ems The Web Huge client-ser ver syst em (Ch 11.1)

1 Course updates Project user understanding phase: writeup due Monday Oct 12 Home page for each

Scaling that Rails App Christian Amor Kvalheim Linqia.com Search Service for social

Caching Demystified presented by Aaron Welch and C a c h i n g D e m y s t i f i

CSCI x760 - Computer Networks Spring 2016 Instructor: Prof. Roberto Perdisci perdisci@cs.uga.edu

Attacks on DNS D. J. Bernstein University of Illinois at Chicago The Domain Name System

Development of Web Applications Principles and Practice Vincent Simonet, 2015-2016 Universit

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy & Caching Use several