MEMORY HIERARCHY DESIGN Mahdi Nazm Bojnordi Assistant Professor - PowerPoint PPT Presentation

MEMORY HIERARCHY DESIGN Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture

Overview ¨ Announcement ¤ Homework 3 will be released on Oct. 31 st ¨ This lecture ¤ Memory hierarchy ¤ Memory technologies ¤ Principle of locality ¨ Cache concepts

Memory Hierarchy “Ideally one would desire an indefinitely large memory capacity such that any particular [...] word would be immediately available [...] We are [...] forced to recognize the possibility of constructing a hierarchy of memories, each of which has greater capacity than the preceding but which is less quickly accessible.” -- Burks, Goldstine, and von Neumann, 1946 Core Level 1 Greater capacity Level 2 Less quickly accessible Level 3

The Memory Wall ¨ Processor-memory performance gap increased over 50% per year ¤ Processor performance historically improved ~60% per year ¤ Main memory access time improves ~5% per year

Modern Memory Hierarchy ¨ Trade-off among memory speed, capacity, and cost small, fast, expensive Register Cache Memory big, slow, inexpensive SSD Disk

Memory Technology ¨ Random access memory (RAM) technology ¤ access time same for all locations (not so true anymore) ¤ Static RAM (SRAM) n typically used for caches n 6T/bit; fast but – low density, high power, expensive ¤ Dynamic RAM (DRAM) n typically used for main memory n 1T/bit; inexpensive, high density, low power – but slow

RAM Cells ¨ 6T SRAM cell bitline bitline ¤ internal feedback wordline maintains data while power on ¨ 1T-1C DRAM cell bitline ¤ needs refresh regularly to wordline preserve data

Processor Cache ¨ Occupies a large fraction of die area in modern microprocessors 3-3.5 GHz ~$1000 2014) Source: Intel Core i7

Processor Cache ¨ Occupies a large fraction of die area in modern microprocessors 3-3.5 GHz ~$1000 2014) 20MB of cache Source: Intel Core i7

Cache Hierarchy ¨ Example three-level cache organization L1 Core L1 L2 32 KB L3 1 cycle 256 KB 4 MB 10 cycles 30 cycles Off-chip Memory 8 GB ~300 cycles

Cache Hierarchy ¨ Example three-level cache organization L1 Core L1 L2 32 KB L3 1 cycle Application 256 KB 4 MB inst. data 10 cycles 30 cycles Off-chip Memory 8 GB ~300 cycles

Cache Hierarchy ¨ Example three-level cache organization L1 Core L1 L2 32 KB L3 1 cycle Application 256 KB 4 MB inst. data 10 cycles 30 cycles 1. Where to put the application? 2. Who decides? Off-chip a. software (scratchpad) Memory 8 GB b. hardware (caches) ~300 cycles

Principle of Locality ¨ Memory references exhibit localized accesses ¨ Types of locality ¤ spatial : probability of access to A + d at time t + e highest when d → 0 ¤ temporal : probability of accessing A + e at time t + d highest when d → 0 for (i=0; i<1000; ++i) { sum = sum + a[i]; } A t spatial temporal Key idea: store local data in fast cache levels

Principle of Locality ¨ Memory references exhibit localized accesses ¨ Types of locality ¤ spatial : probability of access to A + d at time t + e highest when d → 0 ¤ temporal : probability of accessing A + e at time t + d highest when d → 0 for (i=0; i<1000; ++i) { sum = sum + a[i]; } A t temporal spatial spatial temporal Key idea: store local data in fast cache levels

Cache Terminology ¨ Block (cache line): unit of data access ¨ Hit: accessed data found at current level ¤ hit rate: fraction of accesses that finds the data ¤ hit time: time to access data on a hit ¨ Miss: accessed data NOT found at current level ¤ miss rate: 1 – hit rate ¤ miss penalty: time to get block from lower level hit time << miss penalty

Cache Performance ¨ Average Memory Access Time (AMAT) AMAT = r h t h +r m ( t h +t p ) Outcome Rate Access Time r h t h Hit r h = 1 – r m Miss r m t h + t p Request AMAT = t h + r m t p t h cache Hit problem: hit rate is 90%; hit time is 2 cycles; and accessing the lower level takes 200 cycles; Miss find the average memory access time? t p

Cache Performance ¨ Average Memory Access Time (AMAT) AMAT = r h t h +r m ( t h +t p ) Outcome Rate Access Time r h t h Hit r h = 1 – r m Miss r m t h + t p Request AMAT = t h + r m t p t h cache Hit problem: hit rate is 90%; hit time is 2 cycles; and accessing the lower level takes 200 cycles; Miss find the average memory access time? AMAT = 2 + 0.1x200 = 22 cycles t p

Example Problem ¨ Assume that the miss rate for instructions is 5%; the miss rate for data is 8%; the data references per instruction is 40%; and the miss penalty is 20 cycles; find performance relative to perfect cache with no misses

Example Problem ¨ Assume that the miss rate for instructions is 5%; the miss rate for data is 8%; the data references per instruction is 40%; and the miss penalty is 20 cycles; find performance relative to perfect cache with no misses ¤ misses/instruction = 0.05 + 0.08 x 0.4 = 0.082 ¤ Assuming hit time =1 n AMAT = 1 + 0.082x20 = 2.64 n Relative performance = 1/2.64

Summary: Cache Performance ¨ Bridging the processor-memory performance gap Main memory access time: 300 cycles Core Main Memory

Summary: Cache Performance ¨ Bridging the processor-memory performance gap Main memory access time: 300 cycles Core Two level cache § L1: 2 cycles hit time; 60% hit rate Level-1 § L2: 20 cycles hit time; 70% hit rate What is the average mem access time? Level-2 Main Memory

Summary: Cache Performance ¨ Bridging the processor-memory performance gap Main memory access time: 300 cycles Core Two level cache § L1: 2 cycles hit time; 60% hit rate Level-1 § L2: 20 cycles hit time; 70% hit rate What is the average mem access time? Level-2 AMAT = t h1 + r m1 t p1 t p1 = t h2 + r m2 t p2 Main Memory AMAT = 46

Cache Addressing ¨ Instead of specifying cache address we specify main memory address ¨ Simplest: direct-mapped cache Memory 0000 0001 0010 0011 0100 0101 0110 0111 1000 Cache 1001 1010 1011 1100 1101 1110 1111

Cache Addressing ¨ Instead of specifying cache address we specify main memory address ¨ Simplest: direct-mapped cache Memory 0000 0001 Note: each memory address maps to 0010 0011 a single cache location determined by 0100 0101 modulo hashing 0110 0111 1000 Cache 1001 1010 00 1011 01 1100 10 1101 11 1110 1111

Cache Addressing ¨ Instead of specifying cache address we specify main memory address ¨ Simplest: direct-mapped cache Memory 0000 0001 Note: each memory address maps to 0010 0011 a single cache location determined by 0100 0101 modulo hashing 0110 0111 1000 Cache 1001 1010 How to exactly specify 00 1011 01 which blocks are in the 1100 10 cache? 1101 11 1110 1111

MEMORY HIERARCHY DESIGN Mahdi Nazm Bojnordi Assistant Professor - PowerPoint PPT Presentation

MEMORY HIERARCHY DESIGN Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 3 will be released on Oct. 31 st This lecture Memory hierarchy

Memory Hierarchy Design Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder

Memory Hierarchy Design Issues Memory Hierarchy Design Issues in Many in Many-Core Processors

Abstractions for Practical Systems Caching and the memory hierarchy Operating systems and the

1 5.1 Introduction A Typical Memory Hierarchy A Typical Memory Hierarchy Memory Technology

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy & Caching Use several

Why memory hierarchy (3 rd Ed: p.468-487, 4 th Ed: p. 452-470) users want unlimited fast

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Data Management Systems Storage Management The Memory hierarchy Memory hierarchy

Hierarchy of School Marketing Needs Leadership Day - February 16, 2018 Maslows Hierarchy of

Extensions of the Caucal Hierarchy? Pawe Parys University of Warsaw LATA 2019 Caucal

How to Write Fast Numerical Code Spring 2011 Lecture 7 Instructor: Markus Pschel TA: Georg

Memory Systems Memory speed has been slow compared to the speed of the processor A process

Evaluating find a path reachability queries P. Bouros 1 , T. Dalamagas 2 , S.Skiadopoulos 3

Data Management Systems Query Processing Execution models Understanding the data

Les Lesson 11 Memory Hierarchies In Intr troduc ductio tion From the earliest days of

WITH C++ Prof. Amr Goneid AUC Part 11a. The Vector Class Prof. amr Goneid, AUC 1 The Vector

CS 294-73 Software Engineering for Scientific Computing Lecture 7: STL

Announcement Final exam Standard Template Library Tuesday, May 20 th 2:45 4:45pm

MEMORY HIERARCHY DESIGN Mahdi Nazm Bojnordi Assistant Professor - PowerPoint PPT Presentation

MEMORY HIERARCHY DESIGN Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 3 will be released on Oct. 31 st This lecture Memory hierarchy

Memory Hierarchy Design Memory Hierarchy Design Chapter 5 and Appendix C 1 Overview

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

What Is Memory Hierarchy A typical memory hierarchy today: Lecture 13: Cache Basics and Cache

Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder

Memory Hierarchy Design Issues Memory Hierarchy Design Issues in Many in Many-Core Processors

Abstractions for Practical Systems Caching and the memory hierarchy Operating systems and the

1 5.1 Introduction A Typical Memory Hierarchy A Typical Memory Hierarchy Memory Technology

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

1 Basic use of caches Levels in the memory hierarchy When fetching an instruction, first

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy &amp; Caching Use several

Why memory hierarchy (3 rd Ed: p.468-487, 4 th Ed: p. 452-470) users want unlimited fast

Memory Hierarchy: Cache Memory hierarchy Cache basics Locality Cache organization Cache-aware

Data Management Systems Storage Management The Memory hierarchy Memory hierarchy

Hierarchy of School Marketing Needs Leadership Day - February 16, 2018 Maslows Hierarchy of

Extensions of the Caucal Hierarchy? Pawe Parys University of Warsaw LATA 2019 Caucal

How to Write Fast Numerical Code Spring 2011 Lecture 7 Instructor: Markus Pschel TA: Georg

Memory Systems Memory speed has been slow compared to the speed of the processor A process

Evaluating find a path reachability queries P. Bouros 1 , T. Dalamagas 2 , S.Skiadopoulos 3

Data Management Systems Query Processing Execution models Understanding the data

Les Lesson 11 Memory Hierarchies In Intr troduc ductio tion From the earliest days of

WITH C++ Prof. Amr Goneid AUC Part 11a. The Vector Class Prof. amr Goneid, AUC 1 The Vector

CS 294-73 Software Engineering for Scientific Computing Lecture 7: STL

Announcement Final exam Standard Template Library Tuesday, May 20 th 2:45 4:45pm

EE 457 Unit 7a Cache and Memory Hierarchy 2 Memory Hierarchy & Caching Use several