MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of - PowerPoint PPT Presentation

MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 3810: Computer Organization

Overview ¨ Notes ¤ Homework 9 is due tonight n Verify your submitted file before midnight ¨ This lecture ¤ Direct-mapped ¤ Set-associative

Recall: Direct-Mapped Lookup tag index byte ¨ Byte offset: to select v the requested byte 0 1 ¨ Tag: to maintain the 2 address ¨ Valid flag (v): … whether content is 1021 meaningful 1022 1023 ¨ Data and tag are = always accessed data hit

Example Problem ¨ Find the size of tag, index, and offset bits for an 8MB, direct-mapped L3 cache with 64B cache blocks. Assume that the processor can address up to 4GB of main memory.

Example Problem ¨ Find the size of tag, index, and offset bits for an 8MB, direct-mapped L3 cache with 64B cache blocks. Assume that the processor can address up to 4GB of main memory. ¨ 4GB = 2 32 B à address bits = 32 ¨ 64B = 2 6 B à byte offset bits = 6 ¨ 8MB/64B = 2 17 à index bits = 17 ¨ tag bits = 32 – 6 – 17 = 9

Set Associative Caches ¨ Improve cache hit rate by allowing a memory location to be placed in more than one cache block ¤ N-way set associative cache ¤ Fully associative ¨ For fixed capacity, higher associativity typically leads to higher hit rates ¤ more places to simultaneously map cache lines ¤ 8-way SA close to FA in practice … for (i=0; i<10000; i++) { a a++; b++; } b Memory

Set Associative Caches ¨ Improve cache hit rate by allowing a memory location to be placed in more than one cache block ¤ N-way set associative cache ¤ Fully associative ¨ For fixed capacity, higher associativity typically leads to higher hit rates ¤ more places to simultaneously map cache lines ¤ 8-way SA close to FA in practice … for (i=0; i<10000; i++) { a a++; b++; } b way 1 way 0 Memory

n-Way Set Associative Lookup tag index byte ¨ Index into cache sets v 0 ¨ Multiple tag comparisons 1 ¨ Multiple data reads … ¨ Special cases 510 ¤ Direct mapped 511 n Single block sets ¤ Fully associative mux = = n Single set cache data hit OR

Example Problem ¨ Find the size of tag, index, and offset bits for an 4MB, 4-way set associative cache with 32B cache blocks. Assume that the processor can address up to 4GB of main memory.

Example Problem ¨ Find the size of tag, index, and offset bits for an 4MB, 4-way set associative cache with 32B cache blocks. Assume that the processor can address up to 4GB of main memory. ¨ 4GB = 2 32 B à address bits = 32 ¨ 32B = 2 5 B à byte offset bits = 5 ¨ 4MB/(4x32B) = 2 15 à index bits = 15 ¨ tag bits = 32 – 5 – 15 = 12

Example ¨ Consider a 32 kilobyte (KB) 4-way set-associative data cache array with 32 byte line sizes ¤ How many sets? ¤ How many index bits, offset bits, tag bits? ¤ How large is the tag array?

Example ¨ Consider a 32 kilobyte (KB) 4-way set-associative data cache array with 32 byte line sizes ¤ cache size = no. sets x no. ways x block size ¤ How many sets? ¤ How many index bits, offset bits, tag bits? ¤ How large is the tag array?

Example ¨ Consider a 32 kilobyte (KB) 4-way set-associative data cache array with 32 byte line sizes ¤ cache size = no. sets x no. ways x block size ¤ How many sets? ¤ no. sets = 32x1024 / (4 x 32) = 256 ¤ How many index bits, offset bits, tag bits? 8 5 19 ¤ ¤ How large is the tag array? ¤ no. sets x no. ways x tag bits = 256 x 4 x 19 = 19Kb

Example ¨ A pipeline’s CPI is 1 if all loads/stores hit in cache ¨ Question: 40% of all instructions are loads/stores; 80% of all loads/stores hit in cache (1-cycle); memory access takes 100 cycles; what is the CPI?

Example ¨ A pipeline’s CPI is 1 if all loads/stores hit in cache ¨ Question: 40% of all instructions are loads/stores; 80% of all loads/stores hit in cache (1-cycle); memory access takes 100 cycles; what is the CPI? ¨ Solution: ¤ Consider 1000 instructions; 400 instructions are load/stores, of which 0.8x400 are hits (1 cycle) and 0.2x400 are misses (101 cycles). ¤ CPI = (1x (600 + 320) + 101 x 80)/1000 = 9

Cache Write Policies ¨ Write vs. read ¤ Data and tag are accessed for both read and write ¤ Only for write, data array needs to be updated ¨ Cache write policies hit miss Write lookup Read lower Write lower level? level? Write no allocate Write allocate Write back Write through

Write back ¨ On a write access, write to cache only ¤ write cache block to memory only when replaced from cache ¤ dramatically decreases bus bandwidth usage ¤ keep a bit (called the dirty bit) per cache block Core Cache Main Memory

Write through ¨ Write to both cache and memory (or next level) ¤ Improved miss penalty ¤ More reliable because of maintaining two copies Core Cache Main Memory

Write (No-)Allocate ¨ Write allocate ¤ allocate a cache line for the new data, and replace old line ¤ just like a read miss ¨ Write no allocate ¤ do not allocate space in the cache for the data ¤ only really makes sense in systems with write buffers ¨ How to handle read miss after write miss?

MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of - PowerPoint PPT Presentation

MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 3810: Computer Organization Overview Notes Homework 9 is due tonight n Verify your submitted file before midnight This lecture

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

MPI AND OPENACC JIRI KRAUS, NVIDIA MPI+OPENACC System System System GDDR5 Memory GDDR5

UNIFIED MEMORY IN CUDA 6 MARK HARRIS NVIDIA CONFIDENTIAL Unified Memory Dramatically Lower

28.05.04 09:50 Memory Management The computer memory is a limited resource so the Memory

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

Memory Management Ideally programmers want memory that is large fast non

Welcome to The Memory Class An Introduction to Memory Problems and the Memory Center Agenda For

JUMP START ABOUT ME My name is Ysmay . I ' m a digital strategist and web designer . I ' ve been

Logistic Regression Lecture 6 Logistic Regression Classification Model CS 335

Lecture 5: Representation Learning Kai-Wei Chang CS @ UCLA kw@kwchang.net Couse webpage:

Green relations Dominique Perrin 23 novembre 2015 Dominique Perrin Green relations Monoids A

Nue Energy Reconstruction with CNN Lars Hertel, Ilsoo Seong, Jianming Bian 2018/08/20 Intro.

Performance plots for the Dual Phase FD TDR 27mar2018 - UPDATE Jose Soto & Bea Tapia &

x86 BASICS DATA ACCESS AND OPERATIONS MACHINE LEVEL REPRESENTATIONS Prior lectures mostly

CSSE132 Introduc0on to Computer Systems 13 : Machine level

Sambuz

Useful Links

Newsletter

Mail Us

MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of - PowerPoint PPT Presentation

MEMORY SYSTEM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 3810: Computer Organization Overview Notes Homework 9 is due tonight n Verify your submitted file before midnight This lecture

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

MPI AND OPENACC JIRI KRAUS, NVIDIA MPI+OPENACC System System System GDDR5 Memory GDDR5

UNIFIED MEMORY IN CUDA 6 MARK HARRIS NVIDIA CONFIDENTIAL Unified Memory Dramatically Lower

28.05.04 09:50 Memory Management The computer memory is a limited resource so the Memory

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

Memory Management Ideally programmers want memory that is large fast non

Welcome to The Memory Class An Introduction to Memory Problems and the Memory Center Agenda For

JUMP START ABOUT ME My name is Ysmay . I ' m a digital strategist and web designer . I ' ve been

Logistic Regression Lecture 6 Logistic Regression Classification Model CS 335

Lecture 5: Representation Learning Kai-Wei Chang CS @ UCLA kw@kwchang.net Couse webpage:

Green relations Dominique Perrin 23 novembre 2015 Dominique Perrin Green relations Monoids A

Nue Energy Reconstruction with CNN Lars Hertel, Ilsoo Seong, Jianming Bian 2018/08/20 Intro.

Performance plots for the Dual Phase FD TDR 27mar2018 - UPDATE Jose Soto &amp; Bea Tapia &amp;

x86 BASICS DATA ACCESS AND OPERATIONS MACHINE LEVEL REPRESENTATIONS Prior lectures mostly

CSSE132 Introduc0on to Computer Systems 13 : Machine level

Sambuz

Useful Links

Newsletter

Mail Us

Performance plots for the Dual Phase FD TDR 27mar2018 - UPDATE Jose Soto & Bea Tapia &