FlashVM: Virtual Memory Management on Flash Mohit Saxena Michael M. - PowerPoint PPT Presentation

FlashVM: Virtual Memory Management on Flash Mohit Saxena Michael M. Swift University of Wisconsin ‐ Madison 1

Is Virtual Memory Relevant? • There is never enough DRAM – Price, power and DIMM slots limit amount – Application memory footprints are ever ‐ increasing • VM is no longer DRAM+Disk – New memory technologies: Flash, PCM, Memristor …. 2

Flash and Virtual Memory DRAM DRAM is expensive VM Storage Storage Disk is slow + VM Flash is cheap and fast Flash Flash for Virtual Memory Disk 3

In this talk • Flash for Virtual Memory – Does it improve system price/performance? – What OS changes are required? • FlashVM – System architecture using dedicated flash for VM – Extension to core VM subsystem in the Linux kernel – Improved performance, reliability and garbage collection 4

Outline • Introduction • Background – Flash and VM • Design • Evaluation • Conclusions 5

Flash 101 • Flash is not disk – Faster random access performance: 0.1 vs. 2 ‐ 3 ms for disk – No in ‐ place modify: write only to erased location • Flash blocks wear out – Erasures limited to 10,000 ‐ 100,000 per block – Reliability dropping with increasing MLC flash density • Flash devices age – Log ‐ structured writes leave few clean blocks after extensive use – Performance drops by up to 85% on some SSDs – Requires garbage collection of free blocks 6

Virtual Memory 101 No Locality � Reduced DRAM � Same performance M � Lower system price Execution Time T DiskVM � Faster execution T � No additional DRAM � Similar system price Unused Memory FlashVM Memory Size M 7

Outline • Introduction • Background • Design – Performance – Reliability – Garbage Collection • Evaluation • Conclusions 8

FlashVM Hierarchy DRAM Page Swapping FlashVM Manager: VM Memory Manager Performance, Reliability and Garbage Collection Block Layer Disk Scheduler Block Device Driver Dedicated Flash Dedicated Flash Cost ‐ effective for VM Disk MLC NAND Reduced FS interference 9

VM Performance • Challenge – VM systems optimized for disk performance – Slow random reads, high access and seek costs, symmetrical read/write performance • FlashVM de ‐ diskifies VM: – Page write back – Page scanning Parameter Tuning – Disk scheduling – Page prefetching 10

Page Prefetching VM assumption Seek and rotational delays are longer than FREE the transfer cost of extra blocks FREE Linux sequential prefetching Minimize costly disk seeks Delimited by free and bad blocks FlashVM Linux Request FlashVM prefetching Exploit fast flash random reads and spatial locality in reference pattern BAD Seek over free and bad blocks swap map 11

Stride Prefetching • FlashVM uses stride Request prefetching – Exploit temporal locality in Request the reference pattern – Exploit cheap seeks for Request fast random access – Fetch two extra blocks in the stride Request Request swap map 12

The Reliability Problem • Challenge: Reduce the number of writes – Flash chips lose durability after 10,000 – 100,000 writes – Actual write ‐ lifetime can be two orders of magnitude less – Past solutions: • Disk ‐ based write caches for streamed I/O • De ‐ duplication and compression for storage • FlashVM uses knowledge of page content and state – Dirty Page sampling – Zero Page sharing

Page Sampling free_pages Clean Dirty? Linux FlashVM Dirty Write ‐ back all evicted Prioritize young clean dirty pages over old dirty pages s R sample 1 ‐ s R Disk Inactive LRU Free Flash Page List Page List 14

Adaptive Sampling • Challenge: Reference pattern variations – Write ‐ mostly: Many dirty pages – Read ‐ mostly: Many clean pages • FlashVM adapts sampling rate – Maintain a moving average for the write rate – Low write rate � Increase s R • Aggressively skip dirty pages – High write rate � Converge to native Linux • Evict dirty pages to relieve memory pressure 15

Outline • Introduction • Why FlashVM? • Design – Performance – Reliability – Garbage Collection • Evaluation • Conclusions 16

Flash Cleaning • All writes to flash go to write ‘cluster A’ at block 0 write ‘cluster A’ at block 0 write ‘cluster B’ at block 100 a new location free ‘cluster A’ free ‘cluster A’ & discard • Discard command write ‘cluster B’ at block 100 free ‘cluster B’ notifies SSD that blocks write ‘cluster C’ at block 200 are unused write ‘cluster D’ at block 0 • Benefits: – More free blocks for writing – Avoids copying data for partial over ‐ writes 17

Discard is Expensive Operation Latency 500 417 ms 400 Latency (ms) 300 55 ms 200 < 0.5 ms 2 ms 100 0 read write erase discard discard discard discard discard 4KB 4KB 128KB 4KB 1M B 10M B 100M B 1GB Operation OCZ ‐ Vertex, Indilinx controller 18

Discard and VM • Native Linux VM has limited discard support – Invokes discard before reusing free page clusters – Pays high fixed cost for small sets of pages • FlashVM optimizes to reduce discard cost – Avoid unnecessary discards: dummy discard – Discard larger sizes to amortize cost: merged discard 19

Dummy Discard • Observation: Overwriting a Overwrite block – notifies SSD it is empty Discard Overwrite – after discarding it, uses the free space made available by discard • FlashVM implements dummy discard – Monitors rate of allocation – Virtualize discard by reusing blocks likely to be overwritten soon 20

Merged Discard • Native Linux invokes Discard discard once per page cluster – Result: 55 ms latency for freeing 32 pages (128K) Discard Discard • FlashVM batch many Discard free pages – Defer discard until 100 MB of free pages available – Pages discarded may be non ‐ contiguous 21

Design Summary • Performance improvements – Parameter Tuning: page write back, page scanning, disk scheduling – Improved/stride prefetching • Reliability improvements – Reduced writes: page sampling and sharing • Garbage collection improvements – Merged and Dummy discard 22

Outline • Introduction • Motivation • Design • Evaluation – Performance and memory savings – Reliability and garbage collection • Conclusions 23

Methodology • System and Devices – 2.5 GHz Intel Core 2 Quad, Linux 2.6.28 kernel – IBM, Intel X ‐ 25M, OCZ ‐ Vertex trim ‐ capable SSDs • Application Workloads – ImageMagick ‐ resizing a large JPEG image by 500% – Spin – model checking for 10 million states – SpecJBB – 16 concurrent warehouses – memcached server – key ‐ value store for 1 million keys 24

Application Performance and Memory Savings Runtime Memory Use Performance/Memory Savings 70 60 50 Const Performance 40 Const Memory 84% memory savings 30 94% less execution time 20 10 0 ImageMagick Spin SpecJBB memcached- memcached- store lookup Applications 25

Write Reduction Performance Writes 120 14% reduction ImageMagick Performance/Writes 100 80 60 40 93% reduction 20 Spin 0 Uniform Page Adaptive Page Page Sharing 7% overhead, Sampling Sampling 12% reduction Write Reduction Technique 26

Garbage Collection Linux/Discard FlashVM Linux/No Discard 10000 Elapsed Time (s) 10X faster 1000 15% slower 100 10 1 ImageMagick Spin memcached Application 27

Conclusions • FlashVM: Virtual Memory Management on Flash – Dedicated flash for paging – Improved performance, reliability and garbage collection • More opportunities and challenges for OS design – Scaling FlashVM to massive memory capacities (terabytes!) – Future memory technologies: PCM and Memristors 28

Thanks! FlashVM: Virtual Memory Management on Flash Mohit Saxena Michael M. Swift University of Wisconsin ‐ Madison http://pages.cs.wisc.edu/~msaxena/FlashVM.html 29

FlashVM: Virtual Memory Management on Flash Mohit Saxena Michael M. - PowerPoint PPT Presentation

FlashVM: Virtual Memory Management on Flash Mohit Saxena Michael M. Swift University of Wisconsin Madison 1 Is Virtual Memory Relevant? There is never enough DRAM Price, power and DIMM slots limit amount Application memory

2004: Poisson Matting 2004: Flash/No-Flash 2004: Flash/No-Flash 2004: Flash/No-Flash 2004: The

A Case for Flash Memory SSD in A Case for Flash Memory SSD in A Case for Flash Memory SSD in

Flash Memory and Micro SD Card Presented by: Krishna Goyal (200601195) Anirudh Tripathi

Arc Flash Protection Arc Flash Protection Electrical Reliability Services Arc Flash Hazard Arc

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

ReFlex: Remote Flash Local Flash Ana Klimovic Heiner Litz Christos Kozyrakis NVMW18

The Basics Of Flash Building A Web Application With Flash What is Flash? Introduction

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table,

Arc Flash Arc Flash Mitigation Mitigation Remote Racking and Switching for Arc Flash danger

Flash Presentation The flash web designs which we make are attractive to captivate your website

Design of Flash- -Based DBMS: Based DBMS: Design of Flash Design of Flash-Based DBMS: An In-

Basics of Off-Camera Flash Off-Camera Flash www.jedi.com * What is it & why do we use it? *

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Coding Schemes for Inter-Cell Interference in Flash Memory Sarit Buzaglo UCSD Joint work with

Memory Management Ideally programmers want memory that is large fast non

In the name of Allah f the compassionate, the merciful p , Digital Video Processing g g S.

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 Overview (Slides from Embedded Systems

Starlings in flight Starlings in flight understanding patterns of animal group movements

By Itzhak (Zuk) Avraham BH-DC-2011 # /usr/bin/whoami Itzhak Avraham (Zuk) Founder &

Spiral 3-2 Signal & Image Processing Finding and exploiting patterns in raw data SIGNAL AND

Using De-optimization to Re-optimize Code , Prasad Kulkarni , Stephen Hines , Jack

Monday, July 14 at 1530-1730 CHAIRS: Gorry Fairhurst <gorry@erg.abdn.ac.uk> Bernhard

7. Video databases Video data representations Video = time-ordered sequence of correlated

FlashVM: Virtual Memory Management on Flash Mohit Saxena Michael M. - PowerPoint PPT Presentation

FlashVM: Virtual Memory Management on Flash Mohit Saxena Michael M. Swift University of Wisconsin Madison 1 Is Virtual Memory Relevant? There is never enough DRAM Price, power and DIMM slots limit amount Application memory

2004: Poisson Matting 2004: Flash/No-Flash 2004: Flash/No-Flash 2004: Flash/No-Flash 2004: The

A Case for Flash Memory SSD in A Case for Flash Memory SSD in A Case for Flash Memory SSD in

Flash Memory and Micro SD Card Presented by: Krishna Goyal (200601195) Anirudh Tripathi

Arc Flash Protection Arc Flash Protection Electrical Reliability Services Arc Flash Hazard Arc

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

ReFlex: Remote Flash Local Flash Ana Klimovic Heiner Litz Christos Kozyrakis NVMW18

The Basics Of Flash Building A Web Application With Flash What is Flash? Introduction

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table,

Arc Flash Arc Flash Mitigation Mitigation Remote Racking and Switching for Arc Flash danger

Flash Presentation The flash web designs which we make are attractive to captivate your website

Design of Flash- -Based DBMS: Based DBMS: Design of Flash Design of Flash-Based DBMS: An In-

Basics of Off-Camera Flash Off-Camera Flash www.jedi.com * What is it &amp; why do we use it? *

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Coding Schemes for Inter-Cell Interference in Flash Memory Sarit Buzaglo UCSD Joint work with

Memory Management Ideally programmers want memory that is large fast non

In the name of Allah f the compassionate, the merciful p , Digital Video Processing g g S.

HW/SW Codesign w/ FPGAs Embedded Systems ECE 495/595 Overview (Slides from Embedded Systems

Starlings in flight Starlings in flight understanding patterns of animal group movements

By Itzhak (Zuk) Avraham BH-DC-2011 # /usr/bin/whoami Itzhak Avraham (Zuk) Founder &amp;

Spiral 3-2 Signal &amp; Image Processing Finding and exploiting patterns in raw data SIGNAL AND

Using De-optimization to Re-optimize Code , Prasad Kulkarni , Stephen Hines , Jack

Monday, July 14 at 1530-1730 CHAIRS: Gorry Fairhurst &lt;gorry@erg.abdn.ac.uk&gt; Bernhard

7. Video databases Video data representations Video = time-ordered sequence of correlated

Basics of Off-Camera Flash Off-Camera Flash www.jedi.com * What is it & why do we use it? *

By Itzhak (Zuk) Avraham BH-DC-2011 # /usr/bin/whoami Itzhak Avraham (Zuk) Founder &

Spiral 3-2 Signal & Image Processing Finding and exploiting patterns in raw data SIGNAL AND

Monday, July 14 at 1530-1730 CHAIRS: Gorry Fairhurst <gorry@erg.abdn.ac.uk> Bernhard