SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy Anirudh Badam - PowerPoint PPT Presentation

Transparent Tiering Today write free() read RAM 29 30 31 32 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 13 14 15 16 9 10 11 12 29 30 31 32 17 18 19 20 25 26 27 28 33 34 35 36 45 46 47 48 41 42 43 44 49 50 51 52 61 62 63 64 57 58 59 60 SSD 9

Transparent Tiering Today write free() read RAM 29 30 31 32 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 13 14 15 16 5 6 7 8 9 10 11 12 29 30 31 32 17 18 19 20 21 22 23 24 25 26 27 28 33 34 35 36 45 46 47 48 37 38 39 40 41 42 43 44 49 50 51 52 61 62 63 64 53 54 55 56 57 58 59 60 SSD 9

Transparent Tiering Today write free() read RAM 29 30 31 32 1 2 3 4 5 6 7 8 9 10 11 12 Indirection In the OS Table or in the FTL 1 2 3 4 13 14 15 16 5 6 7 8 9 10 11 12 29 30 31 32 17 18 19 20 21 22 23 24 25 26 27 28 33 34 35 36 45 46 47 48 37 38 39 40 41 42 43 44 49 50 51 52 61 62 63 64 53 54 55 56 57 58 59 60 SSD (log structured page store) 9

Non-Transparent Tiering 10

Non-Transparent Tiering • Redesign application to be flash aware • Custom object store with custom pointers • Reads, writes and garbage collection at an application object granularity • Avoid in-place writes (objects could be small) • Obtain the best performance and lifetime from flash memory device 10

Non-Transparent Tiering • Redesign application to be flash aware • Custom object store with custom pointers • Reads, writes and garbage collection at an application object granularity • Avoid in-place writes (objects could be small) • Obtain the best performance and lifetime from flash memory device • Intrusive modifications needed • Expertise with flash memory needed 10

Non-Transparent Tiering MyObject* obj = malloc( sizeof( MyObject ) ); malloc obj->x = 0; + obj->y = 1; SSD-swap obj->z = 2; free( obj ); MyObjectID oid = createObject( sizeof( MyObject ) ); MyObject* obj = malloc( sizeof( MyObject ) ); Application readObject( oid, obj ); obj->x = 0; Rewrite obj->y = 1; obj->z = 2; writeObject( oid, obj ); free( obj ); 11

Our Goal 12

Our Goal • Run mostly unmodified applications • Work via memory allocators in C-style programs 12

Our Goal • Run mostly unmodified applications • Work via memory allocators in C-style programs • Use the DRAM effectively • Use it as an object cache (not as a page cache) 12

Our Goal • Run mostly unmodified applications • Work via memory allocators in C-style programs • Use the DRAM effectively • Use it as an object cache (not as a page cache) • Use the SSD wisely • As a log-structured object store 12

Our Goal • Run mostly unmodified applications • Work via memory allocators in C-style programs • Use the DRAM effectively • Use it as an object cache (not as a page cache) • Use the SSD wisely • As a log-structured object store • Reorganize virtual memory allocation to discern object information 12

SSDAlloc Overview Application Virtual Memory (Object per page - OPP) Physical Memory SSD 13

SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual Memory (Object per page - OPP) Physical Memory SSD 13

SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual 1 2 3 4 ... Memory (Object per 61 62 63 64 page - OPP) Physical Memory SSD 13

SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual 1 2 3 4 ... Memory (Object per 61 62 63 64 page - OPP) 1 Physical 12 Memory 33 Page Buffer SSD 13

SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual 1 2 3 4 ... Memory (Object per 61 62 63 64 page - OPP) 15 16 17 18 2 3 4 5 1 ... Physical 19 20 21 22 6 7 8 9 12 Memory 23 24 25 26 10 11 13 14 33 Page Buffer RAM Object Cache SSD 13

SSDAlloc Overview Memory Manager: Creates 64 objects of 1KB size Application Virtual 1 2 3 4 ... Memory (Object per 61 62 63 64 page - OPP) 15 16 17 18 2 3 4 5 1 ... Physical 19 20 21 22 6 7 8 9 12 Memory 23 24 25 26 10 11 13 14 33 Page Buffer RAM Object Cache Log structured object store SSD 13

SSDAlloc Options 14

SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) 14

SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator 14

SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator No. of pages * Virtual Memory No. of objects * page_size page_size 14

SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator No. of pages * Virtual Memory No. of objects * page_size page_size Separate Page Buffer & No such Physical Memory RAM Object Cache separation 14

SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator No. of pages * Virtual Memory No. of objects * page_size page_size Separate Page Buffer & No such Physical Memory RAM Object Cache separation Log-structured Object Log-structured Page SSD Usage Store Store 14

SSDAlloc Options Object Per Page (OPP) Memory Page (MP) Application Defined 4KB objects Data Entity Objects (like pages) Memory Manager Pool Allocator Coalescing Allocator No. of pages * Virtual Memory No. of objects * page_size page_size Separate Page Buffer & No such Physical Memory RAM Object Cache separation Log-structured Object Log-structured Page SSD Usage Store Store Minimal changes restricted Code Changes No changes needed to memory allocation 14

SSDAlloc Overview 15

SSDAlloc Overview Application Virtual Memory RAM Object Cache SSD 15

SSDAlloc Overview • Application A small set of pages in core Virtual Memory Page Buffer RAM Object Cache SSD 15

SSDAlloc Overview • Application A small set of pages in core • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) RAM Object Cache Demand SSD Fetching 15

SSDAlloc Overview • Application A small set of pages in core • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) • Implemented using mprotect RAM Object Cache Demand SSD Fetching 15

SSDAlloc Overview • Application A small set of pages in core X • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) • Implemented using mprotect RAM Object • Page materialized in seg-fault handler Cache Demand SSD Fetching 15

SSDAlloc Overview • Application A small set of pages in core X • Pages materialized on demand Virtual Memory from RAM object cache/SSD • Restricted in size to minimize Page Buffer RAM wastage (from OPP) • Implemented using mprotect RAM Object • Page materialized in seg-fault handler Cache • RAM Object Cache continuously Demand Dirty flushes dirty objects to the SSD in SSD Fetching Objects LRU order 15

SSD Maintenance 16

SSD Maintenance Virtual Memory Object RAM Object Tables Cache Dirty Objects SSD 16

SSD Maintenance 16

SSD Maintenance • Copy-and-compact garbage-collector/log-writer • Seek optimizations not needed 16

SSD Maintenance • Copy-and-compact garbage-collector/log-writer • Seek optimizations not needed • Read at the head and write live and dirty objects • Use Object Tables to determine liveness 16

SSD Maintenance • Copy-and-compact garbage-collector/log-writer • Seek optimizations not needed • Read at the head and write live and dirty objects • Use Object Tables to determine liveness • Garbage is disposed • Objects written elsewhere are garbage • OPP object which is “free” is garbage 16

Implementation 17

Implementation • 11,000 lines of C++ code (runtime library) • Implemented using mprotect, mmap, and madvise • SSDAlloc-OPP pool and array allocator • SSDAlloc-MP coalescing allocator (array allocations) • SSDFree frees the allocated data • Can coexist with malloc pointers 17

SSD Usage Techniques 18

SSD Usage Techniques Write Access < Finegrained Avoid DRAM High Programming Technique Logging 4KB GC Pollution Performance Ease ✔ SSD Swap SSD Swap (Write Logged) ✔ ✔ Application ✔ ✔ ✔ ✔ ✔ Rewrite 18

SSD Usage Techniques Write Access < Finegrained Avoid DRAM High Programming Technique Logging 4KB GC Pollution Performance Ease ✔ SSD Swap SSD Swap (Write Logged) ✔ ✔ Application ✔ ✔ ✔ ✔ ✔ Rewrite ✔ ✔ ✔ ✔ ✔ ✔ SSDAlloc 18

SSDAlloc Runtime Overhead 19

SSDAlloc Runtime Overhead • Overhead for SSDAlloc runtime intervention Overhead Source Max Latency TLB Miss (DRAM Read) 0.014 μ Sec Object Table Lookup 0.046 μ Sec Page Materialization 0.138 μ Sec Page Dematerialization 0.172 μ Sec Signal Handling 0.666 μ Sec Combined Overhead 0.833 μ Sec 19

SSDAlloc Runtime Overhead • Overhead for SSDAlloc runtime intervention Overhead Source Max Latency TLB Miss (DRAM Read) 0.014 μ Sec Object Table Lookup 0.046 μ Sec Page Materialization 0.138 μ Sec Page Dematerialization 0.172 μ Sec Signal Handling 0.666 μ Sec Combined Overhead 0.833 μ Sec • NAND Flash latency ~ 30-50 μ Sec 19

SSDAlloc Runtime Overhead • Overhead for SSDAlloc runtime intervention Overhead Source Max Latency TLB Miss (DRAM Read) 0.014 μ Sec Object Table Lookup 0.046 μ Sec Page Materialization 0.138 μ Sec Page Dematerialization 0.172 μ Sec Signal Handling 0.666 μ Sec Combined Overhead 0.833 μ Sec • NAND Flash latency ~ 30-50 μ Sec • Can reach 1 Million IOPS 19

Experiments 20

Experiments • Comparing three allocation methods • malloc replaced with SSDAlloc-OPP • malloc replaced with SSDAlloc-MP • Swap 20

Experiments • Comparing three allocation methods • malloc replaced with SSDAlloc-OPP • malloc replaced with SSDAlloc-MP • Swap • 2.4Ghz Quadcore CPU with 16GB RAM • RiData, Kingston, Intel X25-E, Intel X25-V and Intel X25-M 20

Results Overview SSDAlloc-OP c-OPP’s gain vs Original Original Modified Modified Application LOC LOC Swap SSDAlloc-MP Memcached 11,193 21 5.5 - 17.4x 1.4 - 3.5x B+Tree 477 15 4.3 - 12.7x 1.4 - 3.2x Index Packet 1,540 9 4.8 - 10.1x 1.3 - 2.3x Cache HashCache 20,096 36 5.3 - 17.1x 1.3 - 3.3x 21

Results Overview SSDAlloc-OP c-OPP’s gain vs Original Original Modified Modified Application LOC LOC Swap SSDAlloc-MP Memcached 11,193 21 5.5 - 17.4x 1.4 - 3.5x B+Tree 477 15 4.3 - 12.7x 1.4 - 3.2x Index Packet 1,540 9 4.8 - 10.1x 1.3 - 2.3x Cache HashCache 20,096 36 5.3 - 17.1x 1.3 - 3.3x • SSDAlloc applications write up to 32 times less data to the SSD than when compared to the traditional VM style applications 21

Microbenchmarks 22

SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy Anirudh Badam - PowerPoint PPT Presentation

SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy Anirudh Badam Vivek S. Pai Princeton University 03/31/2011 1 Memory in Networked Systems 2 Memory in Networked Systems As a cache to reduce pressure on the disk Memcache like

CACHING BEYOND RAM CACHING BEYOND RAM memcached.org/blog @dormando WHY RAM? WHY RAM?

A Case for Flash Memory SSD in A Case for Flash Memory SSD in A Case for Flash Memory SSD in

SSD electronics review M. LeVine BNL M.J. LeVine SSD electronics review, June 20, 2012 1 ST

Cross Ram Support Set Ram accessories 1 Cross Ram Support Set Set composition The Cross

Random Access Memory (RAM) Key features RAM is traditionally packaged as a chip.

Meal Planning Made Easy Meal Planning Made Easy Healthy Utah Meal Planning Made Easy

Become a SSD expert in minutes! Ryan Smith ryan.smith@ssi.samsung.com 408-205-8889 What is a

Easy-to-Use Easy-to-Install Easy on the Budget orecx.com Easy-to-Use

Hybrid Construction Hybrid Construction Hybrid Construction Hybrid Construction 1 VP

MEMORY HIERARCHY RANDOM ACCESS MEMORY Key features RAM is traditionally packaged as a chip.

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

EMS RAM PUMPS EMS RAM PUMPS INDUSTRIES LTD INDUSTRIES LTD Press ENTER to continue EMS

Memory Address Map CS RD RAM 0 0 RD WR 1K x 8 WR DB(0..7) 1 AB Decoder AB(10..11) 2

Memory Systems Design & Programming CMPE 310 Memory Types Two basic types: ROM:

The CPU-Memory Gap 100,000,000.0 10,000,000.0 Disk 1,000,000.0 SSD 100,000.0 Disk seek time

Web/CD Hybrid Model Web/CD Hybrid Model Web/CD Hybrid Model Web/CD Hybrid Model for t he Dist

HOW WAP TEAMS DELIVER COVID SAFE ENERGY RETROFITS Presenters: Amanda Hatherly, EnergySmart

COVID-19 vaccine post-authorization safety monitoring update Tom Shimabukuro, MD, MPH, MBA U.S.

Data Partnerships Agenda & Speakers Speakers: Jessica Rosenberg, HUD Office of Public

RCGP Practice Accreditation Professor Helen Lester Chair of the RCGP Clinical Innovation and

File Systems: Fundamentals A named collection of related information recorded on secondary

Non-text Files, Reading and Writing Objects Work on Spellchecker Project Turn in last written

QEMU: Architecture and Internals Lecture for the Embedded Systems Course CSD, University of Crete

Using files ITEC 1630 We save data in files on disk or some Week 9: Files & Streams

SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy Anirudh Badam - PowerPoint PPT Presentation

SSDAlloc: Hybrid SSD/RAM Memory Management Made Easy Anirudh Badam Vivek S. Pai Princeton University 03/31/2011 1 Memory in Networked Systems 2 Memory in Networked Systems As a cache to reduce pressure on the disk Memcache like

CACHING BEYOND RAM CACHING BEYOND RAM memcached.org/blog @dormando WHY RAM? WHY RAM?

A Case for Flash Memory SSD in A Case for Flash Memory SSD in A Case for Flash Memory SSD in

SSD electronics review M. LeVine BNL M.J. LeVine SSD electronics review, June 20, 2012 1 ST

Cross Ram Support Set Ram accessories 1 Cross Ram Support Set Set composition The Cross

Random Access Memory (RAM) Key features RAM is traditionally packaged as a chip.

Meal Planning Made Easy Meal Planning Made Easy Healthy Utah Meal Planning Made Easy

Become a SSD expert in minutes! Ryan Smith ryan.smith@ssi.samsung.com 408-205-8889 What is a

Easy-to-Use Easy-to-Install Easy on the Budget orecx.com Easy-to-Use

Hybrid Construction Hybrid Construction Hybrid Construction Hybrid Construction 1 VP

MEMORY HIERARCHY RANDOM ACCESS MEMORY Key features RAM is traditionally packaged as a chip.

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

EMS RAM PUMPS EMS RAM PUMPS INDUSTRIES LTD INDUSTRIES LTD Press ENTER to continue EMS

Memory Address Map CS RD RAM 0 0 RD WR 1K x 8 WR DB(0..7) 1 AB Decoder AB(10..11) 2

Memory Systems Design &amp; Programming CMPE 310 Memory Types Two basic types: ROM:

The CPU-Memory Gap 100,000,000.0 10,000,000.0 Disk 1,000,000.0 SSD 100,000.0 Disk seek time

Web/CD Hybrid Model Web/CD Hybrid Model Web/CD Hybrid Model Web/CD Hybrid Model for t he Dist

HOW WAP TEAMS DELIVER COVID SAFE ENERGY RETROFITS Presenters: Amanda Hatherly, EnergySmart

COVID-19 vaccine post-authorization safety monitoring update Tom Shimabukuro, MD, MPH, MBA U.S.

Data Partnerships Agenda &amp; Speakers Speakers: Jessica Rosenberg, HUD Office of Public

RCGP Practice Accreditation Professor Helen Lester Chair of the RCGP Clinical Innovation and

File Systems: Fundamentals A named collection of related information recorded on secondary

Non-text Files, Reading and Writing Objects Work on Spellchecker Project Turn in last written

QEMU: Architecture and Internals Lecture for the Embedded Systems Course CSD, University of Crete

Using files ITEC 1630 We save data in files on disk or some Week 9: Files &amp; Streams

Memory Systems Design & Programming CMPE 310 Memory Types Two basic types: ROM:

Data Partnerships Agenda & Speakers Speakers: Jessica Rosenberg, HUD Office of Public

Using files ITEC 1630 We save data in files on disk or some Week 9: Files & Streams