Memory Management Techniques for Large-Scale Persistent-Main-Memory - PowerPoint PPT Presentation

Memory Management Techniques for Large-Scale Persistent-Main-Memory Systems [VLDB 2017] Ismail Oukid, Daniel Booss, Adrien Lespinasse, Wolfgang Lehner, Thomas Willhalm, Grégoire Gomes Non-Volatile Memories Workshop – March 12, 2018 PUBLIC

Motivation ▪ NVM can replace both main memory and storage  single-level database storage architecture without I/O ▪ Fail-safe persistent NVM memory management is conditio sine qua non for enabling this novel architecture paradigm ▪ Existing persistent allocators are general-purpose and do not address the versatile needs of database systems ▪ We present PAllocator, a highly scalable fail-safe persistent allocator 2

Outlook ▪ What is a persistent allocator? ▪ PAllocator’s design decisions ▪ Experimental evaluation ▪ Conclusion 3

What Characterizes a Persistent Allocator? Application application address space A persistent allocator must: Transient Persistent allocator allocator 1. Provide a recoverable addressing scheme 2. Avoid persistent memory leaks Virtual memory subsystem DRAM NVM 4

1. Recoverable Addressing Scheme Program root at known offset Start address Offset Virtual Address Space Volatile pointer = (mmap) File start address + Offset NVM PPtr: {File ID, Offset} File 5

2. Preventing Memory Leaks pptr = allocate(size);  Traditional interface has a “blind spot” persist(&pptr); Reference passing  allocate( PPtr &pptr , size_t allocSize) pptr is owned by the data structure 6

PAllocator Design We explore the following design dimensions 1. Pool structure (single file vs. multiple files) 2. Allocation strategies 3. Concurrency Handling 4. Persistent fragmentation We do not consider garbage collection We assume hardware-managed wear-leveling 7

1. Pool Structure: Single Vs. Multiple Files Pool as Single File Pool as Multiple Files Pros Pros ▪ 8-byte persistent pointers possible ▪ Easier to grow and shrink ▪ Easier to implement ▪ Easy, fragmentation-free huge allocation handling Cons ▪ Hard to shrink Cons ▪ Huge block allocation a problem ▪ 16-byte persistent pointers Multiple files better suited for database systems 8

2. Allocation Strategies Three allocation strategies  One file per allocation  Segregated-fit for small blocks (e.g., < 4 KB )  Best-fit for medium and large blocks (e.g., [4 KB, 16 MB) ) One file per allocation not realistic… except for huge blocks! ▪  Fragmentation handling Significant overhead and wasted memory for small blocks pushed to filesystem ▪ Filesystem might struggle to handle huge number of files 9

Segregated-Fit Allocation Strategy Fixed-size memory chunk, e.g., 8 KB, divided into fixed-size blocks Bitmap One allocation == one bit flip! Allocated block Free block Multiple class sizes  Reduced fragmentation with moderate number of class sizes  Not suitable for larger block allocations 10

2. Allocation Strategies: Best-Fit Allocation Strategy Allocate multiple of a predetermined size (e.g., system page size) Allocation Coalescing Inner Inner DRAM nodes nodes Free blocks index Global block index sorted by block size sorted by block offset NVM Indexes implemented with the FPTree, a hybrid NVM- DRAM B+-Tree [SIGMOD’16] Segment (e.g., 128 MB)  Suitable for large blocks  Prone to fragmentation 11

3. Concurrency Handling Thread-local allocation One allocator object per thread ▪ The standard in general-purpose allocators ▪ Used for small block allocations  Local allocator requests chunks from global pool ▪ Need to be merged with global pool when thread terminates ▪ Does not scale under high concurrency  Frequent chunk requests to the global pool 12

3. Concurrency Handling Core-local allocation One allocator object per physical core ▪ Local allocators request large files from global pool Socket 1 Socket 2 QPI C1 C2 C1 C2 Alloc Alloc Alloc Alloc ▪ Robust performance under high concurrency  Stable local allocators  Greedy 13

4. Persistent Fragmentation Restart is a last resort, but valid way of defragmenting volatile memory  does not apply to NVM File system solutions do not apply to NVM - File systems benefit from an additional indirection layer - NVM is directly accessed with load/store instructions Need new defragmentation mechanisms 14

4. Persistent Fragmentation Most file systems have support for sparse files Defragmentation idea: Punch holes in free blocks Iterate until target size reached Hole Free Used Used Used Used Free Free Find largest Punch hole using free block fallocate Free Hole Used Used Must keep file size unchanged to maintain validity of offsets 15

PAllocator: Architecture Overview Allocator Small Small Big Big Huge … … Objects Alloc 1 Alloc n Alloc 1 Alloc n Alloc Persistent Small Block Big Block Huge Block Allocators Allocator Allocator Allocator Shared list of free segments Segment Segment ownership Manager map Failure-Atomic Segment Provider NVM-aware File creation, deletion, memory mapping Filesystem 16

PAllocator Performance Evaluation Random-Size Allocation/Deallocation (64 B - 128 KB) 500000 Throughput [op/thread/sec] PAllocator 400000 NVML 300000 jemalloc 200000 1.7x 100000 7.6x 0 1 2 4 8 16 Threads PAllocator scales nearly linearly 17

Allocator Performance Impact on the FPTree 100% Insert 50% Find, 50% Insert 300 400 1.2x 350 1.4x 250 300 200 250 KOPS/S 150 200 150 100 100 50 50 0 0 PAllocator NVML PAllocator NVML Persistent allocators do impact database performance 18

Allocator Recovery Time 100 516x 10 29.5x 1 Time [sec] PAllocator 4.6x 0,1 NVML Makalu 0,01 nvm_malloc 0,001 0,0001 60 600 6000 60000 Allocated Data Size [MB] 1 TB  PAllocator (0.75s), NVML (3.5s), Makalu (394.5s), nvm_malloc (22.5s) 19

Conclusion NVM has the potential to disrupt database storage architecture ➢ Memory management is a necessary building block Allocator Small Small Big Big Huge … … Objects Alloc 1 Alloc n Alloc 1 Alloc n Alloc We presented PAllocator : Persistent Small Block Big Block Huge Block  Designed for large NVM systems Allocators Allocator Allocator Allocator  Highly scalable Shared list of free segments Segment  Fast recovery Segment Manager ownership map  Defragmentation capability Failure-Atomic Segment Provider NVM-aware Filesystem File creation, deletion, memory mapping 20

State-of-the-Art Allocator Purpose Pool Allocation Concurrency Garbage Defragm- Source structure strategies handling collection entation ASPLOS‘11 Mnemosyne General Multiple Segregated-fit + Thread-local for Yes No files best-fit small blocks ASPLOS‘11 NV-Heaps General Single file Undefined Thread-local Yes No ADMS‘15 nvm_malloc General Single file Segregated-fit + Thread-local for No No best-fit small blocks NVML General Single file Segregated-fit + Thread-local for No No http://pmem.i best-fit small blocks o/nvml/ OOPSLA‘16 Makalu General Single file Segregated-fit + Thread-local for Yes (offline) No best-fit small blocks VLDB‘17 PAllocator Large Multiple Segregated-fit Core-local No Yes systems files + best-fit + file For completeness: NVMalloc and Walloc focus on wear-leveling Salient differences in design decisions 21

Memory Management Techniques for Large-Scale Persistent-Main-Memory - PowerPoint PPT Presentation

Memory Management Techniques for Large-Scale Persistent-Main-Memory Systems [VLDB 2017] Ismail Oukid, Daniel Booss, Adrien Lespinasse, Wolfgang Lehner, Thomas Willhalm, Grgoire Gomes Non-Volatile Memories Workshop March 12, 2018 PUBLIC

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Memory Management Ideally programmers want memory that is large fast non

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Ethics in Techniques for large-scale data Graham J.L. Kemp TECHNIQUES FOR LARGE-SCALE DATA

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

28.05.04 09:50 Memory Management The computer memory is a limited resource so the Memory

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Memory Management Memory Manager Requirements Minimize primary memory access time

Memory Management Memory Management 5A. Memory Management and Address Spaces 1. allocate/assign

Memory Management Memory Management 5A. Memory Management and Address Spaces 1. allocate/assign

Operating Systems: Operating Systems: Memory management Memory management Fall 2008 Fall 2008

Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory Memory management Basic

Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory Memory management n Basic

Lecture 5: Memory Management 1 / 54 Memory Management Administrivia Assignment 1 is due on

tc and IP fragments Once defragmented, how to output them? Marcelo Ricardo Leitner

Superfetch : everything you need to know about privacy Mathilde Venault & Baptiste David

Project Heapbleed Thoughts on heap exploitation abstraction (WIP) ZeroNights 2014 PATROKLOS

Reconfigurable and Adaptive Systems (RAS) Lars Bauer, Jrg Henkel - 1 - Institut fr

Slab allocators in the Linux Kernel: SLAB, SLOB, SLUB Christoph Lameter, LCA 2015 Auckland/New

Module 7: Creating and Maintaining Indexes Overview Creating Indexes Creating Index

CS 241 Data Organization Linked Lists March 27, 2018 Linked List A B C D E NULL A

CS4513 Process capacity restricted to vmem size When process terminates, memory lost

Memory Management Techniques for Large-Scale Persistent-Main-Memory - PowerPoint PPT Presentation

Memory Management Techniques for Large-Scale Persistent-Main-Memory Systems [VLDB 2017] Ismail Oukid, Daniel Booss, Adrien Lespinasse, Wolfgang Lehner, Thomas Willhalm, Grgoire Gomes Non-Volatile Memories Workshop March 12, 2018 PUBLIC

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Memory Management Ideally programmers want memory that is large fast non

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Ethics in Techniques for large-scale data Graham J.L. Kemp TECHNIQUES FOR LARGE-SCALE DATA

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

28.05.04 09:50 Memory Management The computer memory is a limited resource so the Memory

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Memory Management Memory Manager Requirements Minimize primary memory access time

Memory Management Memory Management 5A. Memory Management and Address Spaces 1. allocate/assign

Memory Management Memory Management 5A. Memory Management and Address Spaces 1. allocate/assign

Operating Systems: Operating Systems: Memory management Memory management Fall 2008 Fall 2008

Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory Memory management Basic

Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory Memory management n Basic

Lecture 5: Memory Management 1 / 54 Memory Management Administrivia Assignment 1 is due on

tc and IP fragments Once defragmented, how to output them? Marcelo Ricardo Leitner

Superfetch : everything you need to know about privacy Mathilde Venault &amp; Baptiste David

Project Heapbleed Thoughts on heap exploitation abstraction (WIP) ZeroNights 2014 PATROKLOS

Reconfigurable and Adaptive Systems (RAS) Lars Bauer, Jrg Henkel - 1 - Institut fr

Slab allocators in the Linux Kernel: SLAB, SLOB, SLUB Christoph Lameter, LCA 2015 Auckland/New

Module 7: Creating and Maintaining Indexes Overview Creating Indexes Creating Index

CS 241 Data Organization Linked Lists March 27, 2018 Linked List A B C D E NULL A

CS4513 Process capacity restricted to vmem size When process terminates, memory lost

Superfetch : everything you need to know about privacy Mathilde Venault & Baptiste David