Memory Management Techniques for Large-Scale Persistent-Main-Memory - - PowerPoint PPT Presentation

memory management techniques for large scale
SMART_READER_LITE
LIVE PREVIEW

Memory Management Techniques for Large-Scale Persistent-Main-Memory - - PowerPoint PPT Presentation

Memory Management Techniques for Large-Scale Persistent-Main-Memory Systems [VLDB 2017] Ismail Oukid, Daniel Booss, Adrien Lespinasse, Wolfgang Lehner, Thomas Willhalm, Grgoire Gomes Non-Volatile Memories Workshop March 12, 2018 PUBLIC


slide-1
SLIDE 1

PUBLIC

Memory Management Techniques for Large-Scale Persistent-Main-Memory Systems [VLDB 2017]

Ismail Oukid, Daniel Booss, Adrien Lespinasse, Wolfgang Lehner, Thomas Willhalm, Grégoire Gomes Non-Volatile Memories Workshop – March 12, 2018

slide-2
SLIDE 2

2

Motivation

▪ NVM can replace both main memory and storage  single-level database storage architecture without I/O ▪ Fail-safe persistent NVM memory management is conditio sine qua non for enabling this novel architecture paradigm ▪ Existing persistent allocators are general-purpose and do not address the versatile needs of database systems ▪ We present PAllocator, a highly scalable fail-safe persistent allocator

slide-3
SLIDE 3

3

Outlook

▪ What is a persistent allocator? ▪ PAllocator’s design decisions ▪ Experimental evaluation ▪ Conclusion

slide-4
SLIDE 4

4

What Characterizes a Persistent Allocator?

A persistent allocator must:

  • 1. Provide a recoverable addressing scheme
  • 2. Avoid persistent memory leaks

NVM DRAM Application application address space Virtual memory subsystem Persistent allocator Transient allocator

slide-5
SLIDE 5

5

  • 1. Recoverable Addressing Scheme

Program root at known offset Start address

File

Offset (mmap) NVM Virtual Address Space Volatile pointer = File start address + Offset PPtr: {File ID, Offset}

slide-6
SLIDE 6

6

  • 2. Preventing Memory Leaks

Reference passing  allocate(PPtr &pptr, size_t allocSize) pptr is owned by the data structure  Traditional interface has a “blind spot” pptr = allocate(size); persist(&pptr);

slide-7
SLIDE 7

7

PAllocator Design

We explore the following design dimensions

  • 1. Pool structure (single file vs. multiple files)
  • 2. Allocation strategies
  • 3. Concurrency Handling
  • 4. Persistent fragmentation

We assume hardware-managed wear-leveling We do not consider garbage collection

slide-8
SLIDE 8

8

  • 1. Pool Structure: Single Vs. Multiple Files

Pool as Single File Pool as Multiple Files Pros

▪ Easier to grow and shrink ▪ Easy, fragmentation-free huge allocation handling

Cons

▪ 16-byte persistent pointers

Pros

▪ 8-byte persistent pointers possible ▪ Easier to implement

Cons

▪ Hard to shrink ▪ Huge block allocation a problem

Multiple files better suited for database systems

slide-9
SLIDE 9

9

  • 2. Allocation Strategies

Three allocation strategies  One file per allocation  Segregated-fit for small blocks (e.g., < 4 KB)  Best-fit for medium and large blocks (e.g., [4 KB, 16 MB)) One file per allocation not realistic… except for huge blocks!  Fragmentation handling pushed to filesystem ▪ Significant overhead and wasted memory for small blocks ▪ Filesystem might struggle to handle huge number of files

slide-10
SLIDE 10

10

Segregated-Fit Allocation Strategy

 Reduced fragmentation with moderate number of class sizes  Not suitable for larger block allocations Fixed-size memory chunk, e.g., 8 KB, divided into fixed-size blocks Multiple class sizes One allocation == one bit flip! Bitmap Allocated block Free block

slide-11
SLIDE 11

11

  • 2. Allocation Strategies: Best-Fit Allocation Strategy

 Suitable for large blocks  Prone to fragmentation Allocate multiple of a predetermined size (e.g., system page size)

Inner nodes

Segment (e.g., 128 MB)

Inner nodes

Indexes implemented with the FPTree, a hybrid NVM- DRAM B+-Tree [SIGMOD’16] Free blocks index sorted by block size Global block index sorted by block offset Allocation Coalescing

DRAM NVM

slide-12
SLIDE 12

12

  • 3. Concurrency Handling

▪ The standard in general-purpose allocators ▪ Used for small block allocations  Local allocator requests chunks from global pool ▪ Need to be merged with global pool when thread terminates ▪ Does not scale under high concurrency  Frequent chunk requests to the global pool Thread-local allocation One allocator object per thread

slide-13
SLIDE 13

13

  • 3. Concurrency Handling

Core-local allocation ▪ Local allocators request large files from global pool ▪ Robust performance under high concurrency  Stable local allocators  Greedy One allocator object per physical core Socket 1

C1

QPI

C2

Alloc Alloc

Socket 2

C1 C2

Alloc Alloc

slide-14
SLIDE 14

14

  • 4. Persistent Fragmentation

Restart is a last resort, but valid way of defragmenting volatile memory  does not apply to NVM File system solutions do not apply to NVM

  • File systems benefit from an additional indirection layer
  • NVM is directly accessed with load/store instructions

Need new defragmentation mechanisms

slide-15
SLIDE 15

15

  • 4. Persistent Fragmentation

Most file systems have support for sparse files Defragmentation idea: Punch holes in free blocks Find largest free block Punch hole using fallocate Iterate until target size reached Must keep file size unchanged to maintain validity of offsets Used Used Free Free Used Free Used Used Hole Free Used Hole

slide-16
SLIDE 16

16

PAllocator: Architecture Overview

Small Block Allocator Big Block Allocator Huge Block Allocator Small Alloc 1 Small Alloc n Big Alloc 1 Big Alloc n

… …

Huge Alloc Shared list of free segments

Allocator Objects NVM-aware Filesystem Segment Manager Persistent Allocators

Failure-Atomic Segment Provider File creation, deletion, memory mapping Segment

  • wnership

map

slide-17
SLIDE 17

17

PAllocator Performance Evaluation

100000 200000 300000 400000 500000 1 2 4 8 16

Throughput [op/thread/sec] Threads

Random-Size Allocation/Deallocation (64 B - 128 KB)

PAllocator NVML jemalloc

7.6x 1.7x PAllocator scales nearly linearly

slide-18
SLIDE 18

18

Allocator Performance Impact on the FPTree

50 100 150 200 250 300

PAllocator NVML KOPS/S

100% Insert

50 100 150 200 250 300 350 400

PAllocator NVML

50% Find, 50% Insert 1.4x 1.2x Persistent allocators do impact database performance

slide-19
SLIDE 19

19

Allocator Recovery Time

0,0001 0,001 0,01 0,1 1 10 100 60 600 6000 60000

Time [sec] Allocated Data Size [MB]

PAllocator NVML Makalu nvm_malloc

4.6x 516x 29.5x 1 TB  PAllocator (0.75s), NVML (3.5s), Makalu (394.5s), nvm_malloc (22.5s)

slide-20
SLIDE 20

20

Conclusion

NVM has the potential to disrupt database storage architecture ➢ Memory management is a necessary building block

Small Block Allocator Big Block Allocator Huge Block Allocator Small Alloc 1 Small Alloc n Big Alloc 1 Big Alloc n … … Huge Alloc Shared list of free segments Allocator Objects NVM-aware Filesystem Segment Manager Persistent Allocators Failure-Atomic Segment Provider File creation, deletion, memory mapping Segment

  • wnership

map

We presented PAllocator:  Designed for large NVM systems Highly scalable  Fast recovery  Defragmentation capability

slide-21
SLIDE 21

21

State-of-the-Art

Salient differences in design decisions

Allocator Purpose Pool structure Allocation strategies Concurrency handling Garbage collection Defragm- entation Source Mnemosyne General Multiple files Segregated-fit + best-fit Thread-local for small blocks Yes No ASPLOS‘11 NV-Heaps General Single file Undefined Thread-local Yes No ASPLOS‘11 nvm_malloc General Single file Segregated-fit + best-fit Thread-local for small blocks No No ADMS‘15 NVML General Single file Segregated-fit + best-fit Thread-local for small blocks No No http://pmem.i

  • /nvml/

Makalu General Single file Segregated-fit + best-fit Thread-local for small blocks Yes (offline) No OOPSLA‘16 PAllocator Large systems Multiple files Segregated-fit + best-fit + file Core-local No Yes VLDB‘17 For completeness: NVMalloc and Walloc focus on wear-leveling