Object-Oriented Recovery for Non-volatile Memory Nachshon Cohen, - - PowerPoint PPT Presentation

object oriented recovery for non volatile memory
SMART_READER_LITE
LIVE PREVIEW

Object-Oriented Recovery for Non-volatile Memory Nachshon Cohen, - - PowerPoint PPT Presentation

Cohen, Aksun, Larus. Object-Oriented Recovery for Non-Volatile Memory . OOPSLA 2018. Object-Oriented Recovery for Non-volatile Memory Nachshon Cohen, David Aksun, James Larus EPFL 10 th Annual Non-volatile Memories Workshop San Diego, CA March


slide-1
SLIDE 1

Object-Oriented Recovery for Non-volatile Memory

Nachshon Cohen, David Aksun, James Larus EPFL 10th Annual Non-volatile Memories Workshop San Diego, CA March 12, 2019

Cohen, Aksun, Larus. Object-Oriented Recovery for Non-Volatile Memory. OOPSLA 2018.

slide-2
SLIDE 2

Overview

  • Prior NVM recovery mechanisms are incomplete
  • Your carefully stored, consistent data may be unusable
  • Object-oriented recovery
  • llvm extension to support complete recovery

2 James Larus

slide-3
SLIDE 3

NVM Lifecycle

3

Terminate Recovery Run

  • 1. Code accesses

NVM with load and store instructions

  • 2. NVM must record a

consistent memory state before termination, planned or unexpected

  • 3. Ensure NVM state is

consistent in the environment in which execution restarts

James Larus

slide-4
SLIDE 4

Recovery Problems

  • 1. Non-persistent data
  • 2. NVM remapping
  • 3. Code remapping

James Larus 4

slide-5
SLIDE 5
  • 1. Non-Persistent Data in NVM

5

Network socket referenced from NVM is valid in current environment

James Larus

NVM

slide-6
SLIDE 6

6

slide-7
SLIDE 7

Environment Can Change on Restart

James Larus 7

Network socket is no longer usable

NVM

slide-8
SLIDE 8

Environmentally-Specific Data

  • Network sockets
  • Locks
  • Process and thread IDs
  • File handles
  • Common practice to store [pointers to] these objects in NVM
  • Fast access
  • Must restore / reinitialize during recovery
  • Traverse all objects in NVM (= GC)

8 James Larus

Lesson 8: Initialization of semantically nonpersistent data colocated with persistent data is tricky. Programmers frequently find it convenient to co-locate nonpersistent data in persistent objects.

  • -- Persistent Memcached: Bringing Legacy Code to Byte-Addressable Persistent Memory,

HotStorage ‘17.

slide-9
SLIDE 9
  • 2. NVM (Re)Mapping

base = mmap(0x1000, …, nvm_fd);

9 James Larus

A B

0x1000 0x1200

slide-10
SLIDE 10

10

slide-11
SLIDE 11

Remapped To Different Address

base = mmap(0x1000, …, nvm_fd); But, kernel may mmap to a different address

11 James Larus

A B

0x1000 0x2200 0x2000

slide-12
SLIDE 12

mmap

  • Always map to specified virtual address? NO
  • OS upgrade
  • NVM grows/shrinks
  • Execution under debugger/profiler/etc.
  • Earlier actions during recovery
  • Mapping in several NVM segment

12 James Larus

“If addr is not NULL, then the kernel takes it as a hint about where to place the mapping..

  • - MMAP(2) man page
slide-13
SLIDE 13
  • 3. Code and Literal Pointers
  • Function pointers and virtual pointers are also execution specific
  • Address Space Layout Randomization (ASLR)
  • C++ objects contain method pointers
  • Object may not be well formed after restart

James Larus 13

slide-14
SLIDE 14

Published Solutions

  • Forbid NVM to DRAM pointers [ASPLOS’11]
  • Impractical in real systems [HotStorage’17]
  • Ad-hoc, specific solutions
  • Generational locks [ASPLOS’11]
  • Self-relative pointers [NVML, NVM-Direct]
  • Comment code (and hope someone reads it)
  • Custom (re)initialization code

14 James Larus

Data is not durable if it cannot survive system changes

slide-15
SLIDE 15

NVM Reconstruction

  • Compiler support for object-level

recovery

  • Recovery procedure for each
  • bject in NVM
  • Ensures that object is well-

formed after recovery

  • Transparent to application

James Larus 15

slide-16
SLIDE 16

llvm Language Extension

16

struct …{ void *CurrAllocAddr_; transient pthread_mutex_t lock; reconstructor(node *n){ pthread_mutex_init(&n->lock); } void addChild(long k){ left = pnew node(k); Zero on restart Custom initialization code Allocate in NVM Standard pointer (no relative addresses)

James Larus

slide-17
SLIDE 17

NVM Reconstruction Workflow

James Larus 17

Program llvm*

DRAM NVM Executable Code NVM Object Metadata

Clang/LLVM plugin Extend objects with type information Collect metadata Runtime Records runtime information, e.g., mapping address Allocates header for each durable object

Reconstructor Runtime

slide-18
SLIDE 18

Reconstruction After Failure

18

For each live object:

Fix code pointers Rebase NVM pointers Zero transient fields Invoke user-provided reconstructor

James Larus

During recovery Use type information from previous execution Compute address space delta per page

slide-19
SLIDE 19

Lazy Reconstruction

19

Initially: memory protect NVM region On page-fault: For each live object in page Apply system reconstruction Zero transient fields Fix NVM pointers, code pointers Apply User-provided reconstruction

James Larus

slide-20
SLIDE 20

Performance Measurements: Atlas

  • Applied NVM-Reconstruction to Atlas

[Chakrabarti 2014]

  • Support for transient fields, different

mapping addresses, etc.

  • Negligible runtime cost
  • Measured simple Key-Value Store
  • Recovery time: up to 200ms/GB,

depends on number of items

James Larus 20

slide-21
SLIDE 21

Code Change Measurements: Echo KV Store

  • Incorporated NVM-Reconstruction into Echo Key-Value Store
  • Original code: 22,503 SLOC, no recovery
  • NVMReconstructor: added 214 SLOC, full recovery

James Larus 21

pnew pdelete realloc extra transient reconstr uctor total Added SLOC 38 68 25 19 64 214

slide-22
SLIDE 22

Reconstruction Test

James Larus 22

Environment:

  • 1. gcc –O3
  • 2. Original classes
  • 3. mmap @ 240

Environment:

  • 1. gcc –O0
  • 2. Add field to each class
  • 3. mmap @ 3 x 240
slide-23
SLIDE 23

Conclusions

  • Execution environment may differ after restart
  • Need to recover execution-specific data and adjust for environment changes
  • NVM-Reconstruction: system-level approach for object-level recovery
  • Transient fields, virtual address pointers, custom reconstructor functions
  • Low overhead
  • Easy to use

James Larus 23

Questions?