Strand Persistency
Vaibhav Gogte, William Wang$, Stephan Diestelhorst$, Peter M. Chen, Satish Narayanasamy, Thomas F. Wenisch
NVMW 03/12/2019
$
Strand Persistency Vaibhav Gogte, William Wang $ , Stephan - - PowerPoint PPT Presentation
Strand Persistency Vaibhav Gogte, William Wang $ , Stephan Diestelhorst $ , Peter M. Chen, Satish Narayanasamy, Thomas F. Wenisch NVMW $ 03/12/2019 Promise of persistent memory (PM) Performance Density Non-volatility 2 Promise of
$
2
3
“Optane DC Persistent Memory will be
“… expanding memory per CPU socket to as much as 3TB.” *
* Source: www.extremetech.com
4
“Optane DC Persistent Memory will be
“… expanding memory per CPU socket to as much as 3TB.” *
* Source: www.extremetech.com
Byte-addressable, load-store interface to durable storage
5
CPU Writeback caches
6
CPU Writeback caches
7
Recovery can inspect PM data-structures to restore system to a consistent state CPU Writeback caches
8
CPU Writeback caches
for recovery
9
CPU Writeback caches
Consistency model
for recovery
10
CPU Writeback caches
Consistency model Persistency model
for recovery
11
CPU Writeback caches
Consistency model Persistency model
for recovery
12
CPU Writeback caches
Consistency model Persistency model
for recovery
13
14
15
16
– Hardware ISA primitives to specify precise ordering constraints
– Can encode an arbitrary DAG
– Leverage strand persistency to build persistency models efficiently
17
– Hardware ISA primitives to specify precise ordering constraints
– Can encode an arbitrary DAG
– Leverage strand persistency to build persistency models efficiently
18
Strand persistency improves perf. of language persistency models by 21.4% (avg.)
19
20
21
22
Init: x = 0; y = 0 atomic_begin() x = 1; y = 2; atomic_end()
persistUndoLog (L) mutateData (M) commitLog (C) persistData (P)
23
Init: x = 0; y = 0 atomic_begin() x = 1; y = 2; atomic_end()
persistUndoLog (L) mutateData (M) commitLog (C) persistData (P)
24
Init: x = 0; y = 0 atomic_begin() x = 1; y = 2; atomic_end()
persistUndoLog (L) mutateData (M) commitLog (C) persistData (P)
25
atomic_begin() x = 1; y = 2; atomic_end()
Log(Lx,x) CLWB(Lx) Store(x,1) Log(Ly,y) CLWB(Ly) Store(y,2)
26
atomic_begin() x = 1; y = 2; atomic_end()
Log(Ly,y) CLWB(Ly) Log(Lx,x) CLWB(Lx) Store(x,1) Store(y,2)
Log(Lx,x) CLWB(Lx) Store(x,1) Log(Ly,y) CLWB(Ly) Store(y,2)
SFENCE SFENCE
27
atomic_begin() x = 1; y = 2; atomic_end()
Log(Ly,y) CLWB(Ly) Log(Lx,x) CLWB(Lx) Store(x,1) Store(y,2)
Log(Lx,x) CLWB(Lx) Store(x,1) Log(Ly,y) CLWB(Ly) Store(y,2)
SFENCE SFENCE
28
atomic_begin() x = 1; y = 2; atomic_end()
Log(Ly,y) CLWB(Ly) Log(Lx,x) CLWB(Lx) Store(x,1) Store(y,2)
Log(Lx,x) CLWB(Lx) Store(x,1) Log(Ly,y) CLWB(Ly) Store(y,2)
SFENCE SFENCE
29
Persist C
Persist A Persist B
30
Persist C
PersistBarrier
Orders persists within a thread ß
Persist A Persist B
31
Persist C
PersistBarrier
Orders persists within a thread ß
NewStrand
Initiates new stream of persists ß
Persist A
Strand 0 Strand 1
Persist B
32
Persist C
PersistBarrier
Orders persists within a thread ß
NewStrand
Initiates new stream of persists ß
Persist A
Strand 0 Strand 1 strand
Persist B
33
Persist C
PersistBarrier
Orders persists within a thread ß
NewStrand
Initiates new stream of persists ß
Persist A
Strand 0 Strand 1 Persists on different strands can be issued concurrently to PM strand
Persist B
34
Persist A Persist B PersistBarrier Strand 0 Strand 1
35
Persist A Persist B PersistBarrier
Strand 0 Strand 1 NewStrand PersistBarrier Persist A Persist C
36
Persist A Persist B PersistBarrier
Strand 0 Strand 1 NewStrand PersistBarrier Persist A Persist C
Inter-strand
37
atomic_begin() x = 1; y = 2; atomic_end()
Log(Lx,x) CLWB(Lx) Store(x,1) Log(Ly,y) CLWB(Ly) Store(y,2) Log(Lx,x) CLWB(Lx) PersistBarrier Store(x,1) Log(Ly,y) CLWB(Ly) Store(y,2) PersistBarrier NewStrand
Strand 0 Strand 1
38
atomic_begin() x = 1; y = 2; atomic_end()
Log(Lx,x) CLWB(Lx) Store(x,1) Log(Ly,y) CLWB(Ly) Store(y,2) Log(Lx,x) CLWB(Lx) PersistBarrier Store(x,1) Log(Ly,y) CLWB(Ly) Store(y,2) PersistBarrier NewStrand
Strand 0 Strand 1 Need to implement log buffer that can manage concurrent log updates
39
Invalid Log 0 Log 1 Invalid
Persistent head atomically commits logs Volatile tail for concurrent log creation
Log buffer
40
Invalid Log 0 Log 1 Invalid
Persistent head atomically commits logs Volatile tail for concurrent log creation
Log buffer
– Identify valid logs in case of failure – Record order of log creation – Recovery rolls back partial updates using valid logs More details in the paper
41
Hardware ISA
ISA primitives: PersistBarrier and NewStrand
42
Hardware ISA
ISA primitives: PersistBarrier and NewStrand
Compiler
Logging impl. that map to hardware primitives
43
Hardware ISA
ISA primitives: PersistBarrier and NewStrand
Compiler
Logging impl. that map to hardware primitives
High-level languages
Failure atomicity for language-level persistency models
44
L1.lock(); x -= 100; y += 100; L2.lock(); a -= 100; b += 100; L2.unlock(); L1.unlock();
45
L1.lock(); x -= 100; y += 100; L2.lock(); a -= 100; b += 100; L2.unlock(); L1.unlock();
46
L1.lock(); x -= 100; y += 100; L2.lock(); a -= 100; b += 100; L2.unlock(); L1.unlock();
– Queue: insert/delete entries in a queue – Hashmap: update values in persistent hash table – Array swaps: random swaps of array elements – RBTree: insert/delete entries in red-black tree – TPCC: new order transaction from TPCC
47
48
5 10 15 20 25 30 35 Queue Hashmap Array swap RBTree TPCC Mean
ATLAS Coupled-SFR Improves performance of ATLAS by up to 29.9% (18.2% avg.)
Improves performance of Coupled-SFR by up to 34.5% (21.4% avg.)
49
5 10 15 20 25 30 35 Queue Hashmap Array swap RBTree TPCC Mean
ATLAS Coupled-SFR
– Work together to relax ordering constraints in undo logging
50
$