Soft Updates Made Simple and Fast on Non-volatile Memory Mingkai - - PowerPoint PPT Presentation

soft updates made simple and fast on non volatile memory
SMART_READER_LITE
LIVE PREVIEW

Soft Updates Made Simple and Fast on Non-volatile Memory Mingkai - - PowerPoint PPT Presentation

Soft Updates Made Simple and Fast on Non-volatile Memory Mingkai Dong , Haibo Chen Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University @ NVMW 18 Non-volatile Memory (NVM) Non-volatile Byte-addressable High


slide-1
SLIDE 1

Soft Updates Made Simple and Fast

  • n Non-volatile Memory

Mingkai Dong, Haibo Chen Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University

@ NVMW ‘18

slide-2
SLIDE 2

Non-volatile Memory (NVM)

ü Non-volatile ü Byte-addressable ü High throughput and low latency

2

slide-3
SLIDE 3

NVM File Systems (NVMFS)

Existing NVMFS use journaling or copy-on-write for crash consistency Synchronous cache flushes are necessary Cache flushes are expensive! Other options for crash consistency?

3

A B C E D E’ C’ A’ File System Metadata Journal Area inode inode

slide-4
SLIDE 4

NVM File Systems (NVMFS)

Existing NVMFS use journaling or copy-on-write for crash consistency Synchronous cache flushes are necessary Cache flushes are expensive! Other options for crash consistency?

4

A B C E D E’ C’ A’ File System Metadata Journal Area inode inode

slide-5
SLIDE 5

DISK

Soft Updates

Latest metadata in DRAM § Updated in DRAM with dependency tracked ü DRAM performance ü No synchronous disk writes Consistent metadata in disks § Persisted to disks with dependency enforced ü Always consistent ü Immediately usable after crash

5

DRAM (Page cache) Traditional Soft Updates

slide-6
SLIDE 6

Soft Updates

Update dependencies § E.g., allocating a new data block

  • 1. Allocate in bitmap
  • 2. Fill data in the block
  • 3. Update pointer to the block

6

inode new data block block bitmap

slide-7
SLIDE 7

Soft Updates Is Complicated

Delayed disk writes § Auxiliary structures for each update § More complex dependencies

7

inode new data block block bitmap Figures from Soft Updates: A Technique for Eliminating Most Synchronous Writes in the Fast Filesystem, ATC ’99

slide-8
SLIDE 8

Soft Updates Is Complicated

Delayed disk writes § Auxiliary structures for each operation § More complex dependencies Cyclic dependencies § Rolling back/forward

8

Inode #6 inode #4 inode #5 Inode #7 <--, #0> <A, #4> <E, #7> inode Block Directory Block inode #6 inode #4 inode #5 inode #7

inode block (in page cache)

inode #6 inode #4 inode #5 inode #7

inode block

inode #6 inode #4 inode #5 inode #7

inode block

inode #6 inode #4 inode #5 inode #7

inode block

Rollback inode #6 Rollforward inode #6 Flush block to disks

slide-9
SLIDE 9

Soft Updates Is Complicated

Delayed disk writes § Auxiliary structures for each operation § More complex dependencies Cyclic dependencies § Rolling back/forward

9

The mismatch between per-pointer-based dependency tracking and block-based interface of traditional disks

slide-10
SLIDE 10

Soft Updates Meets NVM

Soft Updates ü No synchronous cache flushes ü Immediately usable after crash NVM: byte-addressable and fast ü Direct write to NVM without delays ü No false sharing => no rolling back/forward ü Simple dependency tracking/enforcement

10

slide-11
SLIDE 11

SoupFS

A simple and fast NVMFS derived from soft updates § Hashtable-based directories

§ No false sharing

§ Pointer-based dual views

§ No synchronous cache flushes

§ Semantic-aware dependency tracking/enforcement

§ Simple dependency tracking/enforcement

Get the best of both Soft Updates and NVM

11

slide-12
SLIDE 12

Overview

Background Design & Implementation § Hashtable-based directories § Pointer-based dual views § Semantic-aware dependency tracking/enforcement Evaluation Conclusion

12

slide-13
SLIDE 13

Overview

Background Design & Implementation § Hashtable-based directories § Pointer-based dual views § Semantic-aware dependency tracking/enforcement Evaluation Conclusion

13

slide-14
SLIDE 14

Block-based Directories

Block-based file systems usually use block-based directories § False sharing ✘ Cyclic dependency ✘ Rolling back/forward § Slow access ✘ Linear scan

14

Directory inode indirect block 1.TxT|32 .TxT|38 2 fs-long-lon g.exe|512 l+f.dir|12

slide-15
SLIDE 15

Hashtable-based Directories

Optimized for cache lines ü No false sharing ü No cyclic dependency Efficient access ü No linear scan

Filename Pointer inode Pointer Consistent Next Latest Next Buckets Directory inode inode Hash Len Filename

15

1 2 3 4 …

slide-16
SLIDE 16

Overview

Background Design & Implementation ü Hashtable-based directories § Pointer-based dual views § Semantic-aware dependency tracking/enforcement Evaluation Conclusion

16

slide-17
SLIDE 17

DualViews

Latest view in page cache Consistent view in disks Dual views § Eliminate synchronous writes § Provide usability after crash

17

DISK DRAM (Page cache) Traditional Soft Updates

slide-18
SLIDE 18

DualViews

Latest view in page cache Consistent view in disks NVM Latest view? Another copy of metadata in DRAM ✗ Double writes ✗ Double storage overhead ✗ Unnecessary synchronizations

18

DISK NVM DRAM (Page cache) Soft Updates on NVM

Challenge: How to present latest view efficiently?

slide-19
SLIDE 19

Pointer-based Dual Views

Reuse data structures in both views Distinguish views by different pointers/structures

19

NVM DRAM Soft Updates on NVM

slide-20
SLIDE 20

Pointer-based Dual Views

Reuse data structures in both views Distinguish views by different pointers/structures

20

Data Structures In Consistent View In Latest View inode SoupFS inode VFS inode dentry consistent next pointer latest next pointer hash table bucket latest bucket if exists B-tree root/height in SoupFS inode root/height in VFS inode

slide-21
SLIDE 21

Pointer-based Dual Views

21

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM Dir File A File B File C File D

Latest Next Consistent Next

C B A

inode Filename

Dir File A File B File C File D

Latest View Consistent View

slide-22
SLIDE 22

Pointer-based Dual Views

22

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM Dir File A File B File C File D

Latest Next Consistent Next

C B A

inode Filename

Dir File A File B File C File D

Latest View Consistent View

Ø create E

File E

slide-23
SLIDE 23

Pointer-based Dual Views

23

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets

Ø create E

Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E

slide-24
SLIDE 24

Pointer-based Dual Views

24

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets

Ø create E

Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E E

slide-25
SLIDE 25

Pointer-based Dual Views

25

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets 3

Ø create E

Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E E

slide-26
SLIDE 26

Pointer-based Dual Views

26

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets 3

§ create E

Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E E

slide-27
SLIDE 27

Pointer-based Dual Views

27

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets 3

§ create E Ø unlink B

Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E E

slide-28
SLIDE 28

Pointer-based Dual Views

28

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets 3

§ create E Ø unlink B

Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E E

slide-29
SLIDE 29

Pointer-based Dual Views

29

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets 3

§ create E § unlink B

Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E E

slide-30
SLIDE 30

Pointer-based Dual Views

30

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets 3 Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E E Directory inode

1 2 3 4 … § create E § unlink B

slide-31
SLIDE 31

Pointer-based Dual Views

31

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets 3 Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E E

§ create E § unlink B

Directory VFS inode Directory inode

1 2 3 3

slide-32
SLIDE 32

Pointer-based Dual Views

32

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM

Latest Next Consistent Next

C B A

inode Filename

Latest Buckets 3 Dir File A File B File C File D Dir File A File B File C File D

Latest View Consistent View

File E E

§ create E § unlink B

slide-33
SLIDE 33

§ create E § unlink B

Pointer-based Dual Views

33

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM Dir File A File B File C File D

Latest Next Consistent Next

C B A

inode Filename

Dir File A File B File C File D

Latest View Consistent View

File E Latest Buckets 3 E

slide-34
SLIDE 34

§ create E § unlink B

Pointer-based Dual Views

34

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM Dir File A File B File C File D

Latest Next Consistent Next

C B A

inode Filename

Dir File A File B File C File D

Latest View Consistent View

File E Latest Buckets 3 E File E

slide-35
SLIDE 35

§ create E § unlink B

Pointer-based Dual Views

35

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM Dir File A File B File C File D

Latest Next Consistent Next

C B A

inode Filename

Dir File A File B File C File D

Latest View Consistent View

File E Latest Buckets 3 E File E

slide-36
SLIDE 36

§ create E § unlink B

Pointer-based Dual Views

36

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM Dir File A File B File C File D

Latest Next Consistent Next

C B A

inode Filename

Dir File A File B File C File D

Latest View Consistent View

File E Latest Buckets 3 E File E

slide-37
SLIDE 37

§ create E § unlink B

Pointer-based Dual Views

37

Buckets Directory inode D 1 2 3 4 … Directory VFS inode Volatile in DRAM Updates to NVM w/o persistence guarantee Persisted in NVM Dir File A File C File D

Latest Next Consistent Next

C A

inode Filename

Dir File A File C File D

Latest View Consistent View

File E Latest Buckets 3 E File E

slide-38
SLIDE 38

Pointer-based Dual Views

Reuse data structures in both views Distinguish views by different pointers/structures ü Eliminate synchronous writes ü Provide usability after crash ü No double write ü Little space overhead

38

NVM DRAM Soft Updates on NVM

slide-39
SLIDE 39

Overview

Background Design & Implementation ü Hashtable-based directories ü Pointer-based dual views § Semantic-aware dependency tracking/enforcement Evaluation Conclusion

39

slide-40
SLIDE 40

Dependency Tracking

Auxiliary structures for each updates

40

slide-41
SLIDE 41

Dependency Tracking

Auxiliary structures for each updates The semantic gap between the page cache (where enforcement happens) and the file system (where tracking happens) After removing page cache, SoupFS involves semantics in dependency tracking/enforcement

41

slide-42
SLIDE 42

Semantic-aware Dependency Tracking

Track semantic operations with complementary information § Enough for dependency enforcement

42

Operation Type Complementary Information (pointers/integers) diradd added dentry, source directory∗, overwritten inode∗ dirrem removed dentry, destination directory∗ sizechg the old and new file size attrchg nothing

Information tagged with ∗ is for rename operation.

slide-43
SLIDE 43

Semantic-aware Dependency Tracking

Track semantic operations with complementary information § Enough for dependency enforcement Operations are stored in operation list of each VFS inode

43

dirty inode list VFS inode list next

  • peration

list list next

  • peration type

Complimentary information VFS inode list next

  • peration

list VFS inode list next

  • peration

list

slide-44
SLIDE 44

Semantic-aware Dependency Enforcement

Persister daemons traverse the dirty inode list in background § persist each operation from the latest view to the consistent view with respect to update dependencies

44

dirty inode list VFS inode list next

  • peration

list list next

  • peration type

Complimentary information VFS inode list next

  • peration

list VFS inode list next

  • peration

list

slide-45
SLIDE 45

Overview

Background Design & Implementation ü Hashtable-based directories ü Pointer-based dual views ü Semantic-aware dependency tracking/enforcement Evaluation Conclusion

45

slide-46
SLIDE 46

Evaluation Setup

Platform

§ Intel Xeon E5 server with two 8-core processors § 48 GB DRAM and 64 GB NVDIMM

File Systems

§ SoupFS, PMFS, NOVA, Ext4-DAX, Ext4

NVM Write Delay Simulation

§ ndelay() after clflush

Benchmarks

§ Micro-benchmarks: 100 iterations of 104 create/unlink/mkdir/rmdir § Filebench and Postmark

46

slide-47
SLIDE 47

57.5 57.7 5 10 15 20 create unlink mkdir rmdir Latency (us/op) Ext4 Ext4-DAX PMFS NOVA SoupFS

Micro-benchmark Latency

47

Inefficient Directory Organization

Lowest Latency

0.2 0.4 0.6 0.8 1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 CDF Latency (us) EXT4 Ext4-DAX PMFS NOVA SoupFS

slide-48
SLIDE 48

5 10 15 200 400 600 800 Latency(us) Delay (ns)

Create

NOVA SoupFS 4 8 12 16 20 200 400 600 800 Latency (us) Delay (ns)

Unlink

PMFS NOVA SoupFS

Sensitivity to NVM Write Delay

48

No effect

55 60 65 70 PMFS

↑~200% ↑~250%

slide-49
SLIDE 49

50 100 150 200 250 300 350 Read Write Throughput (MB/s)

Postmark

Ext4 Ext4-DAX PMFS NOVA SoupFS

Postmark & Filebench

49

↑~50%

200 400 600 800 1000 1200 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Throughput (x1000 ops/s) Threads

Fileserver-1K

Ext4 Ext4-DAX PMFS NOVA SoupFS

slide-50
SLIDE 50

Overview

Background Design & Implementation ü Hashtable-based directories ü Pointer-based dual views ü Semantic-aware dependency tracking/enforcement Evaluation Conclusion

50

slide-51
SLIDE 51

Conclusion

§ Soft updates is complicated due to the mismatch between per-pointer-based dependency tracking and block-based interface of traditional disks § We design and implement SoupFS ü Hashtable-based directories ü Pointer-based dual views ü Semantic-aware dependency tracking/enforcement § Soft updates can be made simple and fast on NVM

51

Thanks & Questions? ;-)

slide-52
SLIDE 52

52