SplitFS: Reducing Software Overhead in File Systems for Persistent - PowerPoint PPT Presentation

Outline • Target usage scenario • High-level design • Handling data operations • Handling file reads and updates • Handling file appends • Consistency guarantees • Evaluation � 15

Handling reads and updates Application U-Split User Kernel K-Split (ext4-DAX) File PM � 16

Handling reads and updates Application read / update U-Split User Kernel K-Split (ext4-DAX) File PM � 16

Handling reads and updates Application read / update U-Split User mmap Kernel K-Split (ext4-DAX) perform mmap File PM � 16

Handling reads and updates Application read / update DAX-mmaps U-Split User mmap Kernel K-Split (ext4-DAX) perform mmap File PM � 16

Handling reads and updates Application read / update DAX-mmaps U-Split User Kernel K-Split (ext4-DAX) File PM � 16

Handling reads and updates Application read / update DAX-mmaps U-Split User Kernel In the common case, file reads and updates do K-Split (ext4-DAX) not pass through the kernel File PM � 16

Outline • Target usage scenario • High-level design • Handling data operations • Handling file reads and updates • Handling file appends • Consistency guarantees • Evaluation � 17

Handling appends user kernel foo inode size = 10 foo Persistent Memory � 18

Handling appends Application Start user kernel foo inode size = 10 foo Persistent Memory � 18

Handling appends Application Start user kernel staging file inode foo inode size = 100 size = 10 foo staging file Persistent Memory � 18

Handling appends Application Start staging file mmap user kernel staging file inode foo inode size = 100 size = 10 foo staging file Persistent Memory � 18

Handling appends Application Start staging file mmap append (foo,“abc”) store user kernel staging file inode foo inode size = 100 size = 10 abc foo staging file Persistent Memory 18 �

Handling appends Application Start staging file mmap append (foo,“abc”) load user read (foo) kernel staging file inode foo inode size = 100 size = 10 abc foo staging file Persistent Memory 18 �

Handling appends Application Start staging file mmap append (foo,“abc”) user read (foo) kernel fsync (foo) staging file inode foo inode size = 100 size = 10 abc foo staging file Persistent Memory 18 �

Handling appends Application Start staging file mmap append (foo,“abc”) user read (foo) kernel relink() fsync (foo) staging file inode foo inode size = 100 size = 10 abc foo staging file Persistent Memory 18 �

Handling appends Application Start staging file mmap append (foo,“abc”) user read (foo) kernel relink() fsync (foo) staging foo staging file inode foo inode ext4-journal transaction size = 100 size = 10 abc foo staging file Persistent Memory 18 �

Handling appends Application Start staging file mmap append (foo,“abc”) user read (foo) kernel relink() fsync (foo) staging foo staging file inode foo inode ext4-journal transaction size = 100 size = 10 In the common case, file appends do not pass through the kernel abc foo staging file Persistent Memory 18 �

Outline • Target usage scenario • High-level design • Handling data operations • Consistency guarantees • Evaluation � 19

Consistency Guarantees Metadata Synchronous Data Mode File System Atomicity Operations Atomicity ext4-DAX, POSIX SplitFS-POSIX PMFS, Sync SplitFS-Sync NOVA, Strata, Strict SplitFS-Strict � 20

Consistency Guarantees Metadata Synchronous Data Mode File System Atomicity Operations Atomicity ext4-DAX, POSIX SplitFS-POSIX Optimized logging is used in order to provide PMFS, Sync SplitFS-Sync stronger guarantees in sync and strict modes NOVA, Strata, Strict SplitFS-Strict � 20

Optimized logging � 21

Optimized logging SplitFS employs a per-application log in sync and strict mode, which logs every logical operation � 21

Optimized logging SplitFS employs a per-application log in sync and strict mode, which logs every logical operation In the common case • Each log entry fits in one cache line • Persisted using a single non-temporal store and sfence instruction � 21

Flexible SplitFS App 2 App 3 App 1 User Kernel K-Split (ext4-DAX) PM File 1 File 2 File 3 File 4 � 22

Flexible SplitFS App 2 App 3 App 1 U-Split- U-Split- U-Split- strict sync POSIX User Kernel K-Split (ext4-DAX) PM File 1 File 2 File 3 File 4 � 22

Flexible SplitFS App 2 App 3 App 1 Data Meta Data Meta Data Meta U-Split- U-Split- U-Split- strict sync POSIX User Kernel K-Split (ext4-DAX) PM File 1 File 2 File 3 File 4 � 22

Visibility � 23

Visibility When are updates from one application visible to another? � 23

Visibility When are updates from one application visible to another? • All metadata operations are immediately visible to all other processes � 23

Visibility When are updates from one application visible to another? • All metadata operations are immediately visible to all other processes • Writes are visible to all other processes on subsequent fsync() � 23

Visibility When are updates from one application visible to another? • All metadata operations are immediately visible to all other processes • Writes are visible to all other processes on subsequent fsync() • Memory mapped files have the same visibility guarantees as that of ext4-DAX � 23

SplitFS Techniques Technique Benefit � 24

SplitFS Techniques Technique Benefit Low-overhead data operations, SplitFS Architecture Correct metadata operations � 24

SplitFS Techniques Technique Benefit Low-overhead data operations, SplitFS Architecture Correct metadata operations Optimized appends, Staging + Relink No data copy � 24

SplitFS Techniques Technique Benefit Low-overhead data operations, SplitFS Architecture Correct metadata operations Optimized appends, Staging + Relink No data copy Optimized Logging + out-of-place writes Stronger guarantees � 24

Outline • Target usage scenario • High-level design • Handling data operations • Consistency guarantees • Evaluation � 25

Evaluation � 26

Evaluation Setup: • 2-socket 96-core machine with 32 MB LLC • 768 GB Intel Optane DC PMM, 378 GB DRAM � 26

Evaluation Setup: • 2-socket 96-core machine with 32 MB LLC • 768 GB Intel Optane DC PMM, 378 GB DRAM File systems compared: • ext4-DAX, PMFS, NOVA, Strata � 26

Evaluation Setup: • 2-socket 96-core machine with 32 MB LLC • 768 GB Intel Optane DC PMM, 378 GB DRAM File systems compared: • ext4-DAX, PMFS, NOVA , Strata � 26

Does SplitFS reduce software overhead compared to other file systems? How does SplitFS perform on data intensive workloads? How does SplitFS perform on metadata intensive workloads? � 27

Does SplitFS reduce software overhead compared to other file systems? How does SplitFS perform on data intensive workloads? How does SplitFS perform on metadata intensive workloads? • < 15% overhead for metadata intensive workloads � 27

Software Overhead of SplitFS • Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns 9002 10000 (12x) 8000 Time (ns) 6000 4150 (5x) 3021 4000 (3x) 2450 (2.5x) 2000 700 0 device SplitFS-strict Strata NOVA PMFS ext4-DAX � 28

Software Overhead of SplitFS • Append 4KB data to a file • Time taken to copy user data to PM: ~700 ns 9002 10000 (12x) 8000 Time (ns) 6000 4150 (5x) 3021 4000 (3x) 2450 (2.5x) 1251 2000 (0.8x) 700 0 device SplitFS-strict Strata NOVA PMFS ext4-DAX � 28

Workloads Seq writes Microbenchmarks Seq reads Appends Rand reads Rand writes YCSB on LevelDB Data intensive Redis TPCC on SQLite Metadata intensive Tar Git Rsync � 29

YCSB on LevelDB Yahoo! Cloud Serving Benchmark - Industry standard macro-benchmark Insert 5M keys. Run 5M operations. Key size = 16 bytes. Value size = 1K 2.5 Normalized throughput 2 1.5 NOVA 1 SplitFS-Strict 0.5 0 Load A Run A Run B Run C Run D Load E Run E Run F Load A - 100% writes Run D - 95% reads (latest), 5% writes Run A - 50% reads, 50% writes Load E - 100% writes Run B - 95% reads, 5% writes Run E - 95% range queries, 5% writes Run C - 100% reads Run F - 50% reads, 50% read-modify-writes � 30

YCSB on LevelDB Yahoo! Cloud Serving Benchmark - Industry standard macro-benchmark Insert 5M keys. Run 5M operations. Key size = 16 bytes. Value size = 1K 2.5 Normalized throughput 139.94 kops/s 174.85 kops/s 191.54 kops/s 32.24 kops/s 13.39 kops/s 66.54 kops/s 13.59 kops/s 17.75 kops/s 2 1.5 NOVA 1 SplitFS-Strict 0.5 0 Load A Run A Run B Run C Run D Load E Run E Run F Load A - 100% writes Run D - 95% reads (latest), 5% writes Run A - 50% reads, 50% writes Load E - 100% writes Run B - 95% reads, 5% writes Run E - 95% range queries, 5% writes Run C - 100% reads Run F - 50% reads, 50% read-modify-writes � 31

SplitFS: Reducing Software Overhead in File Systems for Persistent - PowerPoint PPT Presentation

SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan Kadekodi, Se Kwon Lee, Sanidhya Kashyap, Taesoo Kim, Aasheesh Kolli, Vijay Chidambaram on the job market 1 Persistent Memory (PM) Non-volatile Fast

File Management What is a file? Elements of file management File organization

File System Performance File System Performance Memory mapped files - Avoid system call overhead

Electric Traction Electrified railway systems Prof. Dr. Ir. R.P.B.J. Dollevoet Introduction

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

Low-Overhead System Tracing With eBPF Akshay Kapoor DevOps Engineer @ SAP Labs May 2018

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of

CPSC 410/611: File Management What is a file? Elements of file management

File Systems: Consistency Issues 1 File Systems: Consistency Issues File systems maintain many

Case 2: Reducing Cardiovascular Risk Type 2 Diabetes Management Case 1: Reducing Hypoglycemic

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

What if... There is no file with the name given to the File constructor: new File

CS333 Intro to Operating Systems Jonathan Walpole File System Performance File System

Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File

Engineering Streaming Algorithms Graham Cormode University of Warwick G.Cormode@Warwick.ac.uk

Gauge: An Interactive Data-Driven Visualization Tool for HPC Application I/O Performance Analysis

Building an Enterprise Grade PostgreSQL Using Open Source Tools and Extensions Avinash Vallarapu

SCALING YOUR LOGGING INFRASTRUCTURE USING SYSLOG-NG FOSDEM 2017 Peter Czanik / Balabit ABOUT

Reference Capabilities for Concurrency and Scalability An Experience Report Elias Castegren ,

Verilan Network Provider Update March, 2018 IEEE 802 Plenary Hyatt Regency OHare, Rosemont,

History and Baptism Brief History of RCC RCC can trace its history back to 1972 where a group of

Day 3 Long Tail SEO Google Analytics How Google Analytics can help with our Long Tail

SplitFS: Reducing Software Overhead in File Systems for Persistent - PowerPoint PPT Presentation

SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan Kadekodi, Se Kwon Lee, Sanidhya Kashyap*, Taesoo Kim, Aasheesh Kolli, Vijay Chidambaram * on the job market 1 Persistent Memory (PM) Non-volatile Fast

File Management What is a file? Elements of file management File organization

File System Performance File System Performance Memory mapped files - Avoid system call overhead

Electric Traction Electrified railway systems Prof. Dr. Ir. R.P.B.J. Dollevoet Introduction

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

CPSC 410/611: File Management What is a file? Elements of file management File

Week 10: File Management What is a file? Elements of file management File

Low-Overhead System Tracing With eBPF Akshay Kapoor DevOps Engineer @ SAP Labs May 2018

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

File Systems: Semantics &amp; Structure What is a File a file is a named collection of

CPSC 410/611: File Management What is a file? Elements of file management

File Systems: Consistency Issues 1 File Systems: Consistency Issues File systems maintain many

Case 2: Reducing Cardiovascular Risk Type 2 Diabetes Management Case 1: Reducing Hypoglycemic

~FILE SYSTEM~ SUNU WIBIRAMA OUTLINE FILE SYSTEM ACCESS METHODS DIRECTORY STRUCTURE FILE

What if... There is no file with the name given to the File constructor: new File

CS333 Intro to Operating Systems Jonathan Walpole File System Performance File System

Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File

Engineering Streaming Algorithms Graham Cormode University of Warwick G.Cormode@Warwick.ac.uk

Gauge: An Interactive Data-Driven Visualization Tool for HPC Application I/O Performance Analysis

Building an Enterprise Grade PostgreSQL Using Open Source Tools and Extensions Avinash Vallarapu

SCALING YOUR LOGGING INFRASTRUCTURE USING SYSLOG-NG FOSDEM 2017 Peter Czanik / Balabit ABOUT

Reference Capabilities for Concurrency and Scalability An Experience Report Elias Castegren ,

Verilan Network Provider Update March, 2018 IEEE 802 Plenary Hyatt Regency OHare, Rosemont,

History and Baptism Brief History of RCC RCC can trace its history back to 1972 where a group of

Day 3 Long Tail SEO Google Analytics How Google Analytics can help with our Long Tail

SplitFS: Reducing Software Overhead in File Systems for Persistent Memory Rohan Kadekodi, Se Kwon Lee, Sanidhya Kashyap, Taesoo Kim, Aasheesh Kolli, Vijay Chidambaram on the job market 1 Persistent Memory (PM) Non-volatile Fast

File Systems: Semantics & Structure What is a File a file is a named collection of

File Systems: Semantics & Structure What is a File a file is a named collection of