Direct-FUSE: Removing the Middleman for High-Performance FUSE File - - PowerPoint PPT Presentation

direct fuse removing the middleman for high performance
SMART_READER_LITE
LIVE PREVIEW

Direct-FUSE: Removing the Middleman for High-Performance FUSE File - - PowerPoint PPT Presentation

Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support Yue Zhu*, Teng Wang*, Kathryn Mohror + , Adam Moody + , Kento Sato + , Muhib Khan*, Weikuan Yu* Florida State University* Lawrence Livermore National Laboratory +


slide-1
SLIDE 1

Direct-FUSE: Removing the Middleman for High-Performance FUSE File System Support

Yue Zhu*, Teng Wang*, Kathryn Mohror+, Adam Moody+, Kento Sato+, Muhib Khan*, Weikuan Yu* Florida State University*

Lawrence Livermore National Laboratory+

slide-2
SLIDE 2

PDSW-WIP’17 S-2

Introduction

n An efficient file system is important for high-performance

computing (HPC) systems in supporting large-scale scientific applications.

Ø User-level file systems are more designed for particular I/O workloads

with special-purpose, due to development complexity, reliability, and portability.

Ø Different file systems are used for different kinds of data in a single job.

n Filesystem in Userspace (FUSE)

Ø A software interface for Unix-like computer operating systems. Ø It allows non-privileged users to create their own file systems without

editing kernel code.

Ø User defined file system run as a separate process in user-space.

slide-3
SLIDE 3

PDSW-WIP’17 S-3

Breakdown of Metadata & Data Latency

n The create() and close(), and write() are taken as examples to

show the percentage of real operation time in a complete FUSE metadata and data operation, respectively.

Ø Tests are on tmpfs and FUSE-tmpfs. Ø Real Operation in metadata operation: the time of conducting operation. Ø Data Movement: the actual time of write in a complete write function call. Ø Overhead: the cost besides the above two, e.g. the time of context switches.

100 200 300 400 500 600 1 4 16 64 128 256 Latency (ns) Transfer Sizes (KB) Data Movement Overhead

34.8% 33.7% 37.86% 15.82% 10.08% 38.12%

50 100 150 200 250 Latency (ns) Metadata Operations Real Operation Overhead Create Close

11.18% 2.17% tmpfs FUSE-tmpfs tmpfs FUSE-tmpfs

  • Fig. 1. The time Expense in Metadata Operations
  • Fig. 2. The time Expense in Data Operations
slide-4
SLIDE 4

PDSW-WIP’17 S-4

The Overview of Direct-FUSE

n Direct-FUSE contains the

adapted libsysio, lightweight- libfuse, and backend services.

Ø Adapted-libsysio

  • Support multiple backends

Ø lightweight-libfuse

  • Not real libfuse
  • Exposes file system operation to

under layer backend services with supporting FUSE library.

Ø Backend services

  • Provide defined FUSE operations.

Application Program adapted-libsysio Direct-FUSE Backend Services lightweight-libfuse

FUSE-ext4

SSHFS FusionFS … FTPFS GlusterFS

slide-5
SLIDE 5

PDSW-WIP’17 S-5

Sequential Write Bandwidth

n The bandwidth of Direct-FUSE is very close to the

native file system.

Ø Ext4(tmpfs)-fuse: FUSE file system overlying Ext4 (tmpfs);

Ext4(tmpfs)-direct: Direct-FUSE on Ext4 (tmpfs); Ext4(tmpfs)- native: original Ext4 (tmpfs).

Ø Ext4-direct outperforms Ext4-fuse by 11.9% on average Ø tmpfs-direct outperforms tmpfs-fuse at least 2.26x.

1 10 100 1000 10000 4 16 64 256 1024 Bandwidth (MB/s) Write Transfer Sizes (KB) Ext4-fuse Ext4-direct Ext4-native tmpfs-fuse tmpfs-direct tmpfs-native

slide-6
SLIDE 6

PDSW-WIP’17 S-6

Sponsors of This Research