Direct-FUSE: A User-level File System with Multiple Backends Yue - - PowerPoint PPT Presentation

direct fuse a user level file system with multiple
SMART_READER_LITE
LIVE PREVIEW

Direct-FUSE: A User-level File System with Multiple Backends Yue - - PowerPoint PPT Presentation

Direct-FUSE: A User-level File System with Multiple Backends Yue Zhu yzhu@cs.fsu.edu Florida State University Outline Background & Motivation The Overview of Direct-FUSE Performance Evaluation Conclusions S-2 User Space vs.


slide-1
SLIDE 1

Direct-FUSE: A User-level File System with Multiple Backends

Yue Zhu yzhu@cs.fsu.edu Florida State University

slide-2
SLIDE 2

S-2

Outline

Ø Background & Motivation Ø The Overview of Direct-FUSE Ø Performance Evaluation Ø Conclusions

slide-3
SLIDE 3

S-3

User Space vs. Kernel-level File Systems

Ø The development complexity, reliability, and portability of kernel-level and user space file systems are different.

Kernel-level File System User-level File System Development Complexity 1) System crushing and restarting during debugging. 2) Language limitation. 1) Few system crushing and restarting during debugging. 2) Numerous user-space tools for debugging. 3) Less language limitation and more useful libraries. 4) Systems can be mounted and developed by non-privileged users. Reliability 1) A kernel bug crashes entire production system 1) Lower kernel crash possibility Portability 1) Significant efforts for porting a special file system to a different one 1) Easy to port to other systems

slide-4
SLIDE 4

S-4

Filesystem in Userspace

Ø What is Filesystem in Userspace (FUSE) ?

– A software interface for Unix-like computer operating systems. – Non-privileged users can create their own file systems without editing kernel code. – However, the FUSE kernel module is needed to be pre-installed by system administrator. – Example:

  • SSHFS: a file system client that interacts with directories and files on the

remote server over ssh connection.

  • FusionFS (BigData’14): a distributed file system, which supports metadata-

intensive and write-intensive operations.

  • IndexFS Client (SC’14): the client of IndexFS, which redirects applications’

file operations to the appropriate destination.

slide-5
SLIDE 5

S-5

How does FUSE File System Work?

Ø Execution path of a function call

1. Send the request to the user-level file system

  • App program → VFS → FUSE kernel module → User-level file system

process

2. Return the data back to the application program

  • User-level file system process → FUSE kernel module → VFS → App

program

Application Program User Level File System Virtual File System (VFS) In-Built File System Storage Device 1 6 3 4 5 2 User Space Kernel Space FUSE Kernel Modules

slide-6
SLIDE 6

S-6

FUSE File System vs. Native File System

Ø Overheads in FUSE file systems

– 4 user-kernel mode switches

  • App ↔ kernel
  • Kernel ↔ file system process

– 2 context switches

  • App ↔ file system process

– 2 or 3 memory copies

  • Write: App → page cache → file system

process → page cache (if made to native file system)

Ø Overhead in native file system (Ext4)

– 2 user-kernel mode switches

  • App ↔ kernel

– 0 context switches – 1 memory copy

  • Write: App → page cache

Application Program User Level File System Virtual File System (VFS) In-Built File System Storage Device 1 6 3 4

5 2

User Space Kernel Space FUSE Kernel Modules

slide-7
SLIDE 7

S-7

Number of Context Switches & I/O Bandwidth

Ø Measuring the number of context switches and bandwidth in FUSE file system and a native file system.

– dd microbenchmark and perf are used in the tests. – FUSE-tmpfs is a FUSE file system deployed on top of tmpfs, and mounted with tuned option values.

Block Size (KB) FUSE-tmpfs Throughput (MB/s) FUSE-tmpfs # Context Switches tmpfs Throughput (GB/s) tmpfs # Context Switches 4 163 1012 1.3 7 16 372 1012 1.6 7 64 519 1012 1.7 7 128 549 1012 2.0 7 256 569 2012 2.4 7 1024 576 8012 2.5 7

slide-8
SLIDE 8

S-8

Breakdown of Metadata Operation Latency

Ø The create() and close() latency on tmpfs and FUSE-tmpfs.

– Real Operation: the time in the conducting operation (the actual create or close time). – Overhead: the cost besides the real operation, e.g., the involvement

  • f FUSE kernel module.

Ø The real operation time only consists of a small portion of a complete FUSE function call. 50 100 150 200 250 Latency (µs) Real Operation Overhead

Create Close

11.18% 2.17%

tmpfs FUSE-tmpfs tmpfs FUSE-tmpfs

slide-9
SLIDE 9

S-9

Breakdown of Data Operation Latency

Ø The write latency on tmpfs and FUSE-tmpfs

– Data Movement: the actual write operation in a complete write function call. – Overhead: the cost besides the data movement.

Ø The data movement time only consists of a small portion of a complete FUSE I/O call.

100 200 300 400 500 600 1 4 16 64 128 256 Latency (µs) Transfer Sizes (KB) Data Movement Overhead

34.8% 33.7% 37.86% 10.08% 15.82% 38.21%

slide-10
SLIDE 10

S-10

Desirable Objectives

Ø Some file systems, such as TableFS (USENIX’13), are leveraged as libraries to avoid the involvement of FUSE kernel module.

– However, this approach may not support multiple FUSE libraries with distinct file paths and file descriptors.

Ø We propose Direct-FUSE to provide multiple backend services for one application without going through the FUSE kernel.

– To reduce the overheads from FUSE modules, we adopt libsysio for providing an FUSE clients service without going through kernels. – Libsysio

  • developed by the Scalable I/O team at Sandia National Laboratories.
  • POSIX-like file I/O
  • Name space support for file systems from the application’s user-level

address space.

slide-11
SLIDE 11

S-11

Outline

Ø Background & Motivation Ø The Overview of Direct-FUSE Ø Performance Evaluation Ø Conclusions

slide-12
SLIDE 12

S-12

The Overview of Direct-FUSE

Ø Direct-FUSE‘s components includes the adopted-libsysio, lightweight-libfuse, and backend services.

– Adopted-libsysio

  • Distinguishes file path and

descriptor for different backends.

– lightweight-libfuse

  • Not real libfuse
  • Exposes file system operation to

under layer backend services.

– Backend services

  • Provide defined file system
  • perations.
slide-13
SLIDE 13

S-13

Path and File Descriptor Operations

Ø To support multiple FUSE backends, file system operations are divided into two categories: path operations and file descriptor operations.

– Path

  • Apply a prefix for the path. (sshfs:/sshfs/test.txt)
  • Intercept the prefix and path to return the mount information, which

contains the pointers to the defined operations.

  • When a new file is opened, the returned file descriptors of the backend is

mapped to a new file descriptor assigned by adopted-libsysio.

– File descriptor

  • file record is found by the file descriptor in the open file table.
  • file record contains pointers to the operations, current stream position,

etc.

slide-14
SLIDE 14

S-14

Requirements for New Backends

  • The file system operations work with paths and file names

instead of inodes. Ø A independent library which contains the fuse file system

  • perations, initialization function, and also the unmount function.

– If there is no existing library for the backend, we have to build the library by ourselves. – If there is a library for the backend, we have to wrap its APIs and provide the initialization function.

Ø No user data is passed to FUSE module via fuse_mount() function.

– If the file system passes the user data via fuse_mount() when mount, then additional efforts are needed to globalize the user data for other file system operations.

Ø Implemented in C or C++.

slide-15
SLIDE 15

S-15

Outline

Ø Background & Motivation Ø The Overview of Direct-FUSE Ø Performance Evaluation Ø Conclusions

slide-16
SLIDE 16

S-16

Experimental Methodology

Ø We compare the bandwidth of Direct-FUSE with local FUSE file system and native file system on disk and RAM-disk by Iozone.

– Disk

  • Ext4-fuse: FUSE file system overlying Ext4.
  • Ext4-direct: Ext4-fuse bypasses the FUSE kernel.
  • Ext4-native: original Ext4 on disk.

– RAM-disk

  • Tmpfs-fuse, Tmpfs-direct, and Tmpfs-native are similar to the three tests
  • n disk.

Ø We also compare the I/O bandwidth of distributed FUSE file system with Direct-FUSE.

– FusionFS: a distributed file system that supports metadata-intensive and write-intensive operations.

Ø Breakdown Analysis of I/O Processing in Direct-FUSE

slide-17
SLIDE 17

S-17

Sequential Write Bandwidth

Ø The bandwidth of Direct-FUSE is very close to the native file system.

1 10 100 1000 10000 4 KB 16 KB 64 KB 256 KB 1 MB 4 MB 16 MB Bandwidth (MB/s) Transfer Sizes Ext4-fuse Ext4-direct Ext4-native tmpfs-fuse tmpfs-direct tmpfs-native

slide-18
SLIDE 18

S-18

Sequential Read Bandwidth

Ø Similar to the sequential write bandwidth, the read bandwidth

  • f Direct-FUSE is close to the native file system.

1 10 100 1000 10000 4 KB 16 KB 64 KB 256 KB 1 MB 4 MB 16 MB Bandwidth (MB/s) Transfer Sizes Ext4-fuse Ext4-direct Ext4-native tmpfs-fuse tmpfs-direct tmpfs-native

slide-19
SLIDE 19

S-19

I/O Bandwidth of FusionFS

Ø According the figure, doubling the number of nodes yields doubled throughput both in read and write, which demonstrates the almost linear scalability of FusionFS and Direct-FUSE to up to 16 nodes.

1 10 100 1000 10000 1 2 4 8 16 Bandwidth (MB/s) Number of Nodes fusionfs direct-fusionfs 1 10 100 1000 10000 1 2 4 8 16 Bandwidth (MB/s) Number of Nodes fusionfs direct-fusionfs

slide-20
SLIDE 20

S-20

Breakdown Analysis of I/O Processing in Direct-FUSE

Ø The dummy read/write only takes about 38 ns, which

  • ccupies less than 3% of the complete I/O function time in

Direct-FUSE, even when the I/O size is very small.

– Dummy write/read: no actual data movement, directly return after reaching the backend service. – Real write/read: the actual Direct-FUSE read and write I/O calls.

1 10 100 1000 10000 1B 4B 16B 64B 256B 1KB Latency (ns) Transfer Sizes dummy write real write 1 10 100 1000 10000 1B 4B 16B 64B 256B 1KB Latency (ns) Transfer Sizes dummy read real read

slide-21
SLIDE 21

S-21

Conclusions

Ø We have analyzed the additional overheads in the FUSE file system in detail. Ø To facilitate the multiple backend services, we propose the Direct-FUSE. Ø Our Direct-FUSE can largely reduce the overheads from FUSE kernel module, and support multiple FUSE backends simultaneously.

slide-22
SLIDE 22

S-22

Thank you!