Supporting Transactions for Bulk NFSv4 Compounds 13 th ACM - - PowerPoint PPT Presentation

supporting transactions for bulk nfsv4 compounds
SMART_READER_LITE
LIVE PREVIEW

Supporting Transactions for Bulk NFSv4 Compounds 13 th ACM - - PowerPoint PPT Presentation

Supporting Transactions for Bulk NFSv4 Compounds 13 th ACM International Systems and Storage Conference (SYSTOR 2020) Wei Su 1 , Akshay Aurora 1 , Ming Chen 2 , Erez Zadok 1 1 Stony Brook University; 2 Google October 14, 2020 Supporting


slide-1
SLIDE 1

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 1 October 14, 2020

13th ACM International Systems and Storage Conference (SYSTOR 2020)

Wei Su1, Akshay Aurora1, Ming Chen2, Erez Zadok1

1Stony Brook University; 2Google

Supporting Transactions for Bulk NFSv4 Compounds

slide-2
SLIDE 2

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 2 October 14, 2020

Background: Vectorized NFS

  • Ideal utilization of compounding: Writing multiple files

in one compound request

NFS Client NFS Server Application 5 1 SEQUENCE; PUTROOTFH; LOOKUP “etc”; GETFH; GETATTR; SAVEFH; OPEN “passwd”; WRITE 1800 47; CLOSE; GETFH; GETATTR; RESTOREFH; OPEN “group”; WRITE 878 11; CLOSE; GETFH; GETATTR; RESTOREFH; OPEN “shadow”; WRITE 1170 124; CLOSE; GETFH; GETATTR; RESTOREFH; Status codes, file handles, and file attributes Vectorized File-system API

vec_write( [‘/etc/passwd’, ‘/etc/group’, ‘/etc/shadow’ ], ... )

slide-3
SLIDE 3

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 3 October 14, 2020

Background: Vectorized NFS

  • Performance evaluation: Metadata intensive workload

Recursive listing, symlink, and removal

16~259⨉ LAN WAN 7~106⨉ 2.5~12⨉ 2.5⨉ 259⨉

slide-4
SLIDE 4

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 4 October 14, 2020

Motivation

  • NFSv4 introduces “Compound” procedure

Clients can pack multiple NFS operations in one “compound”

This amortizes network latency and improves I/O throughput

Compounding speeds up NFS I/O by up to 2 orders of magnitude

  • Challenge to client’s error handling

If an operation in a compound fails

NFS server only reports the error, but does not rollback

If the server crashed when executing a compound

Nothing will be done when it restarts

Difficulty for applications to handle errors ▪ Any operation may fail, and crash may occur anytime ▪ Hard to restore to initial state for a failed large compound

slide-5
SLIDE 5

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 5 October 14, 2020

Design Overview

① Compound request reaches TCNFS server

Vectorized NFSv4 Client Write LFH1; LFH3;⋯⋯

PUTFH NFH1; WRITE; PUTFH NFH2; WRITE; … ...

Compound Request #23

Files in backup directory#23 Transaction Layer

<LFH1,NFH1> <LFH3,NFH2> … <NFH1,LFH1> <NFH1, Path1> <NFH2,LFH3> <NFH2, Path2> … <RR#23,...>

Original files

TCNFS Server

Metadata Database

Clone

① ② ③ ④ ⑤ ⑥

② TCNFS writes the compound request into metadata database as a Recovery Record (RR) ③ TCNFS backs up data blocks of files that will be changed by the compound request ④ TCNFS executes the operations ⑤ TCNFS removes backup data ⑥ TCNFS removes the recovery record

slide-6
SLIDE 6

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 6 October 14, 2020

Design: Error Handling

  • In case of an error…

◆ TCNFS reverses previously executed operations

Vectorized NFSv4 Client

Files in backup directory#23 Transaction Layer

Write LFH1; LFH3;⋯⋯

<LFH1,NFH1> <LFH3,NFH2> … <NFH1,LFH1> <NFH1, Path1> <NFH2,LFH3> <NFH2, Path2> … <RR#23,...>

PUTFH NFH1; WRITE; PUTFH NFH2; WRITE; … ...

Compound Request #23

Original files

TCNFS Server

Metadata Database

Clone

① ② ③ ④ ⑤ ⑥

  • In case of a server crash…

◆ The recovery record will be present in the metadata database ◆ TCNFS will parse the recovery record to retrieve the failed compound request ◆ TCNFS reverses the partially done compound request

slide-7
SLIDE 7

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 7 October 14, 2020

Prototype Architecture (1)

  • Lock Manager

◆ Coordinates multi-client conflicting access

  • Backup Manager

◆ Creates and cleans up backups

  • Undo Executor

◆ Reverts partially executed compounds due to failure

  • Metadata Translator

◆ Mappings between NFS file handle and local file handle

Metadata Database RPC + Protocol Layer (NFS v4) MDCACHE: Metadata Cache File System Abstraction Layer TC-NFS Transaction Layer VFS: File System Wrapper

Backup Manager Undo Executor Metadata Translator

Offline Undo Executor Transaction Logger Virtual File System CoW-enabled File System: XFS, btrfs Networking (TCP/IP) User Kernel

System Call/ioctl

NFS Ganesha Vectorized NFSv4 API

Lock Manager

SSD with Power-loss Protection

slide-8
SLIDE 8

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 8 October 14, 2020

Prototype Architecture (2)

  • Transaction Logger

◆ Creates and cleans up the Recovery Records

  • Offline Undo Executor

◆ Reverts partially executed compounds due to server crash

  • CoW-enabled File System

◆ Use CoW to create backups to reduce I/O overhead

  • SSD with Power Protection

◆ Ensures endurance and reduces the latency of fsync()

Metadata Database RPC + Protocol Layer (NFS v4) MDCACHE: Metadata Cache File System Abstraction Layer TC-NFS Transaction Layer VFS: File System Wrapper

Backup Manager Undo Executor Metadata Translator

Offline Undo Executor Transaction Logger Virtual File System CoW-enabled File System: XFS, btrfs Networking (TCP/IP) User Kernel

System Call/ioctl

NFS Ganesha Vectorized NFSv4 API

Lock Manager

SSD with Power-loss Protection

slide-9
SLIDE 9

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 9 October 14, 2020

Experimental Setup

  • 3 identical machines, 1 Server + 2 Clients

◆ Each client machine runs 4 KVM virtual machines ◆ Each VM runs Ubuntu 18.04 and one vNFS/NFSv4 client

  • CPU: Intel Xeon X5650
  • RAM: 64GB
  • Storage

◆ 147GB hard drive for system disk (ext4) ◆ 200GB Intel DC-S3700 SSD for server’s TC-NFS backend storage (XFS)

  • Network

◆ 10GbE NIC connected via 10GbE switch ◆ average RTT = 0.2ms

  • OS: Ubuntu 18.04 with Linux Kernel v4.15
slide-10
SLIDE 10

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 10 October 14, 2020

Micro-Benchmark: Writefiles

  • Writefiles (Multi-client)

◆ Write 1,000 fixed-size files from 1K to 16M in parallel ◆ 1~8 clients, 0.2ms network latency

Large files (⩾ 256K) 4.1~20% Overhead Small files (⩽ 128K) 36%~26⨉ Overhead

slide-11
SLIDE 11

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 11 October 14, 2020

Explore the Bottleneck

  • Local “Writefiles” Workload Simulation

◆ Concurrently writes 1,000 equal-size files locally to the SSD using 1~8 threads repeatedly for 30s ◆ fsync() is called after writing each file to simulate the behavior

  • f the NFSv4 server (NFS-Ganesha)

Two types of workload

Interleaving-backup: Create backup for the target file before each write() operation using Copy-on-Write cloning

No-backup: Only do write() and fsync(), no backups

slide-12
SLIDE 12

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 12 October 14, 2020

Exploring the Bottleneck

  • Solid lines: No-backup; Dashed lines: Interleaving-backup
  • No-backup (NB) workload scales well with number of threads
  • Interleaving-backup (IB) workload does not scale or become worse
  • This reproduces the bad scalability of TC-NFS

Speedup Ratio at 8Th NB: 2.0~4.6⨉ Speedup Ratio at 8Th IB: 0.36~0.95⨉

slide-13
SLIDE 13

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 13 October 14, 2020

Macro-Benchmark: Coreutils

  • Test target: Linux kernel 4.20.7 source tree

62,447 regular files (Average size: 14.9 KB)

4,148 directories (Average 15 children per directory

  • Single-client, varied network latency between 0.2ms to 30.2ms
  • Baseline: vNFS Client + Vanilla NFSv4 Server; Measured total runtime

Symlink: 8.8~50% Overhead Removal: 7.9~42% Overhead Copy: 7.9~35% Overhead Listing: 1.1~18% Overhead

⩽10% 35~50%

slide-14
SLIDE 14

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 14 October 14, 2020

Conclusions

  • Provides transaction support for NFSv4 compounds

Makes error handling easier for applications

Currently supports the following operations: OPEN, CREATE, WRITE, LINK, REMOVE and simple RENAME (non-directories)

  • Introduces modest overhead to single-client workloads

and real-world applications

Considering the improvement vNFS provides, vNFS Client + TC-NFS is still much faster than traditional NFSv4 system

  • Higher overhead in multi-client workloads

This is because CoW cloning + synced writes are slow on XFS

Will be resolved once the CoW feature is optimized

slide-15
SLIDE 15

Supporting Transactions for Bulk NFSv4 Compounds (ACM SYSTOR 2020) 15 October 14, 2020

13th ACM International Systems and Storage Conference (SYSTOR 2020)

Wei Su1, Akshay Aurora1, Ming Chen2, Erez Zadok1

1Stony Brook University; 2Google

Supporting Transactions for Bulk NFSv4 Compounds

Q&A

Paper: https://www.fsl.cs.sunysb.edu/docs/nfs4perf/tcnfs-systor2020.pdf Project Source Code: https://github.com/sbu-fsl/fsl-tc-server