Client-Side Direct I/O for NFS Mike Kupfer kupfer@Eng.Sun.COM 28 - - PDF document

client side direct i o for nfs
SMART_READER_LITE
LIVE PREVIEW

Client-Side Direct I/O for NFS Mike Kupfer kupfer@Eng.Sun.COM 28 - - PDF document

Client-Side Direct I/O for NFS Mike Kupfer kupfer@Eng.Sun.COM 28 February 1997 1 Client-Side Direct I/O for NFS Connectathon 1997 Disclaimer This is not a product announcement. 2 Client-Side Direct I/O for NFS Connectathon 1997


slide-1
SLIDE 1

1 Client-Side Direct I/O for NFS Connectathon 1997

Client-Side Direct I/O for NFS

Mike Kupfer

kupfer@Eng.Sun.COM

28 February 1997

slide-2
SLIDE 2

2 Client-Side Direct I/O for NFS Connectathon 1997

Disclaimer This is not a product announcement.

slide-3
SLIDE 3

3 Client-Side Direct I/O for NFS Connectathon 1997

Overview

  • Background
  • Changes
  • Performance Results
  • Future Work, Issues
slide-4
SLIDE 4

4 Client-Side Direct I/O for NFS Connectathon 1997

Background

slide-5
SLIDE 5

5 Client-Side Direct I/O for NFS Connectathon 1997

The Benchmark what

  • sequential I/O: mkfile a 60 MB file, then

dd it to /dev/null

  • unmounts on client and server to flush

caches

why

  • LADDIS (SPEC SFS) doesn’t measure

client

  • LADDIS measures aggregate, not point-to-

point

  • expect 6+ MB/s on SS10/20 with

FastEthernet, only getting 5 MB/s (up from 3.4 MB/s)

slide-6
SLIDE 6

6 Client-Side Direct I/O for NFS Connectathon 1997

Direct I/O

  • bypass page cache
  • best for large files, no locality of

reference

  • avoid page cache overhead
  • avoid polluting page cache
  • UFS Direct I/O project in 2.6
  • databases, decision support software
  • might help NFS server; what about

client?

slide-7
SLIDE 7

7 Client-Side Direct I/O for NFS Connectathon 1997

Direct I/O (cont’d)

  • SGI’s Bulk Data Service
  • O_DIRECT flag combined with NFS file
  • stuff bytes into a TCP socket connection
  • 60 MB/s over HIPPI (March 1996)
  • uses private protocol, requires client

and server changes

slide-8
SLIDE 8

8 Client-Side Direct I/O for NFS Connectathon 1997

Changes

slide-9
SLIDE 9

9 Client-Side Direct I/O for NFS Connectathon 1997

Overview of Changes

  • API support: make look like UFS
  • add array of buffers to rnode
  • kmem_alloc, kmem_free buffers as

needed

  • use buffers instead of VM segment
  • keep the pipe full
  • use readahead and write-behind
  • large transfer sizes
  • safe asynchronous writes
  • transparent to server except for larger

transfer size

slide-10
SLIDE 10

10 Client-Side Direct I/O for NFS Connectathon 1997

Client Structure

nfs3_read, nfs3_write nfs3read, nfs3write VM (page cache) code direct I/O code (including VNOCACHE) async threads

slide-11
SLIDE 11

11 Client-Side Direct I/O for NFS Connectathon 1997

Performance Results

slide-12
SLIDE 12

12 Client-Side Direct I/O for NFS Connectathon 1997

Issues, Future Work

slide-13
SLIDE 13

13 Client-Side Direct I/O for NFS Connectathon 1997

Issues

  • to productize or not to productize
  • verify on UltraSPARC, other

benchmarks

  • API for determining transfer size
  • tuning
  • how many buffers
  • when to issue COMMIT
  • MT support too hairy?
  • less arcane scheme for iterating over

buffers

slide-14
SLIDE 14

14 Client-Side Direct I/O for NFS Connectathon 1997

Things To Do

  • failover support
  • cache management, error handling
  • make direct I/O consistent with VM-

based code (such as it is)

  • misc. cleanup
  • API for enabling/disabling direct I/O
  • code organization
  • plug into kmem reclaim logic
  • coexistence with mmap
  • etc.
slide-15
SLIDE 15

15 Client-Side Direct I/O for NFS Connectathon 1997

Futures

  • application-directed readahead?
  • page flipping?
  • server-side direct I/O
  • assume client cache takes most hits for

NFS

  • use UFS direct I/O
slide-16
SLIDE 16

16 Client-Side Direct I/O for NFS Connectathon 1997

Conclusions

  • bypassing page cache is a win for

sequential access, no locality of reference

  • the win gets bigger if the file doesn’t fit in

memory

  • keeping the pipe full is more work, but

necessary