1 29 2016
play

1/29/2016 Introduction Introduction: NFS Appliance File System - PowerPoint PPT Presentation

1/29/2016 Introduction Introduction: NFS Appliance File System Design for an NFS File In general, appliance is device designed to NFS File Server Appliances have different Server Appliance perform specific function requirements than


  1. 1/29/2016 Introduction Introduction: NFS Appliance File System Design for an NFS File • In general, appliance is device designed to • NFS File Server Appliances have different Server Appliance perform specific function requirements than those of general purpose • Distributed systems trend has been to use file system Dave Hitz, James Lau, and Michael Malcolm appliances instead of general purpose computers. – NFS access patterns are different than local file Examples: Technical Report TR3002 access patterns – routers from Cisco and Avici NetApp – Large client-side caches result in fewer reads than – network terminals 2002 writes – network printers • Network Appliance Corporation uses Write • For files, not just another computer with your files, but new type of network appliance http://www.netapp.com/us/library/white-papers/wp_3002.html Anywhere File Layout (WAFL) file system (At WPI: http://www.wpi.edu/Academics/CCC/Help/Unix/snapshots.html)  Network File System (NFS) file server Introduction: WAFL WPI File System Outline • CCC machines have central, Network File System • WAFL has 4 requirements • Introduction (done) (NSF) – Fast NFS service • Snapshots : User Level (next) – Have same home directory for cccwork2 , – Support large file systems (10s of GB) that can grow (can add cccwork3 … • WAFL Implementation disks later) – /home has 10,113 directories! – Provide high performance writes and support Redundant • Snapshots: System Level • Previously, Network File System support from Arrays of Inexpensive Disks (RAID) NetApp WAFL • Performance – Restart quickly, even after unclean shutdown • Switched to EMC Celera NS-120 • Conclusions • NFS and RAID both strain write performance:  similar features and protocol support – NFS server must respond after data is written • Provide notion of “snapshot” of file system (next) – RAID must write parity bits also 1

  2. 1/29/2016 Introduction to Snapshots User Access to Snapshots Snapshot Administration • • Snapshots are copy of file system at given point in time • Example , suppose accidentally removed file named “ todo ”: WAFL server allows sys admins claypool 168 CCCWORK3% cd .snapshot to create and delete claypool 169 CCCWORK3% ls -1 • WAFL creates and deletes snapshots automatically at preset snapshots, but usually home-20160121-00:00/ CCCWORK3% ls -lut .snapshot/*/todo times home-20160122-00:00/ -rw-rw---- 1 claypool claypool 4319 Oct 24 18:42 automatic home-20160122-22:00/ .snapshot/2011_10_26_18.15.29/todo – Up to 255 snapshots stored at once • At WPI, snapshots of /home . home-20160123-00:00/ -rw-rw---- 1 claypool claypool 4319 Oct 24 18:42 Says: • Uses copy-on-write to avoid duplicating blocks in the active home-20160123-02:00/ .snapshot/2011_10_26_19.27.40/todo – 3am, 6am, 9am, noon, 3pm, home-20160123-04:00/ -rw-rw---- 1 claypool claypool 4319 Oct 24 18:42 file system 6pm, 9pm, midnight home-20160123-06:00/ .snapshot/2011_10_26_19.37.10/todo • Snapshot uses: – Nightly snapshot at midnight home-20160123-08:00/ • every day home-20160123-10:00/ Can then recover most recent version: – Users can recover accidentally deleted files – Weekly snapshot is made on home-20160123-12:00/ – Sys admins can create backups from running system … Saturday at midnight every CCCWORK3% cp .snapshot/2011_10_26_19.37.10/todo todo home-20160127-16:00/ week – System can restart quickly after unclean shutdown home-20160127-17:00/  But looks like every 1 hour home-20160127-18:00/ • Roll back to previous snapshot • Note, snapshot directories ( .snapshot ) are hidden in that they (fewer copies kept for older home-20160127-19:00/ periods and 1 week ago max) don’t show up with ls (even ls -a ) unless specifically requested home-20160127-20:00/ home-latest/ Snapshots at WPI (Windows) Outline WAFL File Descriptors • Mount UNIX space ( \\storage.wpi.edu\home ), add \.snapshot to end • Introduction • Inode based system with 4 KB blocks (done) • Inode has 16 pointers, which vary in type depending upon file • Snapshots : User Level (done) size • WAFL Implementation (next) – For files smaller than 64 KB: • Each pointer points to data block • Snapshots: System Level – For files larger than 64 KB: • Each pointer points to indirect block • Performance – For really large files: • Conclusions • Each pointer points to doubly-indirect block • For very small files (less than 64 bytes), data kept in inode itself, instead of using pointers to blocks • Can also right-click on file and Note, files in .snapshot choose “restore previous version” do not count against quota 2

  3. 1/29/2016 Zoom of WAFL Meta-Data Snapshots (1 of 2) WAFL Meta-Data (Tree of Blocks) • Copy root inode only, copy on write for changed data blocks • Root inode must be in fixed location • Meta-data stored in files • Other blocks can be written anywhere – Inode file – stores inodes – Block-map file – stores free blocks – Inode-map file – identifies free inodes • Over time, old snapshot references more and more data blocks that are not used • Rate of file change determines how many snapshots can be stored on system Snapshots (2 of 2) Consistency Points (1 of 2) Consistency Points (2 of 2) • When disk block modified, must modify • WAFL uses NVRAM (NV = Non-Volatile): meta-data (indirect pointers) as well • In order to avoid consistency checks after unclean – (NVRAM is DRAM with batteries to avoid losing during unexpected poweroff, some servers now just solid-state or shutdown, WAFL creates special snapshot called hybrid) – NFS requests are logged to NVRAM consistency point every few seconds – Upon unclean shutdown, re-apply NFS requests to last – Not accessible via NFS consistency point – Upon clean shutdown, create consistency point and turnoff • Batched operations are written to disk each NVRAM until needed (to save power/batteries) consistency point • Note, typical FS uses NVRAM for metadata write cache instead of just logs – Like journal – Uses more NVRAM space (WAFL logs are smaller) • In between consistency points, data only written • Ex: “rename” needs 32 KB, WAFL needs 150 bytes • Ex: write 8 KB needs 3 blocks (data, inode, indirect pointer), WAFL to RAM needs 1 block (data) plus 120 bytes for log – Slower response time for typical FS than for WAFL (although • Batch, to improve I/O performance WAFL may be a bit slower upon restart) 3

  4. 1/29/2016 The Block-Map File Write Allocation Outline • Typical FS uses bit for each free block, 1 is allocated and 0 is free – Ineffective for WAFL since may be other snapshots that point to • Write times dominate NFS performance • Introduction (done) block – Read caches at client are large • WAFL uses 32 bits for each block – Up to 5 x as many write operations as read operations at • Snapshots : User Level (done) – For each block, copy “active” bit over to snapshot bit server • WAFL batches write requests (e.g., at consistency • WAFL Implementation (done) points) • Snapshots: System Level • WAFL allows “write anywhere”, enabling inode next to (next) data for better perf • Performance – Typical FS has inode information and free blocks at fixed location • Conclusions • WAFL allows writes in any order since uses consistency points – Typical FS writes in fixed order to allow fsck to work if unclean shutdown Creating Snapshots Flushing IN_SNAPSHOT Data Outline • Could suspend NFS, create snapshot, resume NFS • Flush inode data first • Introduction (done) – But can take up to 1 second – Keeps two caches for inode data, so can copy system cache to • Challenge: avoid locking out NFS requests • Snapshots : User Level (done) inode data file, unblocking most NFS requests • Quick, since requires no I/O since inode file flushed later • WAFL marks all dirty cache data as IN_SNAPSHOT. • WAFL Implementation (done) • Update block-map file Then: – Copy active bit to snapshot bit • Snapshots: System Level (done) – NFS requests can read system data, write data not • Write all IN_SNAPSHOT data • Performance (next) IN_SNAPSHOT – Restart any blocked requests as soon as particular buffer flushed – Data not IN_SNAPSHOT not flushed to disk (don’t wait for all to be flushed) • Conclusions • Duplicate root inode and turn off IN_SNAPSHOT bit • Must flush IN_SNAPSHOT data as quickly as possible flush • All done in less than 1 second, first step done in 100s of ms IN_SNAPSHOT Can be used new 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend