Scale and Performance in a Distributed File System (AFS) Howard et - PowerPoint PPT Presentation

Scale and Performance in a Distributed File System (AFS) Howard et al. CMU 1988, ACM TOCS Presenter: Dhirendra Singh Kholia

Outline • What is AFS? • The Prototype implementation • Changes for Performance • Effect of Changes for Performance • Comparison with NFS • Conclusion • Q&A

AFS (Andrew File System) • AFS is a distributed filesystem that enables efficient sharing of storage resources across both local area and wide area networks. • Development started at CMU around 1983 • Goal: 5,000 - 10,000 nodes (very high scalability!) • Scale yet maintain performance and simple administration

AFS • Client-Server Architecture • Vice : Set of trusted servers • Clients run user level process called Venus • Venus caches files from Vice • Caching based on upload/download (whole- file) transfer model • Venus contacts Vice only for open and close operations

The Prototype Implementation • Spawned a dedicated process for every client • Each server contained a directory hierarchy mirroring the structure of the Vice files .admin directory – Vice file status information Stub directory – location database embedded in the file tree • Pathname resolution done by Vice (servers) • Venus verifies timestamp before using cached file (open() and stat() force contact with Vice!) • Coarse-grained read-only replication • Dedicated lock-server process

Benchmark Details • script operates on a collection of source code files. • 70 files totaling 200KiB • 5 phases:

Local FS performance Benchmark took ~1000 seconds on a Sun2 workstation

Prototype Performance 1 Load Unit = 5 Andrew users 70% slower than local FS Doesn’t scale well after 5-8 Load Units

Call Distribution gets status information for files not in cache validated cache entries • 2 calls accounted for almost 90% of total calls! open() and stat() force contact with Vice. • Caching Works ( > 80% Hit Ratio) • “Cache validation driven totally by Venus” is not a good idea Source: http://dcslab.snu.ac.kr/courses/dip2009f/presentation_old/3.ppt

Prototype resource usage • 75% CPU utilization over 5-minute period, 40% over 8-hour period! • CPU is the performance bottleneck! • Causes: pathname resolution, excessive context switches

Problems with the Prototype • High virtual memory paging demands (fork model) • High CPU usage • Exceeded critical resource limits, network-related resources frequently • High frequency of cache validation checks (too many stats) • Difficult to move directories around (and thus balance load) • Despite all these problems, the prototype was robust, simple and it worked. • “… our users willingly suffered!” 

Changes for Performance • Cache Management • Name Resolution + Low-Level Storage Representation • Process structure Target: Handle at least 50 clients per server.

Cache Management • Status cache (in virtual memory for fast stat() perf.) • Data cache (in local disk) • Now caches directory contents and symlinks too! • Venus now assumes that cache entries are valid unless otherwise notified by Vice • Callback – the server promises to notify client before allowing a modification + This reduce cache validation traffic and server load. - Maintenance of callback state information. - There is a potential for inconsistency (how?)

Name Resolution + Low-Level Storage Representation Earlier, pathname resolution was done by Vice • (costly implicit namei operation caused server load) Now, Venus maps Vice pathnames to Fid and passes Fid to Vice • 96-bit FID = 32-bit Volume Number, 32-bit Vnode number + 32-bit • Uniquifier Key Idea: Eliminate pathname lookups (Use Fid on servers and • inodes on clients directly) The Volume Number identifies a Volume and the location of Volume • is contained in Volume Location Database.

Process Structure • Use fixed number of LWPs within one process. • An LWP is bound to a particular client only for the duration of a single server operation. • User space RPC implementation

AFS Consistency Semantics • Visibility of writes to an open file by a process on a workstation is limited to that particular workstation • Commit on close (write-on-close) changes are now visible to new opens, open instances do not see the changes • All other file operations are visible everywhere immediately • No implicit locking, multiple clients can perform same operation on a file concurrently • Application have to cooperate and manage synchronization

Effect of Chances for Performance

Effect of Chances for Performance • Only 19% slower than a stand-alone workstation • ScanDir and ReadAll phases almost independent of load! • Scales well and the target of 50 clients is easily met!

Comparison with NFS (Remote Open) • File Data is not fetched in one go • Advantage of remote-open model: Low Latency

Comparison with NFS (Time)

Comparison with NFS (CPU)

Comparison with NFS (Disk)

Comparison Report • NFS failed to work properly at high loads! • For 1 LU, NFS generated ~3 times as many packets as AFS • NFS’s performance degrades rapidly with load • NFS saturated CPU and Disk and still couldn’t keep up (despite the fact that it operates entirely in kernel!) • NFS doesn’t scale well (actually it doesn’t seem to scale at all) 

Changes for Operability • A Volume is a collection of files. Each user is assigned a Volume. • Volume is like a mini-filesystem in itself. It can grow/shrink in size. • Volumes allow quotas, consistent backups and read- only replication and painless live migration of data • Volumes keep the size of VLD manageable. • Volume abstraction is indispensable!

Conclusion • Only problems I see are: A) limit of 64K files per directory B) whole-file caching (making it slow for big files) • Overall, AFS is awesome 

Questions - Scaling • Do they ever reach their goal of 5000 workstations? OR are distributed file systems fundamentally flawed and cannot scale indefinitely? • Yes, AFS should be able to manage that magical number. (http://www.openafs.org/success.html)

Questions - Locking • Isn't the lack of any form of synchronization amongst the files dangerous? 4.2BSD doesn’t lock files implicitly and AFS conforms to these semantics . Yes, it seems dangerous but even under modern *NIX, locks are advisory by default, which again requires application to behave “correctly”. • Couldn't a single badly written program corrupt a whole lot of important data? Blame the program then 

Questions - Caching • Is the caching of the entire file a good idea, given the huge size of files these days? Latency is a big problem with whole-transfer model. Well even for a 24KB file the latency was ~0.5 seconds (quite noticeable!). For huge files it would get quite worse (linearly though). However, whole-file transfer is the key to AFS scaling! Lets discuss this.

Questions - Caching • Do server remove callbacks for expired cache items in clients? If it does, how would a server know what the workstation has cached and What items have expired? Will workstation notify server about expired cache items? • Yes, Venus executes RemoveCallBack (while flushing an item out of cache) which tells the server the filename to remove callback from.

Questions - Locking • The authors state that user-level file locking was implemented by a dedicated lock server process. How does this centralized locking mechanism affect scalability? Locking is not done implicitly. So only particular applications actually will use the lock mechanism.

Questions • Embedding of file location information in the file storage structure made movement of files across servers difficult, because it required "structural modifications to storage on the servers"... what structural modifications does it mean? • My Guess : Moving a part of namespace will require a new partition on the new server. (Since only entire disk partitions could be mounted and the existing partition could not be used to serve as another mount point).

Questions – Cache Size • Diskless operation is possible but slow and files that are larger than the local disk cache cannot be accessed at all. Why couldn't they be accessed using the same slow method as diskless operation? File has to always fit in the cache (memory or disk)!

Questions - Consistency • This paper does not mention file conflicts (i.e. users modifying stale copies of files). Are file conflicts possible? • What happens when Client A and Client B open and begin modifying the same file? If Client A closes the file first and B closes second then, are the are the changes done by Client A lost? Can the server refuse to close() for Client B because it knows that Callback for B is missing/broken?

Questions • It seems like the performance of AFS will be quite low for small updates to huge files. So how can we overcome this problem? Will the performance of the system will be hampered if small updates to huge files happen very often? Conceptually, something similar to rsync could be used to handle this.

Scale and Performance in a Distributed File System (AFS) Howard et - PowerPoint PPT Presentation

Scale and Performance in a Distributed File System (AFS) Howard et al. CMU 1988, ACM TOCS Presenter: Dhirendra Singh Kholia Outline What is AFS? The Prototype implementation Changes for Performance Effect of Changes for

AFS Perl Module AFS Perl Module Alf Wachsmann Stanford Linear Accelerator Center

AFS at Intel AFS at Intel Travis Broughton Travis Broughton Agenda Agenda Intels

Antidot Training AFS@Store AFS@Store Introduction 2 Antidot solution for E-Commerce 3 What

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

Distributed File Systems Issues in Distributed File Service Case Studies: Sun

Backing UP AFS Using TSM Xueshan Feng Stanford University, March 24 th , 2004 ABSTRACTION AFS is

AFS Performance Simon Wilkinson Your File System Ltd sxw@your-file-system.com Wednesday, 17

File Management What is a file? Elements of file management File organization

D ISTRIBUTED S YSTEMS [COMP9243] Lecture 8b: Distributed File Systems Introduction NFS

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

Hadoop Distributed File System (HDFS) 1 HDFS Overview A distributed file system Built on the

Hadoop Distributed File System (HDFS) 10/05/2018 1 HDFS Overview A distributed file system

Hadoop Distributed File System (HDFS) 1 HDFS Overview A distributed file system Built on

AFS and Kerberos 5 Ken Hornstein Naval Research Laboratory Kerberos Use in AFS, old school

Andrew Deason Sine Nomine Associates European AFS and Kerberos Conference 2012 Agenda Why is

Process Previous Work General Educa2on Commi7ee formed IS

NFS, its applications and future Brian Pawlowski Vice President and Chief Architect

Golden Dalea Dalea capitata Sierra Gold Size: Exposure: 8- 12 tall x 3 -5 wide Full

Wabasso Corridor Plan Update 1 st Community Meeting August 2, 2016 OVERVIEW Purpose

OpenSolaris NFS TM Performance Alain van Hoof February 3, 2010 Research Question

NetSol Technologies Special Conference Call Najeeb Ghauri, CEO | Naeem Ghauri, Head of Global

Achieving High Throughput and Scalability with JRuby Fernando Castano fernando.castano@sun.com

Enabling Ultra-fast Presto in the Cloud with Alluxio Haoyuan (H.Y.) Li | Founder & CTO |

Sambuz

Useful Links

Newsletter

Mail Us

Scale and Performance in a Distributed File System (AFS) Howard et - PowerPoint PPT Presentation

Scale and Performance in a Distributed File System (AFS) Howard et al. CMU 1988, ACM TOCS Presenter: Dhirendra Singh Kholia Outline What is AFS? The Prototype implementation Changes for Performance Effect of Changes for

AFS Perl Module AFS Perl Module Alf Wachsmann Stanford Linear Accelerator Center

AFS at Intel AFS at Intel Travis Broughton Travis Broughton Agenda Agenda Intels

Antidot Training AFS@Store AFS@Store Introduction 2 Antidot solution for E-Commerce 3 What

Distributed File Systems Distributed File Systems A distributed file system (DFS) is a

[537] Distributed Systems Chapters 42 Tyler Harter 11/19/14 File-System Case Studies Local -

Distributed File Systems Issues in Distributed File Service Case Studies: Sun

Backing UP AFS Using TSM Xueshan Feng Stanford University, March 24 th , 2004 ABSTRACTION AFS is

AFS Performance Simon Wilkinson Your File System Ltd sxw@your-file-system.com Wednesday, 17

File Management What is a file? Elements of file management File organization

D ISTRIBUTED S YSTEMS [COMP9243] Lecture 8b: Distributed File Systems Introduction NFS

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

Hadoop Distributed File System (HDFS) 1 HDFS Overview A distributed file system Built on the

Hadoop Distributed File System (HDFS) 10/05/2018 1 HDFS Overview A distributed file system

Hadoop Distributed File System (HDFS) 1 HDFS Overview A distributed file system Built on

AFS and Kerberos 5 Ken Hornstein Naval Research Laboratory Kerberos Use in AFS, old school

Andrew Deason Sine Nomine Associates European AFS and Kerberos Conference 2012 Agenda Why is

Process Previous Work General Educa2on Commi7ee formed IS

NFS, its applications and future Brian Pawlowski Vice President and Chief Architect

Golden Dalea Dalea capitata Sierra Gold Size: Exposure: 8- 12 tall x 3 -5 wide Full

Wabasso Corridor Plan Update 1 st Community Meeting August 2, 2016 OVERVIEW Purpose

OpenSolaris NFS TM Performance Alain van Hoof February 3, 2010 Research Question

NetSol Technologies Special Conference Call Najeeb Ghauri, CEO | Naeem Ghauri, Head of Global

Achieving High Throughput and Scalability with JRuby Fernando Castano fernando.castano@sun.com

Enabling Ultra-fast Presto in the Cloud with Alluxio Haoyuan (H.Y.) Li | Founder &amp; CTO |

Sambuz

Useful Links

Newsletter

Mail Us

Enabling Ultra-fast Presto in the Cloud with Alluxio Haoyuan (H.Y.) Li | Founder & CTO |