SLIDE 1
Backing UP AFS Using TSM
Xueshan Feng Stanford University, March 24th, 2004 ABSTRACTION AFS is Stanford's enterprise file system. It stores 2.5 TB of data, and serves roughly 40,000 individual users, 3,400 classes, 1,600 departments and groups, and thousands of other applications and systems on campus. A well designed backup system should allow us to backup data and restore it should the original data be lost. This document presents the design and implementation of Stanford's AFS backup system, implemented using a vendor backup management product – TSM. The product is also used on a campus-wide data backup basis for administrative applications and desktops through a common infrastructure. BACKUP REQUIREMENTS Our requirements for backup are simple: We should be able to backup AFS data and restore the data back into AFS. We want to be able to preserve AFS access control lists and have the flexibility to restore entire volume as well as a single file. The backup and restore should be automated as much as possible. Manually handling tapes should not be needed for file restoration. IMPLEMENTATION In 1999 we started working on a backup project to replace the old AFS backup system built around Legato software. At that time our backups and restores were not reliable; the operation relied heavily on staff intervention; file restores could take days and many hours of staff time; and the system did not scale well as AFS usage increased over the years. We selected IBM’s ADSM product as our AFS backup solution, in line with the campus backup systems used for the large administrative applications. ADSM was later combined with Tivoli when they were bought by IBM; the product is now called “Tivoli Storage Manager” (TSM). Here is the Stanford AFS backup implementation using TSM.
- Hardware
The hardware for Stanford AFS backup system consists of:
- Two AIX RS6000 H50 servers, each with mirrored system disks and 3GB memory.
- One IBM 3494 automated tape library, accessible from the network, with 60 TB of "in-
shelf" capacity.
- About 200 GB EMC CLARiion disks used as TSM database, event logs, and data staging
- spool. The disks are part of campus SAN disk storage infrastructure and accessible from
the backup servers via fiber channel card. Data is first backed up to the EMC disk backup spool, and then is moved to the secondary tape pool when it is convenient. The data usually remains for 24 hours in disk backup spool and provides faster restores.
- Software