AFS at Intel AFS at Intel Travis Broughton Travis Broughton - - PowerPoint PPT Presentation
AFS at Intel AFS at Intel Travis Broughton Travis Broughton - - PowerPoint PPT Presentation
AFS at Intel AFS at Intel Travis Broughton Travis Broughton Agenda Agenda Intels Engineering Environment Intels Engineering Environment Things AFS Does well Things AFS Does well How Intel uses AFS How Intel uses AFS How not to use
Agenda Agenda
Intel’s Engineering Environment Intel’s Engineering Environment Things AFS Does well Things AFS Does well How Intel uses AFS How Intel uses AFS How not to use AFS How not to use AFS Management Tools Management Tools
Intel’s Engineering Environment Intel’s Engineering Environment
Learned about AFS in 1991 Learned about AFS in 1991 First deployed AFS in Intel’s Israel First deployed AFS in Intel’s Israel design center in 1992 design center in 1992 Grew to a peak of 30 cells in 2001 Grew to a peak of 30 cells in 2001 Briefly considered DCE/DFS Briefly considered DCE/DFS migration in 1998 (the first time AFS migration in 1998 (the first time AFS was scheduled to go away…) was scheduled to go away…)
Intel’s Engineering Environment Intel’s Engineering Environment
~95% NFS, ~5% AFS ~95% NFS, ~5% AFS ~20 AFS cells managed by ~10 regional ~20 AFS cells managed by ~10 regional
- rganizations
- rganizations
AFS used for CAD and /usr/local AFS used for CAD and /usr/local applications, global data sharing for applications, global data sharing for projects, secure access to data projects, secure access to data NFS used for everything else, gives NFS used for everything else, gives higher performance in most cases higher performance in most cases Wide range of client platforms, OSs, etc Wide range of client platforms, OSs, etc
Cell Topology Considerations Cell Topology Considerations
Number of sites/campuses/buildings Number of sites/campuses/buildings to support to support Distance (latency) between sites Distance (latency) between sites Max # of replicas needed for a Max # of replicas needed for a volume volume Trust Trust … … As a result, Intel has many cells As a result, Intel has many cells
Things AFS Does Well Things AFS Does Well
Security Security
- Uses Kerberos, doesn’t have to trust client
Uses Kerberos, doesn’t have to trust client
- Uses ACLs, better granularity
Uses ACLs, better granularity
Performance for frequently-used files Performance for frequently-used files
- e.g. /usr/local/bin/perl
e.g. /usr/local/bin/perl
High availability for RO data High availability for RO data Storage virtualization Storage virtualization Global, delegated namespace Global, delegated namespace
AFS Usage at Intel: AFS Usage at Intel: Global Data Sharing Global Data Sharing
Optimal use of compute resources Optimal use of compute resources
- Batch jobs launched from site x may land at site y,
Batch jobs launched from site x may land at site y, depending on demand depending on demand
Optimal use of headcount resources Optimal use of headcount resources
- A project based at site x may “borrow” idle headcount
A project based at site x may “borrow” idle headcount from site y without relocation from site y without relocation
Optimal license sharing Optimal license sharing
- A project based at site x may borrow idle software licenses
A project based at site x may borrow idle software licenses (assuming contract allows “WAN” licensing) (assuming contract allows “WAN” licensing)
Efficient IP reuse Efficient IP reuse
- A project based at site x may require access to the most
A project based at site x may require access to the most recent version of another project being developed at site y recent version of another project being developed at site y
Storage virtualization and load balancing Storage virtualization and load balancing
- Many servers – can migrate data to balance load and do
Many servers – can migrate data to balance load and do maintenance during working hours maintenance during working hours
AFS Usage at Intel: AFS Usage at Intel: Other Applications Other Applications
x-site tool consistency x-site tool consistency
- Before rsync was widely deployed and SSH-tunneled, used
Before rsync was widely deployed and SSH-tunneled, used AFS namespace to keep tools in sync AFS namespace to keep tools in sync
@sys simplifies multiplatform support @sys simplifies multiplatform support
- Environment variables, automounter macros are reasonable
Environment variables, automounter macros are reasonable workarounds workarounds
“ “@cell” link at top-level of AFS simplifies @cell” link at top-level of AFS simplifies namespace namespace
- In each cell, @cell points to the local cell
In each cell, @cell points to the local cell
- Mirrored data in multiple cells can be accessed through the
Mirrored data in multiple cells can be accessed through the same path (fs wscell expansion would also work) same path (fs wscell expansion would also work)
/usr/local, CAD tool storage /usr/local, CAD tool storage
- Cache manager outperforms NFS
Cache manager outperforms NFS
- Replication provides many levels of fault-tolerance
Replication provides many levels of fault-tolerance
Things AFS Doesn’t Do Well Things AFS Doesn’t Do Well
Performance on seldom-used files Performance on seldom-used files High availability for RW data High availability for RW data Scalability with SMP systems Scalability with SMP systems Integration with OS Integration with OS File/volume size limitations File/volume size limitations
When NOT to Use AFS When NOT to Use AFS
CVS repositories CVS repositories
- Remote $CVSROOT using SSH seems to
Remote $CVSROOT using SSH seems to work better work better
rsync rsync Any other tool that would potentially Any other tool that would potentially thrash the cache… thrash the cache…
Other Usage Notes Other Usage Notes
Client cache is better than nothing, Client cache is better than nothing, but shared “edge” cache may be but shared “edge” cache may be better better
- Mirroring w/ rsync accomplishes this for
Mirroring w/ rsync accomplishes this for RO data RO data
- Client disk is very cheap, shared
Client disk is very cheap, shared (fileserver) disk is fairly cheap, WAN (fileserver) disk is fairly cheap, WAN bandwidth is still costly (and latency can bandwidth is still costly (and latency can rarely be reduced) rarely be reduced)
OpenAFS at Intel OpenAFS at Intel
Initially used contrib’d AFS 3.3 port for Initially used contrib’d AFS 3.3 port for Linux Linux Adopted IBM/Transarc port when it Adopted IBM/Transarc port when it became available became available Migrated to OpenAFS when kernel churn Migrated to OpenAFS when kernel churn became too frequent became too frequent Openafs-devel very responsive to bug Openafs-devel very responsive to bug submissions submissions
- Number of bug submissions (from Intel)
Number of bug submissions (from Intel) tapering off – client has become much more tapering off – client has become much more stable stable
Management Tools Management Tools
Data age indicators Data age indicators
- Per-volume view only
Per-volume view only
- 11pm (local) nightly cron job to collect volume access statistics
11pm (local) nightly cron job to collect volume access statistics
idle++ if accesses==0, else idle=0 idle++ if accesses==0, else idle=0
Mountpoint database Mountpoint database
- /usr/afs/bin/salvager –showmounts on all fileservers
/usr/afs/bin/salvager –showmounts on all fileservers
- Find root.afs volume, traverse mountpoints to build tree
Find root.afs volume, traverse mountpoints to build tree
MountpointDB audit MountpointDB audit
- Find any volume names not listed MpDB
Find any volume names not listed MpDB
- Find unused read-only replicas (mounted under RW)
Find unused read-only replicas (mounted under RW)
Samba integration Samba integration
- Smbklog
Smbklog
“ “Storage on Demand” Storage on Demand”
- Delegates volume creation (primarily for scratch space) to
Delegates volume creation (primarily for scratch space) to users, with automated reclaim users, with automated reclaim
Management Tools Management Tools
Recovery of PTS groups Recovery of PTS groups
- Cause – someone confuses “pts del” and “pts rem”
Cause – someone confuses “pts del” and “pts rem”
- Initial fix – create a new cell, restore pts db, use pts exa to get
Initial fix – create a new cell, restore pts db, use pts exa to get list of users list of users
- Easier fix – wrap pts to log pts del, capture state of group before
Easier fix – wrap pts to log pts del, capture state of group before deleting deleting
- Even better fix – do a nightly text dump of your PTS DB
Even better fix – do a nightly text dump of your PTS DB
Mass deletion of volumes Mass deletion of volumes
- Cause – someone does “rm –rf” equivalent in the wrong place
Cause – someone does “rm –rf” equivalent in the wrong place (most recent case was a botched rsync) (most recent case was a botched rsync)
- Initial fix – lots of vos dump .backup/.readonly | vos restore
Initial fix – lots of vos dump .backup/.readonly | vos restore
Disks fill up, etc Disks fill up, etc
- Other fixes – watch size of volumes, and alert if some threshold
Other fixes – watch size of volumes, and alert if some threshold change is exceeded change is exceeded
Throw fileserver into debug mode, capture IP address doing the Throw fileserver into debug mode, capture IP address doing the damage and lock it down damage and lock it down
Management Tools Management Tools
Watch for ‘calls waiting for a thread’ Watch for ‘calls waiting for a thread’ Routing loops can trigger problems Routing loops can trigger problems True load-based meltdowns can be diagnosed True load-based meltdowns can be diagnosed
- Send signal to fileserver to toggle debug mode
Send signal to fileserver to toggle debug mode
- Collect logs for some period of time (minutes)
Collect logs for some period of time (minutes)
- Analyze logs to locate most frequently used vnodes
Analyze logs to locate most frequently used vnodes
- Convert vnum to inum
Convert vnum to inum
- Use find to locate busiest volume and files/directories
Use find to locate busiest volume and files/directories being accessed being accessed
- Sometimes requires moving the busy volume elsewhere
Sometimes requires moving the busy volume elsewhere to complete diagnosis to complete diagnosis
Management Tools Management Tools
Keep fileserver machines identical if Keep fileserver machines identical if possible possible
- Easier maintenance
Easier maintenance
Keep a hot spare fileserver around and Keep a hot spare fileserver around and
- nline
- nline
- Configure as a fileserver in local cell to host
Configure as a fileserver in local cell to host busy volumes busy volumes
- Configure as a DB server in its own cell for DB
Configure as a DB server in its own cell for DB recovery recovery
“ “Splitting” a volume is somewhat tedious Splitting” a volume is somewhat tedious
- Best to plan directory/volume layout ahead of
Best to plan directory/volume layout ahead of time, but it can be changed if necessary time, but it can be changed if necessary