AFS at Intel AFS at Intel Travis Broughton Travis Broughton - - PowerPoint PPT Presentation

afs at intel afs at intel
SMART_READER_LITE
LIVE PREVIEW

AFS at Intel AFS at Intel Travis Broughton Travis Broughton - - PowerPoint PPT Presentation

AFS at Intel AFS at Intel Travis Broughton Travis Broughton Agenda Agenda Intels Engineering Environment Intels Engineering Environment Things AFS Does well Things AFS Does well How Intel uses AFS How Intel uses AFS How not to use


slide-1
SLIDE 1

AFS at Intel AFS at Intel

Travis Broughton Travis Broughton

slide-2
SLIDE 2

Agenda Agenda

Intel’s Engineering Environment Intel’s Engineering Environment Things AFS Does well Things AFS Does well How Intel uses AFS How Intel uses AFS How not to use AFS How not to use AFS Management Tools Management Tools

slide-3
SLIDE 3

Intel’s Engineering Environment Intel’s Engineering Environment

Learned about AFS in 1991 Learned about AFS in 1991 First deployed AFS in Intel’s Israel First deployed AFS in Intel’s Israel design center in 1992 design center in 1992 Grew to a peak of 30 cells in 2001 Grew to a peak of 30 cells in 2001 Briefly considered DCE/DFS Briefly considered DCE/DFS migration in 1998 (the first time AFS migration in 1998 (the first time AFS was scheduled to go away…) was scheduled to go away…)

slide-4
SLIDE 4

Intel’s Engineering Environment Intel’s Engineering Environment

~95% NFS, ~5% AFS ~95% NFS, ~5% AFS ~20 AFS cells managed by ~10 regional ~20 AFS cells managed by ~10 regional

  • rganizations
  • rganizations

AFS used for CAD and /usr/local AFS used for CAD and /usr/local applications, global data sharing for applications, global data sharing for projects, secure access to data projects, secure access to data NFS used for everything else, gives NFS used for everything else, gives higher performance in most cases higher performance in most cases Wide range of client platforms, OSs, etc Wide range of client platforms, OSs, etc

slide-5
SLIDE 5

Cell Topology Considerations Cell Topology Considerations

Number of sites/campuses/buildings Number of sites/campuses/buildings to support to support Distance (latency) between sites Distance (latency) between sites Max # of replicas needed for a Max # of replicas needed for a volume volume Trust Trust … … As a result, Intel has many cells As a result, Intel has many cells

slide-6
SLIDE 6

Things AFS Does Well Things AFS Does Well

Security Security

  • Uses Kerberos, doesn’t have to trust client

Uses Kerberos, doesn’t have to trust client

  • Uses ACLs, better granularity

Uses ACLs, better granularity

Performance for frequently-used files Performance for frequently-used files

  • e.g. /usr/local/bin/perl

e.g. /usr/local/bin/perl

High availability for RO data High availability for RO data Storage virtualization Storage virtualization Global, delegated namespace Global, delegated namespace

slide-7
SLIDE 7

AFS Usage at Intel: AFS Usage at Intel: Global Data Sharing Global Data Sharing

Optimal use of compute resources Optimal use of compute resources

  • Batch jobs launched from site x may land at site y,

Batch jobs launched from site x may land at site y, depending on demand depending on demand

Optimal use of headcount resources Optimal use of headcount resources

  • A project based at site x may “borrow” idle headcount

A project based at site x may “borrow” idle headcount from site y without relocation from site y without relocation

Optimal license sharing Optimal license sharing

  • A project based at site x may borrow idle software licenses

A project based at site x may borrow idle software licenses (assuming contract allows “WAN” licensing) (assuming contract allows “WAN” licensing)

Efficient IP reuse Efficient IP reuse

  • A project based at site x may require access to the most

A project based at site x may require access to the most recent version of another project being developed at site y recent version of another project being developed at site y

Storage virtualization and load balancing Storage virtualization and load balancing

  • Many servers – can migrate data to balance load and do

Many servers – can migrate data to balance load and do maintenance during working hours maintenance during working hours

slide-8
SLIDE 8

AFS Usage at Intel: AFS Usage at Intel: Other Applications Other Applications

x-site tool consistency x-site tool consistency

  • Before rsync was widely deployed and SSH-tunneled, used

Before rsync was widely deployed and SSH-tunneled, used AFS namespace to keep tools in sync AFS namespace to keep tools in sync

@sys simplifies multiplatform support @sys simplifies multiplatform support

  • Environment variables, automounter macros are reasonable

Environment variables, automounter macros are reasonable workarounds workarounds

“ “@cell” link at top-level of AFS simplifies @cell” link at top-level of AFS simplifies namespace namespace

  • In each cell, @cell points to the local cell

In each cell, @cell points to the local cell

  • Mirrored data in multiple cells can be accessed through the

Mirrored data in multiple cells can be accessed through the same path (fs wscell expansion would also work) same path (fs wscell expansion would also work)

/usr/local, CAD tool storage /usr/local, CAD tool storage

  • Cache manager outperforms NFS

Cache manager outperforms NFS

  • Replication provides many levels of fault-tolerance

Replication provides many levels of fault-tolerance

slide-9
SLIDE 9

Things AFS Doesn’t Do Well Things AFS Doesn’t Do Well

Performance on seldom-used files Performance on seldom-used files High availability for RW data High availability for RW data Scalability with SMP systems Scalability with SMP systems Integration with OS Integration with OS File/volume size limitations File/volume size limitations

slide-10
SLIDE 10

When NOT to Use AFS When NOT to Use AFS

CVS repositories CVS repositories

  • Remote $CVSROOT using SSH seems to

Remote $CVSROOT using SSH seems to work better work better

rsync rsync Any other tool that would potentially Any other tool that would potentially thrash the cache… thrash the cache…

slide-11
SLIDE 11

Other Usage Notes Other Usage Notes

Client cache is better than nothing, Client cache is better than nothing, but shared “edge” cache may be but shared “edge” cache may be better better

  • Mirroring w/ rsync accomplishes this for

Mirroring w/ rsync accomplishes this for RO data RO data

  • Client disk is very cheap, shared

Client disk is very cheap, shared (fileserver) disk is fairly cheap, WAN (fileserver) disk is fairly cheap, WAN bandwidth is still costly (and latency can bandwidth is still costly (and latency can rarely be reduced) rarely be reduced)

slide-12
SLIDE 12

OpenAFS at Intel OpenAFS at Intel

Initially used contrib’d AFS 3.3 port for Initially used contrib’d AFS 3.3 port for Linux Linux Adopted IBM/Transarc port when it Adopted IBM/Transarc port when it became available became available Migrated to OpenAFS when kernel churn Migrated to OpenAFS when kernel churn became too frequent became too frequent Openafs-devel very responsive to bug Openafs-devel very responsive to bug submissions submissions

  • Number of bug submissions (from Intel)

Number of bug submissions (from Intel) tapering off – client has become much more tapering off – client has become much more stable stable

slide-13
SLIDE 13

Management Tools Management Tools

Data age indicators Data age indicators

  • Per-volume view only

Per-volume view only

  • 11pm (local) nightly cron job to collect volume access statistics

11pm (local) nightly cron job to collect volume access statistics

idle++ if accesses==0, else idle=0 idle++ if accesses==0, else idle=0

Mountpoint database Mountpoint database

  • /usr/afs/bin/salvager –showmounts on all fileservers

/usr/afs/bin/salvager –showmounts on all fileservers

  • Find root.afs volume, traverse mountpoints to build tree

Find root.afs volume, traverse mountpoints to build tree

MountpointDB audit MountpointDB audit

  • Find any volume names not listed MpDB

Find any volume names not listed MpDB

  • Find unused read-only replicas (mounted under RW)

Find unused read-only replicas (mounted under RW)

Samba integration Samba integration

  • Smbklog

Smbklog

“ “Storage on Demand” Storage on Demand”

  • Delegates volume creation (primarily for scratch space) to

Delegates volume creation (primarily for scratch space) to users, with automated reclaim users, with automated reclaim

slide-14
SLIDE 14

Management Tools Management Tools

Recovery of PTS groups Recovery of PTS groups

  • Cause – someone confuses “pts del” and “pts rem”

Cause – someone confuses “pts del” and “pts rem”

  • Initial fix – create a new cell, restore pts db, use pts exa to get

Initial fix – create a new cell, restore pts db, use pts exa to get list of users list of users

  • Easier fix – wrap pts to log pts del, capture state of group before

Easier fix – wrap pts to log pts del, capture state of group before deleting deleting

  • Even better fix – do a nightly text dump of your PTS DB

Even better fix – do a nightly text dump of your PTS DB

Mass deletion of volumes Mass deletion of volumes

  • Cause – someone does “rm –rf” equivalent in the wrong place

Cause – someone does “rm –rf” equivalent in the wrong place (most recent case was a botched rsync) (most recent case was a botched rsync)

  • Initial fix – lots of vos dump .backup/.readonly | vos restore

Initial fix – lots of vos dump .backup/.readonly | vos restore

Disks fill up, etc Disks fill up, etc

  • Other fixes – watch size of volumes, and alert if some threshold

Other fixes – watch size of volumes, and alert if some threshold change is exceeded change is exceeded

Throw fileserver into debug mode, capture IP address doing the Throw fileserver into debug mode, capture IP address doing the damage and lock it down damage and lock it down

slide-15
SLIDE 15

Management Tools Management Tools

Watch for ‘calls waiting for a thread’ Watch for ‘calls waiting for a thread’ Routing loops can trigger problems Routing loops can trigger problems True load-based meltdowns can be diagnosed True load-based meltdowns can be diagnosed

  • Send signal to fileserver to toggle debug mode

Send signal to fileserver to toggle debug mode

  • Collect logs for some period of time (minutes)

Collect logs for some period of time (minutes)

  • Analyze logs to locate most frequently used vnodes

Analyze logs to locate most frequently used vnodes

  • Convert vnum to inum

Convert vnum to inum

  • Use find to locate busiest volume and files/directories

Use find to locate busiest volume and files/directories being accessed being accessed

  • Sometimes requires moving the busy volume elsewhere

Sometimes requires moving the busy volume elsewhere to complete diagnosis to complete diagnosis

slide-16
SLIDE 16

Management Tools Management Tools

Keep fileserver machines identical if Keep fileserver machines identical if possible possible

  • Easier maintenance

Easier maintenance

Keep a hot spare fileserver around and Keep a hot spare fileserver around and

  • nline
  • nline
  • Configure as a fileserver in local cell to host

Configure as a fileserver in local cell to host busy volumes busy volumes

  • Configure as a DB server in its own cell for DB

Configure as a DB server in its own cell for DB recovery recovery

“ “Splitting” a volume is somewhat tedious Splitting” a volume is somewhat tedious

  • Best to plan directory/volume layout ahead of

Best to plan directory/volume layout ahead of time, but it can be changed if necessary time, but it can be changed if necessary

slide-17
SLIDE 17

Questions? Questions?