Deciding When to Forget in the Elephant File System By Douglas S. - - PowerPoint PPT Presentation

deciding when to forget in the elephant file system
SMART_READER_LITE
LIVE PREVIEW

Deciding When to Forget in the Elephant File System By Douglas S. - - PowerPoint PPT Presentation

Deciding When to Forget in the Elephant File System By Douglas S. Santry, Michael J. Feeley, Norman C. Hutchinson, Alistair C. Veitch, Ross W. Carton, and Jacob Or Presented By Jon LeVitre CS 533 Concepts of Operating Systems March 14, 2006


slide-1
SLIDE 1

Deciding When to Forget in the Elephant File System

By Douglas S. Santry, Michael J. Feeley, Norman C. Hutchinson, Alistair C. Veitch, Ross W. Carton, and Jacob Or Presented By Jon LeVitre

CS 533 Concepts of Operating Systems March 14, 2006 Slide 1

slide-2
SLIDE 2

CS 533 Concepts of Operating Systems March 14, 2006 Slide 2

Overview

  • Background
  • The Elephant File System

– Goals and Ideas – Design – Policies – Implementation – Performance

  • Summary
slide-3
SLIDE 3

CS 533 Concepts of Operating Systems March 14, 2006 Slide 3

Background

  • Disk space is becoming cheaper and

larger

  • Data is protected from most forms of

failure (except user error)

  • It's a good time to consider ways to make

file systems protect users from themselves

slide-4
SLIDE 4

CS 533 Concepts of Operating Systems March 14, 2006 Slide 4

Previous Work

  • Cedar used copy on write to

automatically retain recent versions

  • Plan-9, AFS, and WAFL use

checkpointing

  • Applications maintain document history
  • Trashcan prevents accidental deletion
  • Users make their own copies
slide-5
SLIDE 5

CS 533 Concepts of Operating Systems March 14, 2006 Slide 5

Goals of EFS

  • Give users the ability to undo recent

changes (both writes and deletes)

  • To save storage space, long term

history only has important versions

slide-6
SLIDE 6

CS 533 Concepts of Operating Systems March 14, 2006 Slide 6

Observations

  • The user's ability to recognize crucial

file versions deteriorates over time

  • Any solution that relies solely on the

user to identify landmark versions is problematic (but they need to be allowed to do it)

slide-7
SLIDE 7

CS 533 Concepts of Operating Systems March 14, 2006 Slide 7

Types of Files

  • Read-only
  • Derived
  • Cached
  • Temporary*
  • User-modified*

* Only these need protection

slide-8
SLIDE 8

CS 533 Concepts of Operating Systems March 14, 2006 Slide 8

General Design

  • Logical file deletion
  • Copy-on-write (version becomes
  • fficial when file is closed)
  • File versions are named by combining

pathname, date, and time (not unique)

  • Retention policy specified by file (or group
  • f files)
  • File system cleaner reclaims blocks
slide-9
SLIDE 9

CS 533 Concepts of Operating Systems March 14, 2006 Slide 9

File Retention Policies

  • Keep One
  • Keep Save

– Recent changes only

  • Keep Landmarks

– Heuristic: keep long-lived versions – Also let users specify landmarks

  • Keep All
slide-10
SLIDE 10

CS 533 Concepts of Operating Systems March 14, 2006 Slide 10

File Implementation

  • The inumber points to an imap
  • Temperature is a heuristic used by

the cleaner

  • For non-versioned files,

imap points to an inode

  • Otherwise, imap points

to an inode log

slide-11
SLIDE 11

CS 533 Concepts of Operating Systems March 14, 2006 Slide 11

Directory Implementation

  • Directories map names to inumbers
  • Directories store versioning

information explicitly

  • Each directory entry stores the

creation time and delete time (if any)

  • Entries for deleted files

can be moved to a history inode

slide-12
SLIDE 12

CS 533 Concepts of Operating Systems March 14, 2006 Slide 12

Microbenchmark Results

  • Files with Keep One Policy was about

the same as FFS (write had a bug)

  • Files with Keep All policy

– Slightly slower open, write, and close – Much slower creation – Much faster deletion

slide-13
SLIDE 13

CS 533 Concepts of Operating Systems March 14, 2006 Slide 13

More Microbenchmarks

  • Andrew file system benchmark (creates

directory hierarchy, copies 70 source files totaling 200KB files, traverses directories, and opens and reads each file)

– EFS was ~5% slower (19 seconds vs 18 seconds) – Much more file meta data:

  • FFS used 18KB for inodes
  • EFS used 444KB for inode logs
  • For larger test, EFS was twice as fast as FFS
slide-14
SLIDE 14

CS 533 Concepts of Operating Systems March 14, 2006 Slide 14

File Profiles

  • Keep One: 33.6% of files, 56.3% of bytes

(98.7% of bytes written)

  • Keep Safe: 3.9% of files, 28.5% of bytes

(only 0.6% of bytes written)

  • Keep Landmark: 62.4% of, 15.2% of

bytes (only 0.7% of bytes written)

slide-15
SLIDE 15

CS 533 Concepts of Operating Systems March 14, 2006 Slide 15

Impact on Buffer-Cache?

  • Buffer-Caches reduce the number of

writes to disk to improve performance

  • Elephant must write to disk when file is

closed, even if ...

– There is a write shortly after close – The file is deleted right after close.

  • This should be rare, so the impact should

be minimal

slide-16
SLIDE 16

CS 533 Concepts of Operating Systems March 14, 2006 Slide 16

Summary

  • Performance similar to FFS
  • Only a few files need versioning
  • Robustness verified using NFS shadowing
  • “....we believe that the extra storage and disk

write overhead incurred by using a file system such as Elephant is of minimal cost compared to the convenience and time gains... made possible”

  • More research was needed