File System Aging Featuring slides modified from a talk by Martn - - PowerPoint PPT Presentation

file system aging
SMART_READER_LITE
LIVE PREVIEW

File System Aging Featuring slides modified from a talk by Martn - - PowerPoint PPT Presentation

File System Aging Featuring slides modified from a talk by Martn Farach-Colton Rutgers University This Class Aging Two papers Smith and Seltzer Conway et al. How do people feel about the readings? This Class Aging Two


slide-1
SLIDE 1

File System Aging

Featuring slides modified from a talk by

Martín Farach-Colton Rutgers University

slide-2
SLIDE 2

This Class

Aging

  • Two papers
  • Smith and Seltzer
  • Conway et al.
  • How do people feel about the readings?
slide-3
SLIDE 3

This Class

Aging

  • Two papers
  • Smith and Seltzer
  • Conway et al.
  • How do people feel about the readings?

Outline

  • (Brief) I/O Models overview
  • Definitions of Fragmentation
  • Aging Problem
  • Simulation and measurement
  • Discussion
slide-4
SLIDE 4

How do we model performance?

slide-5
SLIDE 5

How do we account for disk I/O?

DAM model: How theorists think about external memory algorithms

  • Data is transferred in blocks between RAM and disk.
  • The number of block transfers dominates the running time.

Goal: Minimize # of I/Os

  • Performance bounds are parameterized by 


block size B, memory size M, data size N.

Disk RAM B B M

[Aggarwal+Vitter ’88]

slide-6
SLIDE 6

Is the DAM Model any good?

Short answer: Yes (2-competitive) Long answer: No (can’t tune parameters)

slide-7
SLIDE 7

Affine Model

Affine model:

  • Data is transferred in blocks between RAM and disk.
  • If k blocks are transferred, the cost is
  • On hard disks, 1 is the normalized seek cost and ⍺ is the

incremental bandwidth cost of subsequent blocks

  • On SSDs, it’s more complicated but affine still fits better than

DAM costs.

  • (And PDAM fits even better…)

Goal: Minimize cost of I/Os

  • Performance bounds are parameterized by block size B, memory

size M, data size N.

1 + αk Takeaway: the affine model captures the size of I/Os as well as the speed of the device itself.

slide-8
SLIDE 8

Now We Have a Model, What Next?

The goal of our model is to predict performance. We can verify “things” using a benchmark

  • We compare two systems, A and B, by running the

same well-specified workload on each system

  • We use our model to predict the relative performance of

A and B, and either:

  • Validate our hypothesis
  • Revise our model
  • Revise our theory because we learned something new about our

system and are better able to present an input to our model

To be useful, we need to run representative benchmarks under representative conditions

slide-9
SLIDE 9

Representative State

What is the representative state of a file system?

  • How many files?
  • What is the organization of the files (directory hierarchy)?
  • What is the average size of a file? File size distribution?

Is the state of a file system a path or a point?

  • It is a path.
  • Creating files limits/influences the placement decisions for future
  • perations
  • Deleting files creates “holes” in the LBA space
  • Moving (renaming) files alters the relationships between files
  • It isn’t enough to look at the contents of a file system in

isolation, we need to know where we started and how we got there.

slide-10
SLIDE 10

Aging

Theory: many file systems will age.

  • Aging: the degradation of performance over time.
  • Our models predict this
  • heuristics lead to fragmentation
  • fragmentation leads to increased seeks on important workloads

Two open questions:

  • Is the representative state an aged file system?
  • If so, how do we create a representatively aged file

system?

slide-11
SLIDE 11

Does aging happen on modern file systems?

slide-12
SLIDE 12

Do file system age?

slide-13
SLIDE 13

Do file system age?

Chris Hoffman at howtogeek.com says: “Linux’s ext2, ext3, and ext4 file systems… [are] designed to avoid fragmentation in normal use.” “If you do have problems with fragmentation on Linux, you probably need a larger hard disk.”

slide-14
SLIDE 14

Do file system age?

Chris Hoffman at howtogeek.com says: “Linux’s ext2, ext3, and ext4 file systems… [are] designed to avoid fragmentation in normal use.” “If you do have problems with fragmentation on Linux, you probably need a larger hard disk.” “Modern Linux filesystems keep fragmentation at a minimum…Therefore it is not necessary to worry about fragmentation in a Linux system.”

slide-15
SLIDE 15

I guess not. Then was it ever a problem?

slide-16
SLIDE 16

Do file system age?

So: as of 1997, file systems aged. Then file systems got better, and sys admins say they don’t age. What’s the actual story?

slide-17
SLIDE 17

Theory of Aging over the Ages

slide-18
SLIDE 18

Euclid’s view of hard disks

1 1

Year: X+~4 years

slide-19
SLIDE 19

Euclid’s view of hard disks

1 1

Year: X+~4 years Density: doubles in each dimension every 4 years or so

slide-20
SLIDE 20

Euclid’s view of hard disks

Year: X+~4 years Density: doubles in each dimension every 4 years or so

1 1

α ∝ 1 D

slide-21
SLIDE 21

Hard disks gradually increase ⍺ Measurements one decade have a sell-by date … unless you solve the problem algorithmically

slide-22
SLIDE 22

Perspective

Assumption

  • Random seek is 100x slower than sequential
  • 1% of blocks are non-sequential in the file system

Conclusion

  • That’s enough to limit IO to 50%

So, for people who think that file systems don’t age, are you sure that modern file systems keep fragmentation to under 1%?

slide-23
SLIDE 23

Which File Systems Age?

File Systems Types Logging: F2FS B-tree: BtrFS Bε-tree: BεtrFS Heuristic based update-in- place: FFS, ext4, … 😴 🤕 🤔 🤕

Should age Should age Should age Shouldn’t age

slide-24
SLIDE 24

Let’s test the hypothesis! How?

slide-25
SLIDE 25

Smith and Seltzer ‘97

Keith Smith started grad school in ’92

  • He decided to take snapshots of a bunch of computers
  • Every day
  • For years

He and Seltzer found that:

  • If you replay the changes implied by the snapshots
  • File system performance degrades
  • On file systems available in ’97
slide-26
SLIDE 26

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

slide-27
SLIDE 27

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

get coffee

slide-28
SLIDE 28

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

git pull

get coffee git pull

slide-29
SLIDE 29

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

git pull make

get coffee git pull make

slide-30
SLIDE 30

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

git pull make

get coffee git pull make get coffee

slide-31
SLIDE 31

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

git pull make

get coffee git pull make get coffee git pull

slide-32
SLIDE 32

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

git pull make git pull make

get coffee git pull make get coffee git pull add awesome features

slide-33
SLIDE 33

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

git pull make git pull make

get coffee git pull make get coffee git pull add awesome features get coffee

slide-34
SLIDE 34

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

git pull make git pull make

get coffee git pull make get coffee git pull add awesome features get coffee git pull

slide-35
SLIDE 35

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

get coffee git pull make get coffee git pull add awesome features get coffee git pull fix bugs . . .

git pull make git pull make

slide-36
SLIDE 36

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

get coffee git pull make get coffee git pull add awesome features get coffee git pull fix bugs . . .

git pull make git pull make

slide-37
SLIDE 37

We are impatient

We’d like a history of file systems changes

  • That we can replay on any system
  • We don’t have to wait for years
  • Years of history should be readily available

Let’s model a very simple case: Developers

get coffee git pull make get coffee git pull add awesome features get coffee git pull fix bugs . . .

git pull make git pull make

slide-38
SLIDE 38

We are impatient

get coffee git pull make get coffee git pull add awesome features get coffee git pull fix bugs . . .

git pull make git pull make

We can simulate a developer by replaying Git histories

slide-39
SLIDE 39

Simulating a Developer

Do 100 git pulls Measure Performance

Use the Linux kernel repo from github.com

slide-40
SLIDE 40

How do we measure fragmentation?

Like timing a preorder traversal of tree… Should measure fragmentation

  • Why?

time grep -r random_string /path/to/fs

slide-41
SLIDE 41

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

slide-42
SLIDE 42

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

slide-43
SLIDE 43

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

Metadata Fragmentation

slide-44
SLIDE 44

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

Metadata Fragmentation

slide-45
SLIDE 45

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

Metadata Fragmentation

slide-46
SLIDE 46

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

Metadata Fragmentation Intrafile Fragmentation

slide-47
SLIDE 47

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

Metadata Fragmentation Intrafile Fragmentation

slide-48
SLIDE 48

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

Metadata Fragmentation Intrafile Fragmentation Interfile Fragmentation

slide-49
SLIDE 49

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

Metadata Fragmentation Intrafile Fragmentation Interfile Fragmentation

slide-50
SLIDE 50

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

Metadata Fragmentation Intrafile Fragmentation Interfile Fragmentation

slide-51
SLIDE 51

time grep -r random_string /path/to/filesystem

dir file1 file2 file3 file4

Measuring Aging

Metadata Fragmentation Intrafile Fragmentation Interfile Fragmentation

Then normalize per gigabyte read

slide-52
SLIDE 52

Do modern file systems really age?

slide-53
SLIDE 53

Git workload on ext4 on HDD

Time in seconds / GiB

200 400 600 800

Git pulls performed

1 2 3 4 5 6 7 8 9 1

Lower is better

Our Setup: Cold Cache, 3.4 GHz Quad Core, 4GiB RAM, 20 GiB HDD partition - SATA 7200 RPM

2x slowdown 2x slowdown

14.3x

slide-54
SLIDE 54

Git workload on ext4 on HDD

Time in seconds / GiB

200 400 600 800

Git pulls performed

1 2 3 4 5 6 7 8 9 1

14.3x Lower is better

Our Setup: Cold Cache, 3.4 GHz Quad Core, 4GiB RAM, 20 GiB HDD partition - SATA 7200 RPM

2x slowdown 4x slowdown

slide-55
SLIDE 55

Git workload on ext4 on HDD

Time in seconds / GiB

200 400 600 800

Git pulls performed

1 2 3 4 5 6 7 8 9 1

Lower is better

Our Setup: Cold Cache, 3.4 GHz Quad Core, 4GiB RAM, 20 GiB HDD partition - SATA 7200 RPM

15 minutes to grep 1.2GiB

14.3x

slide-56
SLIDE 56

Ruling out alternative explanations

slide-57
SLIDE 57

Is it a change in the file system?

Smaller files, shallower tree, …

slide-58
SLIDE 58

File System Rejuvenation

Idea: copy same logical state to new partition

  • After each 100 pulls
  • Compare grep cost
slide-59
SLIDE 59

Aging ext4 with Git on HDD

Time in seconds / GiB

200 400 600 800

Git pulls performed

1 2 3 4 5 6 7 8 9 1

Lower is better Aged Unaged 8.8x

Maybe it’s full disks? Nope: 20GiB partition, 1.2 GiB data

slide-60
SLIDE 60

Aging ext4 with Git on HDD

Time in seconds / GiB

200 400 600 800

Git pulls performed

1 2 3 4 5 6 7 8 9 1

Lower is better Aged Unaged 8.8x Smaller average file size makes the unaged 60% slower

slide-61
SLIDE 61

Is it just ext4?

slide-62
SLIDE 62

Aging other file systems with Git on HDD

Btrfs

200 400 600 800

F2FS

500 1000 1500 2000

ZFS

500 1000 1500 2000

XFS

200 400 600 800

20.6x 22.4x 2.2x weird unaged behavior on XFS 11.8x

Lower is better

slide-63
SLIDE 63

Will SSDs save us?

slide-64
SLIDE 64

1.9x

Git Workload on XFS on SSD

Time in seconds / GiB

10 20 30

Git pulls performed

1 2 3 4 5 6 7 8 9 1

Aged Unaged Lower is better

Other file systems give similar results (~2x slowdown)

slide-65
SLIDE 65

And now for BεtrFS

slide-66
SLIDE 66

Git on BεtrFS on HDD

Time in seconds / GiB 200 400 600 800 Git pulls performed 1 2 3 4 5 6 7 8 9 1 Lower is better BetrFS XFS ext4/F2FS/ZFS Btrfs F2FS ZFS ext4 Btrfs XFS — Aged — Unaged

slide-67
SLIDE 67

Git on BεtrFS on HDD

BetrFS — Aged — Unaged Time in seconds / GiB 20 40 60 80 Git pulls performed 1 2 3 4 5 6 7 8 9 1 Btrfs F2FS ext4 ZFS Lower is better

slide-68
SLIDE 68

And SSDs?

slide-69
SLIDE 69

Git on BεtrFS on SSD

Time in seconds / GiB 10 20 30 Git pulls performed 1 2 3 4 5 6 7 8 9 1 Btrfs

ZFS

Lower is better — Aged — Unaged BetrFS

slide-70
SLIDE 70

File Systems Fated for Senescence? Nonsense, Says Science!

🃠Rutgers University, ♢The University of North Carolina at Chapel Hill, ♠Stony Brook University, ♡Oracle Corporation and Massachusetts

Institute of Technology, ♣Farmingdale State College of SUNY

Alex Conway🃠 Ainesh Bakshi🃠 Yizheng Jiao♢ Yang Zhan♢ Michael A. Bender♠ William Jannen♠ Rob Johnson♠ Bradley C. Kuszmaul♡ Donald E. Porter♢ Jun Yuan♣ Martin Farach- Colton🃠