EIO : E rror-handling i s O ccasionally Correct Haryadi S. Gunawi , - - PowerPoint PPT Presentation

eio e rror handling i s o ccasionally correct
SMART_READER_LITE
LIVE PREVIEW

EIO : E rror-handling i s O ccasionally Correct Haryadi S. Gunawi , - - PowerPoint PPT Presentation

EIO : E rror-handling i s O ccasionally Correct Haryadi S. Gunawi , Cindy Rubio-Gonzlez, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Ben Liblit University of Wisconsin Madison FAST 08 February 28, 2008 1 Robustness of File


slide-1
SLIDE 1

1

EIO: Error-handling is Occasionally Correct

Haryadi S. Gunawi, Cindy Rubio-González, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Ben Liblit University of Wisconsin – Madison

FAST ’08 – February 28, 2008

slide-2
SLIDE 2

2

Robustness of File Systems

Today’s file systems have robustness issues Buggy implementation[FiSC-OSDI’04, EXPLODE-OSDI’06]

Unexpected behaviors in corner-case situations

Deficient fault-handling[IRONFS-SOSP’05]

Inconsistent policies: propagate, retry, stop, ignore

Prevalent ignorance

Ext3: Ignore write failures during checkpoint and journal replay NFS: Sync-failure at the server is not propagated to client What is the root cause?

slide-3
SLIDE 3

3

Incorrect Error Code Propagation

void dosync() { void dosync() { fdatawrite(); fdatawrite(); sync_file(); sync_file(); fdatawait(); fdatawait(); }

NFS Client NFS Server sync() dosync dosync fdatawrite fdatawrite sync_file sync_file fdatawait fdatawait ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

slide-4
SLIDE 4

4

Incorrect Error Code Propagation

void dosync() { void dosync() { fdatawrite(); fdatawrite(); sync_file(); sync_file(); fdatawait(); fdatawait(); }

NFS Client NFS Server sync()

X X X

dosync dosync fdatawrite fdatawrite sync_file sync_file fdatawait fdatawait ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... fdatawrite fdatawrite return return EIO; EIO; dosync dosync sync_file sync_file ... ... ... ... return return EIO; EIO; fdatawait fdatawait ... ... ... ... ... ... return return EIO; EIO;

Unsaved error-codes

slide-5
SLIDE 5

5

Implications

Misleading error-codes in distributed systems

NFS client receives SUCCEED instead of ERROR

Useless policies

Retry in NFS client is not invoked

Silent failures

Much harder debugging process

slide-6
SLIDE 6

6

EDP: Error Detection and Propagation Analysis

Static analysis

Useful to show how error codes flow Currently: 34 basic error codes (e.g. EIO, ENOMEM)

Target systems

51 file systems (all directories in linux/fs/*) 3 storage drivers (SCSI, IDE, Software-RAID)

slide-7
SLIDE 7

7

Results

Number of violations

Error-codes flow through 9022 function calls 1153 (13%) calls do not save the returned error-codes

Analysis, a closer look

More complex file systems, more violations Location distance affects error propagation correctness Write errors are neglected more than read errors Many violations are not corner-case bugs − Error-codes are consistently ignored

slide-8
SLIDE 8

8

Outline

Introduction Methodology

Challenges EDP tool

Results Analysis Discussion and Conclusion

slide-9
SLIDE 9

9

Challenges in Static Analysis

File systems use many error codes

bufferstate[Uptodate] = 0 journalflags = ABORT int err = -EIO; ... return err;

Error codes transform

Block I/O error becomes journal error Journal error becomes generic error code

Error codes propagate through:

Function call path Asynchronous path (e.g. interrupt, network messages)

slide-10
SLIDE 10

10

State

Current State: Integer error-codes, function call path Future: Error transformation, asynchronous path

Implementation

Utilize CIL: Infrastructure for C program analysis[Necula-CC’02] EDP: ~4000 LOC in Ocaml

3 components of EDP architecture

Specifying error-code information (e.g. EIO, ENOMEM) Constructing error channels Identifying violation points

EDP

slide-11
SLIDE 11

11

sys_fsync do_fsync filemap_fdatawrite filemap_fdatawrt_rn do_writepages generic_writepages mpage_writepages ext3_writepage

VFS

EIO EIO

if (...) return –EIO –EIO; ext3_writepage (int *err) *err = –EIO; *err = –EIO;

Constructing Error Channels

Propagate function

Dataflow analysis Connect function

pointers

Generation endpoint

Generates error code Example: return –EIO

ext3

slide-12
SLIDE 12

12

func() { err err = func_call(); } func() { err err = func_call(); if (err err) ... }

Detecting Violations

Termination endpoint

Error code is no longer propagated Two termination endpoints: −

error-complete (minimally checks)

error-broken

(unchecked, unsaved, overwritten)

Goal:

Find error-broken endpoints

func() { err err = func_call(); err err = func_call_2(); } func() { func_call(); }

Error-complete endpoint Unchecked Unsaved / Bad Call Overwritten

slide-13
SLIDE 13

13

Outline

Introduction Methodology Results (unsaved error-codes / bad calls)

Graphical outputs Complete results

Analysis of Results Discussion and Conclusion

slide-14
SLIDE 14

14

Functions that generate/propagate error-codes Functions that make bad calls (do not save error-codes) Good calls (calls that propagate error-codes) Bad calls (calls that do not save error-codes)

HFS

func func

1 2 3

slide-15
SLIDE 15

15

int find_init(find_data *fd) { … fd->search_key = kmalloc(…); if (!fd->search_key) return –ENOMEM; return –ENOMEM; … }

HFS (Example 1)

int file_lookup() { … find_init(fd); find_init(fd); fd->search_key-> search_key->cat = …; … }

Bad call! Null pointer dereference

Inconsistencies 11 3

find_init find_init

Good Calls Bad Calls Callee

1

slide-16
SLIDE 16

16

HFS (Example 2)

2

slide-17
SLIDE 17

17

int __brec_find __brec_find(key) {

Finds a record in an HFS node that best matches the given key. Returns ENOENT if it fails.

} int brec_find brec_find(key) { … result = __brec_find(key); result = __brec_find(key); … return result; return result; }

Inconsistencies 11 3

find_init

4 1

__brec_find __brec_find

18 Good Calls Bad Calls

brec_find brec_find

Callee

HFS (Example 2)

2

slide-18
SLIDE 18

18

HFS (Example 3)

3

slide-19
SLIDE 19

19

int free_exts free_exts(…) {

Traverses a list of extents and locate the extents to be freed. If not found, returns EIO. “panic?” is written before the return EIO statement.

}

HFS (Example 3)

Inconsistencies 11 3

find_init

4 1

__brec_find

1 18 Good Calls 3 Bad Calls

free_exts free_exts brec_find

Callee

3

slide-20
SLIDE 20

20

HFS (Summary)

Not only in HFS Almost all file systems and storage systems have

major inconsistencies

Inconsistencies

11 3

find_init find_init

4 1

__brec_find brec_find

1 18 Good Calls 3 Bad Calls

free_exts free_exts brec_find brec_find

Callee

slide-21
SLIDE 21

21

ext3

37 bad / 188 calls = 20%

slide-22
SLIDE 22

22

35 bad / 218 calls = 16%

ReiserFS

slide-23
SLIDE 23

23

IBM JFS

61 bad / 340 calls = 18%

slide-24
SLIDE 24

24

NFS Client

54 bad / 446 calls = 12%

slide-25
SLIDE 25

25

Coda

0 bad / 54 calls = 0% (internal) 0 bad / 95 calls = 0% (external)

slide-26
SLIDE 26

26

Summary

Incorrect error propagation plagues almost all

file systems and storage systems

177 914 Bad Calls 904 7400 EC Calls 20% Storage drivers 12% File systems Fraction

slide-27
SLIDE 27

27

Outline

Introduction Methodology Results Analysis of Results Discussion and Conclude

slide-28
SLIDE 28

28

Analysis of Results

Correlate robustness and complexity

Correlate file system size with number of violations −

More complex file systems, more violations (Corr = 0.82)

Correlate file system size with frequency of violations −

Small file systems make frequent violations (Corr = -0.20)

Location distance of calls affects correct error propagation

Inter-module > inter-file > intra-file bad calls

Read vs. Write failure-handling Corner-case or consistent mistakes

slide-29
SLIDE 29

29

Read vs. Write Failure-Handling

Filter read/write operations (string comparison)

Callee contains “write”, or “sync”, or “wait” Write ops Callee contains “read” Read ops

177 26* Bad Calls 904 603 EC Calls 20% Sync+Wait+Write 4% Read Fraction Callee Type

mm/readahead.c Read prefetching in Memory Management Lots of write failures are ignored!

slide-30
SLIDE 30

30

Corner-Case or Consistent Mistakes?

Define bad call frequency =

Example: sync_blockdev, 15/21 Bad call frequency: 71%

Corner-case bugs

Bad call frequency < 20%

Consistent bugs

Bad call frequency > 50%

# Bad calls to f() # All calls to f()

slide-31
SLIDE 31

31

Bad Call Frequency

sync_blockdev 15 bad calls / 21 EC calls Bad Call Freq: 71 % At x = 71, y += 15

Less than 100 violations are corner- case bugs 850 bad calls fall above the 50% mark

CDF of Bad Call Frequency

Cumulative #Bad Calls Cumulative Fraction

slide-32
SLIDE 32

32

What’s going on?

Not just bugs But more fundamental design issues

Checkpoint failures are ignored

Why? Maybe because of journaling flaw [IOShepherd-SOSP’07]

Cannot recover from checkpoint failures

Ex: A simple block remap could not result in a consistent state

Many write failures are ignored

Lack of recovery policies? Hard to recover?

Many failures are ignored in the middle of operations

Hard to rollback?

slide-33
SLIDE 33

33

Conclusion (developer comments)

ext3

“there's no way of reporting error to

  • userspace. So ignore it”

XFS

“Just ignore errors at this point. There is nothing we can do except to try to keep going”

ReiserFS “we can't do anything about an error here” IBM JFS “note: todo: log error handler” CIFS

“should we pass any errors back?”

SCSI

“Todo: handle failure”

slide-34
SLIDE 34

34

Thank you! Questions?

ADvanced Systems Laboratory www.cs.wisc.edu/adsl

slide-35
SLIDE 35

35

Extra Slides