Nameless Writes Remzi H. Arpaci-Dusseau Professor @ University of - - PowerPoint PPT Presentation

nameless writes
SMART_READER_LITE
LIVE PREVIEW

Nameless Writes Remzi H. Arpaci-Dusseau Professor @ University of - - PowerPoint PPT Presentation

Nameless Writes Remzi H. Arpaci-Dusseau Professor @ University of Wisconsin-Madison (+visiting professor @ EPFL) Joint work with: Andrea C. Arpaci-Dusseau (UW, EPFL) Vijayan Prabhakaran (MSR Silicon Valley) Indirection All problems in


slide-1
SLIDE 1

Nameless Writes

Remzi H. Arpaci-Dusseau Professor @ University of Wisconsin-Madison (+visiting professor @ EPFL) Joint work with: Andrea C. Arpaci-Dusseau (UW, EPFL) Vijayan Prabhakaran (MSR Silicon Valley)

slide-2
SLIDE 2

Indirection

“All problems in computer science can be solved by another level of indirection”

  • usually attributed to Butler Lampson
slide-3
SLIDE 3

Problems?

  • Too big, too slow

Example: Virtual Memory

code code heap heap stack stack

Virtual Address Space

code code heap heap stack stack

Physical Memory Page Table

V PFN 1 2 1 3 1 1 1 4 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 1 0 1 6

slide-4
SLIDE 4

Another example: RAID

Early RAIDs: Simple indirection

  • Fixed mapping avoids need

for indirection table

1 2 3 4 5 6 7 1 6 7 1 6 7 Mirrored 2 4 5 P RAID

More sophisticated RAID, more sophisticated mappings

  • e.g., AutoRAID
slide-5
SLIDE 5

Too Much of a Good Thing?

slide-6
SLIDE 6

Virtual Machine Monitors

VMMs: Another layer, beneath OS

  • Consolidation, multi-platform support,

many other reasons But the cost of indirection grows Example: Virtual Memory (again)

  • Double Indirection:

Virtual to Physical to Machine

slide-7
SLIDE 7

Many Examples

VMMs and Memory File System and RAID File System and Disk (a little) File System and RAID and Disk File System and Flash FTL

slide-8
SLIDE 8

Today’s Focus: Flash

slide-9
SLIDE 9

Flash FTL

Flash Translation Layer (FTL)

  • Turns read-erase/program into

read-write

  • Allows for wear leveling
slide-10
SLIDE 10

Background

Flash organized into blocks Each block contains some pages Problem:

  • To program a page, must erase block first
  • Even worse: Erase is costly (ms not us)

Implication: Simple mapping performs poorly

  • Would turn each write into erase/program

page page page page page page page page

block

slide-11
SLIDE 11

Solution: Use Indirection

Solution: Borrow log-structuring ideas

  • Organize flash into a log
  • Erase an “active” block
  • Direct all writes to active block
  • Record mapping in indirection table

(i-table)

slide-12
SLIDE 12

Useful for Wear Too

Wear-leveling problem

  • Too many erase-program cycles

will render block unreadable (can’t differentiate ones from zeroes) Indirection helps here too

  • Balance write load across blocks
  • Might have to migrate blocks from live

but not-often-used block for leveling

slide-13
SLIDE 13

Problems

slide-14
SLIDE 14

Cost of Indirection

Too big

  • i-table (naive): one mapping per page
  • i-table (hybrid): one per page for some,
  • ne per block for most
  • Either way: MB (or GB) of memory,

just for mapping information Too slow

  • Could be a problem too

(if i-table doesn’t fit in memory)

slide-15
SLIDE 15

So What Can We Do?

slide-16
SLIDE 16

Key Idea: Turn Double Indirection To Our Advantage

slide-17
SLIDE 17

Leverage: Double Indirection

Double indirection example

  • FS: virtual offset (in file)

to logical block (on dev)

  • Flash: logical block to

physical page Can we remove one level of the indirection?

  • Generically called de-indirection

0: 100 1: 101

inode SSD

100: 8000 101: 9500 0: 8000 1: 9500

inode SSD

no mapping info needed

slide-18
SLIDE 18

Our “Solution”: Nameless Writes

slide-19
SLIDE 19

Nameless Writes

Usual interface:

  • write(address, data): return OK/FAIL

Nameless interface

  • write(data): return address, OK/FAIL

Device chooses where to write block, and returns physical address to client (FS)

slide-20
SLIDE 20

Simple Example

Structures dirtied: inode (I), data (D) Usual approach

  • D is allocated to address A(D)
  • I is at fixed location [A(D) inside]
  • Write them out whenever (depending on FS)

Nameless approach

  • Nameless write of D, returns A(D)
  • Update inode I with A(D)
slide-21
SLIDE 21

What About Wear?

Problem: Wear-leveling

  • Wear-leveling algorithm still might

need to move blocks Solution: Renaming callback

  • Device upcalls into client, informs

that device has moved block at address X to new location: addressY

  • Client (FS) must take action as needed
slide-22
SLIDE 22

Key Features

Removes FTL indirection

  • No more indirection table;

assumed that client tracks locations Device retains control

  • For performance, still log-structured
  • For reliability, still does wear leveling
slide-23
SLIDE 23

But, Lots of Problems

File system must delay allocation decision File system must be able to write out blocks in certain order File system must be able to handle callback Sometimes need a “known location” Device must be willing to expose its physical nature (many more; your thoughts/complaints go here)

slide-24
SLIDE 24

Other Ways To Do This?

Could remove FTL (“file-system only”)

  • Buggy FS might do poor wear leveling
  • Device is better at managing its detailed

performance characteristics Could do it in device (“device only”)

  • Hard to do while device is mounted

Could consider alternate interfaces

  • e.g., inform device of pointers
slide-25
SLIDE 25

Conclusions

slide-26
SLIDE 26

Nameless Writes

Addresses overheads of FTL indirection

  • Enables little or no mapping info
  • Device controls low-level decisions

But, some pain points

  • Integrating into existing/new file systems
  • Will devices expose physical names?

General approach of de-indirection

  • Likely more widely applicable
slide-27
SLIDE 27

Indirection: Reprise

“All problems in computer science can be solved by another level of indirection”

  • usually attributed to Butler Lampson

Lampson attributes it to David Wheeler And Wheeler usually added: “but that usually will create another problem”