ZZFS: A file system for spontaneous users Michelle L. Mazurek 1 , - - PowerPoint PPT Presentation

zzfs a file system for spontaneous users
SMART_READER_LITE
LIVE PREVIEW

ZZFS: A file system for spontaneous users Michelle L. Mazurek 1 , - - PowerPoint PPT Presentation

ZZFS: A file system for spontaneous users Michelle L. Mazurek 1 , Eno Thereska 2 , Dinan Gunawardena 2 , Richard Harper 2 , James Scott 2 1 Carnegie Mellon University 2 Microsoft Research, Cambridge UK In ideal world . Alice wants to check


slide-1
SLIDE 1

ZZFS: A file system for spontaneous users

Michelle L. Mazurek1, Eno Thereska2, Dinan Gunawardena2, Richard Harper2, James Scott2

1Carnegie Mellon University

2Microsoft Research, Cambridge UK

slide-2
SLIDE 2

In ideal world ….

  • Alice wants to check her financial spreadsheet
  • It’s available and up-to-date on her phone
  • Even if she updated it from her laptop
  • Even if her husband updated it from his phone

2 February 2012

I should check my budget

food

x a 9

mortgage

y 5 13

clothing

2 b r

social

z 7 4

car

3 w x

slide-3
SLIDE 3

Personal doesn’t mean simple

  • Requires collaboration
  • Household management (e.g. financials)
  • Beyond the household
  • Requires coordinating devices
  • Non-homogenous, frequently changing
  • Requires handling writes
  • Example: Listening to music
  • Ideally, all of this should be seamless

February 2012 3

slide-4
SLIDE 4

One key problem: Unavailable devices

  • Read financials, but laptop is off
  • Switch from laptop to tablet
  • Never on at the same time; how to sync?
  • Update remote data
  • Could have been pre-fetched if device was on
  • Unavailable devices inhibit on-demand data

February 2012 4

dormant

slide-5
SLIDE 5

Human factors are important

  • Eventual consistency (e.g. Coda)
  • Users want data now
  • Planning and hoarding (Perspective, Anzere)
  • Users are spontaneous
  • In a public cloud
  • Users want trust, control
  • Best of each?
  • Availability and trust/control, without planning

February 2012 5

slide-6
SLIDE 6

ZZFS: More devices are available

  • Create more “always available” resources
  • Turn on devices that are off
  • Minimize likelihood a device is unreachable
  • Support users
  • Access data without pre-planning
  • Don’t force or second-guess their choices

February 2012 6

Combine new hardware with best-practice storage system techniques

slide-7
SLIDE 7

Outline

  • Introduction and motivation
  • Design
  • Implementation and evaluation
  • Conclusion

7 February 2012

slide-8
SLIDE 8

ZZFS: Design considerations

  • Human factors
  • User studies to examine storage and access behavior
  • Hardware
  • Low-power NIC to turn on devices (Agarwal ’09)
  • Additional temporary storage
  • Storage system best practices
  • Versioned histories for consistency
  • I/O offloading when needed for performance

February 2012 8

slide-9
SLIDE 9

Human factors: Sources

  • Goal: Understand how and why users store,
  • rganize and access content across devices
  • Inform design decisions
  • Sources:
  • Analysis of feedback from LiveMesh and Dropbox
  • Small-scale qualitative study
  • Large-scale qualitative study (Odom et al., CHI 2012)

February 2012 9

slide-10
SLIDE 10

Human factors: Findings

  • Don’t require hoarding
  • People are busy
  • People don’t always know what they’ll need
  • Placement is deliberate and reasoned
  • Desire to know and control where data is
  • Trust considerations for cloud storage (Odom

2012, Ion 2011)

  • Don’t second-guess users’ decisions

February 2012 10

slide-11
SLIDE 11

Hardware: Somniloquy NIC

  • Maintains access while PC is dormant
  • Wake up as needed (“always-on”)
  • Transparent to applications
  • 10-100x less power than idle PC
  • On-board flash for temporary storage

February 2012 11

Agarwal et al., NSDI 2009

slide-12
SLIDE 12

Storage system

  • Metadata service
  • How devices find the data
  • I/O director
  • Sets policy for where/how data is found
  • I/O offloading
  • Example use cases

February 2012 12

slide-13
SLIDE 13

Storage system: Metadata service

  • Flat, device-transparent namespace
  • Can reside anywhere
  • Could replicate content or service
  • Simple default for personal data
  • Single instance, cached on all devices
  • Metadata size << data size
  • ≤ 0.1% (our families)
  • Consistent with Eyo (Strauss 2011)
  • Easily storable, cacheable

February 2012 13

slide-14
SLIDE 14

Storage system: I/O director

  • Sits above metadata service
  • Determines how data is stored, accessed
  • Optimize for energy, cost, latency, etc.
  • New options for placement, access
  • Wake device when dormant
  • Offload writes when unavailable

– To on-board flash, cloud, other devices

February 2012 14

slide-15
SLIDE 15

Storage system: I/O offloading

  • Builds on past work (Everest, Sierra)
  • Offload updates to spare resources as needed
  • All offloaded data eventually reclaimed
  • Allows online data migration

February 2012 15

slide-16
SLIDE 16

NIC

Standard read

February 2012 16

MDS I/O Dir

Where is baby.jpg? On the desktop Read baby.jpg

slide-17
SLIDE 17

NIC

Read while dormant: Wake up

February 2012 17

MDS I/O Dir

Read baby.jpg dormant

slide-18
SLIDE 18

NIC

Standard write

February 2012 18

MDS I/O Dir

primary secondary

Where is finances.xlsx? On the desktop Write finances.xlsx Write finances.xlsx ACK ACK

food

x

clothing

2

car

3

food

x

clothing

2

car

3

slide-19
SLIDE 19

NIC

Write with offload/reclaim

February 2012 19

MDS I/O Dir

Write finances.xlsx Write to log ACK ACK dormant I’m awake reclaim

No waiting!

Somni flash

food

x

clothing

2

car

3

food

x

clothing

2

car

3

primary secondary

slide-20
SLIDE 20

Other placement options

  • Write with no network connection:
  • Offload all remote writes to local log
  • Upon connection, implicated devices reclaim
  • Standard conflict resolution (e.g. Bayou)
  • Move primary before device sleeps
  • More in the paper

February 2012 20

slide-21
SLIDE 21

Other design considerations

  • Broadband at home, weak 3G while mobile
  • Somniloquy is a prototype
  • Application to mobile devices, tablets

February 2012 21

slide-22
SLIDE 22

Outline

  • Introduction and motivation
  • Design
  • Implementation and evaluation
  • Conclusion

22 February 2012

slide-23
SLIDE 23

Implementation

  • Simple application of always-on design
  • User level, in C, above NTFS or FAT
  • Run legacy applications via WebDav detours
  • Per-object replication
  • Default placement: 1R, leave where created

February 2012 23

slide-24
SLIDE 24

Evaluation overview

  • Throughput: Low overhead
  • Read latency: Standby penalty
  • Write latency: Standby penalty
  • Access latency and placement policy
  • Moving files: Limited performance penalty
  • Sensitivity to parameters
  • Fraction of local vs. remote accesses
  • Fraction of reads vs. writes

February 2012 24

slide-25
SLIDE 25

Read latency: Music example

February 2012 25

  • User listening to music
  • Songs split local and remote, shuffle mode
  • No replication
  • Read entire song in chunks; then small db write
  • Goal: Evaluate worst case
  • Read request arrives just as shutting down
slide-26
SLIDE 26

Read latency: Music example

February 2012 26

Worst case: Performance is OK Local reads Remote small writes Remote reads

Blocked request Standby and resume

slide-27
SLIDE 27

Latency: Standby and resume

Device Standby (s) Resume (s) Lenovo x61 (Win7) 3.8 2.6 Dell T3500 (Win7) 8.7 7.2 HP Pavilion (XP) 4.9 10.3 Macbook Pro (OSX 10.6.8) 1.0 2.0 Ubuntu 11.10 11.0 4.5

February 2012 27

slide-28
SLIDE 28

Write latency: Document example

  • Writes to an office document
  • 2 replicas: D1(local) and D2 (remote)
  • Worst case: remote device goes into standby
  • Offload to D3 while it resumes
  • When it’s awake, reclaim
  • Unlike for read, offload masks switch-on

February 2012 28

slide-29
SLIDE 29

Write latency: Document example

February 2012 29

Offloading masks switch-on cost

D2 shuts down; start

  • ffload

to D3

D2 wakes up; start reclaim Reclai m ends

slide-30
SLIDE 30

Conclusion

  • ZZFS makes spontaneous access to

distributed content work better

  • Low-power, always-on comm channel
  • Helps execute placement policies
  • Helps compensate for uncertainties in device

availability, user behavior

  • Accounts for human factors

February 2012 30