Fast, Scalable Disk Imaging with Frisbee University of Utah Mike - - PowerPoint PPT Presentation

fast scalable disk imaging with frisbee
SMART_READER_LITE
LIVE PREVIEW

Fast, Scalable Disk Imaging with Frisbee University of Utah Mike - - PowerPoint PPT Presentation

Fast, Scalable Disk Imaging with Frisbee University of Utah Mike Hibler, Leigh Stoller, Jay Lepreau, Robert Ricci, Chad Barb 1 Key Points Frisbee clones whole disks from a server to many clients using multicast ! Fast " 34 seconds for


slide-1
SLIDE 1

1

Fast, Scalable Disk Imaging with Frisbee

University of Utah Mike Hibler, Leigh Stoller, Jay Lepreau, Robert Ricci, Chad Barb

slide-2
SLIDE 2

2

Key Points

Frisbee clones whole disks from a server to many clients using multicast

! Fast

" 34 seconds for standard FreeBSD to 1 machine

! Scalable

" 34 seconds to 80 machines!

! Due to careful design and engineering

" Straightforward implementation loaded in 30 minutes

slide-3
SLIDE 3

3

Disk Imaging Matters

! Data on a disk or partition, rather than file,

granularity

! Uses

" OS installation " Catastrophe recovery

! Environments

" Enterprise " Clusters " Utility computing " Research/education environments

slide-4
SLIDE 4

4

Emulab

slide-5
SLIDE 5

5

The Emulab Environment

! Network testbed for emulation

" Cluster of 168 PCs 100Mbps Ethernet LAN

! Users have full root access to nodes ! Configuration stored in a central database

" Fast reloading encourages aggressive experiments " Swapping to free idle resources

! Custom disk images ! Frisbee in use 18 months, loaded > 60,000 disks

slide-6
SLIDE 6

6

Disk Imaging Unique Features

! General and Versatile

" Does not require knowledge of filesystem " Can replace one filesystem type with another

! Robust

" Old disk contents irrelevant

! Fast

slide-7
SLIDE 7

7

Disk Imaging Tasks

Distribution Installation

Targets

Creation

Source Server

slide-8
SLIDE 8

8

Key Design Aspects

! Domain-specific data compression ! Two-level data segmentation ! LAN-optimized custom multicast protocol ! High levels of concurrency in the client

slide-9
SLIDE 9

9

Image Creation

! Segments images into self-describing

“chunks”

! Compresses with zlib ! Can create “raw” images with opaque

contents

! Optimizes some common filesystems

" ext2, FFS, NTFS " Skips free blocks

slide-10
SLIDE 10

10

Image Layout

! Chunk logically divided

into 1024 blocks

! Medium-sized chunks

good for

" Fast I/O " Compression " Pipelining

! Small blocks good for

" Retransmits

slide-11
SLIDE 11

11

Image Distribution Environment

! LAN environment

" Low latency, high bandwidth " IP multicast " Low packet loss

! Dedicated clients

" Consuming all bandwidth and CPU OK

slide-12
SLIDE 12

12

Custom Multicast Protocol

! Receiver-driven

" Server is stateless " Server consumes no bandwidth when idle

! Reliable, unordered delivery ! “Application-level framing” ! Requests block ranges within 1MB chunk

slide-13
SLIDE 13

13

Client Operation

! Joins multicast channel

" One per image

! Asks server for image size ! Starts requesting blocks

" Requests are multicast

! Client start not synchronized

slide-14
SLIDE 14

14

Client Requests

Request

slide-15
SLIDE 15

15

Client Requests

Block

slide-16
SLIDE 16

16

Tuning is Crucial

! Client side

" Timeouts " Read-ahead amount

! Server side

" Burst size " Inter-burst gap

slide-17
SLIDE 17

17

Image Installation

Decompression Disk Writer Blocks Chunk Distribution Decompressed Data

! Pipelined with distribution

" Can install chunks in any

  • rder

" Segmented data makes

this possible

! Three threads for overlapping

tasks

! Disk write speed the bottleneck ! Can skip or zero free blocks

slide-18
SLIDE 18

18

Evaluation

slide-19
SLIDE 19

19

Performance

! Disk image

" FreeBSD installation used on Emulab " 3 GB filesystem, 642 MB of data " 80% free space " Compressed image size is 180 MB

! Client PCs

" 850 MHz CPU, 100 MHz memory bus " UDMA 33 IDE disks, 21.4 MB/sec write speed " 100 Mbps Ethernet, server has Gigabit

slide-20
SLIDE 20

20

Speed and Scaling

slide-21
SLIDE 21

21

FS-Aware Compression

slide-22
SLIDE 22

22

Packet Loss

slide-23
SLIDE 23

23

Related Work

! Disk imagers without multicast

" Partition Image [www.partimage.org]

! Disk imagers with multicast

" PowerQuest Drive Image Pro " Symantec Ghost

! Differential Update

" rsync 5x slower with secure checksums

! Reliable multicast

" SRM [Floyd ’97] " RMTP [Lin ’96]

slide-24
SLIDE 24

24

Comparison to Symantec Ghost

slide-25
SLIDE 25

25

Ghost with Packet Loss

slide-26
SLIDE 26

26

How Frisbee Changed our Lives (on Emulab, at least)

! Made disk loading between experiments

practical

! Made large experiments possible

" Unicast loader maxed out at 12

! Made swapping possible

" Much more efficient resource usage

slide-27
SLIDE 27

27

The Real Bottom Line

“I used to be able to go to lunch while I loaded a disk, now I can’t even go to the bathroom!”

  • Mike Hibler (first author)
slide-28
SLIDE 28

28

Conclusion

! Frisbee is

" Fast " Scalable " Proven

! Careful domain-specific design from top to

bottom is key Source available at www.emulab.net

slide-29
SLIDE 29

29

slide-30
SLIDE 30

30

Comparison to rsync

! Timestamps not robust ! Checksums slow ! Conclusion: Bulk writes beat

data comparison

50 100 150 200

Frisbee: Write rsync: Checksum rsync: Timestamps Seconds

slide-31
SLIDE 31

31

How to Synchronize Disks

! Differential update - rsync

" Operates through filesystem " + Only transfers/writes changes " + Saves bandwidth

! Whole-disk imaging

" Operates below filesystem " + General " + Robust " + Versatile

! Whole-disk imaging essential for our task

slide-32
SLIDE 32

32

Image Distribution Performance: Skewed Starts

slide-33
SLIDE 33

33

Future

! Server pacing ! Self tuning

slide-34
SLIDE 34

34

The Frisbee Protocol

Chunk Finished? More Chunks Left? Wait for BLOCKs Send REQUEST Start No Timeout Outstanding Requests? Yes No BLOCK Received Yes Yes No Finished

slide-35
SLIDE 35

35

The Evolution of Frisbee

! First disk imager: Feb, 1999 ! Started with NFS distribution ! Added compression

" Naive " FS-aware

! Overlapping I/O ! Multicast

30 minutes down to 34 seconds!

200 400 600 800 1000 1200 1400 1600 1800 2000

Generation Seconds