CERN, June 2008 large, reliable, and secure distributed online - - PowerPoint PPT Presentation

cern june 2008 large reliable and secure distributed
SMART_READER_LITE
LIVE PREVIEW

CERN, June 2008 large, reliable, and secure distributed online - - PowerPoint PPT Presentation

CERN, June 2008 large, reliable, and secure distributed online storage harness idle resources of participating computers old dream of computer science The design of a world-wide, fully transparent distributed file system for simultaneous


slide-1
SLIDE 1

CERN, June 2008

slide-2
SLIDE 2

large, reliable, and secure distributed online storage harness idle resources of participating computers

slide-3
SLIDE 3
  • ld dream of computer science
slide-4
SLIDE 4

“The design of a world-wide, fully transparent distributed file system for simultaneous use by millions of mobile and frequently disconnected users is left as an exercise for the reader.”

  • A. Tanenbaum, Distributed Operating System, 1995
slide-5
SLIDE 5

lots of research projects OceanStore (UC Berkeley) Past (Microsoft Research) CFS (MIT)

slide-6
SLIDE 6

we were inspired by them wanted to make it work first step: closed alpha

slide-7
SLIDE 7

upload any file in any size access from anywhere share with friends and groups publish to the world

slide-8
SLIDE 8

free and simple application Win, Mac, Linux start from the web, no installation required start with 1 GB provided by us if you want more, you can trade or buy storage

slide-9
SLIDE 9
  • nline storage

with the “power of P2P” fast downloads no file size limit no traffic limit

slide-10
SLIDE 10

privacy all files are encrypted on your computer your password never leaves your computer so no one, not even we, can see your files

slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20
slide-21
SLIDE 21
slide-22
SLIDE 22

how does it work?

slide-23
SLIDE 23

data stored in the p2p network users’s computer can be offline how to ensure availability (persistent storage)?

slide-24
SLIDE 24

two approaches

  • 1. make sure the data is always

in the network

move the data when a computer goes offline bad idea for lots of data and high churn rate

  • 2. introduce redundancy
slide-25
SLIDE 25

redundany = replication? p = node availability k = redundancy factor prep = file availability

slide-26
SLIDE 26

redundany = replication? example p = 0.25 k = 5 prep = 0.763 not enough

slide-27
SLIDE 27

redundany = replication? example p = 0.25 k = 24 prep = 0.999 unrealistic

slide-28
SLIDE 28

erasure codes encode m fragments into n need any m out of n to reconstruct

reed-solomon (optimal codes) RAID storage systems (vs. low-density-parity-check need (1+e) * m, where e is a fixed, small constant)

slide-29
SLIDE 29

availability p = 0.25 m = 100, n = 517, k = n/m = 5.17 pec = 0.999

k = n/m = 5.17 vs. k = 24 using replication

slide-30
SLIDE 30
slide-31
SLIDE 31

x y d points

slide-32
SLIDE 32
  • 1
slide-33
SLIDE 33
  • 1
slide-34
SLIDE 34

alice stores a file

roadtrip.mpg

slide-35
SLIDE 35

alice drags roadtrip.mpg into wuala

slide-36
SLIDE 36
  • 1. encrypted on alice’s computer (128 bit AES)
slide-37
SLIDE 37
  • 1. encrypted on alice’s computer (128 bit AES)
  • 2. encoded into redundant fragments
slide-38
SLIDE 38
  • 1. encrypted on alice’s computer (128 bit AES)
  • 2. encoded into redundant fragments
  • 3. uploaded into the p2p network

p2p network

slide-39
SLIDE 39
  • 1. encrypted on alice’s computer (128 bit AES)
  • 2. encoded into redundant fragments
  • 3. uploaded into the p2p network

p2p network

  • 4. m fragments

uploaded onto our servers (boostrap, backup)

slide-40
SLIDE 40

alice shares the file with bob

alice and bob have friendship key alice encrypts file key and exchanges it with bob bob wants to download the file

slide-41
SLIDE 41

p2p network

slide-42
SLIDE 42
  • 1. download subset of fragments (m)

p2p network

slide-43
SLIDE 43
  • 1. download subset of fragments (m)

p2p network

if necessary, get the remaining fragments from

  • ur servers
slide-44
SLIDE 44
  • 2. decode the file
  • 1. download subset of fragments (m)

p2p network

slide-45
SLIDE 45
  • 3. decrypt the file
  • 2. decode the file
  • 1. download subset of fragments (m)

p2p network

slide-46
SLIDE 46

bob plays roadtrip.mpg

  • 2. decode the file
  • 1. download subset of fragments (m)

p2p network

slide-47
SLIDE 47

p2p network

slide-48
SLIDE 48

maintenance

p2p network

slide-49
SLIDE 49

maintenance alice’s computer checks and maintains her files

p2p network

slide-50
SLIDE 50

maintenance alice’s computer checks and maintains her files

if necessary, it constructs new fragments and uploads them p2p network

slide-51
SLIDE 51

maintenance alice’s computer checks and maintains her files

if necessary, it constructs new fragments and uploads them p2p network

slide-52
SLIDE 52

maintenance alice’s computer checks and maintains her files

if necessary, it constructs new fragments and uploads them p2p network

slide-53
SLIDE 53

p2p network

slide-54
SLIDE 54

p2p network

put

slide-55
SLIDE 55

p2p network

get put

slide-56
SLIDE 56

distributed hash table (DHT)

p2p network

get put

slide-57
SLIDE 57

super nodes

slide-58
SLIDE 58

storage nodes

slide-59
SLIDE 59

client nodes

slide-60
SLIDE 60

get

slide-61
SLIDE 61

get

slide-62
SLIDE 62

get

slide-63
SLIDE 63

get

slide-64
SLIDE 64

get

slide-65
SLIDE 65

download of fragments (in parallel)

slide-66
SLIDE 66

routing napster: centralized :-( gnutella: flooding :-( chord, tapestry: structured overlay networks O(log n) hops :-)

n = # super nodes

vulnerable to attacks (partitioning) :-(

slide-67
SLIDE 67

super node connected to direct neighbors plus some random links random links? piggy-pack routing information

slide-68
SLIDE 68

number of hops depends on size of the network (n) size of the routing table (R)

which itself depends on the traffic we have lots of traffic due to erasure coding

slide-69
SLIDE 69

simulation results n = 106 R = 1,000: < 3 hops R = 100: ~5 hops

reasonable already with moderate traffic

slide-70
SLIDE 70

small world effects

(see milgram, watts & strogatz, kleinberg) regular graph high diameter :-( high clustering :-)

slide-71
SLIDE 71

small world effects

(see milgram, watts & strogatz, kleinberg) regular graph high diameter :-( high clustering :-) random graph low diameter :-) low clustering :-(

slide-72
SLIDE 72

small world effects

(see milgram, watts & strogatz, kleinberg) regular graph high diameter :-( high clustering :-) random graph low diameter :-) low clustering :-( mix low diameter :-) high clustering :-)

slide-73
SLIDE 73

routing table

n = 109, R = 10,000

slide-74
SLIDE 74

incentives, fairness prevent free-riding local disk space

  • nline time

upload bandwidth

slide-75
SLIDE 75
  • nline storage = local disk space * online time

example: 10 GB disk space, 70% online --> 7 GB we have different mechanisms to measure and check these two variables

slide-76
SLIDE 76

trading storage

  • nly if you want to (you start with 1 GB)

you must be online at least 17% of the time

( 4 hours a day, running average) storage can be earned on multiple computers

slide-77
SLIDE 77

upload bandwidth the more upload bandwidth you provide, the more download bandwidth you get

slide-78
SLIDE 78

“client” storage node asymmetric interest tit-for-tat doesn’t work :-( believe the software? hack it (kazaa lite) :-(

slide-79
SLIDE 79

distributed reputation system that is not susceptible to false reports and other forms of cheating

Havelaar, NetEcon 2006 must scale well with number of transactions we have lots of small transactions due to erasure coding

slide-80
SLIDE 80

Havelaar, NetEcon 2006

  • 1. lots of transactions

“observations”

slide-81
SLIDE 81

Havelaar, NetEcon 2006

  • 2. every round (e.g., a week)

send observations to pre-determined neighbors (hash code)

  • 1. lots of transactions

“observations”

slide-82
SLIDE 82

Havelaar, NetEcon 2006

  • 2. every round (e.g., a week)

send observations to pre-determined neighbors (hash code)

  • 3. discard ego-reports,

median, etc.

  • 1. lots of transactions

“observations”

slide-83
SLIDE 83

Havelaar, NetEcon 2006

  • 2. every round (e.g., a week)

send observations to pre-determined neighbors (hash code)

  • 3. discard ego-reports,

median, etc.

  • 4. next round, aggregate
  • 1. lots of transactions

“observations”

slide-84
SLIDE 84

Havelaar, NetEcon 2006

  • 2. every round (e.g., a week)

send observations to pre-determined neighbors (hash code)

  • 3. discard ego-reports,

median, etc.

  • 4. next round, aggregate
  • 5. update reputation
  • f storage nodes
  • 1. lots of transactions

“observations”

slide-85
SLIDE 85

Havelaar, NetEcon 2006

  • 2. every round (e.g., a week)

send observations to pre-determined neighbors (hash code)

  • 3. discard ego-reports,

median, etc.

  • 4. next round, aggregate
  • 5. update reputation
  • f storage nodes

rewarding: upload bandwidth proportional to reputation

  • 1. lots of transactions

“observations”

slide-86
SLIDE 86

Havelaar, NetEcon 2006

local approximation of contribution

slide-87
SLIDE 87

“client” storage node

slide-88
SLIDE 88

“client” storage node

slide-89
SLIDE 89

“client” storage node

slide-90
SLIDE 90

“client” storage node

slide-91
SLIDE 91

“client” storage node

slide-92
SLIDE 92

“client” storage node

slide-93
SLIDE 93

“client” storage node “flash crowd”

slide-94
SLIDE 94

content distribution similar to bittorrent tit-for-tat

some differences due to erasure codes

“client”

slide-95
SLIDE 95

encryption 128 bit AES for encryption 2048 bit RSA for authentication all data is encrypted (file + meta data) all cryptographic operations performed locally (i.e., on your computer)

slide-96
SLIDE 96

access control cryptographic tree structure untrusted storage doesn’t reveal who has access very efficient for typical operations

(grant access, move, etc.) Cryptree, SRDS 2006

slide-97
SLIDE 97

Cryptree, SRDS 2006 alice videos vacation roadtrip.mpg switzerland.mpg europe.mpg root

slide-98
SLIDE 98

Cryptree, SRDS 2006 alice videos vacation roadtrip.mpg switzerland.mpg europe.mpg root claire bob bob doesn’t see that claire has also access and vice versa

slide-99
SLIDE 99

Cryptree, SRDS 2006 alice videos vacation roadtrip.mpg switzerland.mpg europe.mpg root garfield granting access to this and all subfolders takes just one operation all subkeys can be derived from that parent key claire bob bob doesn’t see that claire has also access and vice versa

slide-100
SLIDE 100

demo

slide-101
SLIDE 101
slide-102
SLIDE 102

Invitation for the closed alpha

  • 1. http://download.wua.la
  • 2. Run the installer
  • 3. Enter your invitation code:

CERN