Cooperative Data Backup for Mobile Devices Ludovic Courts Advisors - - PowerPoint PPT Presentation

cooperative data backup for mobile devices
SMART_READER_LITE
LIVE PREVIEW

Cooperative Data Backup for Mobile Devices Ludovic Courts Advisors - - PowerPoint PPT Presentation

Cooperative Data Backup for Mobile Devices Ludovic Courts Advisors : David Powell, Marc-Olivier Killijian 23 November 2007 2 Context Thesis at LAAS-CNRS, Dependable & Fault-Tolerant Computing Team The MoSAIC Project 3-year project


slide-1
SLIDE 1

Cooperative Data Backup for Mobile Devices

Ludovic Courtès

Advisors: David Powell, Marc-Olivier Killijian

23 November 2007

slide-2
SLIDE 2

2

Context

Thesis at LAAS-CNRS, Dependable & Fault-Tolerant Computing Team The MoSAIC Project

  • 3-year project started in Sept. 2004
  • French national program: IRISA, Eurecom and LAAS-CNRS

The Hidenets Project

  • 3-year EU IST project, FP6, started in Jan. 2006
  • resilience for mobility-aware services

⇒ Improve Mobile Device Data Availability

slide-3
SLIDE 3

3

  • Introduction
  • Redundancy Management
  • Storage Mechanisms
  • Secure Cooperation
  • Implementation
  • Conclusions
slide-4
SLIDE 4

4

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Problem Statement

Mobile Devices are Subject to Damage, Loss, etc. Typical Data Backup Techniques…

  • “synchronization”betweenmobile device and desktop

machine

  • requires access to desktop machine

… Are Constraining or Costly

  • nly intermittent access to one’s desktop machine
  • potentially costly

communications (e.g., GPRS, UMTS) ⇒ Backup opportunities are rare, data is at risk

slide-5
SLIDE 5

5

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

A Cooperative Approach to Data Backup

Key Ideas

  • leverage computing device ubiquity
  • pportunistic replication to neighboring devices
  • … using wireless ad hoc networking (Wi-Fi, Bluetooth)

Salient Points

  • adapted to intermittent connectivity scenarios
  • continuous backup & replication
  • protected against common-mode failures
slide-6
SLIDE 6

6

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Cooperative Backup and Recovery Processes

slide-7
SLIDE 7

7

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Cooperative Backup and Recovery Processes

slide-8
SLIDE 8

8

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Cooperative Backup and Recovery Processes

slide-9
SLIDE 9

9

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Cooperative Backup and Recovery Processes

Internet

slide-10
SLIDE 10

10

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Cooperative Backup and Recovery Processes

Internet

slide-11
SLIDE 11

11

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Cooperative Backup and Recovery Processes

Data Owner Contributors Internet Store Internet

slide-12
SLIDE 12

12

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Cooperative Backup and Recovery Processes

Internet

slide-13
SLIDE 13

13

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Cooperative Backup and Recovery Processes

Internet

slide-14
SLIDE 14

14

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Storage Challenges

Unpredictable Connection Encounters & Lifetime ⇒ limited transfer size ⇒ data must be fragmented ⇒ data blocks are scattered Limited Resources

  • minimize storage cost
  • ptimize energy consumption
slide-15
SLIDE 15

15

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Security Challenges

Trustworthy Data Storage

  • ensure data confidentiality, integrity, authenticity
  • provide appropriate backup redundancy

Secure Cooperation

  • participants have no a priori trust relationship
  • participants are mutually suspicious
  • protect against Denial-of-Service attacks
slide-16
SLIDE 16

16

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Related Work

Peer-to-Peer Backup

  • Pastiche, Samsara, PeerStore, etc.
  • n the Internet ⇒ different connectivity assumptions

Personal Area Network Cooperative Backup

  • Flashback
  • devices are mutually trusted

Persistent Stores for Sensor Networks (e.g., tinyPEDS ) Delay-Tolerant Networks

  • different evaluation criteria (e.g., delay)
  • usually assumes that devices are well-behaved
slide-17
SLIDE 17

17

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Major Contributions of the Thesis

  • Definition of Dependability Goals & Backup Framework
  • Identification of Distributed Storage Requirements
  • Design of Core Security Mechanisms
  • Prototype Implementation of a Cooperative Backup Service
slide-18
SLIDE 18

18

  • Introduction
  • Redundancy Management
  • Storage Mechanisms
  • Secure Cooperation
  • Implementation
  • Conclusions
slide-19
SLIDE 19

19

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Distributed Storage & Redundancy Management

slide-20
SLIDE 20

20

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Improving Data Availability

Problem Statement

  • contributors may fail
  • contributors are not trusted

⇒ Need for Data Replication Data Replication Goals

  • maximize storage efficiency…
  • … and data availability

Methodology

  • devise replication strategies
  • evaluate the efficiency/availability tradeoff
slide-21
SLIDE 21

21

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

A Simple Replication Strategy

Algorithm 1. send a total of n copies of each data item 2. send 1 copy per contributor 3. recover from any 1 contributor out of n Dependability & Storage Cost Analysis

  • tolerate f contributor faults ⇒ storage cost f + 1 times the input data size

n = 3, f = 2

slide-22
SLIDE 22

22

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

A Simple Replication Strategy

Algorithm 1. send a total of n copies of each data item 2. send 1 copy per contributor 3. recover from any 1 contributor out of n Dependability & Storage Cost Analysis

  • tolerate f contributor faults ⇒ storage cost f + 1 times the input data size

n = 3, f = 2

slide-23
SLIDE 23

23

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

A Simple Replication Strategy

Algorithm 1. send a total of n copies of each data item 2. send 1 copy per contributor 3. recover from any 1 contributor out of n Dependability & Storage Cost Analysis

  • tolerate f contributor faults ⇒ storage cost f + 1 times the input data size

n = 3, f = 2

slide-24
SLIDE 24

24

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Using Erasure Codes for Replication

Erasure Codes at a Glance

  • k-block input → n coded blocks, n > k
  • m blocks suffice to recover input data k < m < n
  • storage cost: S = n/k
  • k = 1 ⇔ simple replication

Optimal Codes

  • when m = k
  • notation: (n,k) code
  • n and k are user-defined parameters

k = 4 source blocks n = 6 coded blocks

slide-25
SLIDE 25

25

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

A Replication Strategy Using Erasure Codes

Algorithm 1. (n,k) erasure coding → n coded blocks 2. send 1 coded block per contributor 3. recover from any k contributors out of n Dependability & Storage Cost Analysis

  • tolerate f contributor faults ⇒ storage cost k + f

k times the input data size!

n = 5, k = 3 f = n − k = 2

slide-26
SLIDE 26

26

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

A Replication Strategy Using Erasure Codes

Algorithm 1. (n,k) erasure coding → n coded blocks 2. send 1 coded block per contributor 3. recover from any k contributors out of n Dependability & Storage Cost Analysis

  • tolerate f contributor faults ⇒ storage cost k + f

k times the input data size!

n = 5, k = 3 f = n − k = 2

slide-27
SLIDE 27

27

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

A Replication Strategy Using Erasure Codes

Algorithm 1. (n,k) erasure coding → n coded blocks 2. send 1 coded block per contributor 3. recover from any k contributors out of n Dependability & Storage Cost Analysis

  • tolerate f contributor faults ⇒ storage cost k + f

k times the input data size!

n = 5, k = 3 f = n − k = 2

slide-28
SLIDE 28

28

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Example Erasure Code-Based Strategies

Examples

  • (3,1) code: f = 2 failures tolerated; storage cost: S = 3
  • (5,3) code: f = 2 failures tolerated; storage cost: S = 1.67

2 failures tolerated: n = k + 2 stretch factor S 1 2 4 6 8 10 1 2 3

S = k + 2 k

parameter k

slide-29
SLIDE 29

29

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Is The Choice of a Replication Strategy That Simple?

No! Issues

  • “contributor failures tolerated” is not the right metric
  • erasure codes increase data scattering
  • … which affects data availability

⇒ Calls for a dependability evaluation of the strategies

slide-30
SLIDE 30

30

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Markov Chain Modeling the Backup Process

Modeling Parameters (stochastic processes with exponential distributions)

  • α: contributor encounter rate
  • β: Internet connection rate
  • λ: failure rate (crash faults)

Characteristics (simple replication)

  • data safe ⇒ 1 replica reached the Internet
  • data lost ⇒ contributors + owner failed

data lost … x − 1 x replicas available x + 1 … data safe

xλ λ α xβ (x + 1)β

slide-31
SLIDE 31

31

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Modeling the Full Replication Strategy

The replication process with n = 2 and k = 1… OK (data on the Internet) 1/2 2/1 3/0 1/1 2/0 1/0 KO (data definitely lost)

α α α λ λ 3 × λ λ λ λ 2 × λ β 2 × β 3 × β β 2 × β β

a/b a available copies b copies yet to be disseminated

slide-32
SLIDE 32

32

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Modeling the Full Replication Strategy

The replication process with n = 3 and k = 2…

alive(3) 0/0 alive(2) 1/0 A0

  • k

B0 ko L0 B0 L0 alive(1) 2/0 A0 alive(2) 0/1 B alive(2) 0/0 L B0 alive(0) 3/0 A0 alive(1) 1/1 2*B alive(1) 1/0 2*L dead 2/0 L0 B0 L0 A0 B0 L0 A0 B0 alive(0) 2/1 3*B alive(0) 2/0 3*L dead 3/0 L0 B+B0 A0 dead 1/1 L0 alive(1) 0/1 L B0 L0 A0 alive/endangered L B 2*L 2*B 2*B+B0 alive(0) 1/1 2*L dead 2/1 L0 B0 L0 2*B 2*L 3*L 3*B B+B0 L L0 2*B 2*L B0 L0 B L B0 L0 A0

slide-33
SLIDE 33

33

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Modeling the Full Replication Strategy

The replication process with n = 5 and k = 3…

alive(5) 0/0 alive(4) 1/0 A0
  • k
B0 ko L0 B0 L0 alive(3) 2/0 A0 alive(4) 0/1 B alive(4) 0/0 L B0 L0 alive(2) 3/0 A0 alive(3) 1/1 2*B alive(3) 1/0 2*L B0 L0 A0 B0 L0 A0 B0 alive(1) 4/0 A0 alive(2) 2/1 3*B alive(2) 2/0 3*L dead 3/0 L0 B0 L0 A0 alive(3) 0/2 B alive(3) 0/1 L B0 L0 A0 B alive(3) 0/0 L B0 alive(0) 5/0 A0 alive(1) 3/1 4*B alive(1) 3/0 4*L dead 4/0 L0 B0 A0 dead 2/1 L0 alive(2) 1/2 2*B alive(2) 1/1 2*L B0 L0 A0 2*B alive(2) 1/0 2*L 3*L 3*B B0 alive(0) 4/1 5*B alive(0) 4/0 5*L dead 5/0 L0 B0 A0 dead 3/1 L0 alive(1) 2/2 3*B alive(1) 2/1 3*L B0 L0 A0 3*B alive(1) 2/0 3*L 4*L 4*B B0 alive(0) 3/2 4*B alive(0) 3/1 4*L dead 4/1 L0 B0 L0 4*B alive(0) 3/0 4*L 5*L 5*B 3*B+B0 alive(0) 2/2 3*L dead 3/2 L0 B0 3*B alive(0) 2/1 3*L L0 4*B 4*L 2*B+B0 alive(0) 1/2 2*L dead 2/2 L0 3*B 3*L B+B0 alive/endangered L dead 1/2 L0 2*B 2*L B0 L0 B L B0 2*B 2*L L0 3*B 3*L 2*L 2*B B0 L0 3*L 3*B 2*B+B0 A0 L0 alive(1) 1/2 2*L B0 A0 L0 2*B alive(1) 1/1 2*L B+B0 A0 L0 alive(1) 0/2 L B0 L0 A0 B0 L0 L A0 B B0 L0 2*L A0 2*B B+B0 L0 A0 alive(2) 0/2 L B0 L0 A0 B alive(2) 0/1 L B0 L0 A0 B0 L0 A0 B0 L0 L A0 B B0 L0 A0 B0 L0 A0 B0 L0 A0
slide-34
SLIDE 34

34

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Modeling the Full Replication Strategy

Generalized Stochastic Petri Net (GSPN), for any n and k…

n

  • wner
  • wner meets a contributor

FC FC α α L L OU OU λ0 λ0 L L β0 β0 L L OD OD MF MF (MF m(MF)β L L (MF m(MF)λ L L m m(SF) + m(MF) < k L L SF m(SF) ≥ k L L DLDL DS DS SF SF L ≡ (m(DL) = 0) ∧ (m(DS) = 0)

slide-35
SLIDE 35

35

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Dependability Measurements

Measurements

  • PL: probability of data loss
  • LRF: data loss reduction factor ⇒ PL compared to non-cooperative backup

The Non-Cooperative Backup Scenario

  • nly one device ⇔ α = 0
  • either fails or connects to the Internet
  • PLref =

λ λ + β ⇒ LRF = PLref PL KO 1 copy OK

λ λ β β

slide-36
SLIDE 36

36

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Data Loss Reduction Factor vs. Environment Parameters

1 10 100 100010000 100000 Participant effectiveness 1 10 100 1000 10000 100000 Connectivity ratio 1 10 100 1000 10000 100000

(α/β) (β/λ ) LRF Loss Reduction Factor

contributor encounter rate Internet connection rate device failure rate

α β λ n = 3, k = 2

slide-37
SLIDE 37

37

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Data Loss Reduction Factor vs. Environment Parameters

1 10 100 100010000 100000 Participant effectiveness 1 10 100 1000 10000 100000 Connectivity ratio 1 10 100 1000 10000 100000

(α/β) (β/λ ) LRF Loss Reduction Factor

contributor encounter rate Internet connection rate device failure rate

α β λ n = 3, k = 2

Cooperative backup approach useless when α/β < 1

slide-38
SLIDE 38

38

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Data Loss Reduction Factor vs. Environment Parameters

1 10 100 100010000 100000 Participant effectiveness 1 10 100 1000 10000 100000 Connectivity ratio 1 10 100 1000 10000 100000

(α/β) (β/λ ) LRF Loss Reduction Factor

contributor encounter rate Internet connection rate device failure rate

α β λ n = 3, k = 2

α/β > 10 β/λ > 2 Cooperative approach beneficial when:

slide-39
SLIDE 39

39

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Data Loss Reduction Factor vs. Environment Parameters

1 10 100 100010000 100000 Participant effectiveness 1 10 100 1000 10000 100000 Connectivity ratio 1 10 100 1000 10000 100000

(α/β) (β/λ ) LRF Loss Reduction Factor

contributor encounter rate Internet connection rate device failure rate

α β λ n = 3, k = 2

Data loss probability decreased by up to α/β

slide-40
SLIDE 40

40

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Erasure-Code-Based Strategies vs. Simple Replication

1 10 100 100010000 100000 1 10 100 1000 10000 100000 1 10 100 1000 10000 100000

LRF Loss Reduction Factor

Participant effectiveness

(β/λ )

Connectivity ratio

(α/β)

contributor encounter rate Internet connection rate device failure rate

α β λ

(6,3) (8,4) (2,1) simple replication (4,2)

slide-41
SLIDE 41

41

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Erasure-Code-Based Strategies vs. Simple Replication

1 10 100 100010000 100000 1 10 100 1000 10000 100000 1 10 100 1000 10000 100000

LRF Loss Reduction Factor

Participant effectiveness

(β/λ )

Connectivity ratio

(α/β)

contributor encounter rate Internet connection rate device failure rate

α β λ

Erasure Code Strategies are Rarely Beneficial!

(6,3) (8,4) (2,1) simple replication (4,2)

slide-42
SLIDE 42

42

  • Introduction
  • Redundancy Management
  • Storage Mechanisms
  • Secure Cooperation
  • Implementation
  • Conclusions
slide-43
SLIDE 43

43

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Storage Mechanisms

slide-44
SLIDE 44

44

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Constraints Imposed on the Storage Layer

Lack of Trust Among Participants

  • replicate data fragments
  • enforce data confidentiality, verify integrity & authenticity

Scarce Resources (energy, storage, CPU)

  • maximize storage efficiency
  • but avoid CPU-intensive techniques (compression, encryption)

Short-lived and Unpredictable Encounters

  • fragment data into small blocks
  • disseminate them among contributors
slide-45
SLIDE 45

45

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Providing Data Confidentiality, Integrity, and Authenticity

Enforcing Confidentiality

  • encrypt both data & meta-data
  • use energy-economic algorithms (e.g., symmetric encryption)

Allowing for Integrity Checks Against Accidental & Malicious Modifications

  • store cryptographic hashes of (meta-)data blocks (e.g., SHA1)
  • use hashes as a block naming scheme (content-based indexing)

Allowing for Authenticity Checks

  • cryptographically sign (part of) the meta-data
slide-46
SLIDE 46

46

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Maximizing Storage Efficiency

Techniques Rejected

  • differential compression: CPU- and memory-intensive, weakens data availability
  • lossy compression: too specific (image, sound, etc.)

Eligible Techniques

  • single-instance storage
  • generic lossless compression (e.g., gzip, etc.)
  • content-defined input chopping (U. Manber, 1994)

⇒ Storage efficiency? Computational cost?

slide-47
SLIDE 47

47

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Chopping Data Into Small Blocks

Natural Solution: Fixed-Size Blocks

  • similar data streams might yield identical blocks

Finding More Similarities Using Content-Based Chopping

  • better identifies identical blocks among different data streams
  • can help improve storage efficiency
  • see U. Manber, Finding Similar Files in a Large File System, 1994

⇒ Which one improves storage efficiency? Under what circumstances?

slide-48
SLIDE 48

48

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Storage Efficiency & Computational Cost Evaluation

Measurements

  • storage efficiency
  • computational cost (throughput)
  • … for different combinations of algorithms

File Sets

  • single mailbox file (low entropy)
  • source code of a program, several versions (low entropy, high redundancy)
  • compressed audio files (high entropy, hardly compressable)

Our Storage Layer Implementation: libchop

slide-49
SLIDE 49

49

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Main Results

0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Throughput (B/tu) 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Throughput (B/tu) 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Throughput (B/tu)

Compression Ratio Mailbox File Lossless Compressor:

lzo bzip2 zlib

Combination of Storage Techniques:

slide-50
SLIDE 50

50

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Main Results

0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Throughput (B/tu) 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Throughput (B/tu) 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Throughput (B/tu)

Compression Ratio Mailbox File Lossless Compressor:

lzo bzip2 zlib

fixed−size chopping + input zipping fixed−size chopping + block zipping content−defined chopping, no zipping content−defined chopping + block zipping Combination of Storage Techniques:

slide-51
SLIDE 51

51

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Main Results

0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Throughput (B/tu) 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Throughput (B/tu) 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Throughput (B/tu)

Compression Ratio Mailbox File Lossless Compressor:

lzo bzip2 zlib

fixed−size chopping + input zipping fixed−size chopping + block zipping content−defined chopping, no zipping content−defined chopping + block zipping fixed−size chopping + input zipping content−defined chopping + block zipping Combination of Storage Techniques:

slide-52
SLIDE 52

52

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Main Results

Proposed a Storage Layer

  • compression + block indexing + meta-data + keyed block store
  • prototype implementation

Conducted an Experimental Evaluation

  • identified compression with best CPU/storage tradeoff
  • using single-instance storage + lossless compression + fixed-size chopping

Limitations of the Evaluation

  • results depend on our file sets
  • mapping to energy consumption?
slide-53
SLIDE 53

53

  • Introduction
  • Redundancy Management
  • Storage Mechanisms
  • Secure Cooperation
  • Implementation
  • Conclusions
slide-54
SLIDE 54

54

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Secure Cooperation

slide-55
SLIDE 55

55

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Secure Cooperation

slide-56
SLIDE 56

56

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Security Threats

Threats to Data Availability, Confidentiality, Integrity, etc. ⇒ addressed at the storage layer ⇒ using encryption, cryptographic hashes, etc. Threats to Service Availability ⇒ must be addressed elsewhere

slide-57
SLIDE 57

57

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Security Threats: Denial-of-Service Attacks

Data Retention

  • contributor does not give data back

Flooding

  • contributor exhausts storage resources

Selfishness

  • device does not contribute in return
  • so-called “tragedy of the commons”

⇒ DoS Attacks Threaten Effective Cooperation

slide-58
SLIDE 58

58

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Design Goals

Decentralization, Self-Organization

  • allow for open participation
  • no central identity management
  • no single point of trust/failure

Policy-Neutral

  • separate mechanisms from policy
  • provide a minimal set of mechanisms
  • no policy enforcement
  • users are free to choose when/how they cooperate
slide-59
SLIDE 59

59

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Accountability

A Prerequisite for Cooperation

  • answers to: how much storage does Bob use? Does he do a good job?
  • participants must be held accountable for their actions
  • user policies can then take appropriate actions

⇒ Technical Requirements:

  • must be able to designate devices
  • actions taken must be bound to their responsible device
slide-60
SLIDE 60

60

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Device Designation

Decentralized Designation

  • no central identity management
  • users can create their name/designator

Unique Device Designators

  • device names must be (statistically) unique

Verifiable Designators

  • authentication of name ownership
  • prevent “spoofing”
slide-61
SLIDE 61

61

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Device Designation

Decentralized Designation

  • no central identity management
  • users can create their name/designator

Unique Device Designators

  • device names must be (statistically) unique

Verifiable Designators

  • authentication of name ownership
  • prevent “spoofing”

Uh, really?! I’m Friendly Charly!

slide-62
SLIDE 62

62

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Sybil Attacks What’s your name? I’m Bob.

slide-63
SLIDE 63

63

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Sybil Attacks Bob the nasty guy! Go away! ...

slide-64
SLIDE 64

64

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Sybil Attacks Well, no, I’m Charly. OK then, use my storage, Charly

slide-65
SLIDE 65

65

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Sybil Attacks

Description 1. participant misbehaves 2. then creates a new name 3. … and is now considered a (harmless) stranger Impact in our Context

  • few resources available in one’s physical neighborhood

Must Provide an Incentive to not Change Names

  • e.g., offer few resources to strangers

⇒ Calls for Higher-Level Cooperation Policies

Well, no, I’m Charly.

slide-66
SLIDE 66

66

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Cooperation Policies

Definition

  • set of rules defining when/how to cooperate
  • provide a disincentive for non-cooperation (e.g., “punishment”)

Examples

  • white list: only cooperate with devices listed
  • reputation-like system: based on records of past behavior

⇒ Built on Top of the Accounting Mechanisms

slide-67
SLIDE 67

67

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Implementation Considerations

Device Designation: OpenPGP Public Key Communications: Transport Layer Security (TLS, RFC 4346)

  • OpenPGP certificate-based authentication
  • supports message integrity/authenticity checks (HMACs)
  • binds messages to keys

Usage 1. generate OpenPGP key pair 2. TLS handshake with mutual authentication 3. cooperate…

slide-68
SLIDE 68

68

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Main Results

Policy-Neutral, Decentralized Core Mechanisms

  • support accountability through verifiable device designation
  • … and strong binding between a device’s name and actions
  • building block for various cooperation policies
  • users are free to choose their policy

Limitations

  • did not address the choice of a cooperation policy
  • did not evaluate actual cooperation policies
  • “openness” hinders formal reasoning about device interactions
slide-69
SLIDE 69

69

  • Introduction
  • Redundancy Management
  • Storage Mechanisms
  • Secure Cooperation
  • Implementation
  • Conclusions
slide-70
SLIDE 70

70

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Implementation

Internet

slide-71
SLIDE 71

71

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Overview

Backup Daemon Client Tools browse versions backup restore

slide-72
SLIDE 72

72

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Backup Daemon Architecture

Backup Daemon Client Tools browse versions backup restore

Backup directories versioning Opportunistic Replication Storage Contribution Storage Layer libchop Service Discovery Avahi Authentication GnuTLS User-Defined Cooperation & Replication Policies File System Networking

slide-73
SLIDE 73

73

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Experimental Evaluation

Methodology

  • several daemons running on one machine ⇒ avoid effects of network variability
  • pre-defined scenarios
  • synthetic workload (file sets, etc.)
  • varying data block sizes

Experiments

  • measure data preparation + replication time
  • measure retrieval time
slide-74
SLIDE 74

74

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Data Preparation & Replication Time

50 100 150 200 250 300 350 400 1024 2048 4096 8192 10000 20000 30000 40000 50000 60000 70000 80000 time (seconds) block size (B) file indexing replication block count

slide-75
SLIDE 75

75

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Data Preparation & Replication Time

50 100 150 200 250 300 350 400 1024 2048 4096 8192 10000 20000 30000 40000 50000 60000 70000 80000 time (seconds) block size (B) file indexing replication block count

Replication time dominates the backup process Execution time is inversely proportional to the block size

slide-76
SLIDE 76

76

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Main Results

Practical Implementation of the Cooperative Backup

  • uses our storage + security framework
  • … adds versioning + file system meta-data handling
  • + opportunistic replication, data retrieval, storage contribution
  • ported to a mobile platform (Nokia N770/N800)

Limitations

  • lacks data forwarding (e.g., to Internet store)
  • performance bottlenecks
  • evaluation does not account for wireless networking variability
slide-77
SLIDE 77

77

  • Introduction
  • Redundancy Management
  • Storage Mechanisms
  • Secure Cooperation
  • Implementation
  • Conclusions
slide-78
SLIDE 78

78

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Major Contributions of the Thesis

Definition of Dependability Goals & Backup Framework ⇒ analytical dependability evaluation of replication strategies Identification of Distributed Storage Requirements ⇒ experimental evaluation of storage/compression techniques Design of Core Security Mechanisms ⇒ qualitative protocol evaluation Prototype Implementation of a Cooperative Backup Service ⇒ experimental performance evaluation

slide-79
SLIDE 79

79

Introduction > Redundancy Management > Storage Mechanisms > Secure Cooperation > Implementation > Conclusions

Perspectives

Analytical Dependability Evaluation

  • evaluate rate-less codes
  • extend model with input chopping

Distributed Storage Mechanisms

  • experiment erasure-coding as a chopping technique

Secure Cooperation

  • analytically evaluate cooperation policies

Prototype Implementation

  • implement/evaluate more cooperation & replication policies
slide-80
SLIDE 80

80

Thank You!

Questions? http://www.laas.fr/mosaic/ http://www.hidenets.aau.dk/