Hierarchical Codes: How to Make Erasure Codes Attractive for - - PowerPoint PPT Presentation

hierarchical codes how to make erasure codes attractive
SMART_READER_LITE
LIVE PREVIEW

Hierarchical Codes: How to Make Erasure Codes Attractive for - - PowerPoint PPT Presentation

Hierarchical Codes: How to Make Erasure Codes Attractive for Peer-to-Peer Storage Systems Alessandro Duminuco and Ernst Biersack EURECOM Sophia Antipolis, France (Best paper award in P2P'08) Presented by: Amir H. Payberah amir@sics.se


slide-1
SLIDE 1

1

Hierarchical Code, 11 Nov. 2008

Hierarchical Codes: How to Make Erasure Codes Attractive for Peer-to-Peer Storage Systems

Alessandro Duminuco and Ernst Biersack EURECOM Sophia Antipolis, France (Best paper award in P2P'08) Presented by: Amir H. Payberah

amir@sics.se

slide-2
SLIDE 2

2

Hierarchical Code, 11 Nov. 2008

What's The Problem?

slide-3
SLIDE 3

3

Hierarchical Code, 11 Nov. 2008

What Is The Problem?

Does file backup fit the P2P model?

slide-4
SLIDE 4

4

Hierarchical Code, 11 Nov. 2008

Churn and Redundancy

  • The challenge in P2P model is to provide storage reliability

under churn.

  • The key solution is to add redundancy to the data.
slide-5
SLIDE 5

5

Hierarchical Code, 11 Nov. 2008

The Basic Solution: Replication

  • With 4 replicas, even if 3 peers are offline we still have the file.
  • Every file consumes storage for 4 times its size!!

file

slide-6
SLIDE 6

6

Hierarchical Code, 11 Nov. 2008

A Better Solution: Coding

  • Any k fragments are sufficient to reconstruct the file:
  • We can sustain any h losses.
  • Every file consumes storage for (k+h)/k times its size:
  • If k=6 and h=3, (k+h)/k=1.5 .... Instead of 4!!

file

Coding Dissemination

k k + h

slide-7
SLIDE 7

7

Hierarchical Code, 11 Nov. 2008

Repair Communication Cost

  • Replication
  • Coding
  • To create a single fragment we must transfer k fragments,

i.e. the size-equivalent of the whole file!!

k

... a repair means combining k fragments into a new one.

slide-8
SLIDE 8

8

Hierarchical Code, 11 Nov. 2008

Storage vs Repair Cost

  • If we want to sustain 10 losses:

Repair Cost makes coding unattractive.

slide-9
SLIDE 9

9

Hierarchical Code, 11 Nov. 2008

Motivation

Can we mitigate the repair cost of coding Can we mitigate the repair cost of coding while retaining storage efficiency? while retaining storage efficiency?

slide-10
SLIDE 10

10

Hierarchical Code, 11 Nov. 2008

Efficiency Metrics

  • Redundancy factor
  • β = |S| / |O|
  • |S|: size of the stored data.
  • |O|: size of the original data.
  • Repair degree
  • The amount of data read with respect to the amount of new redundant

data created.

  • Denoted as d.
slide-11
SLIDE 11

11

Hierarchical Code, 11 Nov. 2008

Efficiency Analysis

  • Replication
  • β = R
  • d = 1
  • Block replication
  • β = R
  • d = 1
  • Erasure codes
  • β = (k + h) / k
  • d = k
slide-12
SLIDE 12

12

Hierarchical Code, 11 Nov. 2008

Linear Codes

  • A specific implementation of erasure codes.
  • fi: ith fragment
  • bi:ith fragment
  • ci,j: coefficients
  • Any 4 of these 7 fragments can reconstruct the original file if the

coefficients are linearly independent. (will be back to it later)

  • Repair degree d = k

fi i ≤ k ∑(ci,j X fj) k < i ≤ k + h bi =

b7 b6 b5 b4 b3 b2 b1 f4 f3 f2 f1

k k + h

(k-h)-code

slide-13
SLIDE 13

13

Hierarchical Code, 11 Nov. 2008

Hierarchical Code

  • Additional fragments can be linear combinations of a subset of

the original ones.

  • Not all the subsets of 4 fragments are sufficient to reconstruct

the file.

  • The repair cost varies accordingly to the particular fragments

that are available (we can have d < k).

b7 b6 b5 b4 b3 b2 b1 f4 f3 f2 f1

k k + h

slide-14
SLIDE 14

14

Hierarchical Code, 11 Nov. 2008

Comparison

b7 b6 b5 b4 b3 b2 b1 f4 f3 f2 f1

k k + h

b7 b6 b5 b4 b3 b2 b1 f4 f3 f2 f1

k k + h

slide-15
SLIDE 15

15

Hierarchical Code, 11 Nov. 2008

Generalizing The Concept

  • If we take a 64+64 traditional linear code and we apply the same

idea hierarchically...

  • If we set the hierarchy differently we obtain a different trade-off.
slide-16
SLIDE 16

16

Hierarchical Code, 11 Nov. 2008

Experiments

slide-17
SLIDE 17

17

Hierarchical Code, 11 Nov. 2008

Synthetic Data

  • An event-driven simulator.
  • They compared a 64+64 Reed-Solomon code (linear code) with
  • ne instance of a 64+64 Hierarchical code.
  • They generated synthetic peer behavior with exponentially

distributed uptimes, downtimes and lifetimes.

  • As a general rule, the smaller is the up-ratio the higher the

number of repairs.

slide-18
SLIDE 18

18

Hierarchical Code, 11 Nov. 2008

Synthetic Data Results

slide-19
SLIDE 19

19

Hierarchical Code, 11 Nov. 2008

Real Data

  • PlanetLab traces consist in 669 nodes monitored for 500 days.
  • KAD traces consist in the availability of about 6500 peers in the

KAD network for about 5 months.

slide-20
SLIDE 20

20

Hierarchical Code, 11 Nov. 2008

Real Data Results

  • PlanetLab
  • KAD
slide-21
SLIDE 21

21

Hierarchical Code, 11 Nov. 2008

Conclusion

slide-22
SLIDE 22

22

Hierarchical Code, 11 Nov. 2008

Conclusion

  • They proposed a new class of erasure codes called

Hierarchical Codes.

  • They aim at coupling the communication efficiency of

replication with the storage efficiency of coding.

  • Experiments showed that Hierarchical Codes require more

repairs, but those repairs are so cheap that the resulting communication cost is smaller.

slide-23
SLIDE 23

23

Hierarchical Code, 11 Nov. 2008

More Detail About Coding

slide-24
SLIDE 24

24

Hierarchical Code, 11 Nov. 2008

Linear Codes

  • fi: ith fragment
  • bi:ith fragment
  • ci,j: coefficients
  • Any 4 of these 7 fragments can reconstruct the original file if the

coefficients are linearly independent.

fi i ≤ k ∑(ci,j X fj) k < i ≤ k + h bi =

b7 b6 b5 b4 b3 b2 b1 f4 f3 f2 f1

k k + h

slide-25
SLIDE 25

25

Hierarchical Code, 11 Nov. 2008

Linear Codes

  • If any sub-matrix S built using k rows from C' is invertible, then

the original fragments can be always reconstructed by F = S−1Bs.

  • BS: The k-long subvector of B, corresponding to the coefficients

chosen in S.

  • If this property is satisfied, the code obtained is a (k,h)-code.

F = S-1 Bs B = C' F

slide-26
SLIDE 26

26

Hierarchical Code, 11 Nov. 2008

Coefficient Matrix

  • Reed-Solomon Codes
  • Random Linear Codes
slide-27
SLIDE 27

27

Hierarchical Code, 11 Nov. 2008

Reed-Solomon

  • Ik, k: Indentity matrix.
  • Ch, k: Coefficient Matrix.
  • If k = 2 and h = 3

I C B = F = C' F I C B = F = 1 0 0 1 c1,1 c1,2 c2,1 c2,2 c3,1 c3,2 f1 f2 = f1 f2 c1,1f1 + c1,2f2 c2,1f1 + c2,2f2 c3,1f1 + c3,2f2

slide-28
SLIDE 28

28

Hierarchical Code, 11 Nov. 2008

Reed-Solomon Codes

  • Define the matrix C as a h × k Vandermonde matrix.
  • ci,j = aij-1
slide-29
SLIDE 29

29

Hierarchical Code, 11 Nov. 2008

Reed-Solomon Codes

I C B = F = 1 0 0 1 1 1 1 2 1 3 f1 f2 = f1 f2 f1 + f2 f1 + 2f2 f1 + 3f2

  • k = 2
  • h = 3
  • ci,j = ji-1
slide-30
SLIDE 30

30

Hierarchical Code, 11 Nov. 2008

Reed-Solomon Codes

1 0 1 3 I C B = F = 1 0 0 1 1 1 1 2 1 3 f1 f2 = f1 f2 f1 + f2 f1 + 2f2 f1 + 3f2 S = 1 0

  • 1/3 1/3

= f1 f1 + 3f2 f1 f2 S-1 Bs = = F

slide-31
SLIDE 31

31

Hierarchical Code, 11 Nov. 2008

Random Linear Code

  • It is shown that a k × k random matrix S in GF(2q) is invertible

with a probability which depends only on the field size and will increase by the size increasing.

  • GF(2q): Galois Field, where the elements can be expressed by q-bit

words.

  • If q ≥ 16, the probability can be considered practically 1.
  • This means that any k × k sub-matrix of C' is invertible and that

the property of a (k,h)-code is provided.

slide-32
SLIDE 32

32

Hierarchical Code, 11 Nov. 2008

Information Flow Graph (Code Graph)

  • Represents the evolution of the stored data through time.

b3 b2 b1 f2 f1 b3 b2 b1 b3 b2 b1

... 1 t-1 t F B1 Bt-1 Bt

slide-33
SLIDE 33

33

Hierarchical Code, 11 Nov. 2008

Information Flow Graph (Code Graph)

  • Proposition 1: At any time t, any of all the possible selections of

k nodes Btk is sufficient to reconstruct the original fragments only if the disjoint paths condition is provided at time step t = 1 and the repair degree d ≥ k.

  • A Random linear code provides this condition
  • By design any node in B1 is connected to all the source nodes in F.

b3 b2 b1 f2 f1 b3 b2 b1 b3 b2 b1

... 1 t-1 t F B1 Bt-1 Bt

slide-34
SLIDE 34

34

Hierarchical Code, 11 Nov. 2008

Block Replication vs Linear Codes

  • k = 8, h = 16 and R = 3
  • Block replication: d = 1
  • Linear codes: d = k
slide-35
SLIDE 35

35

Hierarchical Code, 11 Nov. 2008

Question?

Is there a design space between these two Is there a design space between these two limits that can be explored to find a better limits that can be explored to find a better trade-off between storage efficiency and trade-off between storage efficiency and repair degree? repair degree?

slide-36
SLIDE 36

36

Hierarchical Code, 11 Nov. 2008

Hierarchical Codes

slide-37
SLIDE 37

37

Hierarchical Code, 11 Nov. 2008

Hierarchical Code Graph – Step 1

  • Choose k0 and h0 and build (k0, h0)-code:
  • k0 = 2
  • h0 = 1
  • The generated group denoted as Gd0,1, where d0 = k0.

fi i ≤ k ∑(ci,j X fj) k < i ≤ k + h bi =

b2 b3 b1 f2 f1

Hierarchical (2, 1)-code

G2,1

slide-38
SLIDE 38

38

Hierarchical Code, 11 Nov. 2008

Hierarchical Code Graph – Step 2

  • Choose g1 and h1.
  • Replicate Gd0,1 for g1 times.
  • g1 groups denoted as Gd0,1, ..., Gd0,g.
  • Then add other h1 redundant blocks.
  • Combining all the existing g1k0 original

fragments F.

  • The new group denoted as Gd1,1,
  • Hierarchical (d1, H1)-code,
  • H1 = g1h0 + h1
  • d1 = g1k0 = g1d0
  • g1 = 2
  • h1 = 1

b4 b6 b3 f4 f3 b2 b5 b1 f2 f1 b7

G2,1 G2,2 G4,1

Hierarchical (4, 3)-code

slide-39
SLIDE 39

39

Hierarchical Code, 11 Nov. 2008

Hierarchical Code Graph – Step 3

  • Repeat Step 2 several times.
  • Hs = gs Hs-1 + hs
  • ds = gs ds-1
slide-40
SLIDE 40

40

Hierarchical Code, 11 Nov. 2008

Hierarchical Code – Reliability

  • Proposition 2: Consider Bk, a set of k blocks in the code graph of

a hierarchical (k,h)-code.

  • If the nodes in Bk are chosen fulfilling the following condition:

|Gd,i B ∩

k| ≤ d, G

d,i belonging to the code

  • Then the nodes in Bk are sufficient to reconstruct the original

fragments.

slide-41
SLIDE 41

41

Hierarchical Code, 11 Nov. 2008

Hierarchical Code – Reliability

  • P(failure | l) = 0.23

b4 b6 b3 f4 f3 b2 b5 b1 f2 f1 b7

G2,1 G2,2 G4,1

Hierarchical (4, 3)-code

slide-42
SLIDE 42

42

Hierarchical Code, 11 Nov. 2008

Hierarchical Code – Repair Degree

  • Proposition 3: Consider a node b repaired at time step t. Denote

as G(b) the hierarchy of groups that contains b and as R(b) the set of nodes in Bt−1 that have been combined to repair b

  • If If t and b, R(b) fulfills the following conditions:

∀ ∀ |Gd,i R(b)| ≤ d, G ∩ ∀

d,i belonging to the code

∃Gd,i ∈ G(b): R(b) G ⊆

d,i, |R(b)| = d

  • Then Then the code does not degrade, i.e. preserve the

properties of the code graph expressed in Proposition 2.

slide-43
SLIDE 43

43

Hierarchical Code, 11 Nov. 2008

Hierarchical Code

b4 b6 b3 f4 f3 b2 b5 b1 f2 f1 b7

G2,1 G2,1 G4,1

Hierarchical (4, 3)-code

slide-44
SLIDE 44

44

Hierarchical Code, 11 Nov. 2008

Hierarchical Code

slide-45
SLIDE 45

45

Hierarchical Code, 11 Nov. 2008

Question?