An Explicit, Coupled-Layer Construction of a High-Rate MSR Code - - PowerPoint PPT Presentation

an explicit coupled layer construction of a high rate msr
SMART_READER_LITE
LIVE PREVIEW

An Explicit, Coupled-Layer Construction of a High-Rate MSR Code - - PowerPoint PPT Presentation

An Explicit, Coupled-Layer Construction of a High-Rate MSR Code Birenjith Sasidharan, Myna Vajha and P. Vijay Kumar Indian Institute of Science, Bangalore Dagstuhl Seminar: Coding Theory in the Time of Big Data August 7-12, 2016 Regenerating


slide-1
SLIDE 1

An Explicit, Coupled-Layer Construction of a High-Rate MSR Code

Birenjith Sasidharan, Myna Vajha and P. Vijay Kumar

Indian Institute of Science, Bangalore

Dagstuhl Seminar: Coding Theory in the Time of Big Data August 7-12, 2016

slide-2
SLIDE 2

Regenerating Codes - Formal Definition

Parameters: ( (n, k, d), (α, β), B, Fq )

1 k+1 k 2 n Data Collector α α α α capacity nodes 1 d+1 2 n 1’ 3 β β β α capacity nodes

◮ Data to be recovered by connecting to any k of n nodes ◮ Nodes to be repaired by connecting to any d nodes, downloading β

symbols from each node; (dβ << file size B )

◮ Focus here is on exact repair

slide-3
SLIDE 3

The Storage-Repair Bandwidth Tradeoff

The upper bound on file size: B ≤

k

  • i=1

min{α, (d − i + 1)β} (multiple (α, β) pairs can achieve bound)

◮ Tradeoff curve drawn for

fixed (k, d), B.

◮ Extreme points: MSR &

MBR

◮ MSR=Minimum Storage

Regenerating

α = (d − k + 1)β

◮ Focus here is on the

MSR point

(k, d) = (120, 129), B = 725360

slide-4
SLIDE 4

MSR Codes

MSR Codes with All-Symbol Node Repair. n k d α β Explicit Product Matrix n k ≥ 2k − 2 d − k + 1 1 Y Hadamard n n − 2 n − 1 2k+1 2k Y Mod Zig-Zag n k n − 1 (n − k)k+1 (n − k)k N Cadambe et.al n k d → ∞ → ∞ N Sasidharan et.al qt q(t − 1) n − 1 qt qt−1 N Rawat et.al qt q(t − 1) d qt qt−1 N MSR Codes with Systematic-Node Repair. n k d α β Explicit MISER n k ≥ 2k − 1 d − k + 1 1 Y Zig-Zag n k n − 1 (n − k)k (n − k)k−1 N Poly MDS O(k2)

2 3 n ≤ k ≤ n

n − 1 (

k n−k )n−k α n−k

Y Goparaju et.al n k d (n − k)k

α n−k

N

  • 1. Hadamard design based MSR code paper by Dimakis et.al also gives a non explicit msr

code with high probability for ( n, k, n − 1, (n − k)k+1, (n − k)k ).

  • 2. Zig-Zag Codes have explicit coefficient assignment for n − k = 2 and n − k = 3.
slide-5
SLIDE 5

Recent Constructions

n k d α β Sys/All Explicit Raviv et.al n n − 2, n − 3 n − 1 (n − k)

k r

α n−k

Sys Y Ye & Barg n k n − 1 (n − k)n (n − k)n−1 All Y Ye & Barg qt q(t − 1) n − 1 qt qt−1 All Y Coupled Layer qt q(t − 1) n − 1 qt qt−1 All Y Scalar MDS Code with efficient repair bandwidth. n k d α β Sys/All Explicit Guruswami et.al n k n − 1 1 log2( n−1

n−k ) bits

All Y Vector MDS Codes with efficient repair bandwidth. n k d α β Piggybacked RS (m ≥ 1) n k d 2m, 4m, (2(n − k) − 3)m (non-unif) β All Y Guruswami et.al (p ≥ 1) qt q(t − 1) n − 1 (n − k)p (1 + 1

p ) α n−k

All Y

slide-6
SLIDE 6

Parameters of Construction: (n, k, d, (α, β))

We adopt the same parameters introduced first in [1] . For t ≥ 2: n tq k (t − 1)q d (n − 1) r := (n − k) q α qt β qt−1 Rate = t − 1 t ≥ 1 2 (field size was large to accommodate data collection)

[1] B. Sasidharan, G. K. Agarwal, PVK, “A High-Rate MSR Code With Polynomial Sub-Packetization Level,” ISIT 2015

slide-7
SLIDE 7

(Present) Coupled-Layer Construction Code Parameters

Code Parameter Value Block Length n qt, q ≥ 2, t ≥ 2 (n − k) := r q k qt − q d (n − 1) α qt β qt−1 Field Size Q n (caution: Q, not q, is the field size!) α = rn/r lower bound rk/r =

rn/r r

slide-8
SLIDE 8

The Parity-Check Equations of [1]: Difficulty in Data Recovery

Row-Sum Parity-Checks: For z ∈ Zt

q,

  • y∈[t]
  • x∈Zq

A(x, y; z) = 0. Jump Parity-Checks (for each 1 ≤ λ ≤ (q − 1) and z ∈ Zt

q):

  • y∈[t]
  • x∈Zq

θλ

(x,y)A(x, y; z) +

  • y∈[t]

cA(y, zy; (z − λey)

  • jump in yth position by λ

) = 0.

◮ The coupling with other planes depended on λ. ◮ The construction was non-explicit; the coefficient c was shown to exist in a large

enough field.

slide-9
SLIDE 9

The Parity-Check Equations of [1]: Resolving the Issue

◮ Question was to find an explicit assignment for c. ◮ We adopted a sequential decoding approach. ◮ This led to the need for proving the invertibility of data-recovery matrices of the

form (but with larger number of sub-blocks): D =         1 1 1 θ1 θ2 θ3 c θ2

1

θ2

2

θ2

3

1 1 1 θ1 θ2 θ3 c θ2

1

θ2

2

θ2

3

        This provide difficult, but this was resolved by altering the amount of coupling leading instead to: D′ =         1 1 1 u θ1 θ2 θ3 uθ2 θ2

1

θ2

2

θ2

3

uθ2

2

u 1 1 1 uθ1 θ1 θ2 θ3 uθ2

1

θ2

1

θ2

2

θ2

3

       

slide-10
SLIDE 10

Coupled-Layer MSR Code: Example Parameters Chosen for Illustration

q t n k d α β Rate Field size, Q 2 3 6 4 5 23 22 2/3 Q ≥ 6

slide-11
SLIDE 11

Shortening for Other Parameters

Can shorten the code to achieve other parameters (n, k, d), (α, β) ⇒ (n − δ, k − δ, d − δ), (α, β)

slide-12
SLIDE 12

The Data Cube (q = 2, t = 3)

◮ The data cube is a 3-D array A(x, y; z) of code symbols. ◮ (x, y) ∈ (Zq × [t]) used to identify a node. ◮ z = (z1, z2, · · · , zt) used to index a plane. X y Z Data cube for q = 2, t = 3. It has 6 nodes each with 23 = 8 symbols .

x=0 1 y=1 2 3

The plane z = (1, 0, 0) identified by placement of red dots.

slide-13
SLIDE 13

Parity-Check Equations and the Pairwise Coupling

For z ∈ Zqt, 0 ≤ λ ≤ (q − 1), we have that

  • y∈[t]
  • x∈Zq

θλ

(x,y)B(x, y; z)

= 0, and the code symbols A(x, y; z) are given from the B(x, y; z) by:

  • A(x, y; z)

A(zy, y; x, z∼y)

  • =

1 u u 1 −1 B(x, y; z) B(zy, y; x, z∼y)

  • .

(x, z∼y) ⇒ vector obtained by replacing yth symbol of z by x

slide-14
SLIDE 14

Parity-Check Equations and the Pairwise Coupling

For z ∈ Zqt, 0 ≤ λ ≤ (q − 1), we have that

  • y∈[t]
  • x∈Zq

θλ

(x,y)B(x, y; z)

= 0, and the code symbols A(x, y; z) are given from the B(x, y; z) by:

  • A(x, y; z)

A(zy, y; x, z∼y)

  • =

1 u u 1 −1 B(x, y; z) B(zy, y; x, z∼y)

  • .

(x, z∼y) ⇒ vector obtained by replacing yth symbol of z by x

  • B(x, y; z)

B(zy, y; x, z∼y)

  • =

1 u u 1 A(x, y; z) A(zy, y; x, z∼y)

  • ,
slide-15
SLIDE 15

An Example: q = 2, t = 3

(n = 6, k = 4, d = 5, α = 8, β = 4) X y Z

◮ For every plane z, there are

linear parity-check equations binding all symbols on z, and symbols from certain other planes.

slide-16
SLIDE 16

Coupling and Decoupling of Symbols Across Planes

X y Z

A2 A1

Coupling of symbols (A1, A2) are a coupled pair.

Coupling: B1 B2

  • =

1 u u 1 A1 A2

  • Decoupling:

A1 A2

  • =

1 u u 1 −1 B1 B2

slide-17
SLIDE 17

Encoding the Coupled-Layer MSR Code

q = 2, t = 3

RS_ENCODER

. . . .

data0 . . . data7 coupling engine b-code b-code code0 . . . code7

n = 6, k = 4, = 8

  • n

RS_ENCODER

9 10 11 12 9 10 11 12 3 5 7 8 3 5 7 8 2 4 2 4 1 6 1 6

◮ Encoding involves α = 8 parallel calls to a [6, 4] Reed Solomon encoder in parallel. ◮ Number of pairs of symbols that are coupled = t(q−1)α

2

= 12 . ◮ Coupling involving 1 multiplication and 1 addition per code symbol.

slide-18
SLIDE 18

Repairing the Coupled-Layer MSR Code

X y Z

◮ The node (1, 1) on extreme left has

failed.

◮ Data from pink planes are transmitted

during repair.

◮ Repair can be done in qt−1 = 4 parallel

  • perations, each involving a

(q × q) = (2 × 2) matrix inversion.

slide-19
SLIDE 19

Repairing the Coupled-Layer MSR Code

0 1 2 3 4 5 6 7

1 2 3 4 5 6

. . .

6

0 1 2 3 4 5 6 7

1 2 3 4 5 6

RS_DECODER RS_DECODER

  • 1. Repair of node-1: Node-1 to be repaired
  • 2. Repair operation can be performed β = 4 instances of RS decoding in

parallel.

slide-20
SLIDE 20

Data Collection and Erasures

  • 1. The task of data collection is to recover the data from k nodes
  • 2. Equivalently, one must recover the data following

n − k = q erasures

  • 3. q = 2 in our example

We assume a given erasure pattern E of q nodes.

slide-21
SLIDE 21

A Sequential Approach to Data Collection

Erasures are indicated by a unfilled circle. The intersection score σ of a plane for given erasure pattern E is the number of dots in the plane that correspond to erased nodes. Example planes with σ = 0, 1, 2 respectively are shown below:

y=1 2 3 x=0 1 y=1 2 3 x=0 1 y=1 2 3 x=0 1

  • 1. Decode erased symbols in a plane-by-plane manner.
  • 2. The planes are selected in the order of increasing intersection score σ
  • 3. Each plane is decoded using a scalar MDS (RS) code decoder
slide-22
SLIDE 22

A Sequential Decoding Algorithm

Input set of erased nodes E Compute maximum intersection score max s = 0 Decode symbols (mixture of A’s and B’s) from plane Z such that (Z, E) = s by invoking SC-MDS-DEC and using previously decoded symbols Transform B symbols to A symbols s = s + 1 s ≤ max

YES NO

EXIT

Label planes with intersection scores

slide-23
SLIDE 23

Decoding the Coupled-Layer MSR Code

1 2 3 4 5 6

0 1 2 3 4 5 6 7 i - score based

  • rdering

3 7 1 2 5 6 0 4 i-score: 0 1 2 3 7 1 2 5 6 0 4 3 7 1 2 5 6 0 4 2 3 7 1 2 5 6 0 4 1 3 7 1 2 5 6 0 4

2x RS_DEC Couple/Decouple

0 1 2 3 4 5 6 7

1 2 3 4 5 6

4x RS_DEC Couple/Decouple 2x RS_DEC Couple/Decouple

◮ Label planes with intersection-scores ◮ Intersection score determines order in which planes are decoded. ◮ Planes with same intersection score can decoded using parallel instances of RS decoding.

slide-24
SLIDE 24

(Alternate) Systematic Encoding of the Coupled-Layer MSR Code

data0 . . . data7

  • n

CL-MSR Decoder

  • 1. Here we fill in the raw data in nodes 1, 2, 3, 4.
  • 2. Regard nodes 5, 6 as having been erased and recover them through

decoding.

slide-25
SLIDE 25

Thanks!