Locally decodable codes:
from computational complexity to cloud computing
Sergey Yekhanin
Microsoft Research
Locally decodable codes: from computational complexity to cloud - - PowerPoint PPT Presentation
Locally decodable codes: from computational complexity to cloud computing Sergey Yekhanin Microsoft Research Error-correcting codes: paradigm 2 E() 2 E() +noise 0110001 011000100101 01*00*10010*
Microsoft Research
0110001 011000100101 0110001 01*00*10010*
Encoder Decoder Channel
π β πΊ2
π
E(π) +noise π E(π) β πΊ2
π
The paradigm dates back to 1940s (Shannon / Hamming)
Corrupts up to π coordinates.
0110001 011000100101 1 01*00*10010*
Encoder Local Decoder Channel
Local decoder runs in time much smaller than the message length! π β πΊ2
π
E(π) β πΊ2
π
E(π) +noise ππ Corrupts up to π coordinates. Reads up to π coordinates.
X1 X E(X) X2 X3 X1 X1ο X2 X2 X3 X1ο X3 X2ο X3 X1ο X2ο X3
Message length: k = 3 Codeword length: n = 7 Corrupted locations: π = 3 Locality: π = 2
X1 X E(X) X2 X3 X1 X1ο X2 X2 X3 X1ο X3 X2ο X3 X1ο X2ο X3
Message length: k = 3 Codeword length: n = 7 Corrupted locations: π = 3 Locality: π = 2
Definition: A code E: πΊ
π π β πΊ π π is π -locally decodable, if for every message π,
each ππ can be recovered from reading some π symbols of πΉ(π), even after up to π coordinates of πΉ(π) are corrupted.
k symbol message n symbol codeword
Noise
Decoder reads only r symbols
Goal: Understand the true shape of the tradeoff between redundancy π β π and locality π , for different settings of π (e.g., π = ππ, ππ , π 1 .)
π(1) ππ ππ π(1) ππ (log π)π Taxonomy of known families of LDCs
Multiplicity codes Matching vector codes Reed Muller codes Local reconst- ruction codes Projective geometry codes
Construction [STV]: E: πΊ2
π β πΊ2 π
π = ππππ§ π , π = (log π)π, π = π/10. Theorem: If there is an efficient algorithm that errs on <10% of πβ²; then EXP β BPP.
π is EXP-complete πβ² is in EXP
1 1 0 1 0 1 1 1 1 1 Level π is a string π of length 2π
πΉ(π) π
E: πΊ2
π β πΊ2 π
π = ππππ§ π , π = (log π)π, π = π/10. Theorem: If there is an efficient algorithm that errs on <10% of πβ²; then EXP β BPP.
π is EXP-complete πβ² is in EXP
1 1 0 1 1 1 1 1 1
πΉ(π) π
Proof: We obtain a BPP algorithm for π:
A gives us access to the corrupted encoding πΉ(π).
π.
π π¨1, β¦ , π¨π , deg f < π yields a codeword: π(π¦ ) π¦ βπΊ
π π
π , π = π β 1, π = ππ.
. , , d m q
evaluation of a univariate polynomial π π of degree at most π.
β Pick a random affine line through π¦ . β Do noisy polynomial interpolation.
π¦
π π
π = ππ, π = π + π π , π = 1 β 4π π, π = π β 1, π = ππ.
Setting parameters:
1 π β1 .
Reducing codeword length is a major open question.
Better codes are known
Lose one machine, access one
Loose one machine, access k
k data chunks n-k parity chunks
Need: Erasure codes with local decoding
X1 X2 Xk
β¦
P1
β¦
Pn-k
π π + β.
π π + β.
X1 Xr
β¦
Xk-r Xk
β¦
Hh H1 L1 Lg
β¦
β¦ β¦
Light parities Heavy parities Data symbols Local group
X1 Xr
β¦
Xk-r Xk
β¦
Hh H1 L1 Lg
β¦
β¦ β¦
Light parities Heavy parities Data symbols Local group
π π + β.
Set k=8, r=4, and h=3.
X1 L1 X2 X3 X4 X5 X6 X7 X8 H3 H2 H1 L2
Set k=8, r=4, and h=3.
X1 L1
X2 X3 X4 X5 X6 X7 X8 H3 H2 H1 L2
Set k=8, r=4, and h=3.
X1 L1
X2 X3 X4 X5 X6 X7 X8 H3 H2 H1 L2
Set k=8, r=4, and h=3.
X1 L1
X2 X3 X4 X5 X6 X7 X8 H3 H2 H1 L2
Set k=8, r=4, and h=3.
X1 L1
X2 X3 X4 X5 X6 X7 X8 H3 H2 H1 L2
Def: A regular failure pattern for a (r,h)-LRC is a pattern that can be obtained by failing
X1 L1 X2 X3 X4 X5 X6 X7 X8 H3 H2 H1 L2 X1 L1 X2 X3 X4 X5 X6 X7 X8 H3 H2 H1 L2
Theorem:
correctable by any LRC.
Def: An (r,h)-LRC is maximally recoverable if it corrects all regular failure patterns. Theorem: Maximally reliable (r,h)-LRCs exist. Proof sketch: Pick the coefficients in heavy parities at random from a large finite field. The tradeoff: Larger fields allow for more reliable codes up to maximal recoverability. We want both: small field size (efficiency) and maximal recoverability. Asymptotic setting: β = π 1 , π = π 1 , π β β. Random choice needs a field of size at least: Ξ© πββ1 .
Theorem[GHJY]: There exist maximally recoverable (r,h)-LRC over a field of size
ββ1 1β 1 2π .
Comparison:
We use dual constraints to specify the code.
ππ ππ β¦ ππ π΄π
β¦
ππβπ ππβπ+π β¦ ππ π΄π/π π°π π°π β¦ π°π 1 1 β¦ 1 1 1 1 β¦ 1 1 π π h
π½ππ π½ππ
2
β¦ π½ππ
2ββ1
Element π½ππ appears in the j-th column of the i-th group.
We consider a sequence field extensions πΊ
2 β πΊ2π β πΊ2π.
{ππ} β πΊ2π form a basis over πΊ
2.
{ππ} β πΊ2π are β-independent over πΊ2π.
π½ππ=ππ Γ ππ.
π π + 1 local groups.
k=8, r=4, h=2.
ππ ππ ππ ππ π΄π ππ ππ ππ ππ π΄π π°π π°π π°π 1 1 1 1 1 1 1 1 1 1 π½11 π½12 π½21 π½22 π½31 π½11
2 π½12 2
π½21
2 π½22 2
π½31
2
π½11
4 π½12 4
π½21
4 π½22 4
π½31
4
1 1 1 1 π½11 π½12 π½21 π½22 π½31 π½11
2 π½12 2 π½21 2 π½22 2 π½31 2
π½11
4 π½12 4 π½21 4 π½22 4 π½31 4
π½11+π½12 π½21+π½22 π½31 π½11
2 +π½12 2
π½21
2 +π½22 2
π½31
2
π½11
4 +π½12 4
π½21
4 +π½22 4
π½31
4
(π½11+π½12) (π½21+π½22) π½31 (π½11+π½12) 2 (π½21+π½22 )2 π½31
2
(π½11+π½12) 4 (π½21+π½22 )4 π½31
4
(π½11+π½12) (π½21+π½22) π½31 (π1 + π2) Γ π1 (π1 + π2) Γ π2 π1 Γ π3
The main challenge in LRC design is to obtain maximally reliable codes over small finite
computational efficiency.
Open questions: