Locally decodable codes: from computational complexity to cloud - PowerPoint PPT Presentation

Locally decodable codes: from computational complexity to cloud computing Sergey Yekhanin Microsoft Research

Error-correcting codes: paradigm 𝑜 𝑙 𝑌 𝑌 ∈ 𝐺 2 E(𝑌) ∈ 𝐺 2 E(𝑌) +noise 0110001 011000100101 01*00*10010* 0110001 Encoder Channel Decoder Corrupts up to 𝑓 coordinates. The paradigm dates back to 1940s (Shannon / Hamming)

Local decoding: paradigm 𝑙 𝑜 E(𝑌) +noise E(𝑌) ∈ 𝐺 2 𝑌 𝑗 𝑌 ∈ 𝐺 2 0110001 011000100101 01*00*10010* 1 Local Encoder Channel Decoder Corrupts up to 𝑓 Reads up to 𝑠 coordinates. coordinates. Local decoder runs in time much smaller than the message length! • First account: Reed’s decoder for Muller’s codes (1954) • Implicit use: (1950s-1990s) • Formal definition and systematic study (late 1990s) [Levin’ 95, STV’ 98, KT’ 00]  Original applications in computational complexity theory  Cryptography  Most recently used in practice to provide reliability in distributed storage

Local decoding: example E(X) X 1 X 2 X 3 X X 1 X 2 X 3 X 1  X 2 X 1  X 3 X 2  X 3 X 1  X 2  X 3 Message length: k = 3 Codeword length: n = 7 Corrupted locations: 𝑓 = 3 Locality: 𝑠 = 2

Locally decodable codes 𝑙 → 𝐺 𝑜 is 𝑠 -locally decodable, if for every message 𝑌 , Definition: A code E: 𝐺 𝑟 𝑟 each 𝑌 𝑗 can be recovered from reading some 𝑠 symbols of 𝐹(𝑌) , even after up to 𝑓 coordinates of 𝐹(𝑌) are corrupted. (Erasures.) Decoder is aware of erased locations. Output is always correct. • (Errors.) Decoder is randomized. Output is correct with probability 99%. • k symbol message Decoder reads only r symbols 0 1 … 0 1 n symbol codeword Noise 0 0 1 0 1 … 0 1 1 0 1 0 … 0 1

Locally decodable codes Goal: Understand the true shape of the tradeoff between redundancy 𝑜 − 𝑙 and locality 𝑠 , for different settings of 𝑓 (e.g., 𝑓 = 𝜀𝑜, 𝑜 𝜗 , 𝑃 1 . ) 𝑠 𝑙 𝜁 Multiplicity codes Local Projective reconst- (log 𝑙) 𝑑 Reed Muller geometry ruction codes codes codes Matching 𝑃(1) vector codes 𝑜 𝜁 𝑃(1) 𝜀𝑜 𝑓 Taxonomy of known families of LDCs

Plan • Part I: (Computational complexity) • Average case hardness • An avg. case hard language in EXP (unless EXP ⊆ BPP) • Construction of LDCs • Open questions • Part II: (Distributed data storage) • Erasure coding for data storage • LDCs for data storage • Constructions and limitations • Open questions

Part I: Computational complexity

Average case complexity A problem is hard-on-average if any efficient algorithm errs on 10% of the inputs. • Establishing hardness-on-average for a problem in NP is a major problem. • Below we establish hardness-on-average for a problem in EXP, assuming EXP ⊈ BPP. • Construction [STV]: 𝐹(𝑌) 𝑌 𝑙 → 𝐺 2 𝑜 E: 𝐺 2 Level 𝑙 is 1 1 0 1 0 1 1 1 1 0 0 0 𝑜 = 𝑞𝑝𝑚𝑧 𝑙 , a string 𝑌 of 1 0 length 2 𝑙 𝑠 = (log 𝑙) 𝑑 , 𝑓 = 𝑜/10. 𝑀 ′ is in EXP 𝑀 is EXP-complete Theorem: If there is an efficient algorithm that errs on <10% of 𝑀 ′ ; then EXP ⊆ BPP.

Average case complexity Theorem: If there is an efficient algorithm that errs on <10% of 𝑀 ′ ; then EXP ⊆ BPP. Proof: We obtain a BPP algorithm for 𝑀 : Let A be the algorithm that errs on <10% of 𝑀 ′ ; • A gives us access to the corrupted encoding 𝐹(𝑌) . To decide if 𝑌 𝑗 invoke the local decoder for 𝐹(𝑌) . • Time complexity is (log 2 𝑙 ) 𝑑 ∗ 𝑞𝑝𝑚𝑧 𝑙 = 𝑞𝑝𝑚𝑧 𝑙 . • Output is correct with probability 99%. • 𝐹(𝑌) 𝑌 𝑙 → 𝐺 2 0 𝑜 1 0 E: 𝐺 2 1 0 1 1 1 1 0 1 1 0 0 𝑜 = 𝑞𝑝𝑚𝑧 𝑙 , 𝑠 = (log 𝑙) 𝑑 , 𝑓 = 𝑜/10. 𝑀 ′ is in EXP 𝑀 is EXP-complete

Reed Muller codes Parameters: 𝑟, 𝑛, 𝑒 = 1 − 4𝜀 𝑟. • Codewords: evaluations of degree 𝑒 polynomials in 𝑛 variables over 𝐺 𝑟 . • 𝑛 𝑟 𝑨 1 , … , 𝑨 𝑛 , deg f < 𝑒 yields a codeword: 𝑔(𝑦 ) 𝑦 ∈𝐺 Polynomial 𝑔 ∈ 𝐺 • 𝑟 Parameters: 𝑜 = 𝑟 𝑛 , 𝑙 = 𝑛 + 𝑒 , 𝑠 = 𝑟 − 1, 𝑓 = 𝜀𝑜. • 𝑛

Reed Muller codes: local decoding Key observation: Restriction of a codeword to an affine line yields an • q , m , d . evaluation of a univariate polynomial 𝑔 𝑀 of degree at most 𝑒. To recover the value at 𝑦 : • – Pick a random affine line through 𝑦 . – Do noisy polynomial interpolation. 𝑛 𝐺 𝑟 𝑦 Locally decodable code: Decoder reads 𝑟 − 1 random locations. •

Reed Muller codes: parameters 𝑙 = 𝑛 + 𝑒 𝑜 = 𝑟 𝑛 , , 𝑒 = 1 − 4𝜀 𝑟, 𝑠 = 𝑟 − 1, 𝑓 = 𝜀𝑜. 𝑛 Setting parameters: 1 𝑠−1 . q = 𝑃 1 , 𝑛 → ∞: 𝑠 = 𝑃 1 , 𝑜 = exp 𝑙 • Better codes are q = 𝑛 2 ∶ 𝑠 = (log 𝑙) 2 , 𝑜 = 𝑞𝑝𝑚𝑧 𝑙 . • known q → ∞, 𝑛 = 𝑃 1 : 𝑠 = 𝑙 𝜗 , 𝑜 = 𝑃 𝑙 . • Reducing codeword length is a major open question.

Part II: Distributed storage

Data storage • Store data reliably • Keep it readily available for users

Data storage: Replication • Store data reliably • Keep it readily available for users • Very large overhead • Moderate reliability • Local recovery: Lose one machine, access one

Data storage: Erasure coding • Store data reliably • Keep it readily available for users … • Low overhead … … • High reliability • No local recovery: Loose one machine, access k k data chunks n-k parity chunks Need: Erasure codes with local decoding

Codes for data storage … … P 1 P n-k X 1 X 2 X k • Goals: • (Cost) minimize the number of parities. • (Reliability) tolerate any pattern of h+1 simultaneous failures. • (Availability) recover any data symbol by accessing at most r other symbols • (Computational efficiency) use a small finite field to define parities.

Local reconstruction codes • Def: An (r,h) – Local Reconstruction Code (LRC) encodes k symbols to n symbols, and • Corrects any pattern of h+1 simultaneous failures; • Recovers any single erased data symbol by accessing at most r other symbols.

Local reconstruction codes • Def: An (r,h) – Local Reconstruction Code (LRC) encodes k symbols to n symbols, and • Corrects any pattern of h+1 simultaneous failures; • Recovers any single erased data symbol by accessing at most r other symbols. 𝑙 • Theorem[GHSY]: In any (r,h) – (LRC), redundancy n-k satisfies 𝑜 − 𝑙 ≥ 𝑠 + ℎ.

Local reconstruction codes • Def: An (r,h) – Local Reconstruction Code (LRC) encodes k symbols to n symbols, and • Corrects any pattern of h+1 simultaneous failures; • Recovers any single erased data symbol by accessing at most r other symbols. 𝑙 • Theorem[GHSY]: In any (r,h) – (LRC), redundancy n-k satisfies 𝑜 − 𝑙 ≥ 𝑠 + ℎ. • Theorem[GHSY]: If r | k and h<r+1; then any (r,h) – LRC has the following topology: Light … parities L 1 L g Local group … … … X k-r Data symbols X 1 X r X k Heavy … H 1 H h parities

Local reconstruction codes • Def: An (r,h) – Local Reconstruction Code (LRC) encodes k symbols to n symbols, and • Corrects any pattern of h+1 simultaneous failures; • Recovers any single erased data symbol by accessing at most r other symbols. 𝑙 • Theorem[GHSY]: In any (r,h) – (LRC), redundancy n-k satisfies 𝑜 − 𝑙 ≥ 𝑠 + ℎ. • Theorem[GHSY]: If r | k and h<r+1; then any (r,h) – LRC has the following topology: Light … L 1 parities L g Local group … … … X k-r Data symbols X 1 X r X k Heavy … H 1 H h parities • Fact: There exist (r,h) – LRCs with optimal redundancy over a field of size k+h.

Reliability Set k=8, r=4, and h=3. L 1 L 2 X 5 X 1 X 2 X 3 X 6 X 4 X 7 X 8 H 2 H 1 H 3

Reliability Set k=8, r=4, and h=3. L 1 L 2 X 5 X 1 X 2 X 3 X 6 X 4 X 7 X 8 H 2 H 1 H 3 • All 4-failure patterns are correctable.

Reliability Set k=8, r=4, and h=3. L 1 L 2 X 5 X 1 X 2 X 3 X 6 X 4 X 7 X 8 H 2 H 1 H 3 • All 4-failure patterns are correctable. • Some 5-failure patterns are not correctable.

Reliability Set k=8, r=4, and h=3. L 1 L 2 X 5 X 1 X 2 X 3 X 6 X 4 X 7 X 8 H 2 H 1 H 3 • All 4-failure patterns are correctable. • Some 5-failure patterns are not correctable. • Other 5-failure patterns might be correctable.

Combinatorics of correctable failure patterns Def: A regular failure pattern for a (r,h)-LRC is a pattern that can be obtained by failing one symbol in each local group and h extra symbols. L 1 L 2 L 1 L 2 X 3 X 4 X 7 X 3 X 4 X 7 X 1 X 2 X 5 X 6 X 8 X 1 X 2 X 5 X 6 X 8 H 1 H 2 H 3 H 1 H 2 H 3 Theorem: Every failure pattern that is not dominated by a regular failure pattern is not • correctable by any LRC. There exist LRCs that correct all regular failure patterns. •

Locally decodable codes: from computational complexity to cloud - PowerPoint PPT Presentation

Locally decodable codes: from computational complexity to cloud computing Sergey Yekhanin Microsoft Research Error-correcting codes: paradigm 2 E() 2 E() +noise 0110001 011000100101 010010010*

Proofs of Retrievability Using Locally Decodable Codes Julien Lavauzelle, Franoise

Tight Upper and Lower Bounds for Leakage-Resilient Locally Decodable and Updatable Non-Malleable

Building Codes Building Codes Building Codes Building Codes 1 1 Builder Responsibilities

ECEN 5682 Theory and Practice of Error Control Codes Cyclic Codes Peter Mathys University of

Formal Modeling in Cognitive Science Source Codes Lecture 30: Codes; Kraft Inequality; Source

CODES FOR ALL SEASONS Emina Soljanin, Bell Labs IN THE CLOUD? CODES Emina @ Bell Labs Codes at

G ENERALIZED R EED -S OLOMON CODES (GRS CODES ) A CHARACTERIZATION OF MDS CODES THAT HAVE AN ERROR

Lattices from Codes or Codes from Lattices Amin Sakzad Dept of Electrical and Computer Systems

Error-Correcting codes: Application of convolutional codes to Video Streaming Diego Napp

Information Theory Lecture 8 BCH codes BCH codes: R8.45 (R5.6) Decoding BCH (and

Locally tabular polymodal logics Ilya Shapirovsky Institute for Information Transmission Problems

5b. Market Network Codes Capacity Allocation and Congestion Management Forward Capacity

CSS codes Shyam Sundhar R November 12, 2017 Shyam Sundhar R CSS codes November 12, 2017 1 /

Preparation and Adoption International Building Codes The codes published by the International

6.02 Fall 2012 Lecture #4 Linear block codes Rectangular codes Hamming codes 6.02 F

Error-correcting codes and Cryptography Henk van Tilborg Code-based Cryptography Workshop

Yet another side-channel attack: Multi-linear Power Analysis attack (MLPA) Thomas Roche , C

U NE ATTAQUE POLYNOMIALE DU SCH EMA DE I NTRODUCTION M C E LIECE BAS E SUR LES CODES G

Projected Subcodes of the Second Order Binary Reed-Muller Code Matthieu Legeay IRMAR, University

Model-Driven Softw are Engineering of Self- Adaptive System s Charles University, 16.02.2010

Zeta Functions of a Class of Artin-Schreier Curves With Many Automorphisms Renate Scheidler Joint

Examples of Linear Block Codes Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of

P UBLIC - KEY CRYPTOGRAPHY (PKC) E RROR - CORRECTING PAIRS FOR A PUBLIC - KEY CRYPTOSYSTEM P UBLIC

Correlation bounds for polynomials, and the disproof of a conjecture on Gowers' norm using Ramsey

Locally decodable codes: from computational complexity to cloud - PowerPoint PPT Presentation

Locally decodable codes: from computational complexity to cloud computing Sergey Yekhanin Microsoft Research Error-correcting codes: paradigm 2 E() 2 E() +noise 0110001 011000100101 01*00*10010*

Proofs of Retrievability Using Locally Decodable Codes Julien Lavauzelle, Franoise

Tight Upper and Lower Bounds for Leakage-Resilient Locally Decodable and Updatable Non-Malleable

Building Codes Building Codes Building Codes Building Codes 1 1 Builder Responsibilities

ECEN 5682 Theory and Practice of Error Control Codes Cyclic Codes Peter Mathys University of

Formal Modeling in Cognitive Science Source Codes Lecture 30: Codes; Kraft Inequality; Source

CODES FOR ALL SEASONS Emina Soljanin, Bell Labs IN THE CLOUD? CODES Emina @ Bell Labs Codes at

G ENERALIZED R EED -S OLOMON CODES (GRS CODES ) A CHARACTERIZATION OF MDS CODES THAT HAVE AN ERROR

Lattices from Codes or Codes from Lattices Amin Sakzad Dept of Electrical and Computer Systems

Error-Correcting codes: Application of convolutional codes to Video Streaming Diego Napp

Information Theory Lecture 8 BCH codes BCH codes: R8.45 (R5.6) Decoding BCH (and

Locally tabular polymodal logics Ilya Shapirovsky Institute for Information Transmission Problems

5b. Market Network Codes Capacity Allocation and Congestion Management Forward Capacity

CSS codes Shyam Sundhar R November 12, 2017 Shyam Sundhar R CSS codes November 12, 2017 1 /

Preparation and Adoption International Building Codes The codes published by the International

6.02 Fall 2012 Lecture #4 Linear block codes Rectangular codes Hamming codes 6.02 F

Error-correcting codes and Cryptography Henk van Tilborg Code-based Cryptography Workshop

Yet another side-channel attack: Multi-linear Power Analysis attack (MLPA) Thomas Roche , C

U NE ATTAQUE POLYNOMIALE DU SCH EMA DE I NTRODUCTION M C E LIECE BAS E SUR LES CODES G

Projected Subcodes of the Second Order Binary Reed-Muller Code Matthieu Legeay IRMAR, University

Model-Driven Softw are Engineering of Self- Adaptive System s Charles University, 16.02.2010

Zeta Functions of a Class of Artin-Schreier Curves With Many Automorphisms Renate Scheidler Joint

Examples of Linear Block Codes Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of

P UBLIC - KEY CRYPTOGRAPHY (PKC) E RROR - CORRECTING PAIRS FOR A PUBLIC - KEY CRYPTOSYSTEM P UBLIC

Correlation bounds for polynomials, and the disproof of a conjecture on Gowers' norm using Ramsey

Locally decodable codes: from computational complexity to cloud computing Sergey Yekhanin Microsoft Research Error-correcting codes: paradigm 2 E() 2 E() +noise 0110001 011000100101 010010010*