codes for big data erasure coding for distributed storage
play

Codes for Big Data: Erasure Coding for Distributed Storage P. Vijay - PowerPoint PPT Presentation

Codes for Big Data: Erasure Coding for Distributed Storage P. Vijay Kumar Professor, Department of Electrical Communication Engineering Indian Institute of Science, Bangalore The 3rd Annual Storage Developer Conference Bengaluru May 25-26,


  1. Windows Azure Storage Coding Solution MicrosoH+Azure+Code+ P 1+ X 1+ X 2+ X 3+ X 4+ X 5+ X 6+ X 7+ Y 1+ Y 2+ Y 3+ Y 4+ Y 5+ Y 6+ Y 7+ P 2+ XPcode+ YPcode+ P X+ P Y+ Comparison: In terms of reliability and number of helper nodes contacted for node repair, the two codes are comparable. The overheads however are quite di ff erent, 1.29 for the Azure code versus 1.5 for the RS code. This di ff erence has reportedly saved Microsoft millions of dollars. Re X 1* X 2* X 3* X 4* X 5* X 6* P 1* P 2* P 3* Huang, Simitci, Xu, Ogus, Calder, Gopalan, Li, Yekhanin, “Erasure Coding in Windows Azure Storage,” USENIX, Boston, MA, 2012. 29 / 41

  2. Codes with Hierarchical Locality [4 , 3 , 2] code ⇒ (3,1) code [12 , 8 , 3] code ⇒ (8,4) code [24 , 14 , 6] code ⇒ (14,10) code Codes with hierarchical locality do exactly that by calling for help from an intermediate layer of codes when the local code fails. These codes may be regarded as the “middle codes”. B. Sasidharan, G. K.Agarwal, PVK, “Codes With Hierarchical Locality,” arXiv:1501.06683 [cs.IT]. 30 / 41

  3. Codes with Local Regeneration 31 / 41

  4. Codes with Local Regeneration Codes(with(Locality:(( Regenera'ng(Codes:(( Minimize(repair(BW( Minimize(repair(degree( Codes(with(Local(Regenera'on:(( Small(repair(BW(and(( small(repair(degree( A single code that has both locality and regeneration properties and inherent double replication of data 1 G. M. Kamath, N. Prakash, V. Lalitha, PVK, ‘Codes With Local Regeneration and Erasure Correction,” T-IT, Aug. 2014 . 32 / 41

  5. An Example Code with Local Regeneration The construction makes can make use of an all-symbol local scalar code and is also optimal: 1,2, 1,2, 1,2, 3,4 3,4 3,4 1 4 1 4 1 4 2 2 2 1,5 4,7, 1,5, 4,7, 1,5, 4,7, 7 7 7 6,7 9,P 1 6,7 9,P 2 6,7 9,P 3 6 6 6 5 3 5 3 P 2 5 3 P 3 P 1 9 9 9 2,5, 3,6, 2,5, 3,6, 2,5, 3,6, 8 8 8 8,9 8,P 1 8,9 8,P 2 8,9 8,P 3 Local Code 1 Local Code 2 Local Code 3 Scalar All-Symbol Locality Code 1 2 . . . 9 P 1 1 2 . . . 9 P 2 1 2 9 P 3 . . . Local Code 1 Local Code 2 Local Code 3 33 / 41

  6. Codes with Availability (Recovery from Simultaneous Multiple Erasures) 34 / 41

  7. Recovery in Parallel c 11 C 12 C 13 c 14 c 15 c 21 c 22 c 23 c 24 c 25 X X c 31 c 32 c 33 c 34 c 35 c 41 c 42 c 43 c 44 c 45 c 51 c 52 c 53 c 54 c 55 Last column is a parity check on entries to the left in the same row Last row is a parity check on entries above in the same column Can recover locally from 2 erasures in parallel 35 / 41

  8. Codes with Sequential Recovery (Recovery from Simultaneous Multiple Erasures) 36 / 41

  9. Sequential Recovery c 11 C 12 C 13 c 14 c 15 X c 21 c 22 c 23 c 24 c 25 X X c 31 c 32 c 33 c 34 c 35 c 41 c 42 c 43 c 44 c 45 c 51 c 52 c 53 c 54 c 55 Same code as before Can recover locally from 3 erasures in a sequential manner Sequential recovery enables codes with larger storage e ffi ciency 37 / 41

  10. References - Codes for Multiple Erasures A. Wang and Z. Zhang, “Repair locality with multiple erasure tolerance,” IEEE Trans. 1 Inf. Theory, Nov. 2014. N. Prakash, V. Lalitha, and P. V. Kumar, “Codes with locality for two erasures,” in Proc. 2 IEEE Int. Symp. Inform. Theory (ISIT) 2014. W. Song and C. Yuen, “Binary locally repairable codes - sequential repair for multiple 3 erasures,” in Proc. IEEE GLOBECOM, 2016. 38 / 41

  11. Functioning of an Example, Coupled-Layer MSR Code Goal: To show that a larger sub-packetization level is not necessarily a problem for implementation 39 / 41

  12. Example Coupled-Layer MSR Code y x" Our coupled-layer perspective Z"="(0,0,0)" on the Ye-Barg construction (2) a (4 , 2) MSR code 6 nodes, sub-packetization Z level is ` = 8 6 × 8 = 48 points 2MB in the example to follow, each point stores 2MB Z"="(1,1,1)" 1 M. Ye, and A. Barg, “Explicit constructions of optimal- access MDS codes with nearly optimal sub-packetization, ” May 2016. B. Sasidharan, M. Vajha, and PVK. “An Explicit, Coupled-Layer Construction of a 2 High-Rate MSR Code with Low Sub-Packetization Level, Small Field Size and d < ( n − 1), ” to be presented at ISIT 2017. 40 / 41

  13. Consider a file of size 64MB 64MB • Will encode via a [k=4, m=2] MSR Code • Called the Coupled-Layer MSR Code

  14. Step 1: Break file into k = 4 data chunks, each of 16MB. 16MB 16MB 16MB 16MB

  15. Data cube representation of CL-MSR Code 16MB 16MB 16MB 16MB y x" The cube has: Z"="(0,0,0)" ● 6 columns, each associated to a distinct node Z ● 8 horizontal planes. ● A column has 8 points 2MB ● Each point corresponds Z"="(1,1,1)" to 2MB of storage

  16. Place four 16MB chunks in four systematic nodes 16MB 16MB 16MB y x" Z"="(0,0,0)" Z Z"="(1,1,1)"

  17. Place four 16MB chunks in four systematic nodes 16MB 16MB y x" Z"="(0,0,0)" Z Z"="(1,1,1)"

  18. Place four 16MB chunks in four systematic nodes 16MB y x" Z"="(0,0,0)" Z Z"="(1,1,1)"

  19. Place four 16MB chunks in four systematic nodes y x" Z"="(0,0,0)" Z Z"="(1,1,1)"

  20. We now have the systematic nodes

  21. We will now compute the parity nodes Actual data cube A

  22. Will get there through an intermediate “Virtual data cube” Virtual data cube Actual data cube A B

  23. Start filling the virtual data cube on the right as follows

  24. Certain pairs of points in the cube are “coupled” A 1 A 2

  25. The Coupling Transform is a 2x2 matrix transform A 2 A 1 A 1 A 2 Coupling Transform B 1 B 2

  26. Place the points obtained in the Virtual data cube A 1 A 2 B 1 B 2

  27. Place the points obtained in the Virtual data cube B 1 A 1 A 2 B 1 B 2 B 2

  28. Place the points obtained in the Virtual data cube A 1 A 2

  29. Place the points obtained in the Virtual data cube A 2 A 1 A 1 Coupling A 2 Transform B 1 B 2

  30. Place the points obtained in the Virtual data cube A 1 B 1 B 1 B 2 A 2 B 2

  31. Place the points obtained in the Virtual data cube A 2 A 1 A 1 Coupling Transform A 2 B 1 B 2

  32. Place the points obtained in the Virtual data cube A 1 B 1 B 2 A 2

  33. Place the points obtained in the Virtual data cube A 1 B 1 B 1 B 2 A 2 B 2

  34. Place the points obtained in the Virtual data cube B 1 A 1 A 2 B 2

  35. Place the points obtained in the Virtual data cube

  36. Red dotted points are not paired, they are simply carried over Copy

  37. Red dotted points are not paired, they are simply carried over Copy

  38. We now have data-part of the Virtual data cube y x" Z"="(0,0,0)" Z Z"="(1,1,1)"

  39. Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,0)"

  40. Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,0)" RS Encode

  41. Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,0)" RS Encode

  42. Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,0)"

  43. Each plane is Reed-Solomon coded to obtain parity points Z"="(1,0,0)" RS Encode

  44. Each plane is Reed-Solomon coded to obtain parity points Z"="(0,1,0)" RS Encode

  45. Each plane is Reed-Solomon coded to obtain parity points Z"="(1,1,0)" RS Encode

  46. Each plane is Reed-Solomon coded to obtain parity points Z"="(0,0,1)" RS Encode

  47. Each plane is Reed-Solomon coded to obtain parity points Z"="(1,0,1)" RS Encode

  48. Each plane is Reed-Solomon coded to obtain parity points Z"="(0,1,1)" RS Encode

  49. Each plane is Reed-Solomon coded to obtain parity points Z"="(1,1,1)" RS Encode

  50. Now we have the complete Virtual data cube Virtual data cube B

  51. Parity points of Actual data cube can now be computed Virtual data cube B

  52. Perform decoupling B 1 B 2 Virtual data cube B

  53. Perform decoupling B 1 B 1 B 2 Inverse Coupling Transform B 2 A 2 A 1 Virtual data cube B

  54. Perform decoupling B 1 A 2 A 1 B 2 Virtual data cube B

  55. Perform decoupling B 1 A 2 A 1 A 1 B 2 A 2 Virtual data cube B

  56. Perform decoupling B 1 B 2 Virtual data cube B

  57. Perform decoupling B 1 B 1 B 2 Inverse Coupling Transform B 2 A 2 A 1 Virtual data cube B

  58. Perform decoupling B 1 A 1 A 2 B 2 Virtual data cube B

  59. Perform decoupling B 1 A 1 A 2 A 1 B 2 A 2 Virtual data cube B

  60. Perform decoupling B 1 B 1 B 2 Inverse Coupling Transform B 2 A 2 A 1 Virtual data cube B

  61. Perform decoupling B 1 A 1 A 2 B 2 Virtual data cube B

  62. Perform decoupling B 1 A 1 A 2 A 1 B 2 A 2 Virtual data cube B

  63. Perform decoupling B 1 A 1 B 2 A 2 Virtual data cube B

  64. Red dotted points are simply carried over B 1 Copy B 2 Virtual data cube B

  65. Red dotted points are simply carried over B 1 Copy B 2 Virtual data cube B

  66. Actual and Virtual data cubes Coupling Decoupling Virtual data cube Virtual data cube A B

  67. The encoding is now completed!

  68. Problem of Node Repair: One node fails

  69. Problem of Node Repair: One node fails

  70. For this example, only half of the planes participate in repair ● Total Helper Data = 2MB X 4 X 5 = 40MB ● Opposed to RS code = 16MB X 4 = 64MB ● Much larger savings seen for m > 2

  71. Couple points Coupling

  72. Run RS decoding on each of the selected planes RS Dec Coupling

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend