coding for loss tolerant systems
play

Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 - PowerPoint PPT Presentation

Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 Mathieu Cunche, Vincent Roca INRIA, quipe Plante INRIA Rhne-Alpes Mathieu Cunche, Vincent Roca The erasure channel Erasure codes Reed-Solomon codes LDPC


  1. Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 Mathieu Cunche, Vincent Roca INRIA, équipe Planète INRIA Rhône-Alpes – Mathieu Cunche, Vincent Roca

  2.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 2

  3. The erasure channel  erasure channel 0 0 o definition: Erased ! a symbol either arrives to the destination, without any error… 1 1 … or is erased and never received ≠ BSC (binary symmetric) and AWGN channels… o the integrity assumption is a strong hypothesis o a received symbol is 100% guaranteed error free 3

  4. The erasure channel  where do we find erasure channels? o On the Internet o Because of routing error, congestion o Because of bad CRC/checksum o On wireless and satelitte networks o intermittent connection due to obstacles o Distributed storage o disk failure in RAID systems o node failure in a data center o Distributed computation o Fail stop 4

  5.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 5

  6. Erasure codes o k sources symbols, encoded into n encoding symbols before encoding k o Code rate = = n after encoding o Close to 1 => little redundancy o Close to 0 => high amount of redundancy Symbol erasure Transmission k source symbols Source object Decoded object Encoding Decoding (n-k) repair symbols 6

  7. Erasure codes  Often used as AL-FEC codes o “Application Level - Forward Error Correction” codes  AL-FEC differ from Physical-layer FEC codes o PHY codes: o correct bit errors, and if not possible detect the errors o Symbol = bit o AL-FEC: o recover from symbol erasures o Symbol = byte, IP datagram, file chunck 7

  8. Erasure codes  how can we define good erasure codes?  performance metrics for erasure codes o erasure recovery capabilities o main metric, measured as the overhead ratio: decoding _ overhead  #_ of _ symbols _ required _ for _ decoding  1 k o decoding needs (1+overhead)*k symbols to succeed, whereas ideal (MDS) codes need only k symbols ฀  o encoding and decoding speed o to appreciate the complexity o required memory during encoding and decoding 8

  9.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 9

  10. Reed Solomon codes  In short o Discovered by Reed & Solomon in 1959 o Linear codes over GF(2 n ) o Sum : simple binary XOR o Multiplication and Division: use a logarithmic table o Based on polynomial interpolation o Practical implementation with Vandermonde matrix o any k×k submatrix of a Vandermonde is invertible 10

  11. Reed Solomon codes  Encoding o Matrix vector multiplication X × G = Y × = Encoded vector: Source vector: n encoded symbols Generator matrix: k source symbols k x n Vandermonde o Complexity O(k 2 ) operations 11

  12. Reed Solomon codes  Decoding o Solve a linear system X × G’ = Y’ × = Received vector: Source vector: kxk submatrix of G k received symbols k source symbols (invertible) o Good VDM property: any kxk submatrix is invertible o k encoding symbols are enough to decode o Decoding overhead = 0, said differently RS are MDS o Complexity O(k 3 ) 12

  13. Reed Solomon codes: summary  Perfect codes o Decoding overhead = 0 o Decoding possible as soon as k symbols are received  … but limited scalability o n<255 GF(2 8 ) is sufficient o Fast operation over GF(2 8 ), (small logarithmic table) o Decoding speed = a few 10 Mbps o n>255, use GF(2 16 ) or more o Log table too large, cannot fit in cache o Decoding speed falls = a few Mbps 13

  14.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 14

  15. LDPC codes  in short o “Low Density Parity Check” (LDPC) o linear block codes o Sparse parity check matrix o discovered by Gallager in the 60’s, re -discovered in mid-90s o In general encoding require to solve a linear system O(k 3 ) o but high performance, lightweight variants exist o in the remaining we focus on a binary LDPC o Based on XOR operations 15

  16. LDPC codes  LDPC-staircase codes (RFC 5170) o a simple (trivial) parity check matrix structure Source symbols Parity symbols S 1 S 2 S 3 S 4 S 5 P 1 P 2 P 3 P 4 P 5 0 0 1 1 0 1 0 0 0 0 Constraints S 1  S 4  S 5  P 1  P 2 = 0 1 0 0 1 1 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 0 1 0 1 1 0 0 1 1 0 1 1 1 0 1 0 0 0 1 1 o A.K.A. double diagonal or Repeat Accumulate codes o high encoding speed (encoding is trivial) o recovery capabilities can be made close to ideal codes  16

  17. LDPC codes  Encoding S 1 S 2 S 3 S 4 S 5 P 1 P 2 P 3 P 4 P 5 S 3  S 4  P 1 0 0 1 1 0 1 0 0 0 0 =0  P 2 S 1  S 4  S 5  P 1 =0 1 0 0 1 1 1 1 0 0 0  P 3 S 1  S 2  S 3  P 2 =0 1 1 1 0 0 0 1 1 0 0 S 2  S 4  S 5  P 3  P 4 =0 0 1 0 1 1 0 0 1 1 0  P 5 S 1  S 2  S 3  S 5  P 4 =0 1 1 1 0 1 0 0 0 1 1 S 1  S 4  S 5  P 1  P 2 =0 o Linear complexity O(k)  Decoding o solve a system of linear equations o Several techniques are feasible… 17

  18. LDPC codes  Sol.1: Iterative Decoding (ID) o If an equation has only one unknown variable, this latter is equal to the sum of the others. Reiterate … o Efficient thanks to the sparsness of the parity check matrix o Pros: Low complexity (linear O(k)) o Low CPU load and high sustainable bandwidth o Cons: Suboptimal in terms of correction capabilities o Some full rank systems cannot be solved Overhead for a failure proba ≤ 10 -4 code rate Average overhead (k=1000,N1=3) 2/3 (=0.66) 9.99 % 13.93 % 2/5 (=0.4) 17.13 % 22.91 % 18

  19. LDPC codes  Sol.2: Maximum Likelihood(ML) decoding o Solve a linear system (Gaussian Elimination, LU decomposition …) xA = b Information of the Missing symbols Submatrix of the received symbols Generator matrix o Excellent erasure correction capabilities Overhead for a failure proba ≤ 10 -4 code rate Average overhead (k=1000,N1=5) 2/3 (=0.66) 0.63 % 2.21 % 2/5 (=0.4) 2.04 % 4.41 % o High complexity: O(k 3 ) 19

  20. Some more details on LDPC codes considered  Sol. 3: Hybrid ID/ML scheme o Hybrid decoder o start decoding with ID (fast) o finish with ML if necessary (optimal) o excellent erasure correction capabilities… o … while remaining very fast 20

  21. LDPC codes  Decoding speed of the hybrid decoder o LDPC-staircase (N1=5), code rate 2/3, k=1,000 o Reed Solomon over GF(2 8 ) 32.4 times faster than RS ML needed more (1.7 Gbps) and more often sustainable ID sufficient decoding speed still 10.2 times faster (Mbps) (500 Mbps) with RS: 54Mbps loss probability(%) 21

  22.  The erasure channel  Erasure codes  Reed-Solomon codes  LDPC codes  Application to distributed storage 22

  23. Application to distributed storage Using replication : • A file partitionned into 8 blocks • Each block is replicated 4 times Client_1 2 4 1 3 1 2 5 8 1 3 4 6 6 8 3 4 6 7 6 7 2 5 2 3 1 4 7 8 5 7 5 8 Client_2 Can tolerate up to 3 failures 23

  24. Application to distributed storage Using erasure codes: • A file encoded into 32 blocks: Client_1 8 source blocks 24 repair blocks 1 2 E F 3 4 M N G H A B I J O P C D K L Q R U V 5 6 S T W X 7 8 Client_2 Can tolerate up to 6 failures, since 8 blocks are enough to decode 24

  25. Conclusion  Erasure codes o Add redundancy to combat symbol erasures  Reed-Solomon o Perfect codes (MDS), but inefficient for large objects  LDPC codes o Can encode large objects o Corrections capabilities close to MDS o High encoding and decoding speed 25

  26. Questions ?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend