using a shared storage class memory device to improve the
play

Using a Shared Storage Class Memory Device to Improve the - PowerPoint PPT Presentation

Using a Shared Storage Class Memory Device to Improve the Reliability of RAID Arrays S. Chaarawi. U. of Houston J.-F. Pris, U. of Houston A. Amer, Santa Clara U. T. J. E. Schwarz, U. Catlica del Uruguay D. D. E. Long, U. C. Santa Cruz


  1. Using a Shared Storage Class Memory Device to Improve the Reliability of RAID Arrays S. Chaarawi. U. of Houston J.-F. Pâris, U. of Houston A. Amer, Santa Clara U. T. J. E. Schwarz, U. Católica del Uruguay D. D. E. Long, U. C. Santa Cruz

  2. The problem  Archival storage systems store  Huge amounts of data  Over long periods of time  Must ensure long-term survival of these data  Disk failure rates  Typically exceed 1% per year  Can exceed 9–10% per year

  3. Requirements  Archival storage systems should  Be more reliable than conventional storage architectures  Excludes RAID level 5  Be cost-effective  Excludes mirroring  Have lower power requirements than conventional storage architectures  Not addressed here

  4. Non-Requirements  Contrary to conventional storage systems  Update costs are much less important  Access times are less critical

  5. Traditional Solutions  Mirroring:  Maintains two copies of all data  Safe but costly  RAID level 5 arrays:  Use omission correction codes: parity  Can tolerate one disk failure  Cheaper but less safe than mirroring

  6. More Recent Solutions (I)  RAID level 6 arrays:  Can tolerate two disk failures  Or a single disk failure and bad blocks on several disks  Slightly higher storage costs than RAID level 5 arrays  More complex update procedures  X-Code, EvenOdd, Row-Diagonal Parity

  7. More Recent Solutions (II)  Superparity:  Widani et al., MASCOTS 2009  Partitions each disk into fixed-size “disklets” used to form conventional RAID stripes  Groups these stripes into “supergroups”  Adds to each supergroup one or more distinct “superparity” devices

  8. More Recent Solutions (III)  Shared Parity Disks  Paris and Amer, IPCCC 2009  Does not use disklets  Starts with a few RAID level 5 arrays  Adds an extra parity disk to these arrays

  9. Example (I)  Start with two RAID arrays:  In reality, parity blocks will be distributed among all disks D 00 D 01 D 02 D 03 D 04 D 05 P 0 D 10 D 11 D 12 D 13 D 14 D 15 P 1

  10. Example (II)  Add an extra parity disk D 00 D 01 D 02 D 03 D 04 D 05 P 0 Q D 10 D 11 D 12 D 13 D 14 D 15 P 1

  11. Example (III)  Single disk failures handled within each individual RAID array  Double disk failures handled by whole structure

  12. Example (IV)  We XOR the two parity disks to form a single virtual drive D 00 D 01 D 02 D 03 D 04 D 05 P 0 Q D 10 D 11 D 12 D 13 D 14 D 15 P 1

  13. Example (V)  And obtain a single RAID level 6 array D 00 D 01 D 02 D 03 D 04 D 05 P 0 ⊕ P 1 Q D 10 D 11 D 12 D 13 D 14 D 15

  14. Example (VI)  Our array tolerates all double failures  Also tolerates most triple failures  Triple failures causing a data loss include failures of:  Three disks in same RAID array  Two disks in same RAID array plus shared parity disk Q

  15. Triple Failures Causing a Data Loss X X X D 02 D 03 D 04 D 05 Q D 10 D 11 D 12 D 13 D 14 D 15 P 1 X X D 02 D 03 D 04 D 05 P 0 X D 10 D 11 D 12 D 13 D 14 D 15 P 1

  16. Our Idea  Replace the shared parity disk by a much more reliable device  A Storage Class Memory (SCM) device  Will reduce the risk of data loss

  17. Storage Class Memories  Solid-state storage  Non-volatile  Much faster than conventional disks  Numerous proposals:  Ferro-electric RAM (FRAM)  Magneto-resistive RAM (MRAM)  Phase-change memories (PCM)  We focus on PCMs as exemplar of these technologies

  18. Phase-Change Memories No moving parts A data cell Crossbar organization

  19. Phase-Change Memories  Cells contain a cha chalco lcoge geni nide de material that has tw two sta tate tes  Amor orphou ous with high electrical resistivity  Cry rysta talli lline ne with low electrical resistivity  Quickly ckly co cooli ling ng material from above fusion point leaves it in amor orphou ous s state  Slo lowly ly co cooli ling ng material leaves it in cry crysta talli lline ne sta tate te

  20. Key Parameters of Future PCMs  Target date 2012  Access time 100 ns  Data Rate 200–1000 MB/s 10 9 write cycles  Write Endurance  Read Endurance no upper limit  Capacity 16 GB  Capacity growth > 40% per year  MTTF 10–50 million hours  Cost < $2/GB

  21. New Array Organization  Use SCM device as shared parity device D 00 D 01 D 02 D 03 D 04 D 05 P 0 Q D 10 D 11 D 12 D 13 D 14 D 15 P 1

  22. Reliability Analysis  Reliability R ( t ):  Probability that system will operate correctly over the time interval [0, t ] given that it operated correctly at time t = 0  Hard to estimate  Mean Time To Data Loss (MTTDL):  Single value  Much easier to compute

  23. Our Model  Device failures are mutually independent and follow a Poisson law  A reasonable approximation  Device repairs can be performed in parallel  Device repair times follow an exponential law  Not true but required to make the model tractable

  24. Scope of Investigation  We computed the MTTDL of  A pair of RAID 5 arrays with 7 disks each plus a shared parity SCM  A pair of RAID 5 arrays with 7 disks each plus a shared parity disk and compare it with the MTTDLs of  A pair of RAID 5 arrays with 7 disks each  A pair of RAID 6 arrays with 8 disks each

  25. System Parameters (I)  Disk mean time to fail was assumed to be 100,000 hours (11 years and 5 months)  Corresponds to a failure rate λ of 8 to 9% per year  High end of failure rates observed by Schroeder + Gibson and Pinheiro et al.  SCM device MTTF was assumed to be a multiple of disk MTTF

  26. System Parameters  Disk and SCM device repair times varied between 12 hours and one week  Corresponds to repair rates µ varying between 2 and 0.141 repairs/day

  27. State Diagram Initial State 14 λ 13 λ α 13 λ 11 λ + λ ′ 20 30 00 10 3 μ 2 μ μ λ ′ λ ′ μ μ βλ ′ μ β 13 λ 14 λ (1- α )12 λ +(1 − β ) λ ′ 01 21 11 12 λ 2 μ μ (1- β )13 λ α is fraction of triple disk failures that do not result in a data loss Data Loss β is fraction of double disk failures that do not result in a data loss when the shared parity device is down

  28. Impact of S CM Reliability 1.E+07 Shared SCM device never fails Shared SCM device fails 10 times less frequently Shared SCM device fails 5 times less frequently All disks 1.E+06 MTTDL (years) 1.E+05 1.E+04 0 1 2 3 4 5 6 7 8 Mean Repair Time (days)

  29. Comparison with other solutions 1.E+08 Shared SCM device never fails Shared SCM device fails 10 times less frequently 1.E+07 Shared SCM device fails 5 times less frequently All disks Pair of RAID 6 arrays with 8 disks each 1.E+06 Pair of RAID 5 arrays with 7 disks each MTTDL (years) 1.E+05 1.E+04 1.E+03 1.E+02 1.E+01 0 1 2 3 4 5 6 7 8 Mean Repair Time (days)

  30. Main Conclusions  Replacing the shared parity disk by a shared parity device increases the MTTDL of the array by 40 to 59 percent  Adding a shared parity device that is 10 times more reliable than a regular disk to a pair of RAID 5 arrays increases the MTTDL of the array by at least 21,000 and up to 31,000 percent  Shared parity organizations always outperform RAID level 6 organization

  31. Cost Considerations  SCM devices are still much more expensive that magnetic disks  Replacing shared parity disk by a pair of mirrored disks would have achieved same performance improvements at a much lower cost

  32. Additional Slides

  33. Orga ganiza nizatio tion n Relative tive MT MTTDL DL Two RAID 5 arrays 0.00096 All Disks 1.0 Two RAID 6 arrays 1.0012 SCM 5 × better 1.4274 1.5080 SCM 10 × better SCN 100 × better 1.5887 SSD never fails 1.5982

  34. Why we selected MTTDLs Much easier to compute than other  reliability indices Data survival rates computed from MTTDL  are a good approximation of actual data survival rates as long as disk MTTRs are at least one thousand times faster than disk MTTFs: J.-F. Pâris, T. J. E. Schwarz, D. D. E. Long and A. Amer,  When MTTDLs Are Not Good Enough: Providing Better Estimates of Disk Array Reliability, Proc. 7 th I2TS ’08 Symp., Foz do Iguaçu, PR, Brazil, Dec. 2008.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend