Using a Shared Storage Class Memory Device to Improve the - - PowerPoint PPT Presentation

using a shared storage class memory device to improve the
SMART_READER_LITE
LIVE PREVIEW

Using a Shared Storage Class Memory Device to Improve the - - PowerPoint PPT Presentation

Using a Shared Storage Class Memory Device to Improve the Reliability of RAID Arrays S. Chaarawi. U. of Houston J.-F. Pris, U. of Houston A. Amer, Santa Clara U. T. J. E. Schwarz, U. Catlica del Uruguay D. D. E. Long, U. C. Santa Cruz


slide-1
SLIDE 1

Using a Shared Storage Class Memory Device to Improve the Reliability of RAID Arrays

  • S. Chaarawi. U. of Houston

J.-F. Pâris, U. of Houston

  • A. Amer, Santa Clara U.
  • T. J. E. Schwarz, U. Católica del Uruguay
  • D. D. E. Long, U. C. Santa Cruz
slide-2
SLIDE 2

The problem

  • Archival storage systems store
  • Huge amounts of data
  • Over long periods of time
  • Must ensure long-term survival of these data
  • Disk failure rates
  • Typically exceed 1% per year
  • Can exceed 9–10% per year
slide-3
SLIDE 3

Requirements

  • Archival storage systems should
  • Be more reliable than conventional storage

architectures

  • Excludes RAID level 5
  • Be cost-effective
  • Excludes mirroring
  • Have lower power requirements than

conventional storage architectures

  • Not addressed here
slide-4
SLIDE 4

Non-Requirements

  • Contrary to conventional storage systems
  • Update costs are much less important
  • Access times are less critical
slide-5
SLIDE 5

Traditional Solutions

  • Mirroring:
  • Maintains two copies of all data
  • Safe but costly
  • RAID level 5 arrays:
  • Use omission correction codes:

parity

  • Can tolerate one disk failure
  • Cheaper but less safe than mirroring
slide-6
SLIDE 6

More Recent Solutions (I)

  • RAID level 6 arrays:
  • Can tolerate two disk failures
  • Or a single disk failure and bad blocks on

several disks

  • Slightly higher storage costs than RAID

level 5 arrays

  • More complex update procedures
  • X-Code, EvenOdd, Row-Diagonal Parity
slide-7
SLIDE 7

More Recent Solutions (II)

  • Superparity:
  • Widani et al., MASCOTS 2009
  • Partitions each disk into fixed-size

“disklets” used to form conventional RAID stripes

  • Groups these stripes into “supergroups”
  • Adds to each supergroup one or more

distinct “superparity” devices

slide-8
SLIDE 8

More Recent Solutions (III)

  • Shared Parity Disks
  • Paris and Amer, IPCCC 2009
  • Does not use disklets
  • Starts with a few RAID level 5 arrays
  • Adds an extra parity disk to these arrays
slide-9
SLIDE 9

Example (I)

  • Start with two RAID arrays:
  • In reality, parity blocks will be distributed

among all disks

P0 D05 D04 D01 D00 D03 D02 P1 D15 D14 D11 D10 D13 D12

slide-10
SLIDE 10

Example (II)

  • Add an extra parity disk

Q P0 D05 D04 D01 D00 D03 D02 P1 D15 D14 D11 D10 D13 D12

slide-11
SLIDE 11

Example (III)

  • Single disk failures handled within each

individual RAID array

  • Double disk failures handled by whole

structure

slide-12
SLIDE 12

Example (IV)

  • We XOR the two parity disks to form a

single virtual drive

Q P0 P1 D05 D04 D01 D00 D03 D02 D15 D14 D11 D10 D13 D12

slide-13
SLIDE 13

Example (V)

  • And obtain a single RAID level 6 array

Q

P0⊕P1

D05 D04 D01 D00 D03 D02 D15 D14 D11 D10 D13 D12

slide-14
SLIDE 14

Example (VI)

  • Our array tolerates all double failures
  • Also tolerates most triple failures
  • Triple failures causing a data loss include

failures of:

  • Three disks in same RAID array
  • Two disks in same RAID array plus

shared parity disk Q

slide-15
SLIDE 15

Triple Failures Causing a Data Loss

Q

X

D05 D04

X X

D03 D02 P1 D15 D14 D11 D10 D13 D12

X

P0 D05 D04

X X

D03 D02 P1 D15 D14 D11 D10 D13 D12

slide-16
SLIDE 16

Our Idea

  • Replace the shared parity disk by a

much more reliable device

  • A Storage Class Memory (SCM) device
  • Will reduce the risk of data loss
slide-17
SLIDE 17

Storage Class Memories

  • Solid-state storage
  • Non-volatile
  • Much faster than conventional disks
  • Numerous proposals:
  • Ferro-electric RAM (FRAM)
  • Magneto-resistive RAM (MRAM)
  • Phase-change memories (PCM)
  • We focus on PCMs as exemplar of these

technologies

slide-18
SLIDE 18

Phase-Change Memories

No moving parts Crossbar

  • rganization

A data cell

slide-19
SLIDE 19

Phase-Change Memories

  • Cells contain a cha

chalco lcoge geni nide de material that has tw two sta tate tes

  • Amor
  • rphou
  • us with high electrical resistivity
  • Cry

rysta talli lline ne with low electrical resistivity

  • Quickly

ckly co cooli ling ng material from above fusion point leaves it in amor

  • rphou
  • us s

state

  • Slo

lowly ly co cooli ling ng material leaves it in cry crysta talli lline ne sta tate te

slide-20
SLIDE 20

Key Parameters of Future PCMs

  • Target date

2012

  • Access time

100 ns

  • Data Rate

200–1000 MB/s

  • Write Endurance

109 write cycles

  • Read Endurance

no upper limit

  • Capacity

16 GB

  • Capacity growth

> 40% per year

  • MTTF

10–50 million hours

  • Cost

< $2/GB

slide-21
SLIDE 21

New Array Organization

  • Use SCM device as shared parity device

P0 D05 D04 D01 D00 D03 D02 P1 D15 D14 D11 D10 D13 D12

Q

slide-22
SLIDE 22

Reliability Analysis

  • Reliability R(t ):
  • Probability that system will operate

correctly over the time interval [0, t] given that it operated correctly at time t = 0

  • Hard to estimate
  • Mean Time To Data Loss (MTTDL):
  • Single value
  • Much easier to compute
slide-23
SLIDE 23

Our Model

  • Device failures are mutually independent and

follow a Poisson law

  • A reasonable approximation
  • Device repairs can be performed in parallel
  • Device repair times follow an exponential law
  • Not true but required to make the model

tractable

slide-24
SLIDE 24

Scope of Investigation

  • We computed the MTTDL of
  • A pair of RAID 5 arrays with 7 disks each

plus a shared parity SCM

  • A pair of RAID 5 arrays with 7 disks each

plus a shared parity disk and compare it with the MTTDLs of

  • A pair of RAID 5 arrays with 7 disks each
  • A pair of RAID 6 arrays with 8 disks each
slide-25
SLIDE 25

System Parameters (I)

  • Disk mean time to fail was assumed to be

100,000 hours (11 years and 5 months)

  • Corresponds to a failure rate λ of 8 to 9%

per year

  • High end of failure rates observed by

Schroeder + Gibson and Pinheiro et al.

  • SCM device MTTF was assumed to be a

multiple of disk MTTF

slide-26
SLIDE 26

System Parameters

  • Disk and SCM device repair times varied

between 12 hours and one week

  • Corresponds to repair rates µ varying

between 2 and 0.141 repairs/day

slide-27
SLIDE 27

State Diagram

14λ μ

Data Loss 00 10 20

13λ 2μ

30

α13λ 3μ

21

β13λ 2μ λ′ μ λ′ μ βλ′ μ (1-α)12λ+(1−β)λ′ (1-β)13λ 12λ 11λ+λ′

01 11

14λ μ

α is fraction of triple disk failures that do not result in a data loss β is fraction of double disk failures that do not result in a data loss when the shared parity device is down

Initial State

slide-28
SLIDE 28

1.E+04 1.E+05 1.E+06 1.E+07

1 2 3 4 5 6 7 8 Mean Repair Time (days) MTTDL (years)

Shared SCM device never fails Shared SCM device fails 10 times less frequently Shared SCM device fails 5 times less frequently All disks

Impact of SCM Reliability

slide-29
SLIDE 29

1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08

1 2 3 4 5 6 7 8 Mean Repair Time (days) MTTDL (years)

Shared SCM device never fails Shared SCM device fails 10 times less frequently Shared SCM device fails 5 times less frequently All disks Pair of RAID 6 arrays with 8 disks each Pair of RAID 5 arrays with 7 disks each

Comparison with other solutions

slide-30
SLIDE 30

Main Conclusions

  • Replacing the shared parity disk by a shared

parity device increases the MTTDL of the array by 40 to 59 percent

  • Adding a shared parity device that is 10 times

more reliable than a regular disk to a pair of RAID 5 arrays increases the MTTDL of the array by at least 21,000 and up to 31,000 percent

  • Shared parity organizations always
  • utperform RAID level 6 organization
slide-31
SLIDE 31

Cost Considerations

  • SCM devices are still much more expensive

that magnetic disks

  • Replacing shared parity disk by

a pair of mirrored disks would have achieved same performance improvements at a much lower cost

slide-32
SLIDE 32

Additional Slides

slide-33
SLIDE 33

Orga ganiza nizatio tion n Relative tive MT MTTDL DL Two RAID 5 arrays 0.00096 All Disks 1.0 Two RAID 6 arrays 1.0012 SCM 5 × better 1.4274 SCM 10 × better 1.5080 SCN 100 × better 1.5887 SSD never fails 1.5982

slide-34
SLIDE 34

Why we selected MTTDLs

  • Much easier to compute than other

reliability indices

  • Data survival rates computed from MTTDL

are a good approximation of actual data survival rates as long as disk MTTRs are at least one thousand times faster than disk MTTFs:

  • J.-F. Pâris, T. J. E. Schwarz, D. D. E. Long and A. Amer,

When MTTDLs Are Not Good Enough: Providing Better Estimates of Disk Array Reliability, Proc. 7th I2TS ’08 Symp., Foz do Iguaçu, PR, Brazil, Dec. 2008.