RAIDP: ReplicAtion with Intra-Disk Parity
Eitan Rosenfeld, Aviad Zuck, Nadav Amit, Michael Factor, Dan Tsafrir
Slide 1 of 41
RAIDP: ReplicAtion with Intra-Disk Parity Eitan Rosenfeld , Aviad - - PowerPoint PPT Presentation
RAIDP: ReplicAtion with Intra-Disk Parity Eitan Rosenfeld , Aviad Zuck, Nadav Amit, Michael Factor, Dan Tsafrir Slide 1 of 41 Todays Datacenters Slide 2 of 41 Image Source: http://www.google.com/about/datacenters/gallery/#/tech/14 Problem:
Eitan Rosenfeld, Aviad Zuck, Nadav Amit, Michael Factor, Dan Tsafrir
Slide 1 of 41
Image Source: http://www.google.com/about/datacenters/gallery/#/tech/14
Slide 2 of 41
Slide 3 of 41
Slide 4 of 41
Slide 5 of 41
Slide 6 of 41
Slide 7 of 41
Slide 8 of 41
Slide 9 of 41
Slide 10 of 41
Slide 11 of 41
Slide 12 of 41
Slide 13 of 41
1 4 2 1 4 3 4 3 5 1 1 6 2 5 6 5
Slide 14 of 41
1 4 2 1 4 3 4 3 5 1 1 6 2 5 6 5
Slide 15 of 41
1 4 2 1 4 3 4 3 5 1 1 6 2 5 6 5
Slide 16 of 41
APARITY A1 A2 A3 APARITY
B1 B3 C1 1 B2 CPARITY BPARITY 5 C3 C2 DPARITY D1 D3 D2
Slide 17 of 41
APARITY A1 A2 A3 APARITY
B1 B3 C1 1 B2 CPARITY BPARITY 5 C3 C2 DPARITY D1 D3 D2
Slide 18 of 41
A1 A2 A3 APARITY
B1 B3 C1 1 B2 CPARITY BPARITY 5 C3 C2 DPARITY D1 D3 D2
Slide 19 of 41
A1 A2 A3 APARITY
B1 B3 C1 1 B2 CPARITY BPARITY 5 C3 C2 DPARITY D1 D3 D2
Slide 20 of 41
Slide 21 of 41
Slide 22 of 41
Slide 23 of 41
Slide 24 of 41
Slide 25 of 41
1 2 2 3 4 1 3 4 5 5 7 8 9 10 7 8 9 10 6 6
Slide 26 of 41
1 2 2 3 4 1 3 4 5 5 7 8 9 10 7 8 9 10 6 6
Slide 27 of 41
– Fails separately from the associated disk Disk Drive Add-on SATA/SAS Power
4 3 2 1
1 ⨁2 ⨁3 ⨁4
Slide 29 of 41
Add-on Add-on Add-on Add-on Add-on
5 1 4 5 9 10 7 2 3 7 9 6 1⨁2⨁6⨁8 2⨁3⨁7⨁9 3⨁4⨁8⨁10 4⨁5⨁9⨁6 5⨁1⨁10⨁7 1 2 6 8 3 4 8 10
1⨁2⨁6⨁8 8 6 1 2
⊕ ⊕ ⊕ =
XOR Add-on 1 with the surviving superchunks from Disk 1.
1⨁2⨁6⨁8
1 6 8 2
Slide 30 of 41
Slide 31 of 41
Slide 32 of 41
Slide 33 of 41
Slide 34 of 41
RAIDP
U p d a t e s
n
l a c e L s t
s H D F S
H D F S
HDFS
S u p e r c h u n k s
l y
For Updates in place: RAIDP performs 4 I/Os for each write à Both replicas are read before they are overwritten
RAIDP completes the workload 22% faster!
Slide 35 of 41
U p d a t e s
n
l a c e L s t
s H D F S
H D F S
S u p e r c h u n k s
l y
RAIDP HDFS
Slide 36 of 41
Runtime of writing 100GB Network usage in GB when writing 100GB RAIDP HDFS-3 RAIDP HDFS-3
Slide 37 of 41
Runtime of sorting 100GB Network usage in GB when sorting 100GB RAIDP HDFS-3 RAIDP HDFS-3
Slide 38 of 41
System 1Gbps Network 10Gbps Network
16 node cluster with 6GB superchunk
Slide 39 of 41
Slide 40 of 41
Slide 41 of 41