FastScale : Accelerate RAID Scaling by Mi i Minimizing Data - - PowerPoint PPT Presentation
FastScale : Accelerate RAID Scaling by Mi i Minimizing Data - - PowerPoint PPT Presentation
FastScale : Accelerate RAID Scaling by Mi i Minimizing Data Migration i i D t Mi ti Weimin Zheng, Guangyan Zhang gyzh@tsinghua.edu.cn Tsinghua University Outline Motivation Minimizing data migration g g Optimizing data
Outline
- Motivation
- Minimizing data migration
g g
- Optimizing data migration
- Evaluation
- Evaluation
- Conclusions
2/16/2011 2
Why Scale a RAID y
- A disk is a simple computer
- A RAID vol. can deliver high perf.
– Multi disks serve an App concurrently.
- applications often require larger
capacity and higher performance.
As user data increase and computing – As user data increase and computing powers enhance
- One solution is to add new disks to a RAID volume
– This disk addition is termed “RAID scaling”.
- To regain a balanced load, some blocks needs to be
moved to new disks.
- Data migration need to be performed online
2/16/2011 3
– To supply non-stop services.
Limitation of Existing Approach g pp
- Existing approach to RAID scaling preserves the round-
bi d f ddi di k robin order after adding disks.
– Pro: the addressing function is simple. Con: all the data need to be moved – Con: all the data need to be moved
- Recent work has optimized data migration, among which
Recent work has optimized data migration, among which
- ne typical example is SLAS (ACM TOS 2007):
– Uses I/O aggregation and lazy checkpointing to improve the ffi i efficiency – Due to migration of all the data, RAID scaling remains costly
Can we reduce the total number of migrated data blocks? Can we reduce the total number of migrated data blocks?
2/16/2011 4
Minimizing Data Migration g g
- FastScale moves only data blocks
f ld di k di k hil from old disks to new disks, while not migrating data among old disks.
) ( 1 n m + m 1
– It is enough for preserving the uniformity of data distribution
… …
- In this manner, FastScale
minimizes data migration for RAID scaling.
D0 D1 Dm-1 Dm Dm+n-1 new disks
- ld disks
RAID scaling.
- We design an elastic addressing function, through which
th l ti f bl k b il t d – the location of one block can be easily computed – without any lookup operation.
2/16/2011 5
Optimizing Data Migration p g g
- FastScale also exploits physical properties to optimize
li d i i
- nline data migration.
First it uses aggregate accesses to improve the efficiency of – First, it uses aggregate accesses to improve the efficiency of data migration. – Second, it records data migration lazily to minimize the number
- f metadata updates while ensuring data consistency.
2/16/2011 6
Results
- Implemented FastScale and SLAS in DiskSim 4.0
– Compared with SLAS, Round-robin RAID-0 scaling.
E l ti d i RAID li
- Evaluation during RAID scaling:
– reduce redistribution time by up to 86.06% – with smaller maximum response time of user I/Os – with smaller maximum response time of user I/Os
- Evaluation after 1 or 2 RAID scaling operations:
- Evaluation after 1 or 2 RAID scaling operations:
– is almost identical with the round-robin RAID-0.
2/16/2011 7
Coverage of FastScale g
- In this paper, we only describe our solution for RAID-0,
i.e., striping without parity.
F tS l l k f RAID 10 d RAID 01 – FastScale can also work for RAID-10 and RAID-01. – Some large storage systems slice disks into many segments, several segments are organized into a RAID.
- Although we do not handle RAID-4 and RAID-5, we
g believe that our method provides a good starting point for efficient scaling of RAID-4 and RAID-5 arrays.
2/16/2011 8
Outline
- Motivation
- Minimizing data migration
g g
- Optimizing data migration
- Evaluation
- Evaluation
- Conclusions
2/16/2011 9
Requirements for RAID Scaling q g
- Requirement 1 (Uniform data distribution):
– If there are B blocks stored on m disks, the expected number of blocks on each disk is approximately B/m so as to maintain an even load.
- Requirement 2 (Minimal Data Migration):
– During the addition of n disks to a RAID with m disks storing B blocks, the expected number of blocks to be moved is B*n/(m+n).
- Requirement 3 (Fast data Addressing):
– In a m-disk RAID, the location of a block is computed by an algorithm with low space and time complexity.
2/16/2011 10
Semi-RR: the Most Intuitive Method
- semi-RR is based on Round-robin scaling
– Only if the resulting disk is one of new disks, it moves a data block. – Otherwise, it does not move a data block. Otherwise, it does not move a data block.
- Good news: Semi-RR can reduce data migration
Good news: Semi RR can reduce data migration significantly.
- Bad news: it does not guarantee uniform distribution of
data blocks after multiple scaling operations
2/16/2011 11
FastScale: Min Migr. & Uniform Dist. g
- take RAID scaling from 3 disks to 5 as an example.
- one RAID scaling process can be divided into two stages
logically:
d t i ti d – data migration and, – data filling.
ll th d t bl k ithi
- all the data blocks within
a parallelogram will be moved.
– 2 data blocks are migrated from each old disk. hil it h i l bl k – while its physical block number is unchanged.
- An elastic function to
2/16/2011 12
describe the data layout
FastScale: Property Examination p y
- Does FastScale satisfies the three requirements?
– compared with the round-robin and semi-RR algorithms.
- From a 4-disk array, we add one disk repeatedly for 10
times, using the three algorithms respectively.
- Each disk has a capacity of 128 GB, and the block size is
64 KB 64 KB.
– In other words, each disk holds 2M blocks.
2/16/2011 13
Comparison in Migration Fraction p g
- Using the round-robin
i
0 6 0.8 1.0
Ratio Round-Robin
algorithm,
– the migration fraction is constantly 100%
0.2 0.4 0.6
Migration R FastScale Semi-RR
co s y 00%
- using semi-RR and FastScale
– The migration fractions are
1 2 3 4 5 6 7 8 9 10 11 0.0
Times of Disk Additions
identical. – They are significantly smaller Restricted by uniformity they – Restricted by uniformity, they are also minimal.
Compared in migration fraction, Semi-RR and FastScale win! Compared in migration fraction, Semi-RR and FastScale win!
2/16/2011 14
FastScale win! FastScale win!
- Comp. in Uniformity of Distribution
p y
- We use the coefficient of variation as a metric to evaluate
the uniformity of data distribution across all the disks
- For the round robin and
the uniformity of data distribution across all the disks.
– The C.V. expresses the std dev. as a percentage of the average.
- For the round-robin and
FastScale algorithms,
– C.V. remain 0 percent as the
8 10 12 14
ation (%) Round-Robin FastScale Semi-RR
p addition times increases.
- the semi-RR algorithm
2 4 6 8
efficient of Varia
– causes excessive oscillation in the C.V. – Maximum is even 13.06%.
1 2 3 4 5 6 7 8 9 10 11
- 2
Coe Times of Disk Additions
Compared in uniformity of distribution, Semi-RR f il d F tS l i i ! Compared in uniformity of distribution, Semi-RR f il d F tS l i i !
2/16/2011 15
fails and FastScale wins again! fails and FastScale wins again!
Comparison in Calculation Overhead p
- we run different algorithms to calculate the physical
addresses for all data blocks on a scaled RAID. addresses for all data blocks on a scaled RAID.
– the average addressing time for each block is calculated. – Setup: Intel Dual Core T9400 2.53 GHz, 4 GB Memory, Windows 7
0.20 0.25
us) Round-Robin FastScale Semi-RR
- The Round-robin algorithm
has the lowest overhead
0.10 0.15
ressing Time (u
has the lowest overhead,
– 0.014 μs or so.
- FastScale has the largest
1 2 3 4 5 6 7 8 9 10 11 0.00 0.05
Add
FastScale has the largest
- verhead.
– the largest time is 0.24 μs
Times of Disk Additions
compared to milliseconds of disk I/O time, the l l ti h d i li ibl compared to milliseconds of disk I/O time, the l l ti h d i li ibl
2/16/2011 16
calculation overhead is negligible. calculation overhead is negligible.
Outline
- Motivation
- Minimizing data migration
g g
- Optimizing data migration
- Evaluation
- Evaluation
- Conclusions
2/16/2011 17
I/O Aggregation
- Aggregate read:
M l i l i – Multiple successive blocks on a disk are read via a single I/O.
- Aggregate write:
– Multiple successive blocks on a disk are blocks on a disk are written via a single I/O.
converts small requests into fewer, larger requests. k t i iti t d lti l bl k converts small requests into fewer, larger requests. k t i iti t d lti l bl k
2/16/2011 18
seek cost is mitigated over multiple blocks. seek cost is mitigated over multiple blocks.
Why can Lazy Checkpointing work?
- Each metadata update causes one long seek :
MetaData is usually stored at the beginning of member disks – MetaData is usually stored at the beginning of member disks
- after data copying, new replica and original are valid.
– block copying does not overwrite any valid data block copying does not overwrite any valid data
- when the system fails and reboots, the original replica
will be used.
3 4
mapping metadata
7 8 11
- As long as data has not
been written since being i d th d t i
D0 D1 D2 24 21 18 15 12 9 6 27 3 30 2 26 23 20 17 14 11 8 29 5 32 1 25 22 19 16 13 10 7 28 4 31
copied, the data remain consistent.
– Only some I/Os are wasted
D2 2 26 23 20 17 14 11 8 29 5 32 D3 D4 11 8 4 7 3
Only some I/Os are wasted
not updating MD immediately does not sacrifice data li bilit Th l th t i it t i t d d t not updating MD immediately does not sacrifice data li bilit Th l th t i it t i t d d t
2/16/2011 19
- reliability. The only threat is write to migrated data.
- reliability. The only threat is write to migrated data.
Lazy Checkpointing
- data blocks are copied to new locations continuously
while the mapping metadata is not updated onto the disks until
- In the figure,
– while the mapping metadata is not updated onto the disks until a threat to data consistency appears.
In the figure,
– “C”: migrated and checkpointed “M” i d b – “M”: migrated but not checkpointed; – “U”:not migrated
- only when a user write request arrives in the area “M”,
data migration is checkpointed lazy checkpointing minimizes the number of t d t it ith t l f d t i t lazy checkpointing minimizes the number of t d t it ith t l f d t i t data migration is checkpointed.
2/16/2011 20
metadata writes without loss of data consistency. metadata writes without loss of data consistency.
Outline
- Motivation
- Minimizing data migration
g g
- Optimizing data migration
- Evaluation
- Evaluation
- Conclusions
2/16/2011 21
Evaluation
- Questions that we want to answer:
– Can FastScale accelerate RAID scaling? – What is the effect on user workloads? How about the performance of a scaled RAID? – How about the performance of a scaled RAID?
W d d il d i l i i h SLAS
- We used detailed simulations to compare with SLAS
– The simulator is implemented with DiskSim as a worker module
ith se eral disk traces collected in real s stems
- with several disk traces collected in real systems
– The traces are TPC-C, Financial trace from SPC, Web search engine trace from SPC
2/16/2011 22
Evaluation
- The simulator is made up of a workload generator and a
di k disk array.
– workload generator initiates an I/O request at the appropriate time.
- The disk array consists of
– an array controller and, y , – Storage components.
- The array controller is logically divided
into:
– an I/O processor and, a data mover – a data mover.
- The simulator is implemented in SimPy
and DiskSim.
2/16/2011 23
and DiskSim.
Scaling under the Financial Workload g
- Under the Fin workload, we conduct a scaling op:
– adding 2 disks to a 4-disk RAID, – each disk has a capacity of 4 GB, ith th 32KB t i it i – with the 32KB stripe unit size
- The figure plots local max latencies as the time increases
90
SLAS end
50 60 70 80
SLAS end, 6830 s ms)
- FastScale accelerates RAID
scaling significantly
20 30 40 50
latency (m SLAS FastScale FastScale end, 952 s
– 952s vs 6,830s, 86.06% improved
- local max latencies are also
smaller
1000 2000 3000 4000 5000 6000 7000 10
timeline (s)
smaller
2/16/2011 24
Scaling under the TPC-C Workload g
- Under the TPC-C workload, we redo the scaling:
– adding 2 disks to a 4-disk RAID,
- The figure plots local max latencies as the time increases
120 140
SLAS end, 6820 s SLAS FastScale
- Once again, shows the efficiency
in improving redistribution time
60 80 100
latency (ms)
p g
– 964s vs 6,820s, 85.87% improved
- local max latencies are also
1000 2000 3000 4000 5000 6000 7000 20 40
FastScale end, 964 s
smaller
1000 2000 3000 4000 5000 6000 7000
timeline (s)
Fastscale improves the scaling efficiency of RAID i ifi tl Fastscale improves the scaling efficiency of RAID i ifi tl
2/16/2011 25
significantly. significantly.
After One Scaling Operation g p
- We compared the performance of two RAIDs scaled using
F S l d SLAS FastScale and SLAS:
– “4+1”: adding 1 disk to a 4-disk RAID
We replayed the Web workload on two RAIDs
- We replayed the Web workload on two RAIDs.
- The figure plots local avg latencies as the time increases
14 10 12 14
ms)
- the performances of the two
RAIDs are very close.
For the round robin RAID the
4 6 8
average latency (m round-robin FastScale
– For the round-robin RAID, the average latency is 11.36 ms. – For the FastScale RAID, the l t i 11 37
100 200 300 400 500 2
timeline (s)
average latency is 11.37 ms.
2/16/2011 26
After Two Scaling Operations g p
- We compared the performance of two RAIDs scaled twice
i F S l d SLAS using FastScale and SLAS:
– “4+1+1”: adding 1 disk to a 4-disk RAID twice
The figure plots local avg latencies as the time increases
- The figure plots local avg latencies as the time increases
12 14
- It again reveals the
approximate equality in the
6 8 10
e latency (ms)
approximate equality in the performances.
– For the round-robin RAID, the
2 4
average round-robin FastScale
, average latency is 11.21 ms. – For the FastScale RAID, the average latency is 11.03 ms.
100 200 300 400 500
timeline (s)
average latency is 11.03 ms.
the performance of the FastScale RAID-0 is almost id ti l ith th t f th RR RAID 0 the performance of the FastScale RAID-0 is almost id ti l ith th t f th RR RAID 0
2/16/2011 27
identical with that of the RR RAID-0 identical with that of the RR RAID-0
Outline
- Motivation
- Minimizing data migration
g g
- Optimizing data migration
- Evaluation
- Evaluation
- Conclusions
2/16/2011 28
Conclusions
- FastScale accelerates RAID-0 scaling significantly
– minimizes data migration without loss of the uniformity of data distribution – optimizes data migration with I/O aggregation and lazy
- ptimizes data migration with I/O aggregation and lazy
checkpointing
- Compared with a round-robin scaling approach,
F S l FastScale can:
– reduce redistribution time by up to 86.06% with smaller maximum response time of user I/Os – with smaller maximum response time of user I/Os.
- the performance of the RAID scaled using FastScale is
almost identical with that of the round-robin RAID.
2/16/2011 29
Thank you! y Questions? Questions?
Guangyan Zhang
http://storage.cs.tsinghua.edu.cn/~zgy
2/16/2011 30
How is a Block Moved?
- a parallelogram is divided into three parts:
– a head triangle, unchanged shape – a body parallelogram, t il t i l h d h – a tail triangle, unchanged shape
head head
- The body parallelogram:
– If m>=n, not a rectangle,
head head body tail body tail m m n
change it into a rectangle – Otherwise, change the t l i t
n head
rectangle into a parallelogram.
head body tail n m body tail n m
2/16/2011 31
(a) m>=n (b) m<n m m
Comparison in Local avg Latencies p g
- Under the Fin workload, we conduct a scaling op:
– adding 2 disks to a 4-disk RAID, – each disk has a capacity of 4 GB, – with the 32KB stripe unit size
10
– with the 32KB stripe unit size
- The figure plots local avg latencies as the time increases
- local avg latencies are close
– FastScale 8.01 ms, SLAS 7 53
6 7 8 9 10
SLAS end, 6830 s FastScale end, 952 s
– SLAS 7.53 ms
- shorter data
redistribution time
2 3 4 5 6
latency (ms) SLAS FastScale
redistribution time
1000 2000 3000 4000 5000 6000 7000 1 2
timeline (s)
2/16/2011 32