The SCADS Director: Scaling a Distributed Storage System Under - - PowerPoint PPT Presentation
The SCADS Director: Scaling a Distributed Storage System Under - - PowerPoint PPT Presentation
The SCADS Director: Scaling a Distributed Storage System Under Stringent Performance Requirements Beth Trushkowsky, Peter Bodk, Armando Fox, Michael J. Franklin, Michael I. Jordan, David A. Patterson FAST 2011 elasticity for interactive web
elasticity for interactive web apps
2
Interactivity Service-Level-Objective: Over any 1-minute interval, 99% of requests are satisfied in less than 100ms Targeted systems features:
- horizontally scalable
- API for data movement
- backend for interactive apps
…
clients web servers storage
✔
wikipedia workload trace - June 2009
3
Michael Jackson dies
- verprovisioning storage system
4
(assuming data stored on ten servers)
- verprovision by 300%
to handle spike
contributions
5
Cloud computing is mechanism for storage elasticity
Scale up when needed Scale down to save money
We address the scaling policy
Challenges of latency-based scaling Model-based approach for elasticity to deal with stringent SLO Fine-grained workload monitoring aids in scaling up and down Show elasticity for both a hotspot and a diurnal workload
pattern
SCADS key/value store
6
Features
Partitioning (until some minimum data size) Replication Add/remove servers
Properties
Range-based partitioning Data maintained in memory for performance Eventually consistent
(see SCADS: Scale-independent storage for social computing applications, CIDR’09)
classical closed-loop control for elasticity?
7
SCADS cluster Controller actions Action Executor actions upper %-tile latency sampled latency sampled latency config
- scillations from a noisy signal
8
time 99th %-tile latency
Noisy signal… Will smoothing help?
too much smoothing masks spike
9
time 99th %-tile latency
variation for smoothing intervals
10 5 10 15 2 5 10 20 50 smoothing interval [min] standard deviation [ms] (log scale)
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
99th %-tile latency mean latency (SCADS running on Amazon EC2) raw 99th % raw mean
model-predictive control (MPC)
MPC instead of classical closed-loop
Upper %-tile latency is a noisy signal Use per-server workload as predictor of upper %-tile latency Therefore need a model that predicts SLO violations based on
- bserved workload
Reacting with MPC
Use model of the system to determine a sequence of actions
to change state to meet constraint
Execute first steps, then re-evaluate
11
Model workload SLO violation
model-predictive control loop
12
SCADS cluster Controller actions Action Executor actions Performance Models Workload Histogram sampled workload smoothed workload config upper %-tile latency sampled latency sampled latency config
building a performance model
Benchmark SCADS servers
- n Amazon’s EC2
Steady-state model
Single server capacity Explore space of possible
workload
Binary classifier: SLO violation
- r not
13
2000 4000 6000 8000 500 1000 1500 2000 2500 get workload [req/sec] put workload [req/sec]
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !!!!!!!!!!!!!!!!!!!!!! ! !!!!!!!!!!!!!!!!!!!!!!!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
2000 4000 6000 8000 500 1000 1500 2000 2500 get workload [req/sec] put workload [req/sec]
! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !!!!!!!!!!!!!!!!!!!!!! ! !!!!!!!!!!!!!!!!!!!!!!!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
50/50 80/20 90/10 95/5 No violation Violation
how much data to move?
14
time workload (requests/sec)
finer-granularity workload monitoring
Need fine-grained workload monitoring
Data movement especially impacts tail of latency distribution Only move enough data to alleviate performance issues Move data quickly Better for scaling down later Monitor workload on small units of data (bins) Move/copy bins between servers
15
summary of approach
16
Fine-grained monitoring and performance model
Determine amount of data to move from overloaded server Estimate how much “extra room” an underloaded server has Know when safe to coalesce servers
Replication for predictability and robustness
See paper and/or tonight’s poster session
controller stages
17
Stage 1: Replicate Workload threshold Bins Stage 2: Partition Stage 3: Allocate servers Storage nodes N1 N2 N3 N4 N5 N6 N7
controller stages
18
Workload threshold Bins
destination
Stage 1: Replicate Stage 2: Partition Stage 3: Allocate servers Storage nodes N1 N2 N3 N4 N5 N6 N7
controller stages
19
Storage nodes Workload threshold Bins N1 N2 N3 N4 N5 N6 N7 Stage 1: Replicate Stage 2: Partition Stage 3: Allocate servers
experimental results
Experiment setup
Up to 20 SCADS servers run on m1.small instances on Amazon EC2 Server capacity: 800MB, due to in-memory restriction 5-10 data bins per server 100ms SLO on read latency
Workload profiles
Hotspot
100% workload increase in five minutes on a single data item Based on spike experienced by CNN.com on 9/11
Diurnal
Workload increases during the day, decreases at night Replayed trace at 12x speedup 20
extra workload directed to single data item
21 request rate
05:10 05:15 05:20 05:25 05:30 40000 30000
aggregate request rate
05:10 05:15 05:20 05:25 05:30 10000 30000
per−bin request rate
150 05:10 05:15 05:20 05:25 05:30
per-bin request rate
hot bin
- ther 199 bins
time [min]
replicating hot data
22
per-bin request rate
99th percentile latency [ms]
50 100 150 20
99th %-tile latency (ms)
number of servers
5 10 15 20
number of servers
time [min]
05:10 05:15 05:20 05:25 05:30 10000 30000
per−bin request rate
150 05:10 05:15 05:20 05:25 05:30
scaling up and down
23
Number of servers
two experiments close
to “ideal”
Over-provisioning
tradeoff
Amplify workload by
10%, 30%
Savings
Known peak: 16% 30% headroom: 41%
20 40 60 80 100 120 40000 80000 120000 simulated time [min] workload rate [req/s]
number of servers aggregate request rate ideal elastic 10% elastic 30%
20 40 60 80 100 120 5 10 15 simulated time [min] number of servers elastic 0.3 elastic 0.1 ideal
cost-risk tradeoff
24
Over-provisioning
Allows more time before violation occurs Cost-risk tradeoff
Comparing over-provisioning for diurnal experiment
Recall SLO parameters: threshold, percentile, interval Over-provisioning factor of 30% vs 10% Interval Max percentile achieved 30% 10% 5 min 99.5 99 1 min 99 95 20 sec 95 90
conclusion
Elasticity for storage servers possible by leveraging cloud
computing
Upper percentile too noisy
Model-based approach to build control framework for
elasticity subject to stringent performance SLO
Finer-grained workload monitoring
Minimize impact of data movement on performance Quickly responding to workload fluctuations
Evaluated on EC2 with hotspot and diurnal workloads
25
increasing replication
26
50 100 150 200 0.0 0.2 0.4 0.6 0.8 1.0
99th percentile latency with varying replication
lantecy [ms] CDF 5 nodes, 1 replica 10 nodes, 2 replicas 15 nodes, 3 replicas