Cake: : Enabling Hig igh-level SLOs on Shared Sto torage Syste tems
Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Randy Katz, Ion Stoica University of California, Berkeley SO SOCC CC 20 2012 12
Cake: : Enabling Hig igh-level SLOs on Shared Sto torage Syste - - PowerPoint PPT Presentation
Cake: : Enabling Hig igh-level SLOs on Shared Sto torage Syste tems Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Randy Katz, Ion Stoica University of California, Berkeley SO SOCC CC 20 2012 12 Content Introduction Problem
Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Randy Katz, Ion Stoica University of California, Berkeley SO SOCC CC 20 2012 12
2
Introduction Problem And Challenge Solutions System Design Implementation Evaluation Conclusion Future work
Rich web applications
A single slow storage request can dominate the
High percentile latency SLOs
Deal with the latency present at the 95th or
99th percentile
4
Datacenter applications
Latency-sensitive Throughput-oriented
Accessing distributed storage systems
Applications don’t share storage systems Service-level objectives on throughput or latency
5
SLOs
Reflect the performance expectations
Amazon, Google, and Microsoft have identified
SLO as a major cause of user dissatisfaction
For example
A web client might require a 99th percentile
latency SLO of 100ms
A batch job might require a throughput SLO of
100 scan requests per second
6
Physically separating storage systems
Need Individual peak load Segregation of data leads to degraded user
experience
Operational complexity
7
Focusing solely on controlling disk-level resources
High-level storage SLOs require consideration of
resources beyond the disk
Disconnect between the high-level SLOs and
performance parameters like MB/s
Require tedious, manual translation More programmer or system operator
8
9
Architecture
10
First-level schedulers as a client
Provide mechanisms for differentiated
scheduling
Split large requests into smaller chunks Limit the number of outstanding device requests
11
Cake’s second-level scheduler as a
While attempting to increase utilization Continually adjusts resource allocation at each
Maximize SLO compliance of the system
12
Differentiated scheduling
a b
13
Split large requests Control number of outstanding requests
c d
14
Multi-resource Request Lifecycle
Request processing in a storage system
involves far more than just accessing disk
Necessitating a coordinated, multi-resource
approach to scheduling
15
Multi-resource Request Lifecycle
16
High-level SLO Enforcement
Cake’s second-level scheduler
front-end clients
batch clients
Two phases of second level scheduling
decisions
based phase
17
The initial SLO compliance-based phase
Decide on disk allocations based on client performance
The queue occupancy-based phase
Balance allocation in the rest of the system to keep the
disk utilized and improve overall performance
18
Chunking Large Requests
19
Number of Outstanding Requests
20
Cake Second-level Scheduler — SLO
Compliance-based Scheduling
21
Cake Second-level Scheduler — Queue
Occupancy-based Scheduling
22
Proportional Shares and Reservations
When the front-end client is sending low throughput, reservations are an effective way of reducing queue time at HDFS
23
Proportional Shares and Reservations
When the front-end is sending high throughput,proportional share is an effective mechanism at reducing latency
24
Single vs Multi-resource Scheduling
CPU contention within HBase when running many concurrent threads and without separate queues and differentiated scheduling
25
Single vs. Multi-resource Scheduling
Thread-per-request displays greatly increased latency with chunked request sizes
26
Convergence Time Diurnal Workload Spike Workload Latency Throughput Trade-off Quantifying Benefits of Consolidation
27
Coordinating resource allocation across
multiple software layers
Allowing application programmers to specify
high-level SLOs directly to the storage
Allowing consolidation of latency-sensitive
and throughput-oriented workloads
28
Allowing users to flexibly move within the
storage latency vs. throughput trade-off by choosing different high-level SLOs
Using Cake has concrete economic and
business advantages
29
SLO admission control Influence of DRAM and SSDs Composable application-level SLOs Automatic parameter tuning Generalization to multiple SLOs
30