Incentivizing Self-Capping to Increase Cloud Utilization Mohammad - - PowerPoint PPT Presentation

incentivizing self capping to increase cloud utilization
SMART_READER_LITE
LIVE PREVIEW

Incentivizing Self-Capping to Increase Cloud Utilization Mohammad - - PowerPoint PPT Presentation

Incentivizing Self-Capping to Increase Cloud Utilization Mohammad Shahrad Cristian Klein Liang Zheng Mung Chiang Erik Elmroth David Wentzlaff September 25, 2017 Installment cost of a datacenter ~$100M [1] Google Traces


slide-1
SLIDE 1

Incentivizing Self-Capping to Increase Cloud Utilization

Mohammad Shahrad Cristian Klein Liang Zheng 
 Mung Chiang Erik Elmroth David Wentzlaff

September 25, 2017

slide-2
SLIDE 2

Installment cost of a datacenter ~$100M[1]

2

40% CPU utilization
 53% memory utilization

Google Traces[2]

[2] C. Reiss, A. Tumanov, G. R. Gange, R. H. Katz, M. A. Kozuch,Towards understanding heterogeneous clouds at scale:Google trace analysis. Technical Report ISTC–CC–TR–12–101, Carnegie Mellon University, Pittsburgh, PA, USA, Apr. 2012. [1] J. Koomey, A Simple Model for Determining True Total Cost of Ownership for Data Centers, Uptime Institute White Paper, Version 2 (2007): 2007.

Energy efficiency

Provider Competitiveness

slide-3
SLIDE 3

Workload Matters

3

[1] Barroso, L. A., Clidaras, J., & Hölzle, U. (2013). The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synthesis lectures on computer architecture, 8(3), 1-154.

Large continuous batch workloads

  • Jan. to Mar. 2013, 20,000-server clusters

Mix of workloads including online services

Utilization Utilization

slide-4
SLIDE 4

Dealing with Low Utilization

4

  • 1. More efficient resource provisioning
  • Better resource sharing/reclamation (e.g. Borg)
  • Antagonist co-location
  • Resource overbooking
  • 2. Improve deployment models
  • Resource bidding

(e.g. Spot instances)

  • Burstable instances
  • Long-term SLOs
  • Availability Knob
slide-5
SLIDE 5

Managing Uncertainty is
 Fundamentally Challenging

5

QoS / SLO’s Demand fluctuations

Spare Capacity Cloud services have became more and more elastic. Offloading some of the burden to tenants?

slide-6
SLIDE 6

Motivating Tenants to Have 
 Less Fluctuations

6

Provide a mechanism to control capacity demand fluctuations Economic incentives to change behavior Graceful degradation

slide-7
SLIDE 7

Graceful Degradation (GD) Methodology

7

Dynamic Adaptive Streaming over HTTP (DASH) Brownout self-adaptation

  • Maintain response time
  • Deactivating non-

essential content

E.g. online recommendations

  • Make revenue
  • Compute-heavy
slide-8
SLIDE 8

Graceful Degradation Pricing Model

How to flatten capacity demand?

8

Cutting the peaks

Filling the valleys

slide-9
SLIDE 9

Shaping The Capacity Demand

9

1 2 3 4 5 6 7 DDys ()iUst wHHk of Aug. 2013) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 AggUHgDtH C3U UtilizDtion (7Hz)

Cmax Cmin Cb Cd

Reserved Capacity On-demand Capacity Capacity Delivery Limit

Activate GD

Always charged (price pb) Charged based

  • n usage 


(price pd)

  • GD helps shape the peaks.
  • pb < pd helps shape the valleys.

Globally dynamic price pair

slide-10
SLIDE 10

System Overview

10

Capacity Controller Price Controller Hypervisor

Service Provider Infrastructure Provider

dynamic price capacity request capacities capacity demand queries

Clients GD-compliant Application

slide-11
SLIDE 11

Tenants’ Profit Maximization

11

Given a price pair, tenants can select the best capacity pair:

Demand PDF*

Revenue Function Capacity Price

Optimal Reserved Capacity Optimal Capacity Limit

*PDF: Probability Density Function

slide-12
SLIDE 12

Infrastructure Provider Controlling Utilization with Price

12

We prove that for all tenants:

Reserved capacity ~ Capacity limit ~ 1 / Subscription 
 Renegotiation This empowers a robust feedback mechanism:

Infrastructure

Utilization Dynamic Price

slide-13
SLIDE 13

Evaluation

13

  • Simulations on real-world traces
  • Implement and test a prototype

Cutting Peaks Filling Valleys GD-compliant GD-noncompliant

Bitbrains and Materna traces:

  • Business-critical applications

for enterprise customers

slide-14
SLIDE 14

14

slide-15
SLIDE 15

Even the Simplest Demand PDF Prediction Shows Major Gains

15

5 10 15 20 25 30 4000 4500 5000 5500 6000 6500

Net Profit ($) Simple Prediction

GD-compliant GD-noncompliant

5 10 15 20 25 30

SLA Period (Days)

4000 4500 5000 5500 6000 6500

Net Profit ($) Oracle (Perfect Prediction)

GD-compliant GD-noncompliant

5 10 15 20 25 30 0.25 0.4 0.55 0.7 0.85 1

Effective Utilization (ue) Simple Prediction

GD-compliant GD-noncompliant

5 10 15 20 25 30

SLA Period (Days)

0.25 0.4 0.55 0.7 0.85 1

Effective Utilization (ue) Oracle (Perfect Prediction)

GD-compliant GD-noncompliant

Effective utilization: from 41% to 73% Profit: ~16%

Our simple prediction: using the PDF* of previous period

*PDF: Probability Density Function

slide-16
SLIDE 16

Improving Effective Utilization

16

Effective utilization (ue): amount of requested capacity limit a tenant has used

0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8

/p d

0.4 0.5 0.6 0.7 0.8 0.9 1

Effective Utilization (u e)

GD-compliant (k=k 0=0.7) GD-compliant (k=0.9k 0) GD-compliant (k=1.1k 0) GD-noncompliant

  • Capacity getting more expensive

compared to revenue

  • Capacity getting cheaper

compared to tenant’s revenue

Less sensitive to GD More sensitive to GD

Infrastructure utilization: average of tenants’ effective utilization

slide-17
SLIDE 17

Multi-Tenant Scenario

17

Increased

  • n-demand

price More degradation

slide-18
SLIDE 18

Prototype Evaluation

18

  • Used Xen hypervisor (CPU scaling capabilities)
  • GD-enabled RUBiS (eBay-like benchmark)
  • Scaled down the traces in two dimensions:
  • time-wise
  • magnitude-wise

Renegotiations

https://github.com/cristiklein/gdinc-experiment

slide-19
SLIDE 19

Takeaways

19

  • Demand uncertainty is a fundamental challenge to increase cloud

utilization

  • One way to deal with it is incentivize tenants to fluctuate less
  • Graceful degradation resilience methodology can be applied
  • A well-defined pricing model allows tenants to maximize for profit using GD
  • IP’s can control utilization without having full knowledge of tenants
slide-20
SLIDE 20

Incentivizing Self-Capping to Increase Cloud Utilization

20

1 2 3 4 5 6 7 DDys ()iUst wHHk of Aug. 2013) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 AggUHgDtH C3U UtilizDtion (7Hz)

Cmax Cmin Cb Cd

Mohammad Shahrad
 mshahrad@princeton.edu

Capacity Controller Price Controller Hypervisor

Service Provider Infrastructure Provider

dynamic price capacity request capacities capacity demand queries

Clients GD-compliant Application