Cloud Index Tracking: Enabling Predictable Costs in Cloud Spot - - PowerPoint PPT Presentation

cloud index tracking enabling predictable costs in cloud
SMART_READER_LITE
LIVE PREVIEW

Cloud Index Tracking: Enabling Predictable Costs in Cloud Spot - - PowerPoint PPT Presentation

Cloud Index Tracking: Enabling Predictable Costs in Cloud Spot Markets Supreeth Shastri and David Irwin University of Massachusetts Amherst Spot Servers are gaining significance in the cloud Servers that may terminate anytime after an advance


slide-1
SLIDE 1

Cloud Index Tracking: Enabling Predictable Costs in Cloud Spot Markets

Supreeth Shastri and David Irwin

University of Massachusetts Amherst

slide-2
SLIDE 2

Spot Servers are gaining significance in the cloud

Servers that may terminate anytime after an advance warning period

slide-3
SLIDE 3

Cheap Expensive Guaranteed, Non-revocable Not guaranteed, Non-revocable Not guaranteed, Revocable

Cost Availability

Spot Servers are gaining significance in the cloud

Servers that may terminate anytime after an advance warning period

slide-4
SLIDE 4

Cheap Expensive Guaranteed, Non-revocable Not guaranteed, Non-revocable Not guaranteed, Revocable

Cost Availability

Reserved Spot On-demand

Spot Servers are gaining significance in the cloud

Servers that may terminate anytime after an advance warning period

slide-5
SLIDE 5

Cheap Expensive Guaranteed, Non-revocable Not guaranteed, Non-revocable Not guaranteed, Revocable

Cost Availability

Reserved Spot On-demand

Spot Servers are gaining significance in the cloud

Spot instances helped scale our clusters up by 4X during the discovery of the Higgs Boson Researchers built the largest HPC cluster in the cloud with 1.1million vCPUs on EC2 spot

Servers that may terminate anytime after an advance warning period

slide-6
SLIDE 6

bid level

Spot server pricing

while low on average, it is characterized by variability and deliberate revocations

slide-7
SLIDE 7

bid level

Spot server pricing

while low on average, it is characterized by variability and deliberate revocations

Ability to compare servers, plan IT budgets, and avoid disruptive revocations

Predicting Spot Prices is an Active Area of Research

slide-8
SLIDE 8

bid level

Spot server pricing

while low on average, it is characterized by variability and deliberate revocations

2015

Bid [SIGCOMM] SpotOn [SoCC] Cumulon [VLDB]

2016

No-bid [HotCloud] Flint [Eurosys] BOSS [Infocom]

2017

Prob-Guarantee [SC] Proteus [EuroSys] Exosphere [SIGMETRICS]

2018

LSTM [HPDC] Tributary [ATC]

Ability to compare servers, plan IT budgets, and avoid disruptive revocations

Predicting Spot Prices is an Active Area of Research

slide-9
SLIDE 9

Predicting Spot Prices is Important

Prior work models individual spot server prices based on their historical spot price data

slide-10
SLIDE 10

Predicting Spot Prices is Important

Prior work models individual spot server prices based on their historical spot price data

Difficult

Accurately

slide-11
SLIDE 11

Hardware config

68

Time commitments

2

OS types

2

Regions (country, state)

14

Zones (datacenters)

2-5

Predicting Spot Prices is Important

Prior work models individual spot server prices based on their historical spot price data

Difficult

Accurately

slide-12
SLIDE 12

Hardware config

68

Time commitments

2

OS types

2

Regions (country, state)

14

Zones (datacenters)

2-5

worldwide markets

7600+ =

Predicting Spot Prices is Important

Prior work models individual spot server prices based on their historical spot price data

Difficult

Accurately

slide-13
SLIDE 13

Hardware config

68

Time commitments

2

OS types

2

Regions (country, state)

14

Zones (datacenters)

2-5

worldwide markets

7600+ =

Predicting Spot Prices is Important

Prior work models individual spot server prices based on their historical spot price data

Difficult

Accurately

One size fits all model is unlikely Limited correlation with external variables No visibility into market internals

slide-14
SLIDE 14

Image credit: www.cnbc.com/mad-money/

slide-15
SLIDE 15

Image credit: www.cnbc.com/mad-money/

vs.

slide-16
SLIDE 16

Key Insight: A Market-based Index for CLOUD

Image credit: www.cnbc.com/mad-money/

vs.

slide-17
SLIDE 17

Key Insight: A Market-based Index for CLOUD

Rather than focusing exclusively on predicting individual servers, cloud users should make decisions based on broader market indices

Image credit: www.cnbc.com/mad-money/

vs.

slide-18
SLIDE 18

Cloud Index

intuition for our hypothesis

index construction methodology

validation on Amazon EC2

[

Index-tracking

techniques for predictability

design of index-tracking by server hopping

performance evaluation

[

slide-19
SLIDE 19

Underlying Characteristics of Large Cloud Platforms

slide-20
SLIDE 20

Underlying Characteristics of Large Cloud Platforms

  • 1. Dependence of VMs

Spot markets originating from the same physical machine family are not free from mutual interference

slide-21
SLIDE 21

Underlying Characteristics of Large Cloud Platforms

  • 1. Dependence of VMs

Spot markets originating from the same physical machine family are not free from mutual interference

Not all spot markets could be individually modeled and predicted

slide-22
SLIDE 22

Underlying Characteristics of Large Cloud Platforms

  • 1. Dependence of VMs

Spot markets originating from the same physical machine family are not free from mutual interference

  • 2. Stability of Idle Capacity

Aggregate idle VM capacity in public cloud datacenters tends to be stable

[SoCC 2014, SOSP 2017]

Not all spot markets could be individually modeled and predicted

slide-23
SLIDE 23

Underlying Characteristics of Large Cloud Platforms

  • 1. Dependence of VMs

Spot markets originating from the same physical machine family are not free from mutual interference

  • 2. Stability of Idle Capacity

Aggregate idle VM capacity in public cloud datacenters tends to be stable

[SoCC 2014, SOSP 2017]

Not all spot markets could be individually modeled and predicted If idle capacity were priced like commodity, its clearing price will be stable and predictable

slide-24
SLIDE 24

Underlying Characteristics of Large Cloud Platforms

  • 1. Dependence of VMs

Spot markets originating from the same physical machine family are not free from mutual interference

  • 2. Stability of Idle Capacity

Aggregate idle VM capacity in public cloud datacenters tends to be stable

[SoCC 2014, SOSP 2017]

Not all spot markets could be individually modeled and predicted If idle capacity were priced like commodity, its clearing price will be stable and predictable

We hypothesize that observing spot markets at aggregate levels (say, server family or datacenter levels) should lead to stable prices

slide-25
SLIDE 25

Constructing a Market Index for CLOUD

slide-26
SLIDE 26

Constructing a Market Index for CLOUD

Characterizing an individual server i

Price = Pi , Memory = Mi GB Compute = Ci ECUs

Pi

norm =

Pi

√(Ci . Mi)

slide-27
SLIDE 27

Constructing a Market Index for CLOUD

Characterizing a group of servers

Average of normalized prices

Index-level = Σ Pi

norm

N

N i=1

Characterizing an individual server i

Price = Pi , Memory = Mi GB Compute = Ci ECUs

Pi

norm =

Pi

√(Ci . Mi)

slide-28
SLIDE 28

Constructing a Market Index for CLOUD

Cloud index value represents the average price per unit of compute time

for the selected group of servers

Characterizing a group of servers

Average of normalized prices

Index-level = Σ Pi

norm

N

N i=1

Characterizing an individual server i

Price = Pi , Memory = Mi GB Compute = Ci ECUs

Pi

norm =

Pi

√(Ci . Mi)

slide-29
SLIDE 29

bid level

Individual Server Level

slide-30
SLIDE 30

Datacenter Level (US-West-1a)

bid level

Individual Server Level

slide-31
SLIDE 31

Datacenter Level (US-West-1a) Server Family Level (US-West-1a)

bid level

Individual Server Level

slide-32
SLIDE 32

Datacenter Level (US-West-1a) Server Family Level (US-West-1a)

Price prediction is more accurate and stable at datacenter- and server family level than individual level

bid level

Individual Server Level

slide-33
SLIDE 33

Cloud Index

intuition for our hypothesis

index construction methodology

validation on Amazon EC2

Index-tracking

techniques for predictability

design of index-tracking by server hopping

performance evaluation

[ [

slide-34
SLIDE 34

Design elements

slide-35
SLIDE 35

Investments that match the returns of an index. Construct a portfolio such that its constituent items are same as those present in the index.

S&P 500 Vanguard ETFs

Index-tracking in financial markets

Design elements

slide-36
SLIDE 36

Investments that match the returns of an index. Construct a portfolio such that its constituent items are same as those present in the index.

S&P 500 Vanguard ETFs

Index-tracking in financial markets

Design elements

A container that automatically hops spot VMs as market conditions change [SoCC 2017]. Increasing cost-efficiency, lowers revocations

Server hopping in cloud markets

slide-37
SLIDE 37

Achieving index-level cost-efficiency despite market volatility

Index Tracking by Server Hopping

slide-38
SLIDE 38

Achieving index-level cost-efficiency despite market volatility

Determine a broad set of candidate markets, and then compute its market index

1

Index Tracking by Server Hopping

slide-39
SLIDE 39

Achieving index-level cost-efficiency despite market volatility

Host the application on a server that meets the index-level cost-efficiency

2

Determine a broad set of candidate markets, and then compute its market index

1

Index Tracking by Server Hopping

slide-40
SLIDE 40

Achieving index-level cost-efficiency despite market volatility

If market conditions violate the index invariant, then transparently hop to a better server

3

Host the application on a server that meets the index-level cost-efficiency

2

Determine a broad set of candidate markets, and then compute its market index

1

Index Tracking by Server Hopping

slide-41
SLIDE 41

Achieving index-level cost-efficiency despite market volatility

If market conditions violate the index invariant, then transparently hop to a better server

3

Host the application on a server that meets the index-level cost-efficiency

2

Determine a broad set of candidate markets, and then compute its market index

1

Index Tracking by Server Hopping

Sharpe ratio =

( 𝕁 — Ṕi )

std-dev ( 𝕁 — Ṕi )

𝕁 = Index-level, and Ṕi = Spot server’s normalized efficiency

Select a server that shows best balance between risk (price volatility) vs. reward (cost-efficiency)

Server Choice

slide-42
SLIDE 42

LXC based prototype for EC2 spot markets https://umass-sustainablecomputinglab.github.io/cloudIndex/

slide-43
SLIDE 43

Evaluation

Does index-tracking achieve predictable expenses? How does cost-availability of index-tracking compare to others?

LXC based prototype for EC2 spot markets https://umass-sustainablecomputinglab.github.io/cloudIndex/

slide-44
SLIDE 44

Evaluation

Does index-tracking achieve predictable expenses? How does cost-availability of index-tracking compare to others? vs. vs.

Spot server with index-tracking Spot server with cost-based hopping (HotSpot) Spot server with static prediction (SpotFleet)

We compare three systems for running two classes of applications on EC2 spot markets

LXC based prototype for EC2 spot markets https://umass-sustainablecomputinglab.github.io/cloudIndex/

slide-45
SLIDE 45

Long-running Single-node App

E.g., IoT sinks, crypto miners, p2p file trackers

Bulk-synchronous Parallel Jobs

MapReduce type workload from Google traces

slide-46
SLIDE 46

S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g

Long-running Single-node App

E.g., IoT sinks, crypto miners, p2p file trackers

Bulk-synchronous Parallel Jobs

MapReduce type workload from Google traces

slide-47
SLIDE 47

S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g

Long-running Single-node App

E.g., IoT sinks, crypto miners, p2p file trackers

Bulk-synchronous Parallel Jobs

MapReduce type workload from Google traces

slide-48
SLIDE 48

S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g

Long-running Single-node App

E.g., IoT sinks, crypto miners, p2p file trackers

Bulk-synchronous Parallel Jobs

MapReduce type workload from Google traces

slide-49
SLIDE 49

S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g

Long-running Single-node App

E.g., IoT sinks, crypto miners, p2p file trackers

Bulk-synchronous Parallel Jobs

MapReduce type workload from Google traces

slide-50
SLIDE 50

Index-Tracking not only meets the predicted cost-efficiency but also achieves the best cost-availability tradeoff compared to other approaches.

S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g S p

  • t
  • f

l e e t H

  • t

S p

  • t

I n d e x

  • t

r a c k i n g

Long-running Single-node App

E.g., IoT sinks, crypto miners, p2p file trackers

Bulk-synchronous Parallel Jobs

MapReduce type workload from Google traces

slide-51
SLIDE 51

Spot server markets enable inexpensive computing at scale but expose users to cost uncertainty

Conclusion

slide-52
SLIDE 52

Spot server markets enable inexpensive computing at scale but expose users to cost uncertainty

Conclusion

Cost Uncertainty

Affects app performance and user’s budget planning Prior work focuses on history-based prediction

slide-53
SLIDE 53

Spot server markets enable inexpensive computing at scale but expose users to cost uncertainty

Conclusion

Cost Uncertainty

Affects app performance and user’s budget planning Prior work focuses on history-based prediction

Cloud Index Tracking

Propose market-based indices for EC2 spot servers Design technique for index tracking by server hopping

slide-54
SLIDE 54

Spot server markets enable inexpensive computing at scale but expose users to cost uncertainty

Conclusion

Cost Uncertainty

Affects app performance and user’s budget planning Prior work focuses on history-based prediction

Index-level cost-efficiency

Evaluations

  • vs. other approaches

Achieves predictable costs with higher availability across applications

Cloud Index Tracking

Propose market-based indices for EC2 spot servers Design technique for index tracking by server hopping