[PPT] - Bursting with Possibilities An Empirical Study of Credit-Based PowerPoint Presentation

SLIDE 1

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 1

Bursting with Possibilities

An Empirical Study of Credit-Based Bursting Cloud Instance Types

Dr. Philipp Leitner, Joel Scheuner

leitner@ifi.uzh.ch, joel.scheuner@uzh.ch

SLIDE 2

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 2

A new Type of Cloud Instances

Icons from the Noun Project: Rabbit by Hayden Kerrisk, Stopwatch by Nørgaard Andersen, Snail by Jems Mayor

➔ Behave fundamentally different than any

ther existing instance type

Credit-Based Bursting Instances

SLIDE 3

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Context

Infrastructure-as-a-Service (IaaS)
Virtual Machines (VMs) on a pay-per-use basis
Different performance characteristics

2015-12-09 Page 3

Icons from the Noun Project: CPU by iconsmind.com, ram by Bryn Bodayle, cloud-storage by Matthew Hawdon

CPU Memory I/O

SLIDE 4

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 4

Credit-Based CPU Bursting

Peak Baseline

Icons from the Noun Project: Rabbit by Hayden Kerrisk, Snail by Jems Mayor

SLIDE 5

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 5

Bursting Instance Types in Industry

“The burstable model has proven to be extremely popular with our customers.”

AWS Official Blog Oct 2015

Announced a new instance type in the burstable T2 family “f1-micro machine types offer bursting capabilities that allow instances to use additional physical CPU for short periods of time”

SLIDE 6

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 6

Related Work

Cloud Benchmarking

S. Ostermann, A. Iosup, N. Yigitbasi, R. Prodan, T. Fahringer, and D. Epema,

“A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing,” in Cloud Computing, ser. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and

Telecommunications Engineering. Springer, 2010, vol. 34, pp. 115–131.

K. R. Jackson, L. Ramakrishnan, K. Muriki, S. Canon, S. Cholia, J. Shalf, H. J.

Wasserman, and N. J. Wright, “Performance analysis of high performance computing applications on the amazon web services cloud,” in Proceedings of the 2010 IEEE Second

International Conference on Cloud Computing Technology and Science, ser. CLOUDCOM ’10, 2010, pp. 159–168.

A. Iosup, S. Ostermann, N. Yigitbasi, R. Prodan, T. Fahringer, and D. Epema,

“Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing,” IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 6, pp. 931–945, Jun. 2011.

Burstable Instances

J. Wen, L. Lu, G. Casale, and E. Smirni, “Less can be More: micro-Managing VMs in

Amazon EC2,” in Proceedings of the 2015 IEEE International Conference on Cloud Computing (CLOUD’15), 2015.

SLIDE 7

Department of Informatics – s.e.a.l.

software evolution & architecture lab

10

20 30 18:10 18:20 18:30 18:40 18:50 19:00 19:10 19:20 19:30 19:40 19:50 20:00 20:10 20:20 20:30

Experiment Duration CPU Credit Balance

10x

100 200 300 18:10 18:20 18:30 18:40 18:50 19:00 19:10 19:20 19:30 19:40 19:50 20:00 20:10 20:20 20:30

Experiment Duration Execution Time (s) 2015-12-09 Page 7

Credit-Based CPU Bursting – Explained (1)

1 CPU Credit full CPU core performance for 1 minute

ˆ =

Peak Baseline

SLIDE 8

Department of Informatics – s.e.a.l.

software evolution & architecture lab

25 50 75 100 18:10 18:20 18:30 18:40 18:50 19:00 19:10 19:20 19:30 19:40 19:50 20:00 20:10 20:20 20:30

Experiment Duration CPU Time (%)

CPU Time user steal idle

10x

100 200 300 18:10 18:20 18:30 18:40 18:50 19:00 19:10 19:20 19:30 19:40 19:50 20:00 20:10 20:20 20:30

Experiment Duration Execution Time (s) 2015-12-09 Page 8

Credit-Based CPU Bursting – Explained (2)

Peak Baseline

SLIDE 9

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 9

Research Questions

Icons from the Noun Project: CPU by iconsmind.com, cloud-storage by Matthew Hawdon, Dashboard by Björn Andersson, Coins by hunotika, History by Joe Mortell

1. How do t2 bursting instance types perform in terms of CPU

and IO speed in comparison to other instances?

2. When are t2 bursting instance types more cost-efficient than
ther instance types?
3. How do t2 instance types perform in comparison to the

previous generation (t1) types?

SLIDE 10

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 10

Empirical Study Setup

[1] Scheuner, Leitner, Cito, Gall: Cloud WorkBench - Infrastructure-as-Code Based Cloud Benchmarking. CloudCom‘14

Region Ireland (eu-west-1)

Icons from the Noun Project: Month by Rohit Arun Rao, Shapes by Chananan, Tool Presets by Fabiano Coelho, Repeat by Dimitry Sunseifer, Gears by Rigo Peter

All T2 bursting instance types in May 2015 (t2.micro, t2.small, t2.medium) Sysbench measures CPU and I/O performance 50 data points for each configuration (~1000 in total) Automated execution with Cloud WorkBench (CWB) [1] 1.-15. May 2015 Benchmark definitions and data publicly available: https://github.com/sealuzh/bursting-cloud-instances

SLIDE 11

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Results – T2 vs. Other Instance Types

2015-12-09 Page 11

t2.micro m3.medium m3.large c4.large t1.micro 1 2 3 4 t2.micro − Peak t2.micro − Base m3.medium m3.large c4.large t1.micro − Peak

Instance Types Medium−Instance Equivalents

SLIDE 12

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Results – T2 Bursting Instances

2015-12-09 Page 12

t2.micro t2.small t2.medium 1 2 3 4 t2.micro − Peak t2.micro − Base t2.small − Peak t2.small − Base t2.medium − Peak t2.medium − Base

Instance Types Medium−Instance Equivalents

SLIDE 13

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Results – Performance-Cost Ratio (1)

2015-12-09 Page 13 147 15 71 15 71 15 13 23 32 m3.medium t2.medium − Peak t2.small − Peak t2.medium − Base t2.micro − Base t2.micro − Peak t2.small − Base m3.large c4.large 50 100 150

Performance / Cost Ratio (pcr) Instance Types

medium-instance equivalents per USD and hour

Icons from the Noun Project: 24 hour by iconsmind.com, Ruler by Arthur Shlain, dollar by Simple Icons

ˆ =

medium-instance hours

SLIDE 14

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Results – Performance-Cost Ratio (2)

2015-12-09 Page 14 147 15 71 15 71 15 13 23 32 m3.medium t2.medium − Peak t2.small − Peak t2.medium − Base t2.micro − Base t2.micro − Peak t2.small − Base m3.large c4.large 50 100 150

Performance / Cost Ratio (pcr) Instance Types

15 15 14 15 14 15 13 23 32 m3.medium eak eak Base Base eak Base m3.large c4.large 10 20 30

Full−Utilization Equivalent pcr

SLIDE 15

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Usage Scenarios – Low or Irregular Load

Identify the cutoff point for each T2 instance
Where does higher avg. utilization (u) make them less cost efficient
Assumptions: Service is CPU-bound + always requires peak performance

2015-12-09 Page 15

20 30 40 50 40 60 80 100

Utilization (%) Utilization−Normalised PCR

Config c4.large m3.large m3.medium t2.medium − Peak t2.micro − Peak t2.small − Peak

SLIDE 16

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 16

Usage Scenarios – Boosting Performance-Cost Ratio

Idea Exploit initial CPU credit balance on VM startup Implementation Systematically restart VM instances when they run out of CPU credits Effect Improved (utilization normalized) performance cost ratio up to 4x

SLIDE 17

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 17

Conclusions

t2.micro 1 2 3 4 t2.micro − Peak t2.micro − Base

Medium−Instance Equivalents

t1.micro t1.micro − Peak

20 30 40 50 40 60 80 100

Utilization (%) Utilization−Normalised PCR

Config c4.large m3.large m3.medium t2.medium − Peak t2.micro − Peak t2.small − Peak

This research has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 610802 (CloudWave).

T2 instance types perform highly predictable unlike the previous T1 generation of bursting instances. T2 instance types provide superior performance-cost ratio below 40% average utilization

Icons from the Noun Project: Dice by chris dawson, Coins by hunotika

SLIDE 18

Department of Informatics – s.e.a.l.

software evolution & architecture lab

APPENDIX

2015-12-09 Page 18

SLIDE 19

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Future Work

Limited to micro-benchmarks

à Validate results using application benchmarks / actual applications

Limited to CPU credit bursting

à Analyze the same bursting model for IOPS1

2015-12-09 Page 19

1 http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html#IOcredit

SLIDE 20

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Usage Scenarios – Non-Critical IO

Trend towards more homogenous IO performance
No substantial IO performance degradation even at baseline performance

à Use cost-efficient CPU instances for IO-bound applications

2015-12-09 Page 20

t2.micro t2.small t2.medium 10 20 30 40 t2.micro − Peak t2.micro − Base t2.small − Peak t2.small − Base t2.medium − Peak t2.medium − Base

Instance Types Disk Read/Write Speed [MBit/s]

SLIDE 21

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Generations – T1 (previous) vs T2 (current)

2015-12-09 Page 21

100 200 300 20 40 60

Experiment Duration [min] sysbench Benchmark Value [s]

Config t2.micro t1.micro

Figure 5: Comparison of performance development of stressed t1 and t2 instances.

SLIDE 22

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Formal Model – Concise

performance-cost ratio (at a given time t)
Unit: medium instance equivalents per USD and hour
Arithmetic mean of all 50 benchmark observations [seconds]
Hourly costs per started billing time unit [USD]
utilization-normalized cost-performance ratio
Intuitively: costs of operating a cluster of bursting instances, so that
ne instance can always be operated at peak performance under the

assumed utilization level (e.g., need 10x t2.micro for u=100)

Utilization level
Standard instance utilization (i.e., utilization rate that keeps CPU

credit balance constant)

2015-12-09 Page 22

per US Dollar pcr(t) =

¯ m c(t)

hour. We vis

se ¯ m

ts c(t) ∈ R+,

el u 2 [0; 100].

n t¯

u.

remains stable indefinitely as unpcr(t, u) =

pcr(t) ⌈ u

t¯ u ⌉

and an utilization level u

SLIDE 23

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Formal Model – Basic Definitions (1)

Bursting instances (I)
Credit-based bursting instance types (T)
Peak performance level
Baseline performance level
CPU credits available
Replenishment rate per hour (when CPU idle)
Depletion rate per hour (when CPU non-idle)
Startup credits (initial credits on VM startup)
Credit limit (max amount of credits)

2015-12-09 Page 23

evel sp(t) ∈ R+

evel sb(t) ∈ R+

instance i ∈ I the set of cloud instances. type it ∈ T,

Lower number represents better performance

as ic ∈ R+. td ∈ N+ a constant

f tr ∈ N+

not in the startup ts ∈ N+,

perate at peak

tm ∈ N+.

SLIDE 24

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Formal Model – Basic Definitions (2)

Standard Instance Utilization (ū)

(i.e., utilization rate that keeps the instance credit account constant)

Utilization level

(i.e., percentage of time a user wants to operate a bursting instance at peak performance level)

Hourly costs (US $ per started billing time unit)
Arithmetic mean of all 50 benchmark observations
Relative standard deviation in percent

2015-12-09 Page 24

costs c(t) ∈ R+, as US dollars

t¯

u.

which

and mσ

use ¯ m to

⌉

el u ∈ [0; 100]. the number of

SLIDE 25

Department of Informatics – s.e.a.l.

software evolution & architecture lab

Formal Model – Performance-Cost Metrics

Performance-cost ratio

Unit: medium-instance equivalents per US dollar and hour

Utilization-normalized performance-cost ratio

Intuitively: costs of operating a cluster of bursting instances, so that one instance can always be operated at peak performance under the assumed utilization level u

2015-12-09 Page 25

as pcr(t) =

¯ m c(t).

alents per US dollar

remains stable indefinitely as unpcr(t, u) =

pcr(t) ⌈ u

t¯ u ⌉

and an utilization level u

Number of required instances to

perate at peak performance, given u

(e.g., 10x t2.micro for u=100)

SLIDE 26

Department of Informatics – s.e.a.l.

software evolution & architecture lab

2015-12-09 Page 26

Contributions

1. Basic formal model for credit-based bursting behavior
2. Empirical study of performance behavior
3. Comparison with current/previous generation instances (performance/cost)
4. Potential uses cases for practitioners

el u 2 [0; 100]

21st, 2015.

t2.micro t2.small t2.medium (0.014 $ / hour) (0.028 $ / hour) (0.056 $ / hour) sp sb sp sb sp sb ¯ m (CPU) 2.06 0.21 1.98 0.41 3.99 0.87 mσ (CPU) 3% 8% 4% 6% 5% 6%

∈ a peak performance level sp(t) ∈ R+ R a baseline performance level sb(t) ∈ R+ per US Dollar pcr(t) =

¯ m c(t)

hour. We vis

t2.micro m3.medium m3.large c4.large t1.micro 1 2 3 4 t2.micro − Peak t2.micro − Base m3.medium m3.large c4.large t1.micro − Peak Instance Types Medium−Instance Equivalents 147 15 71 15 71 15 13 23 32 m3.medium t2.medium − Peak t2.small − Peak t2.medium − Base t2.micro − Base t2.micro − Peak t2.small − Base m3.large c4.large 50 100 150

Performance / Cost Ratio (pcr) Instance Types

15 15 14 15 14 15 13 23 32 m3.medium t2.medium − Peak t2.small − Peak t2.medium − Base t2.micro − Base t2.micro − Peak t2.small − Base m3.large c4.large 10 20 30

Full−Utilization Equivalent pcr Instance Types

20 30 40 50 40 60 80 100 Utilization (%) Utilization−Normalised PCR Config c4.large m3.large m3.medium t2.medium − Peak t2.micro − Peak t2.small − Peak

10x

100 200 300 18:10 18:20 18:30 18:40 18:50 19:00 19:10 19:20 19:30 19:40 19:50 20:00 20:10 20:20 20:30

Experiment Duration Execution Time (s)

remains stable indefinitely as unpcr(t, u) =

pcr(t) ⌈ u

t¯ u ⌉

and an utilization level u