Empirical Evaluation of Workload Forecasting Techniques for - - PowerPoint PPT Presentation

empirical evaluation of workload forecasting techniques
SMART_READER_LITE
LIVE PREVIEW

Empirical Evaluation of Workload Forecasting Techniques for - - PowerPoint PPT Presentation

Empirical Evaluation of Workload Forecasting Techniques for Predictive Cloud Resource Scaling In Kee Kim , Wei Wang, Yanjun (Jane) Qi, and Marty Humphrey Computer Science @ University of Virginia Motivation Cloud Resource Scaling Approach


slide-1
SLIDE 1

Empirical Evaluation of Workload Forecasting Techniques for Predictive Cloud Resource Scaling

In Kee Kim, Wei Wang, Yanjun (Jane) Qi, and Marty Humphrey Computer Science @ University of Virginia

slide-2
SLIDE 2

Motivation – Cloud Resource Scaling Approach

Reactive Auto Scaling

[AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O…

2

slide-3
SLIDE 3

Motivation – Cloud Resource Scaling Approach

Reactive Auto Scaling

[AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O…

Resource Demand Time

3

slide-4
SLIDE 4

Motivation – Cloud Resource Scaling Approach

Reactive Auto Scaling

[AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O…

Number of Instances Resource Demand Time

4

slide-5
SLIDE 5

Motivation – Cloud Resource Scaling Approach

Reactive Auto Scaling

[AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O…

Number of Instances Resource Demand Time

Scaling Delays Scaling Delays

5

slide-6
SLIDE 6

Motivation – Cloud Resource Scaling Approach

Reactive Auto Scaling

[AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O…

Predictive Resource Scaling

Resource Scaling based on forecasting:

Number of Instances Resource Demand Time

  • 2. Workload Arrival Pattern
  • 1. Future Resource Usage

Scaling Delays Scaling Delays

6

slide-7
SLIDE 7

Motivation – Cloud Resource Scaling Approach

Reactive Auto Scaling

[AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O…

Predictive Resource Scaling

Resource Scaling based on forecasting:

Number of Instances Resource Demand Time

  • 2. Workload Arrival Pattern
  • 1. Future Resource Usage
  • 2. Workload Arrival Pattern

Scaling Delays Scaling Delays

7

slide-8
SLIDE 8

Predictive Resource Management Engine

Predictive Resource Scaling

Cloud Infrastructure Resource Scaling Workload Predictor

  • 1. Workload Predictor
  • Detects cloud workload pattern.
  • Predicts job arrival pattern in near future.
  • 2. Resource Scaling
  • allocates/deallocates cloud resources based on the prediction.

Workload

8

slide-9
SLIDE 9

Predictive Resource Management Engine

Predictive Resource Scaling

Resource Scaling Workload Predictor

Regression? Machine Learning? Time Series?

Workload Cloud Infrastructure

9

slide-10
SLIDE 10

Research Questions

  • Question #1: Which workload predictor has the highest

accuracy for job arrival time prediction?

  • Question #2: Which exiting workload predictor has the

best cost efficiency and performance benefits?

  • Question #3: Which styles of predictive scaling achieves

the best cost efficiency and performance benefits?

10

slide-11
SLIDE 11

X X X = 4K cases

Research Big Picture

  • 4K cases are very challenging via actual deployment on IaaS clouds.
  • Use PICS (Public IaaS Cloud Simulator) – KWH – CLOUD’15

Realistic Workload Collection of WL Predictor Resource Manager Public Clouds

Naive Regression Time Series Non-temporal Resource Scaling Job Scheduling VM Control

. . .

24 WL patterns 21 Predictors 4 Policies 2 Configs.

11

slide-12
SLIDE 12

12

Experiment Design

  • Collection of Workload Predictors.
  • Simulation Workloads.
  • Design of Resource Management System.
  • Implementation and Performance Tuning.
slide-13
SLIDE 13
  • We collect all 21 workload predictors:

Collection of (Existing) Workload Predictors

1) Naïve Models 2) Regression Models 3) Time Series Models 4) Non-Temporal (ML) Models

Mean-based Recent-mean

(kNN)

Global Model

(Linear, Quad, Cubic)

Local Model

(Linear, Quad, Cubic)

Smoothing

(WMA, EMA, DES)

Box-Jenkins

( AR, ARMA, ARIMA)

SVMs

(Linear, Gaussian)

Ensemble

(RF, GBM, Exts)

Decision Tree

13

slide-14
SLIDE 14

Simulation Workload Patterns

  • We generate 24 workload patterns based on:

t

Compute

Inactivity Period

t

Compute

t

Compute

t

Compute

On and Off (Batch/Scientific) Growing (Emerging Service) Random/Unpredictable (Media) Cyclic Bursting (E-Commerce)

14

slide-15
SLIDE 15

Design of Resource Management System

Job Portal

Job (Duration, Deadline)

Workload Repository

Predictor for Scaling-Out Predictor for Scaling-In

Samples for Prediction Prediction Result

Predictive Scaling Module

Job Arrival Info

Predictive Scaler

Predictive Scaling Decision

Job

Job Queue J J J J J J J J Job Exe Job Exe Job Exe

Cloud Resource Management System

Resource Management Module

(e.g. job scheduling, VM scaling, and management) +/- VMs, Job Assign.

Cloud Infrastructure (e.g. AWS, Azure) Workload

Predictor for Scaling-Out Predictor for Scaling-In

Predictive Scaler

15

Workload Repository

slide-16
SLIDE 16

Implementations and Performance Tuning

  • Workload Predictor Implementation.
  • All predictors are written in Python.
  • numpy and Pandas.
  • statsmodels for time-series model implementation.
  • scikit-learn machine learning lib for non temporal models.
  • Predictor Performance T

uning.

  • (Training) Sample Size Decision:
  • a tradeoff between prediction performance and overhead.
  • Most predictors use 50 -- 100 of most recent job arrival samples.
  • Parameter Selection:
  • a grid search algorithm with prediction accuracy.

16

slide-17
SLIDE 17

17

Performance Evaluation

  • Experiment #1 – Statistical Predictor Performance.
  • Experiment #2 – Predictive Scaling Performance.
slide-18
SLIDE 18

Experiment #1 – (Statistical) Predictor Performance

  • Purpose: Measuring Statistical Predictor Accuracy and Overhead.
  • Accuracy: MAPE – Mean Absolute Percentage Error.
  • Overhead: Sum of All Prediction Time.
  • Overall Results:

Overall Prediction Accuracy Overall Prediction Overhead (10K Jobs)

18

slide-19
SLIDE 19

Experiment #1 – (Statistical) Predictor Performance

  • Purpose: Measuring Statistical Predictor Accuracy and Overhead.
  • Accuracy: MAPE – Mean Absolute Percentage Error.
  • Overhead: Sum of All Prediction Time.
  • Overall Results:

Overall Prediction Accuracy Overall Prediction Overhead (10K Jobs)

Average: 0.6360 SVMs: 0.37 -- 0.4 (42% less than average)

19

slide-20
SLIDE 20

Experiment #1 – (Statistical) Predictor Performance

  • Purpose: Measuring Statistical Predictor Accuracy and Overhead.
  • Accuracy: MAPE – Mean Absolute Percentage Error.
  • Overhead: Sum of All Prediction Time.
  • Overall Results:

Overall Prediction Accuracy Overall Prediction Overhead (10K Jobs)

kNN: 0.5s for 10K Jobs ARMA: 6032s

20

slide-21
SLIDE 21

Workload Rank Predictor MAPE Workload Rank Predictor MAPE

Growing

1

  • Lin. SVM

0.28

On/Off

1

  • Gau. SVM

0.22 2 AR 0.29 2 ARMA 0.30 3 ARMA 0.30 3

  • Lin. SVM

0.44 Avg.

  • 0.51

Avg.

  • 0.69

Bursty

1 ARIMA 0.38

Random

1

  • Gau. SVM

0.45 2 Brown’s DES 0.41 2

  • Lin. Reg.

0.46 3

  • Lin. SVM

0.43 3

  • Lin. SVM

0.46 Avg.

  • 0.75

Avg.

  • 0.52

Experiment #1 – (Statistical) Predictor Performance

  • Accuracy of Workload Predictor per Pattern.

21

slide-22
SLIDE 22

Experiment #2 – Predictive Scaling Performance

  • Purpose: How much benefits RM can achieve by applying
  • 1. “Good Predictor”
  • 2. Different Styles of Predictive Scaling.
  • List of Predictors: 8 best predictors from evaluation #1.
  • Linear Regression, WMA, BRDES, AR, ARMA, ARIMA, Linear SVM, Gaussian SVM
  • Four Different Styles of Resource Scaling.
  • RR (Reactive Scaling-Out

+ Reactive Scaling-In) -- Baseline

  • PR (Predictive Scaling-Out + Reactive Scaling-In)
  • RP (Reactive Scaling-Out

+ Predictive Scaling-In)

  • PP (Predictive Scaling-Out + Predictive Scaling-In)
  • Cloud Configurations: T

wo Pricing Models -- Hourly and Minutely.

  • Metrics: Cloud Cost and Job Deadline Miss Rate.

22

slide-23
SLIDE 23

0.0 0.2 0.4 0.6 0.8 1.0 1.2 PR RP PP

Cost DL Miss Rate

  • Overall Results:

Experiment #2 – Predictive Scaling Performance

0.0 0.2 0.4 0.6 0.8 1.0 1.2 PR RP PP

Cost DL Miss Rate

Baseline (RR)

(a) Hourly Pricing Model (b) Minutely Pricing Model

23

slide-24
SLIDE 24

0.0 0.2 0.4 0.6 0.8 1.0 1.2 PR RP PP

Cost DL Miss Rate

  • Overall Results:

Experiment #2 – Predictive Scaling Performance

0.0 0.2 0.4 0.6 0.8 1.0 1.2 PR RP PP

Cost DL Miss Rate

Baseline (RR)

(a) Hourly Pricing Model (b) Minutely Pricing Model

24

37% 58% 67% 87%

cost DL Miss

slide-25
SLIDE 25

0.0 0.2 0.4 0.6 0.8 1.0 1.2 PR RP PP

Cost DL Miss Rate

  • Overall Results:

Experiment #2 – Predictive Scaling Performance

0.0 0.2 0.4 0.6 0.8 1.0 1.2 PR RP PP

Cost DL Miss Rate

Baseline (RR)

Cost: No Improvement (a) Hourly Pricing Model (b) Minutely Pricing Model

25

60% 72%

DL Miss

slide-26
SLIDE 26

Experiment #2 – Predictive Scaling Performance

  • PP (Predictive Scaling-Out – Predictive Scaling-In) Details -- Deadline Miss Rate

0.00 0.20 0.40 0.60 0.80 1.00 1.20 Lin-SVM (Best) Average Gau-SVM (Best) Average BRDES (Best) Average Lin-SVM (Best) Average Growing On/Off Bursty Random

Hourly Pricing Model

0.00 0.20 0.40 0.60 0.80 1.00 1.20 AR (Best) Average ARMA (Best) Average BRDES (Best) Average Gau-SVM (Best) Average Growing On/Off Bursty Random

Minutely Pricing Model

Workloads Top 1 Top 2 Top 3

Growing Linear SVM AR ARMA On/Off Gaussian SVM ARMA Linear SVM Bursty ARIMA Brown’s DES Linear SVM Random Gaussian SVM Linear Regression Linear SVM

26

slide-27
SLIDE 27

Experiment #2 – Predictive Scaling Performance

  • PP (Predictive Scaling-Out – Predictive Scaling-In) Details -- Deadline Miss Rate

0.00 0.20 0.40 0.60 0.80 1.00 1.20 Lin-SVM (Best) Average Gau-SVM (Best) Average BRDES (Best) Average Lin-SVM (Best) Average Growing On/Off Bursty Random

Hourly Pricing Model

0.00 0.20 0.40 0.60 0.80 1.00 1.20 AR (Best) Average ARMA (Best) Average BRDES (Best) Average Gau-SVM (Best) Average Growing On/Off Bursty Random

Minutely Pricing Model

Workloads Top 1 Top 2 Top 3

Growing Linear SVM AR ARMA On/Off Gaussian SVM ARMA Linear SVM Bursty ARIMA Brown’s DES Linear SVM Random Gaussian SVM Linear Regression Linear SVM

27

slide-28
SLIDE 28

Experiment #2 – Predictive Scaling Performance

  • PP (Predictive Scaling-Out – Predictive Scaling-In) Details -- Deadline Miss Rate

0.00 0.20 0.40 0.60 0.80 1.00 1.20 Lin-SVM (Best) Average Gau-SVM (Best) Average BRDES (Best) Average Lin-SVM (Best) Average Growing On/Off Bursty Random

Hourly Pricing Model

0.00 0.20 0.40 0.60 0.80 1.00 1.20 AR (Best) Average ARMA (Best) Average BRDES (Best) Average Gau-SVM (Best) Average Growing On/Off Bursty Random

Minutely Pricing Model

Workloads Top 1 Top 2 Top 3

Growing Linear SVM AR ARMA On/Off Gaussian SVM ARMA Linear SVM Bursty ARIMA Brown’s DES Linear SVM Random Gaussian SVM Linear Regression Linear SVM

28

slide-29
SLIDE 29

Experiment #2 – Predictive Scaling Performance

  • PP (Predictive Scaling-Out – Predictive Scaling-In) Details -- Deadline Miss Rate

0.00 0.20 0.40 0.60 0.80 1.00 1.20 Lin-SVM (Best) Average Gau-SVM (Best) Average BRDES (Best) Average Lin-SVM (Best) Average Growing On/Off Bursty Random

Hourly Pricing Model

0.00 0.20 0.40 0.60 0.80 1.00 1.20 AR (Best) Average ARMA (Best) Average BRDES (Best) Average Gau-SVM (Best) Average Growing On/Off Bursty Random

Minutely Pricing Model

Workloads Top 1 Top 2 Top 3

Growing Linear SVM AR ARMA On/Off Gaussian SVM ARMA Linear SVM Bursty ARIMA Brown’s DES Linear SVM Random Gaussian SVM Linear Regression Linear SVM

29

slide-30
SLIDE 30

Summary -- Revisit 3 Research Questions

  • Q1: Which WL predictor has the highest accuracy?
  • No one predictor fits all workload patterns.
  • Q2: Which WL predictor provides the best performance benefits?
  • Similar with Q1 – no universally best workload predictor exits.
  • Depends on workload patterns and cloud configurations (e.g. billing model)
  • In general, best workload predictor (cloud metric) is one of top three most

(statistically) accurate predictors.

  • Q3: Which styles of predictive scaling provides the best performance benefits?
  • PP (Predictive Scaling-Out/In) is the best style of predictive scaling.
  • “Predictive Scaling-Out” can improve cloud metrics.

30

slide-31
SLIDE 31

Questions?

Thank you!

ik2sb@virgina.edu

31