Spock: Exploiting Serverless Functions for SLO and Cost Aware - - PowerPoint PPT Presentation

spock exploiting serverless functions for slo and cost
SMART_READER_LITE
LIVE PREVIEW

Spock: Exploiting Serverless Functions for SLO and Cost Aware - - PowerPoint PPT Presentation

Spock: Exploiting Serverless Functions for SLO and Cost Aware Resource Procurement in Public Cloud Jashwant Gunasekaran, Prashanth Thinakaran, Mahmut Kandemir, Bhuvan Urgaonkar, George Kesidis, Chita Das Computer Science and Engineering The


slide-1
SLIDE 1

Hardware architectures for Modern Scientific Computing

Spock: Exploiting Serverless Functions for SLO and Cost Aware Resource Procurement in Public Cloud

Jashwant Gunasekaran, Prashanth Thinakaran, Mahmut Kandemir, Bhuvan Urgaonkar, George Kesidis, Chita Das

Computer Science and Engineering The Pennsylvania State University

slide-2
SLIDE 2

The Last Supper of cloud clients

2

But I turned off my VM Instances Probably

  • ur private key is

compromised But.. I have the money to pay for it this time

slide-3
SLIDE 3

That Last AWS bill

3

slide-4
SLIDE 4

Hardware architectures for Modern Scientific Computing

Spock: Cost Aware Resource Procurement in Public Clouds using Serverless

slide-5
SLIDE 5

Whose problem are we solving?

5

slide-6
SLIDE 6

Outline

  • Elastic Web Services
  • VM-based Resource Procurement
  • Serverless Functions
  • Cost of VMs vs Cloud Functions
  • Spock Hybrid Elastic Scaling
  • Implementation and Evaluation
  • Results

6

slide-7
SLIDE 7
  • Short lived queries
  • Strict SLO
  • Varying resource demands
  • Stateless
  • Resources Required
  • acquired/released on demand
  • Average to Peak ratio is high

Elastic Web Services

7

Typical example? ML based web services

slide-8
SLIDE 8

ML Inference Engine

8

slide-9
SLIDE 9

Outline

  • Elastic Web Services
  • VM-based Resource Procurement
  • Serverless Functions
  • Spock Hybrid Elastic Scaling
  • Implementation and Evaluation
  • Results

9

slide-10
SLIDE 10

VM-based Procurement

10

EC2 instances

slide-11
SLIDE 11
  • Initial pool of

active VMs

  • Procure more VMs
  • n demand
  • Autoscaling during

request surge

VM-based Procurement

11

Time (Sec) R e s

  • u

r c e D e m a n d Arrival rate VM SLA

SLA Violations Scale Up Scale Down

slide-12
SLIDE 12
  • Very long VM startup times (5s-50s)
  • Over-provisioning to meet strict SLOs
  • Under-provisioned during sudden surge

Disadvantages

12

Possible alternative/s?

slide-13
SLIDE 13

Outline

  • Elastic Web Services
  • VM-based Resource Procurement
  • Serverless Functions
  • Spock Hybrid Elastic Scaling
  • Implementation and Evaluation
  • Results

13

slide-14
SLIDE 14

Serverless Functions

14

slide-15
SLIDE 15

Serverless Functions

15

  • Pay per second
  • Cost efficient
  • Scale

instantaneously

  • Intermittent SLA

violations

Time (Sec) R e s

  • u

r c e D e m a n d Arrival rate Lambda

Scale Up Scale Down

SLA

SLA Violations

But, is serverless a panacea?

slide-16
SLIDE 16

Constant arrival rate

16

Cost ($) 10 20 30 40 50 100 150 200

VM Lambda Requests per sec

  • Constant arrival rate
  • Cost compared under iso-

performance

  • All requests have similar SLA

compliance

  • VMs are 100% utilized
slide-17
SLIDE 17
  • Trace based arrival rate
  • Each request is an ML inference for

caffenet-model

  • Cost compared under iso-performance
  • All requests have similar SLA

compliance

  • VMs are provisioned for the peak

request rate

Varying arrival rate

17

Request Rate 30 60 90 120 3600 7200

Avg-1 Avg-2 Time(s) Normalized Cost 0.5 1 1.5 2

Lambda

Average-1 Average-2

Lambda

Cost-effective Solution ?

slide-18
SLIDE 18
  • Use serverless functions along with VMS
  • Reduce SLO violations during request surge
  • Reduce intermittent over-provisioning VMs

SPOCK

18

VM SLA Arrival rate Time (Sec) R e s

  • u

r c e D e m a n d

SLA Violations Scale Up Scale Down

Lambda

slide-19
SLIDE 19
  • It is non-trivial to predict the peak request rate at any

given time period.

  • Provisioning VMs for the peak demands would always

lead to higher cost of deployment. While, under provisioning VMs leads to severe SLO violations for queries.

  • Using serverless functions would overcome the SLO

violation problem. However, it is not cost effective.

Key Motivation

19

slide-20
SLIDE 20

Outline

  • Elastic Web Services
  • VM-based Resource Procurement
  • Serverless Functions
  • Spock Hybrid Elastic Scaling
  • Implementation and Evaluation
  • Results

20

slide-21
SLIDE 21
  • Schedule queries on VM’s if available
  • If VM’s are fully utilized, redirect queries to

lambda functions

  • Spawn a new VM in the meantime
  • After spin-up incoming requests are sent to new

VMs

  • Scale down VMs after three minutes of inactivity

Spock Scheme

21

slide-22
SLIDE 22
  • Reactive
  • Spin-up new VMs as when request surge
  • ccurs
  • No prediction of the request rates
  • Predictive
  • Using moving window linear regression

predict request every minute

  • Spin up new VMs based on prediction

Two Scaling Policies

22

Lets see an example

slide-23
SLIDE 23

125 250 375 500 1 4 10 15 30

Spock resource procurement

VM VM Lambda Lambda VM Lambda Lambda Lambda Lambda

Time (hundreds of sec) Request rate per sec

Scale out Scale in

slide-24
SLIDE 24

Overall Design of Spock

24 Resource Manager Load Monitor Scaling Policy Load Balancer

λ λ λ λ

VM VM VM VM

MODEL 1 MODEL 2 MODEL 3 MODEL 4

Resource Status Predicted Load Queries Query Complete Query Assigned Instance Created Resource Required

Reactive Predictive

User Applications

slide-25
SLIDE 25

Outline

  • Elastic Web Services
  • VM-based Resource Procurement
  • Serverless Functions
  • Spock Hybrid Elastic Scaling
  • Implementation and Evaluation
  • Results

25

slide-26
SLIDE 26
  • Two traces used to generate ML inference workload

Evaluation

26

WITS Berkeley

slide-27
SLIDE 27
  • Mxnet Framework
  • AWS resources
  • Pretrained ML models
  • n imagenet dataset

Evaluation

27

slide-28
SLIDE 28
  • Two scaling policies
  • Predictive
  • Reactive
  • Three resource procurement schemes
  • Autoscale
  • X-autoscale
  • Spock

Evaluation

28

slide-29
SLIDE 29

Outline

  • Elastic Web Services
  • VM-based Resource Procurement
  • Serverless Functions
  • Spock Hybrid Elastic Scaling
  • Implementation and Evaluation
  • Results

29

slide-30
SLIDE 30

30

SLO violations (%) 2.5 5 Normalized Cost 0.15 0.3 0.45 0.6

autoscale X-autoscale Spock Mix-1 Mix-2 SLO Violation

SLO violations (%)

7 14

Normalized Cost 0.15 0.3 0.45 0.6

autoscale X-autoscale Spock Mix-1 Mix-2 SLO Violation

Berkely Trace Results

slide-31
SLIDE 31

WITS Trace Results

31

SLO violations (%) 2 4 6 8 Normalized Cost 0.6667 1.3333 2 autoscale X-autoscale Spock Mix-1 Mix-2 SLO Violation SLO violations (%) 3 6 9 12 Normalized Cost 0.533 1.067 1.6 autoscale X-autoscale Spock Mix-1 Mix-2 SLO Violation

slide-32
SLIDE 32

Spock Prediction Accuracy

32

slide-33
SLIDE 33

Spock Resource Procurement

33

125 250 375 500 1 4 10 15 30

VM VM VM VM VM

Time (hundreds of sec)

Request rate per sec

Request rate Scale out Scale in

slide-34
SLIDE 34

Questions?

34