Real-time Serverless: Enabling Application Performance Guarantee Hai - - PowerPoint PPT Presentation

real time serverless enabling application performance
SMART_READER_LITE
LIVE PREVIEW

Real-time Serverless: Enabling Application Performance Guarantee Hai - - PowerPoint PPT Presentation

Real-time Serverless: Enabling Application Performance Guarantee Hai Duc Nguyen 1 , Chaojie Zhang 1 , Zhujun Xiao 1 , and Andrew A. Chien 1,2 1 University of Chicago 2 Argonne NaKonal Lab Serverless has Limitation Function-as-a-Service (FaaS)


slide-1
SLIDE 1

Real-time Serverless: Enabling Application Performance Guarantee

Hai Duc Nguyen1, Chaojie Zhang1, Zhujun Xiao1, and Andrew A. Chien1,2

1University of Chicago 2Argonne NaKonal Lab

slide-2
SLIDE 2

Serverless has Limitation

12/09/2019 5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19)

h"ps://serverless-benchmark.com

  • Function-as-a-Service (FaaS) aka

Serverless is the fastest growing element of cloud workload

But

  • Best-effort invocations
  • Long-tail latency
slide-3
SLIDE 3

Bursty, Real-.me Applica.ons

Computation demand surges when interest events happen

  • A “wanted” person appears
  • A cyber attack

Timely response to these events

4

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Time Demand

slide-4
SLIDE 4

Serverless vs. Bursty, Real-time Apps

Serverless invoca-ons are best-effort ❌ No way to guarantee when an invoca-on will run

Resource Time Serverless

5

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Bursty Load

slide-5
SLIDE 5

Real-&me Serverless

Real-time Serverless (RTS) = Serverless + Guaranteed Invocation Rate

12/09/2019 5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19)

6

Resource Time

Real-time Serverless

guaranteed invoca6on rate

1 sec 3 sec 10 inv. 30 inv. 10 inv. per sec

Deploy Function description:

  • Maximum Runtime (timeout)
  • Handler
  • Guaranteed invocation rate (!"#$): at least 1

invocation per period of (%/!"#$) seconds.

slide-6
SLIDE 6

Real-&me Serverless

7

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

✓ Timely resource access

Time Resource

Bursty Load Real-5me Serverless invoca5on Time Serverless Bursty Load Resource

slide-7
SLIDE 7

Analy&c Model: Video Monitoring

Value is represented as: !"#$%&#'(% = *#+&#'(% ⋅ %

  • ./01230/ 4/567

8

8

Response Delay Frame Value

Burst Height Burst Duration

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Guaranteed Invocation Rate (9.:;) Camera Real-Bme Serverless Framework Video Monitoring Burst arrival Response

slide-8
SLIDE 8

RTS Guarantees Statistics for Frame Value

100

B e t t e r V a l u e D i s t r i b u t i

  • n

■ !"#$ = 0.0 ()*+,-.(+)/0. ■ !"#$ = 0.1 ()*+,-.(+)/0. ■ !"#$ = 0.3 ()*+,-.(+)/0. ■ !"#$ = 0.9 ()*+,-.(+)/0. ■ !"#$ = 1.0 ()*+,-.(+)/0.

ü High guaranteed invoca8on rate à high value ü Guarantee Sta8s8cs for Frame Value

9

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Guarantee Invocation Rates

0.03 0.3 3 30 0.2 0.4 0.6 0.8 1 Percentage of Burst Frames

Normalized Frame Value ft = frame-time = 1/30 sec

slide-9
SLIDE 9

Ra#onal Design for Value

Application can adjust guaranteed invocation rate to meet any value target

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0.2 0.4 0.6 0.8 1 Percentage of Burst Frame Guarantee Invocation Rate (instance per frame-time) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0.2 0.4 0.6 0.8 1 Percentage of Burst Frame Guarantee InvocaJon Rate (instance per frame-Jme)

0.86 0.92

ü Enable applicaJon to engineer the value distribuJon

10

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Higher is better

90% of max. value 70% of max. value 50% of max. value

slide-10
SLIDE 10

RTS with Burst Interference

Time Burst interference duty factor ↔ burst interference

ü For realistic bursty applications, the interference probability is low

11

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2 3 4 5 Probability Burst Interfering

■ Duty factor = 1% ■ Duty factor = 10% ■ Duty factor = 25% 2.5% 0.2%

slide-11
SLIDE 11

RTS can support Mul0ple Bursts

ü Real-time Serverless can support multiple bursts ü Approach is simple – just increase the guaranteed invocation rate

12

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Higher is be,er Bursts can happen simultaneously

slide-12
SLIDE 12

Implementation

Real-&me serverless interface

  • Compa'ble with serverless

<function name> lang: <Language of function body> handler: <Location of function body> image: <Docker image reference> realtime: <Guaranteed invocation rate> timeout: <Runtime limit> limits: <Maximum resource use> requests: <Minimum resource use>

13

12/09/2019 5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19)

Working prototype

  • Leverage OpenFaaS
  • Admission control at func'on

registra'on

slide-13
SLIDE 13

Case Study: Traffic Monitoring

  • Traces from real video over Glimpse
  • Low-level monitor for vehicle presence
  • Bursts arise when vehicles appears.

14

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Rush hours Night time DayAme

slide-14
SLIDE 14

Simple Frame Value Model (Success/Fail)

Vary guaranteed invocation rate (large background load)

12/09/2019 5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19)

15

0.2 0.4 0.6 0.8 1 5 10 15 20 Successful Frame Rate (normalized) Hour of Day

■ ApplicaHon Requests ■ Serverless/OpenFaaS (!"#$ = 0) ■ Real-Hme Serverless, !"#$ = 0.3 ■ Real-Hme Serverless, !"#$ = 1.0

Serverless cannot respond to demand changes ✓ RTS’ guarantee invocation rate enables it to respond to application demand despite competition from background load ✓ Higher RTS invocation rate improves for success rate for multiple bursts

slide-15
SLIDE 15

Related Work

  • Traditional Serverless with fast, dynamic invocation
  • Amazon Lambda, Google Cloud Function, OpenFaaS, Knative, etc.
  • Minimizing FaaS invocation overhead
  • SAND (ATC’18), SOCK (ATC’18), Kim et. al. (CLUSTER’18).
  • Extension for improving FaaS performance
  • Jonas et. al. (SoCC’17), Hellerstein et. al. (CIDR), Jonas et. al. (Berkeley,

2019)

None focus on performance guarantees / real-time.

16

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

slide-16
SLIDE 16

Summary

  • Current serverless interface cannot support real-2me, bursty

applica2ons.

  • Real-2me serverless = Serverless + Guaranteed invoca2on rate.
  • Guarantee sta*s*cs for value.
  • Enable ra*onal design.
  • A prototype shows 2mely response for a video monitoring applica2on
  • Future work
  • Efficient implementa*on for RTS interface
  • Explore the benefits of RTS interface for other applica*on classes

17

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

slide-17
SLIDE 17

Q&A

  • Acknowledgement. This work supported by Na3onal Sci- ence

Founda3on Grants CNS-1405959, CMMI-1832230, and CNS-

  • 1901466. We gratefully acknowledge support from Intel, Google, and

Samsung.

18

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

slide-18
SLIDE 18

Backup Slides

19

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

slide-19
SLIDE 19

A Big Picture

IT Server Real-,me Bursty Our focus! Cloud Data Center Edge Providers

Computa,on Infrastructure Applications

20

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

slide-20
SLIDE 20

Validate Analy,cal Results with Simula,on

12/09/2019 5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19)

22

slide-21
SLIDE 21

Supporting Multiple Applications

12/09/2019 5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19)

23

slide-22
SLIDE 22

Case Study: Sta,s,cs

Glimpse Pipeline Architecture1

1 Tiffany Yu-Han Chen et. al., Glimpse: Continuous, Real-time Object Recognition on Mobile Devices, SenSys’15

24

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Burst StaLsLcs

slide-23
SLIDE 23

RTS for Video Analysis

ü Guaranteed invoca.on rate enables value guarantee

25

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Realistic Workload Synthe.c Workload

slide-24
SLIDE 24

RTS for Video Analysis

26

ü Enable ra+onal design for value guarantee

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Realis+c Workload Synthetic Workload

slide-25
SLIDE 25

Robust against Burst Shape

  • Fixed total demand per

burst

  • Vary burst dura4on (and

height) ü Any value are achievable at an appropriate !"#$ ü Maximum value is achieved at !"#$ = 1, regardless burst shape

27

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Higher is better

slide-26
SLIDE 26

Robust against Burst Variability

Change variability by varying burst duration standard deviation

Duty factor = 1% Duty factor = 25% Variability causes value drop Higher duty factor creates more damage ✓ Increase !"#$ cancels variability effect ✓ RTS value can be maintained for wide burst variance 28

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19)

Higher is better

12/09/2019

27% 0.25 40% 0.35

slide-27
SLIDE 27

Mul$ple Real-$me, Bursty Apps.

✓ RTS resource cost scales with actual demand ✓ RTS resource consumption is 2.2x to 5x lower than UI ✓ RTS helps cloud provider save resource to serve more applications

500 1000 1500 2000 2500 3000 200 400 600 800 1000 1200 Resource (instances) Time (min)

10 Apps 100 Apps ■ Total demand ■ UI allocaEon ■ RTS allocaEon ■ RTS avg. allocaEon

200 400 600 800 1000 1200 Time (min)

29

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Lower is beJer

slide-28
SLIDE 28

Application Cost: UI vs. RTS

  • Resource cost for maximizing burst

value with different duty factors

  • Vary RTS vs. UI cost ra@o

✓ RTS resource value per unit cost is 16-24x higher than UI ✓ RTS enables low cost solu@ons for real-@me, bursty applica@ons

30

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Lower is better

slide-29
SLIDE 29

UI Cost at Different Duty Factors

12/09/2019 5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19)

31

slide-30
SLIDE 30

Resource Cost

✓ RTS is 2x to 8x cheaper than UI ✓ Actual resource requirement is 70x lower than the worst case

32

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

Lower is better Lower is be-er

slide-31
SLIDE 31

RTS Implementa-on Feasibility

RTS instances can be quickly reuse after reaching the max. runtime !"#$ RTS pool capacity is bounded C A'(), R'() = A'() ⋅ !"#$

Fixed Dura+on (!"#$) Pool Size (Size) Instances Reused

. . . t 33

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

slide-32
SLIDE 32

RTS Interface (Lambda Extension)

34

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019

  • Function description in YAML format

<func(on’s name> lang: <prog. language> handler: <refers to a folder where func(on body can be found> image: <Docker image reference> real%me: <minimum invoca%on rate> environment: exec_(meout: <Hard processing (meout> limits: # <---- Maximum resources used by the func7on memory: <max. memory> cpu: <max. cpu> requests: # <---- Resource requested by an instance memory: <req. memory> cpu: <req. cpu> RTS Extension

slide-33
SLIDE 33

Introduction

  • Func%on-as-a-Service (FaaS) aka

Serverless is the fastest growing element

  • f cloud workload
  • Expected to the the driving force for the

future cloud compu%ng

12/09/2019

35 ■ Serverless ■ Virtualiza/on ■ MapReduce ✓ Easy for development and deployment ✓ Dynamic resource scaling enables cost efficiency ✓ High resource management flexibility

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19)

slide-34
SLIDE 34

Demonstra*on: Experiment setup

RS Real-'me Serverless Serverless (best-effort) Serverless Serverless (best-effort) Driver Background load generator Image viewer Streaming app Driver Image viewer Streaming app

36

5TH WORKSHOP ON SERVERLESS COMPUTING (WOSC’19) 12/09/2019