Serverless in the Wild: Characterizing and Optimizing the Serverless - - PowerPoint PPT Presentation

serverless in the wild
SMART_READER_LITE
LIVE PREVIEW

Serverless in the Wild: Characterizing and Optimizing the Serverless - - PowerPoint PPT Presentation

Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider Mohammad Shahrad , Rodrigo Fonseca , igo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark


slide-1
SLIDE 1

Mohammad Shahrad, Rodrigo Fonseca, Íñigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini

Serverless in the Wild:

Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider

July 15, 2020

slide-2
SLIDE 2

What is Serverless?

  • Very attractive abstraction:
  • Pay for Use
  • Infinite elasticity from 0 (and back)
  • No worry about servers
  • Provisioning, Reserving, Configuring, patching, managing
  • Most popular offering: Function-as-a-Service (FaaS)
  • Bounded-time functions with no persistent state among invocations
  • Upload code, get an endpoint, and go

For the rest of this talk, Serverless = Serverless FaaS

slide-3
SLIDE 3

What is Serverless?

Bare Metal VMs (IaaS) Containers Functions (FaaS) Unit of Scale Server VM Application/Pod Function Provisioning Ops DevOps DevOps Cloud Provider Init Time Days ~1 min Few seconds Few seconds Scaling Buy new hardware Allocate new VMs 1 to many, auto 0 to many, auto Typical Lifetime Years Hours Minutes O(100ms) Payment Per allocation Per allocation Per allocation Per use State Anywhere Anywhere Anywhere Elsewhere

slide-4
SLIDE 4

Serverless

“…more than 20 percent of global enterprises will have deployed serverless computing technologies by 2020.” Gartner, Dec 2018

slide-5
SLIDE 5

Serverless

Source: CNCF Cloud Native Interactive Landscape https://landscape.cncf.io/format=serverless

slide-6
SLIDE 6

Serverless

“… we predict that (…) serverless computing will grow to dominate the future of cloud computing.” December 2019

slide-7
SLIDE 7

So what are people doing with FaaS?

  • Many simple things
  • ETL workloads
  • IoT data collection / processing
  • Stateless processing
  • Image / Video transcoding
  • Translation
  • Check processing
  • Serving APIs, Mobile/Web Backends
  • Interesting Explorations
  • MapReduce (pywren)
  • Linear Algebra (numpywren)
  • ExCamera
  • gg “burst-parallel” functions apps
  • ML training
  • Limitations
  • Communication
  • Latency
  • Locality (lack)
  • State management
slide-8
SLIDE 8

What is Serverless?

  • Very attractive abstraction:
  • Pay for Use
  • Infinite elasticity from 0 (and back)
  • No worry about servers
  • Provisioning, Reserving, Configuring, patching, managing
slide-9
SLIDE 9

If you are a cloud provider…

  • A big challenge
  • You do worry about servers!
  • Provisioning, scaling, allocating, securing, isolating
  • Illusion of infinite scalability
  • Optimize resource use
  • Fierce competition
  • A bigger opportunity
  • Fine grained resource packing
  • Great space for innovating, and capturing new applications, new markets
slide-10
SLIDE 10

Cold Starts

  • Typically range between 0.2 to a few seconds1,2

9

1https://levelup.gitconnected.com/1946d32a0244 2https://mikhail.io/serverless/coldstarts/big3/

OpenWhisk Azure Functions AWS Lambda

slide-11
SLIDE 11

Cold Starts and Resource Wastage

Cold Starts Wasted Memory

Keeping functions in memory indefinitely. Removing function instance from memory after invocation.

10

?

slide-12
SLIDE 12

Stepping Back: Characterizing the Workload

11

  • How are functions accessed
  • What resources do they use
  • How long do functions take

2 weeks of all invocations to Azure Functions in July 2019 First characterization of the workload of a large serverless provider Subset of the traces available for research: https://github.com/Azure/AzurePublicDataset

slide-13
SLIDE 13

12

Invocations per Application*

This graph is from a representative subset of the workload. See paper for details.

slide-14
SLIDE 14

13

Invocations per Application

This graph is from a representative subset of the workload. See paper for details.

slide-15
SLIDE 15

14

Invocations per Application

This graph is from a representative subset of the workload. See paper for details.

slide-16
SLIDE 16

15

Invocations per Application

This graph is from a representative subset of the workload. See paper for details.

slide-17
SLIDE 17

16

Invocations per Application

This graph is from a representative subset of the workload. See paper for details.

slide-18
SLIDE 18

17

Invocations per Application

This graph is from a representative subset of the workload. See paper for details.

slide-19
SLIDE 19

18

Invocations per Application

18% >1/min 99.6% of invocations! 82% <1/min 0.4% of invocations

This graph is from a representative subset of the workload. See paper for details.

slide-20
SLIDE 20

Apps are highly heterogeneous

19

slide-21
SLIDE 21

20

What about memory?

If we wanted to keep all apps warm…

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Fraction of Least Invoked Apps Cumulative Fraction of Total Memory

Allocated Memory Physical Memory

slide-22
SLIDE 22

21

What about memory?

If we wanted to keep all apps warm…

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Fraction of Least Invoked Apps Cumulative Fraction of Total Memory

Allocated Memory Physical Memory 82% of apps -> 0.4% of invocations -> 40% of all physical memory, 60% of virtual memory 90% of apps -> 1.05% of invocations -> 50% of all physical memory

slide-23
SLIDE 23

22

Function Execution Duration

0.00 0.10 0.25 0.50 0.75 0.90 1.00 1ms 100ms 1s 10s 1m 10m 1h

Time(s) CDF

Minimum Average Maximum LogNormal Fit

  • Executions are short
  • 50% of apps on average run for <= 0.67s
  • 75% of apps on run for <= 10s max
  • Times at the same scale as cold start times1,2

1https://levelup.gitconnected.com/1946d32a0244 2https://mikhail.io/serverless/coldstarts/big3/

slide-24
SLIDE 24

23

Key Takeaways

  • Highly concentrated accesses
  • 82% of the apps are accessed <1/min on average
  • Correspond to 0.4% of all accesses
  • But in aggregate would take 40% of the service memory if kept warm
  • Arrival processes are highly variable
  • Execution times are short
  • Same OOM as cold start times
slide-25
SLIDE 25

Cold Starts and Resource Wastage

Cold Starts Wasted Memory

Keeping functions in memory indefinitely. Removing function instance from memory after invocation.

24

?

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Fraction of Least Invoked Apps Cumulative Fraction of Total Memory

Allocated Memory Physical Memory 0.00 0.10 0.25 0.50 0.75 0.90 1.00 1ms 100ms 1s 10s 1m 10m 1h Time(s) CDF

Minimum Average Maximum LogNormal Fit

slide-26
SLIDE 26

What do serverless providers do?

Mikhail Shilkov, Cold Starts in Serverless Functions, https://mikhail.io/serverless/coldstarts/

Amazon Lambda Fixed 10-minute keep-alive.

Cold start probability Time since last invocation (mins)

Azure Functions

Time since last invocation (mins) Cold start probability

Fixed 20-minute keep-alive.

25

slide-27
SLIDE 27

Fixed Keep-Alive Policy

Results from simulation of the entire workload for a week.

Longer keep-alive

26

slide-28
SLIDE 28

Time

8 mins

Cold Start

8 mins

Fixed Keep-Alive Won’t Fit All

Warm Start

Time

11 mins 11 mins

10-minute Fixed Keep-alive

27

slide-29
SLIDE 29

Fixed Keep-Alive Is Wasteful

Function image kept in memory but not used.

28

Time

8 mins 8 mins

Cold Start Warm Start 10-minute Fixed Keep-alive

slide-30
SLIDE 30

Hybrid Histogram Policy

Adapt to each application Pre-warm in addition to keep-alive Lightweight implementation

29

slide-31
SLIDE 31

A Histogram Policy To Learn Idle Times

Time 8 mins 8 mins

Idle Time (IT): Idle Time (IT) Frequency

8

30

Cold Start Warm Start 10-minute Fixed Keep-alive

slide-32
SLIDE 32

Idle Time (IT) Frequency

8 9 7

Pre-warm Keep-alive

31

A Histogram Policy To Learn Idle Times

slide-33
SLIDE 33

A Histogram Policy To Learn Idle Times

Frequency Idle Time (IT)

Pre-warm Keep-alive 5th percentile 99th percentile

Minute-long bins Limited number of bins (e.g., 240 bins for 4-hours)

32

slide-34
SLIDE 34

The Hybrid Histogram Policy

Frequency Idle Time (IT)

Pre-warm Keep-alive 5th percentile 99th percentile

Out of Bound (OOB) We can afford to run complex predictors given the low arrival rate. A histogram might be too wasteful.

Time Series Forecast

33

slide-35
SLIDE 35

Time-series forecast (ARIMA) Use IT distribution (histogram) Be conservative (standard keep-alive)

Too many OOB ITs

No Yes

Pattern Significant

New invocation

The Hybrid Histogram Policy

Yes No

Update app’s IT distribution

ARIMA: Autoregressive Integrated Moving Average

34

slide-36
SLIDE 36

More Optimal Pareto Frontier

35

slide-37
SLIDE 37

Implemented in OpenWhisk

REST Interface Controller Load Balancer Distributed Messaging Invoker Distributed Database Container Container Container Container Container Container Container Container Container Invoker Invoker

  • Open-sourced industry-grade

(IBM Cloud Functions)

  • Functions run in docker containers
  • Uses 10-minute fixed keep-alive
  • Built a distributed setup with 19 VMs

36

slide-38
SLIDE 38

25 50 75 100 App Cold StDrt (%) 0.00 0.25 0.50 0.75 1.00 CD) Hybrid )ixHd (10-min)

Simulation Experimental

4-Hour Hybrid Histogram

Latency overhead: < 1ms (835.7µs) Container memory reduction: 15.6% Average exec time reduction: 32.5% 99th–percentile exec time reduction: 82.4%

37

slide-39
SLIDE 39

Closing the loop

Ø First serverless characterization from a provider’s point of view

38

Ø Azure Functions traces available to download:

https://github.com/Azure/AzurePublicDataset/blob/master/ AzureFunctionsDataset2019.md

Ø A dynamic policy to manage serverless workloads more efficiently ( First elements now running in production. )