Auto-sizing for Stream Processing Applications at LinkedIn Rayman - - PowerPoint PPT Presentation

auto sizing for stream processing applications at linkedin
SMART_READER_LITE
LIVE PREVIEW

Auto-sizing for Stream Processing Applications at LinkedIn Rayman - - PowerPoint PPT Presentation

Auto-sizing for Stream Processing Applications at LinkedIn Rayman Preet Singh, Bharath Kumarasubramanian, Prateek Maheshwari, and Samarth Shetty Stream Processing @ LinkedIn Stream Processing Skills Top App Skills Jobs Streaming input,


slide-1
SLIDE 1

Auto-sizing for Stream Processing Applications at LinkedIn

Rayman Preet Singh, Bharath Kumarasubramanian, Prateek Maheshwari, and Samarth Shetty

Stream Processing @ LinkedIn

slide-2
SLIDE 2

Stream Processing

Streaming input, nearline processing

2

App Skills Jobs Top Skills

slide-3
SLIDE 3

3

John Doe

Samza: Stateful Scalable Stream Processing at LinkedIn

  • Proc. VLDB ‘ 17

Stream Processing App Example

slide-4
SLIDE 4

4

Real-time distributed tracing for web performance and efficiency optimizations LinkedIn Engineering Blog

Front-end Feed Service URN Resolution Service Profile Service Notification Service

Profile Service DB Service

… …

Mini-profile Service Graph Service

Stream Processing App Example

slide-5
SLIDE 5

5

LinkedIn Sales and EMEA Blog

Stream Processing App Example

slide-6
SLIDE 6

Stream Processing at LinkedIn

At LinkedIn, thousands of apps

Notification, monitoring, recommendation, fraud-detection, search, …

Millions of messages/s, 100s of GBs/s, …

6

App Skills Jobs Top Skills

slide-7
SLIDE 7

Stream Processing at LinkedIn

7

App-1 Stream Processing as a Service App-2

App-3

App developers Data scientists …

APIs Capacity provisioning Security & privacy Operational ease Scalability Fault-tolerance Efficiency Performance …

slide-8
SLIDE 8

Problem

8

Throughput, Latency Parallelism CPU-cores, #threads, … Memory Heap, native, … Specialized hardware GPUs, RDMA, … Over-provisioning 50% of users by approx. 50%, Google-Autopilot [EuroSys’20], … Under-provisioning OOMs, stalls, failures, under-performing, ...

slide-9
SLIDE 9

Solution

9

App Controller

Throughput, Latency goals Input load App internals Environmental conditions Dependency-service, network latencies, … Hardware, software evolution …

Sizing parameters

slide-10
SLIDE 10

Existing Solutions

10

Apps are DAGs of cataloged operators

SoCC ‘17, VLDB ‘17, ToN ‘17, OSDI ‘18, ICDE ‘15, ICDE ‘20, IC2E ’16, … Tune parallelism Optimize throughput, latency, utilization, time-taken

Arrival rates, service-times follow specific distributions

ToN ’17, ICDE ’15, … Poisson, exponential, … Tune parallelism – queuing theory, hill-climb, …

Map Join Filter Filter Filter Filter

slide-11
SLIDE 11

Apps use remote services

11

Op1 Op3 UDF Op2

Web Service Blob Storage KV Store

. . .

0.2 0.4 0.6 0.8 1 1 10 100 1000 Service time (in ms) CDF of service time (ms) App 1 App 2 App 3 App 4

Service time depends on remote services’ latencies, error-rates & retries, network latencies, … No specific distribution of service-times

slide-12
SLIDE 12

Apps use remote services

12 Time-series of input load for sample apps

Throughput depends on input load variation and remote services’ throughput

Op1 Op3 UDF Op2

Web Service Blob Storage KV Store

. . .

Input load (messages per sec)

slide-13
SLIDE 13

13

No specific distributions of arrival-rates

0.2 0.4 0.6 0.8 1 105 106 Arrival rate (messages/sec) CDF of arrival-rate (messages/sec) App 1 App 2 App 3 App 4

slide-14
SLIDE 14

Apps go beyond a DAG of operators

14

Additional functionalities

External frameworks TensorFlow, DL4j, … Out-of-order processing Input priorities State User-defined functions (UDFs) Customized input checkpointing …

Op1 Op3 UDF Op2

Periodic UDF Client Cache State Web Service Blob Storage KV Store

. . .

External Frameworks

slide-15
SLIDE 15

Apps go beyond a DAG of operators

15

Heterogenous combinations of functionalities DAG-only based models are insufficient

slide-16
SLIDE 16

Apps exhibit correlations in resource use

16

CPU bottleneck à Input buffering à Memory use àLowered throughput à…

Java-based apps Apache Flink, Samza, …

Memory bottleneck à GC overhead à Low throughput, High latency & CPU utilization

slide-17
SLIDE 17

Long-tail distributions of app characteristics

17

0.2 0.4 0.6 0.8 1

0.01 0.1 1 10 100 1000 10000 100000

Fraction of applications Application p50 service time (ms) CDF of application p50 service time (ms)

slide-18
SLIDE 18

Long-tail distributions of app characteristics

18

0.2 0.4 0.6 0.8 1

1 10 100 1000

Fraction of applications Number of input streams CDF of number of input streams (per application)

slide-19
SLIDE 19

Long-tail distributions of app characteristics

19

0.2 0.4 0.6 0.8 1

0.1 1 10 100 1000 10000 100000 1x106 1x107

Fraction of applications Application state (in MB) CDF of application state size (MB)

slide-20
SLIDE 20

Requirements

20

App Controller Right size vs. optimal size Operational ease

Interpretable Safe-trajectory

Minimize time-taken Scalable, fault-tolerant, efficient, …

Sizing parameters

slide-21
SLIDE 21

Approach

21

Black-box approaches

Azure-VMSS, AWS-EC2 autoscale, Dhalion VLDB ’17, .. Interpretable Right sizing Time-taken, oscillations [DS2 OSDI’18, Turbine ICDE ’20]

Undo, redo, refine, …

slide-22
SLIDE 22

Approach

22

Optimization approaches

Bilal et al. SoCC ‘17, Gencer et al. Middleware ‘15, … Training data (trial runs), parameter & criteria tuning, assumptions, … Optimal sizing, minimize time-taken Operability (interpretable actions), service dependencies, network, ..

slide-23
SLIDE 23

Sage Design

23

Feedback control system Policies encapsulate strategies for sizing a single resource

Priority order Periodically on all apps Only if, no inflight action on app

slide-24
SLIDE 24

Sage Design

24

Policy priority order

Deterministic -- interpretable, modifiable, .. Programmability for policies Tailored to continuous-operator systems like Apache Samza, Flink, … P1: Memory scale-up P2: CPU scale-up P3: Parallelism tuning

slide-25
SLIDE 25

Sage Design

25

Straggling app

Increase memory? CPU? Parallelism?

Bounded buffers

Tuning memory before CPU

Tuning parallelism Triggered by backlog increases (after P1, P2) Correlation with remote service metrics? TLCC (time-lagged cross-correlation)

P1: Memory scale-up P2: CPU scale-up P3: Parallelism tuning

Op1 Op3 UDF Op2

Periodic UDF Client Cache State Web Service Blob Storage KV Store

. . .

External Frameworks

slide-26
SLIDE 26

Implementation

Work in progress Implemented as a stream processing app Used for hundreds of production mix of apps

  • Avg. approx. 30 mins for new apps

14% larger size vs. hand-tuned optimal (selected apps) At-most one scale-down for each resource

26

slide-27
SLIDE 27

Conclusion

Resource sizing is crucial for any service’s performance, usability, operability, ...

Streaming apps go beyond DAG of operators Use remote services

Customize functionalities, heterogeneous Widely varying workloads

Multiple resource-use, performance, cost, operability trade-offs

Sage: a rule-based solution to navigate them in production

27

slide-28
SLIDE 28

Backup slides

28

slide-29
SLIDE 29

Long-tail distributions of app characteristics

29

slide-30
SLIDE 30

Apps use remote services

30

Op1 Op3 UDF Op2

Web Service Blob Storage KV Store

. . .

0.2 0.4 0.6 0.8 1 1 10 100 1000 Service time (in ms) CDF of service time (ms) App 1 App 2 App 3 App 4

Service time depends on remote services’ latencies, error-rates & retries, network latencies, … No specific distribution Throughput depends on input load variation and remote services’ throughput No specific distribution

0.2 0.4 0.6 0.8 1 105 106 Arrival rate (messages/sec) CDF of arrival-rate (messages/sec) App 1 App 2 App 3 App 4

slide-31
SLIDE 31

Apps go beyond a DAG of operators

31

Additional functionalities

External frameworks TensorFlow, DL4j, … Out-of-order processing Input priorities State User-defined functions (UDFs) Customized input checkpointing … Apps combine operators and functionalities in different ways Heterogenous mix

Op1 Op3 UDF Op2

Periodic UDF Client Cache State Web Service Blob Storage KV Store

. . .

External Frameworks

slide-32
SLIDE 32

Apps exhibit correlations in resource use

32

CPU bottleneck à Input buffering à Memory use àLowered throughput à…

Java-based apps Apache Flink, Samza, …

Memory bottleneck à GC overhead à Low throughput, High latency & CPU utilization

slide-33
SLIDE 33

Sage Design

33

Feedback control system Policies encapsulate strategies for sizing a single resource

Priority order Periodically on all apps Only if, no inflight action on app

slide-34
SLIDE 34

Sage Design

34

Policy priority order

Deterministic -- interpretable, modifiable, .. Programmability for policies Tailored to continuous-operator systems like Apache Samza, Flink, … P1: Memory scale-up P2: CPU scale-up P3: Parallelism tuning

slide-35
SLIDE 35

Used for hundreds of production mix of apps

  • Avg. approx. 30 mins for new apps

14% larger size vs. hand-tuned optimal (selected apps) At-most one scale-down for each resource

35

Work in Progress

slide-36
SLIDE 36

Resource sizing is crucial for any service’s performance, usability, operability, ...

Streaming apps go beyond DAG of operators Use remote services

Customize functionalities, heterogeneous Widely varying workloads

Multiple resource-use, performance, cost, operability trade-offs

Sage: a rule-based solution to navigate them in production

36

Summary

slide-37
SLIDE 37

Long-tail distributions of app characteristics

37

0.2 0.4 0.6 0.8 1

0.01 0.1 1 10 100 1000 10000 100000

Fraction of applications Application p50 service time (ms) CDF of application p50 service time (ms)

0.2 0.4 0.6 0.8 1

1 10 100 1000

Fraction of applications Number of input streams CDF of number of input streams (per application)

0.2 0.4 0.6 0.8 1

0.1 1 10 100 1000 10000 100000 1x106 1x107

Fraction of applications Application state (in MB) CDF of application state size (MB)