Auto-sizing for Stream Processing Applications at LinkedIn Rayman - PowerPoint PPT Presentation

Auto-sizing for Stream Processing Applications at LinkedIn Rayman Preet Singh, Bharath Kumarasubramanian, Prateek Maheshwari, and Samarth Shetty Stream Processing @ LinkedIn

Stream Processing Skills Top App Skills Jobs Streaming input, nearline processing 2

Stream Processing App Example John Doe Samza: Stateful Scalable Stream Processing at LinkedIn Proc. VLDB ‘ 17 3

Profile Service Mini-profile Service DB Service Graph … Service … Feed Front-end Service … Profile URN Resolution Service Service Notification … Service Real-time distributed tracing for web performance and efficiency optimizations Stream Processing App Example LinkedIn Engineering Blog 4

Stream Processing App Example LinkedIn Sales and EMEA Blog 5

Stream Processing at LinkedIn Skills Top App Skills Jobs At LinkedIn, thousands of apps Notification, monitoring, recommendation, fraud-detection, search, … Millions of messages/s, 100s of GBs/s, … 6

Stream Processing at LinkedIn App-2 App-1 App-3 Stream Processing as a Service App developers APIs Data scientists Capacity provisioning … Security & privacy Operational ease Scalability Fault-tolerance Efficiency Performance … 7

Problem Throughput, Latency Parallelism CPU-cores, #threads, … Memory Heap, native, … Specialized hardware GPUs, RDMA, … Over-provisioning 50% of users by approx. 50%, Google-Autopilot [EuroSys’20], … Under-provisioning OOMs, stalls, failures, under-performing, ... 8

Solution Sizing parameters App Controller Throughput, Latency goals Input load App internals Environmental conditions Dependency-service, network latencies, … Hardware, software evolution … 9

Existing Solutions Apps are DAGs of cataloged operators Filter SoCC ‘17, VLDB ‘17, ToN ‘17, OSDI ‘18, ICDE ‘15, ICDE ‘20, IC2E ’16, … Join Tune parallelism Filter Map Filter Filter Optimize throughput, latency, utilization, time-taken Arrival rates, service-times follow specific distributions ToN ’17, ICDE ’15, … Poisson, exponential, … Tune parallelism – queuing theory, hill-climb, … 10

Apps use remote services Web Service UDF CDF of service time (ms) 1 Blob Storage 0.8 Op3 0.6 App 1 App 2 0.4 KV App 3 0.2 Op2 Store Op1 App 4 0 . 1 10 100 1000 . Service time (in ms) . Service time depends on remote services’ latencies, error-rates & retries, network latencies, … No specific distribution of service-times 11

Apps use remote services Web Service UDF Input load (messages per sec) Blob Storage Op3 KV Op2 Store Op1 . . Time-series of input load for sample apps . Throughput depends on input load variation and remote services’ throughput 12

CDF of arrival-rate (messages/sec) 1 App 1 0.8 App 2 App 3 0.6 App 4 0.4 0.2 0 10 5 10 6 Arrival rate (messages/sec) No specific distributions of arrival-rates 13

Apps go beyond a DAG of operators Web Service External UDF Frameworks Additional functionalities External frameworks Blob Op3 TensorFlow, DL4j, … Storage Client Periodic Out-of-order processing Cache UDF Input priorities KV Op2 Op1 State Store State User-defined functions (UDFs) . Customized input checkpointing . . … 14

Apps go beyond a DAG of operators Heterogenous c ombinations of functionalities DAG-only based models are insufficient 15

Apps exhibit correlations in resource use CPU bottleneck à Input buffering à Memory use à Lowered throughput à … Java-based apps Apache Flink, Samza, … Memory bottleneck à GC overhead à Low throughput, High latency & CPU utilization 16

Long-tail distributions of app characteristics CDF of application p50 service time (ms) Fraction of applications 1 0.8 0.6 0.4 0.2 0 0.01 0.1 1 10 100 1000 10000 100000 Application p50 service time (ms) 17

Long-tail distributions of app characteristics CDF of number of input streams (per application) Fraction of applications 1 0.8 0.6 0.4 0.2 0 1 10 100 1000 Number of input streams 18

Long-tail distributions of app characteristics CDF of application state size (MB) Fraction of applications 1 0.8 0.6 0.4 0.2 0 1000 10000 100000 1x10 6 1x10 7 0.1 1 10 100 Application state (in MB) 19

Requirements Sizing parameters Right size vs. optimal size Operational ease Interpretable App Controller Safe-trajectory Minimize time-taken Scalable, fault-tolerant, efficient, … 20

Approach Black-box approaches Azure-VMSS, AWS-EC2 autoscale, Dhalion VLDB ’17, .. Interpretable Right sizing Time-taken, oscillations [DS2 OSDI’18, Turbine ICDE ’20] Undo, redo, refine, … 21

Approach Optimization approaches Bilal et al. SoCC ‘17, Gencer et al. Middleware ‘15, … Training data (trial runs), parameter & criteria tuning, assumptions, … Optimal sizing, minimize time-taken Operability (interpretable actions), service dependencies, network, .. 22

Sage Design Feedback control system Policies encapsulate strategies for sizing a single resource Priority order Periodically on all apps Only if, no inflight action on app 23

Sage Design Policy priority order Deterministic -- interpretable, modifiable, .. Programmability for policies Tailored to continuous-operator systems like Apache Samza, Flink, … P1: Memory scale-up P2: CPU scale-up P3: Parallelism tuning … 24

Sage Design Straggling app Web Service External Increase memory? CPU? Parallelism? UDF Frameworks Blob Op3 Bounded buffers Storage Client Periodic Cache UDF Tuning memory before CPU KV Op2 Op1 Store State . Tuning parallelism . . P1: Memory scale-up Triggered by backlog increases (after P1, P2) P2: CPU scale-up Correlation with remote service metrics? P3: Parallelism tuning TLCC (time-lagged cross-correlation) … 25

Implementation Work in progress Implemented as a stream processing app Used for hundreds of production mix of apps Avg. approx. 30 mins for new apps 14% larger size vs. hand-tuned optimal (selected apps) At-most one scale-down for each resource 26

Conclusion Resource sizing is crucial for any service’s performance, usability, operability, ... Streaming apps go beyond DAG of operators Use r emote services Customize functionalities, heterogeneous Widely varying workloads Multiple resource-use, performance, cost, operability trade-offs Sage: a rule-based solution to navigate them in production 27

Backup slides 28

Long-tail distributions of app characteristics 29

Apps use remote services CDF of service time (ms) 1 0.8 Web 0.6 App 1 Service App 2 0.4 App 3 0.2 UDF App 4 0 1 10 100 1000 Blob Service time (in ms) Storage Op3 CDF of arrival-rate (messages/sec) 1 App 1 0.8 KV App 2 App 3 Op2 Store Op1 0.6 App 4 0.4 . 0.2 . 0 . 10 5 10 6 Arrival rate (messages/sec) Service time depends on remote services’ latencies, error-rates & retries, network latencies, … No specific distribution Throughput depends on input load variation and remote services’ throughput No specific distribution 30

Apps go beyond a DAG of operators Web Service External UDF Frameworks Additional functionalities External frameworks Blob Op3 TensorFlow, DL4j, … Storage Client Periodic Out-of-order processing Cache UDF Input priorities KV Op2 Op1 State Store State User-defined functions (UDFs) . Customized input checkpointing . . … Apps combine operators and functionalities in different ways Heterogenous mix 31

Apps exhibit correlations in resource use CPU bottleneck à Input buffering à Memory use à Lowered throughput à … Java-based apps Apache Flink, Samza, … Memory bottleneck à GC overhead à Low throughput, High latency & CPU utilization 32

Sage Design Feedback control system Policies encapsulate strategies for sizing a single resource Priority order Periodically on all apps Only if, no inflight action on app 33

Sage Design Policy priority order Deterministic -- interpretable, modifiable, .. Programmability for policies Tailored to continuous-operator systems like Apache Samza, Flink, … P1: Memory scale-up P2: CPU scale-up P3: Parallelism tuning … 34

Work in Progress Used for hundreds of production mix of apps Avg. approx. 30 mins for new apps 14% larger size vs. hand-tuned optimal (selected apps) At-most one scale-down for each resource 35

Summary Resource sizing is crucial for any service’s performance, usability, operability, ... Streaming apps go beyond DAG of operators Use r emote services Customize functionalities, heterogeneous Widely varying workloads Multiple resource-use, performance, cost, operability trade-offs Sage: a rule-based solution to navigate them in production 36

Long-tail distributions of app characteristics CDF of application p50 service time (ms) Fraction of applications CDF of number of input streams (per application) Fraction of applications 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 10 100 1000 0.01 0.1 1 10 100 1000 10000 100000 Number of input streams Application p50 service time (ms) CDF of application state size (MB) Fraction of applications 1 0.8 0.6 0.4 0.2 0 1000 10000 100000 1x10 6 1x10 7 0.1 1 10 100 Application state (in MB) 37

Auto-sizing for Stream Processing Applications at LinkedIn Rayman - PowerPoint PPT Presentation

Auto-sizing for Stream Processing Applications at LinkedIn Rayman Preet Singh, Bharath Kumarasubramanian, Prateek Maheshwari, and Samarth Shetty Stream Processing @ LinkedIn Stream Processing Skills Top App Skills Jobs Streaming input,

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

Stream Processing Marco Serafini COMPSCI 532 Lecture 5 Stream vs. Batch Processing Batch

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

STREAM PROCESSING AT LINKEDIN: APACHE KAFKA & APACHE SAMZA Processing billions of events

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

INTERNATIONAL AUTO PROCESSING, INC. 2 33 YEARS International Auto Processing brought vehicle

Sizing Power Generation and Fuel Sizing Power Generation and Fuel Capacity of the All-Electric

CS244 Advanced Topics in Networking Lecture 10: Buffer Sizing Nick McKeown Sizing Router

Stress Aw are Active Area Sizing, Gate Sizing and Repeater Insertion Ashutosh Chakraborty David

Getting System Sizing and Getting System Sizing and performance testing right performance

Construction of Realistic Gate Construction of Realistic Gate Sizing Benchmarks Sizing

Robust Gate Sizing via Mean- - Robust Gate Sizing via Mean Excess Delay Minimization Excess

Discrete Buffer and Wire Sizing for Discrete Buffer and Wire Sizing for Link-Based Non-Tree Clock

The Korean Auto & Auto Parts Industry Chapter 1. The Status of Korean Auto Industry 2 1

GB Auto The Ghabbour Group of Companies Everything on Wheels GB Auto, S.A.E I nitial

Project Presentations CT @ VT Project Presentations Daniel Almeida Airport Locator

Whitebox Fuzzing David Molnar Microsoft Research Problem: Security Bugs in File Parsers Hundreds

Coding Sprints From 4pm we will have a Practical session based on the Sage software. Depending on

New Customer Acquisition at Sage: A More Scientific Approach Dan Taylor, Customer Insights

Debian and (large scale) System Administration Alexander Zangerl Bond University az@

gravitational-wave bursts with memory Marc Favata UWM Objectives: Provide a general

Gravitational self force in extreme-mass-ratio binary inspirals Leor Barack University of

Lectures on black-hole perturbation theory Leor Barack University of Southampton Kavli-RISE

Auto-sizing for Stream Processing Applications at LinkedIn Rayman - PowerPoint PPT Presentation

Auto-sizing for Stream Processing Applications at LinkedIn Rayman Preet Singh, Bharath Kumarasubramanian, Prateek Maheshwari, and Samarth Shetty Stream Processing @ LinkedIn Stream Processing Skills Top App Skills Jobs Streaming input,

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

KODA AUTO University KODA AUTO University Agenda on KODA AUTO University Enterprise

Stream Processing Marco Serafini COMPSCI 532 Lecture 5 Stream vs. Batch Processing Batch

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

STREAM PROCESSING AT LINKEDIN: APACHE KAFKA &amp; APACHE SAMZA Processing billions of events

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

INTERNATIONAL AUTO PROCESSING, INC. 2 33 YEARS International Auto Processing brought vehicle

Sizing Power Generation and Fuel Sizing Power Generation and Fuel Capacity of the All-Electric

CS244 Advanced Topics in Networking Lecture 10: Buffer Sizing Nick McKeown Sizing Router

Stress Aw are Active Area Sizing, Gate Sizing and Repeater Insertion Ashutosh Chakraborty David

Getting System Sizing and Getting System Sizing and performance testing right performance

Construction of Realistic Gate Construction of Realistic Gate Sizing Benchmarks Sizing

Robust Gate Sizing via Mean- - Robust Gate Sizing via Mean Excess Delay Minimization Excess

Discrete Buffer and Wire Sizing for Discrete Buffer and Wire Sizing for Link-Based Non-Tree Clock

The Korean Auto &amp; Auto Parts Industry Chapter 1. The Status of Korean Auto Industry 2 1

GB Auto The Ghabbour Group of Companies Everything on Wheels GB Auto, S.A.E I nitial

Project Presentations CT @ VT Project Presentations Daniel Almeida Airport Locator

Whitebox Fuzzing David Molnar Microsoft Research Problem: Security Bugs in File Parsers Hundreds

Coding Sprints From 4pm we will have a Practical session based on the Sage software. Depending on

New Customer Acquisition at Sage: A More Scientific Approach Dan Taylor, Customer Insights

Debian and (large scale) System Administration Alexander Zangerl Bond University az@

gravitational-wave bursts with memory Marc Favata UWM Objectives: Provide a general

Gravitational self force in extreme-mass-ratio binary inspirals Leor Barack University of

Lectures on black-hole perturbation theory Leor Barack University of Southampton Kavli-RISE

STREAM PROCESSING AT LINKEDIN: APACHE KAFKA & APACHE SAMZA Processing billions of events

The Korean Auto & Auto Parts Industry Chapter 1. The Status of Korean Auto Industry 2 1