THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - - PowerPoint PPT Presentation

the long road towards elastic distributed stream
SMART_READER_LITE
LIVE PREVIEW

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - - PowerPoint PPT Presentation

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING Leonardo Querzoni querzoni@diag.uniroma1.it Auto-DaSP - Turin, August 28th 2018 D IPARTIMENTO DI I NGEGNERIA CIS Sapienza I NFORMATICA A UTOMATICA E G ESTIONALE A NTONIO R UBERTI


slide-1
SLIDE 1

DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI

Cyber Intelligence and information Security

CIS Sapienza

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING

Leonardo Querzoni querzoni@diag.uniroma1.it Auto-DaSP - Turin, August 28th 2018

slide-2
SLIDE 2

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

”[...] defines elasticity as the configurability and expandability of the solution [...] Centrally, it is the ability to scale up and scale down capacity based

  • n subscriber workload.”

  • OCDA. Master Usage Model: Compute Infratructure as a
  • Service. Tech. rep., Open Data Center Alliance (OCDA), 2012

”Rapid elasticity: Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.”


MELL, P ., AND GRANCE, T. The NIST Definition of Cloud Computing. Tech. rep., U.S. National Institute of Standards and Technology (NIST), SP 800-145, 2011

”Elasticity is basically a ’rename’ of scalability [...]” and ”removes any manual labor needed to increase or reduce capacity”


SCHOUTEN, E. (IBM) Rapid Elasticity and the Cloud, Septem- ber 2012

”the quantifiable ability to manage, measure, predict and adapt responsiveness of an application based on real time demands placed on an infrastructure using a combi- nation of local and remote computing resources.”


COHEN, R. Defining Elastic Computing, September 2009.

”Elasticity measures the ability of the cloud to map a single user request to different resources.”


WOLSKI, R. Cloud Computing and Open Source: Watching Hype meet Reality, May 2011

slide-3
SLIDE 3

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

”[...] defines elasticity as the configurability and expandability of the solution [...] Centrally, it is the ability to scale up and scale down capacity based

  • n subscriber workload.”

  • OCDA. Master Usage Model: Compute Infratructure as a
  • Service. Tech. rep., Open Data Center Alliance (OCDA), 2012

”Rapid elasticity: Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.”


MELL, P ., AND GRANCE, T. The NIST Definition of Cloud Computing. Tech. rep., U.S. National Institute of Standards and Technology (NIST), SP 800-145, 2011

”Elasticity is basically a ’rename’ of scalability [...]” and ”removes any manual labor needed to increase or reduce capacity”


SCHOUTEN, E. (IBM) Rapid Elasticity and the Cloud, Septem- ber 2012

”the quantifiable ability to manage, measure, predict and adapt responsiveness of an application based on real time demands placed on an infrastructure using a combi- nation of local and remote computing resources.”


COHEN, R. Defining Elastic Computing, September 2009.

”Elasticity measures the ability of the cloud to map a single user request to different resources.”


WOLSKI, R. Cloud Computing and Open Source: Watching Hype meet Reality, May 2011

slide-4
SLIDE 4

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Workload Load Static Provisioning Time

slide-5
SLIDE 5

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Workload Load Static Provisioning Time

slide-6
SLIDE 6

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Workload Load Static Provisioning Time

slide-7
SLIDE 7

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Workload Load Static Provisioning Elastic Provisioning Time

slide-8
SLIDE 8

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Workload Load Static Provisioning Elastic Provisioning Time

slide-9
SLIDE 9

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Workload Load Static Provisioning Elastic Provisioning Time

slide-10
SLIDE 10

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Workload Load Static Provisioning Elastic Provisioning

Underprovisioning Overprovisioning

Time

slide-11
SLIDE 11

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Workload Load Static Provisioning Elastic Provisioning

Elastic provisioning

Time

slide-12
SLIDE 12

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Elastic computing drove the success of cloud providers

▪Virtually infinite resources ▪On-demand provisioning ▪Near-instant availability ▪Automatic scale-out ▪Pay-what-you-use

slide-13
SLIDE 13

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING

Elastic processing of big-data is today a reality

slide-14
SLIDE 14

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

DISTRIBUTED STREAM PROCESSING

Data Stream Processing Engine:

▪continuously calculate results for persistent queries ▪on (potentially) unbounded data streams ▪using operators: algebraic (filters, join, aggregation) or user defined ▪stateless/stateful

  • p 4
  • p 1
  • p 1
  • p 1

source A source B event / tuple

  • p 1
  • p 2
  • p 3
  • p 4
  • p 5

DB KB

slide-15
SLIDE 15

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

DISTRIBUTED STREAM PROCESSING

Data stream processing (DSP) was in the past considered a solution for very specific problems.

▪Financial trading ▪Logistics tracking ▪Factory monitoring

Today the potentialities of DSPs start to be used in more general settings.

DSP : online processing = MR : batch processing

slide-16
SLIDE 16

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

Why is realizing elastic stream processing more difficult?

▪Data in motion vs data at rest

▪ Variable data rates ▪ No obvious ways to characterize data content

▪Latency-sensitive applications

▪ Batch applications are typically throughput-oriented

▪Long term executions

▪ Batch jobs are expected to be short-lived ▪ Stream processing applications are designed to stay up and running for hours/days/ week/months

ELASTIC STREAM VS BATCH

slide-17
SLIDE 17

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

A few optimization strategies are known to deal with these issues:

HOW TO SCALE DSP

Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014

slide-18
SLIDE 18

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

A few optimization strategies are known to deal with these issues:

HOW TO SCALE DSP

FUSION

Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014

slide-19
SLIDE 19

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

A few optimization strategies are known to deal with these issues:

HOW TO SCALE DSP

FUSION FISSION

Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014

slide-20
SLIDE 20

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

A few optimization strategies are known to deal with these issues:

HOW TO SCALE DSP

FUSION FISSION PLACEMENT

Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014

slide-21
SLIDE 21

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

A few optimization strategies are known to deal with these issues:

HOW TO SCALE DSP

FUSION FISSION PLACEMENT LOAD BALANCING

Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014

slide-22
SLIDE 22

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

Most of the existing solutions apply a standard MAPE-K (monitor, analyze, plan, and execute) model:

CURRENT SOLUTIONS

EXECUTE PLAN ANALYZE MONITOR KNOWLEDGE DSP Framework CONTROLLER

slide-23
SLIDE 23

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

MONITOR Performance about the runtime execution of steam applications is gathered at several possible collection points:

▪Hosts-level

▪ memory/cpu utilization ▪ interprocess communications

▪Network-level

▪ communications among hosts in the cluster ▪ link congestion

▪Application level

▪ Metrics exposed by the framework (e.g. operator selectivity, buffer congestion, etc.) ▪ Metrics exposed by software stacks (e.g. thread CPU utilization, hep size, etc.)

CURRENT SOLUTIONS

slide-24
SLIDE 24

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ANALYZE Collected data is analyzed to take scale-in/out decisions. Conditions are usually expressed on thresholds:

▪Static - rely on domain knowledge or sysadmin expertise ▪Dynamic - thresholds are automatically recomputed at runtime depending on monitored data

Thresholds can be checked (Heinze et al, 2014)

▪Locally - they evaluate the current status of each single host ▪Globally - represent the system as a whole

CURRENT SOLUTIONS

slide-25
SLIDE 25

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ANALYZE Collected data is analyzed to take scale-in/out decisions. Decisions can be

▪Reactive: decisions are based only on conditions expressed on monitored data (i.e. on past events) ▪Proactive: decisions are based on models that, on the basis of monitored data, predict future expected behaviors.

Other fundamental factors that you should take into account:

▪State migration

▪ Partitioned vs. “monolithic” state

▪Reconfiguration approach and its overhead

▪ Pause & Resume vs. Parallel Tracks

CURRENT SOLUTIONS

slide-26
SLIDE 26

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

PLAN A new configuration for the runtime is planned that aims at maximizing a metric or satisfying some objective functions.

▪Heuristics (many different greedy approaches) ▪Integer Linear Programming (Cardellini et al., 2017) ▪Predictive performance modelling (Li et al., 2016) ▪Game Theory (Mencagli, 2016) ▪…

CURRENT SOLUTIONS

slide-27
SLIDE 27

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

EXECUTE This phase is usually framework-dependent

▪The framework scheduler must be instructed to deploy the plan output form the previous phase ▪Depending on the framework you may incur possible extra-latencies!

▪ Full re-scheduling vs. incremental deployment update

CURRENT SOLUTIONS

slide-28
SLIDE 28

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPEN PROBLEMS

There are still a few open questions that require further research:

▪Interplay between operator scaling and resource scaling [Lombardi et

  • al. TPDS 2018]

▪Interplay between operator parallelism and operator placement [Cardellini et al., CCP&E 2017] ▪Interplay between operator parallelism and co-location (e.g. Spark/Flink) ▪Interplay among applications sharing the same cluster ▪Sensitivity to load imbalance [Gedik et al, 2014, Rivetti et al. 2015] ▪Sensitivity to data distribution [Rivetti et al. 2015] ▪Latency vs throughput goals [Cardellini et al., CCP&E 2017, Luthra et al, DEBS 2018]

slide-29
SLIDE 29

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

Current solutions typically address the problem of elastically scale

  • perator through fusion

▪Resources are considered static (i.e. over-provisioning) or… ▪they assume that a new resource can be “magically” instantiated for each new

  • perator instance (i.e. “joint scaling”)

Why can’t we consider operator instances and computing resources as distinct solutions to possibly different problems?

▪Scale-in/out operator instances through the DSP framework. ▪Scale-in/out computing resources through the cloud provide APIs. ▪Use an autonomic controller to symbiotically manage both solutions.

slide-30
SLIDE 30

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

Why can’t we consider operator instances and computing resources as distinct solutions to possibly different problems?

slide-31
SLIDE 31

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

slide-32
SLIDE 32

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Learns input workload patterns
 using neural networks

slide-33
SLIDE 33

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Learns input workload patterns
 using neural networks Learns how the application 
 topology handles the workload

slide-34
SLIDE 34

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Learns input workload patterns
 using neural networks Learns how the application 
 topology handles the workload Learns how each operator uses 
 its assigned computing resources
 when the workload varies

slide-35
SLIDE 35

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Learns input workload patterns
 using neural networks Learns how the application 
 topology handles the workload Learns how each operator uses 
 its assigned computing resources
 when the workload varies Learns how much overhead is
 produced by the DSP framework

slide-36
SLIDE 36

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Learns input workload patterns
 using neural networks Learns how the application 
 topology handles the workload Learns how each operator uses 
 its assigned computing resources
 when the workload varies Learns how much overhead is
 produced by the DSP framework The outputs from the profilers 
 constitute the application 
 description parameters

slide-37
SLIDE 37

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

23

Federico Lombardi – Ph.D. Final Defense Monday, August 27, 2018

Selectivity Profiler CPU Performance Profiler

1. For each edge xy compute: α(xy) = out(xy) / in(x)

slide-38
SLIDE 38

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

23

Federico Lombardi – Ph.D. Final Defense Monday, August 27, 2018

Selectivity Profiler CPU Performance Profiler

1. For each edge xy compute: α(xy) = out(xy) / in(x) Selectivity Table Edge Selectivity α AB 1.1 … … BC 0.5 BD 2.0 2. Store alfa in α Selectivity Table

slide-39
SLIDE 39

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

23

Federico Lombardi – Ph.D. Final Defense Monday, August 27, 2018

Selectivity Profiler CPU Performance Profiler

Spout A Input Stream
 (tuple/sec) CPU
 (MHz) 100 200 200 400 … … 1.000 2.000 Bolt B Input Stream
 (tuple/sec) CPU
 (MHz) 100 200 200 400 … … 1.000 2.000 Bolt C Input Stream
 (tuple/sec) CPU
 (MHz) 100 200 200 400 … … 1.000 2.000 1. Build CPU Performance Tables 1. For each edge xy compute: α(xy) = out(xy) / in(x) Selectivity Table Edge Selectivity α AB 1.1 … … BC 0.5 BD 2.0 2. Store alfa in α Selectivity Table

slide-40
SLIDE 40

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

23

Federico Lombardi – Ph.D. Final Defense Monday, August 27, 2018

Selectivity Profiler CPU Performance Profiler

Spout A Input Stream
 (tuple/sec) CPU
 (MHz) 100 200 200 400 … … 1.000 2.000 Bolt B Input Stream
 (tuple/sec) CPU
 (MHz) 100 200 200 400 … … 1.000 2.000 Bolt C Input Stream
 (tuple/sec) CPU
 (MHz) 100 200 200 400 … … 1.000 2.000 1. Build CPU Performance Tables 1. For each edge xy compute: α(xy) = out(xy) / in(x) Selectivity Table Edge Selectivity α AB 1.1 … … BC 0.5 BD 2.0 2. Store alfa in α Selectivity Table 2. Create and train an ANN for each operator:
 
 input stream à ANN à CPU

slide-41
SLIDE 41

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

slide-42
SLIDE 42

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Estimates how many computing 
 resources are needed by an operator
 instance in a given topology 
 configuration

slide-43
SLIDE 43

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Calculates a new scaling 
 configuration and simulates its 
 scheduling allocating the minimum 
 number of needed resources Estimates how many computing 
 resources are needed by an operator
 instance in a given topology 
 configuration

slide-44
SLIDE 44

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Algorithm 1. AutoScaling Algorithm

1: function COMPUTECONFIG(Estimator E, Scheduler S, List h Application i apps, List h int i input loads) 2: for all application ak in apps do 3: for all operator oi in ak do 4: iri E:getOperatorInputRateðinput loadsk; oiÞ 5: pi 1 6: while E:getOperatorInstanceCpuUsageðoi; iri

pi Þ >

core max thr & pi < max paralloi do 7: pi pi þ 1 8: while E:getOperatorInstanceCpuUsageðoi; iri

pi Þ <

core min thr & pi > 1 do 9: pi pi 1 10: worker nodes 1 11: while true do 12: allocation S:allocateðapps; worker nodesÞ 13: cpu usages E:getCpuUsagesðallocation; input loadsÞ 14: if 8x 2 cpu usages : x cpu max thr then 15: return worker nodes; fpig 16: worker nodes worker nodes þ 1

slide-45
SLIDE 45

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Algorithm 1. AutoScaling Algorithm

1: function COMPUTECONFIG(Estimator E, Scheduler S, List h Application i apps, List h int i input loads) 2: for all application ak in apps do 3: for all operator oi in ak do 4: iri E:getOperatorInputRateðinput loadsk; oiÞ 5: pi 1 6: while E:getOperatorInstanceCpuUsageðoi; iri

pi Þ >

core max thr & pi < max paralloi do 7: pi pi þ 1 8: while E:getOperatorInstanceCpuUsageðoi; iri

pi Þ <

core min thr & pi > 1 do 9: pi pi 1 10: worker nodes 1 11: while true do 12: allocation S:allocateðapps; worker nodesÞ 13: cpu usages E:getCpuUsagesðallocation; input loadsÞ 14: if 8x 2 cpu usages : x cpu max thr then 15: return worker nodes; fpig 16: worker nodes worker nodes þ 1

For each application calculates the number of 
 parallel instances of each operator needed to 
 sustain the predicted input rate

slide-46
SLIDE 46

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

Algorithm 1. AutoScaling Algorithm

1: function COMPUTECONFIG(Estimator E, Scheduler S, List h Application i apps, List h int i input loads) 2: for all application ak in apps do 3: for all operator oi in ak do 4: iri E:getOperatorInputRateðinput loadsk; oiÞ 5: pi 1 6: while E:getOperatorInstanceCpuUsageðoi; iri

pi Þ >

core max thr & pi < max paralloi do 7: pi pi þ 1 8: while E:getOperatorInstanceCpuUsageðoi; iri

pi Þ <

core min thr & pi > 1 do 9: pi pi 1 10: worker nodes 1 11: while true do 12: allocation S:allocateðapps; worker nodesÞ 13: cpu usages E:getCpuUsagesðallocation; input loadsÞ 14: if 8x 2 cpu usages : x cpu max thr then 15: return worker nodes; fpig 16: worker nodes worker nodes þ 1

Simulates the scheduling of applications and
 increases the number of provisioned computing
 resources until no host is (predictably) overloaded

slide-47
SLIDE 47

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

slide-48
SLIDE 48

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

▪Comparison between real and estimated total CPU usage (in Hz) for all instances of Counter and StopWordFilter operators. Sinusoidal input rates.

slide-49
SLIDE 49

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

▪ELYSIUM vs joint scaling. Rolling-Top-K-Words topology. Synthetic + real input data.

slide-50
SLIDE 50

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

▪ELYSIUM handling concurrent applications with different input rates.

slide-51
SLIDE 51

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

▪ELYSIUM proactive vs reactive behavior

slide-52
SLIDE 52

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

OPERATOR/RESOURCE SCALING

ELYSIUM: Elastic Symbiotic Scaling of Operators and Resources in Stream Processing Systems [Lombardi et al., TPDS 2017]

▪ELYSIUM resource saving

slide-53
SLIDE 53

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Most solutions for elastic scaling assume that

▪input rate can can vary in time ▪input content is uniformly distributed

The second assumption is unrealistic.

▪Think about the distribution of keys in obvious applications (e.g. rolling top-k words)

Non-uniform content distribution has strong impact at runtime

▪Skewed memory footprint for partitioned-state operators ▪Skewed load on parallelized operator instances ▪Things get more complicated if computations with different complexities are performed for different tuples.

slide-54
SLIDE 54

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Example:

▪Implementation in Storm of the 3rd query of the DEBS 2013 Grand Challenge ▪Running on a 8-cores 2 GHz Intel Xeon (16 logical cores) with 32 GB of RAM

1000 2000 3000 4000 5000 6000 7000 8000 2 3 4 5 6 7 8 9 10

tuples / s number of instances k

Throughput (tuples / s) as a function of the number of instances

Modulo

  • Apache Storm

Standard Key Grouping

slide-55
SLIDE 55

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Gedik et al., 2014]

▪Ad-hoc mapping of “heavy hitters” to operator instances. hashing for the rest.

9 key values make up for roughly 38% of the stream Frequent key values (Heavy Hitters) Non frequent key values (Sparse Items)

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

slide-56
SLIDE 56

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Gedik et al., 2014]

▪Ad-hoc mapping of “heavy hitters” to operator instances. hashing for the rest.

9 key values make up for roughly 38% of the stream Frequent key values (Heavy Hitters) Non frequent key values (Sparse Items)

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

Few key values may cause most unbalance 1. One-to-one mapping Frequent key values (Heavy Hitters) Non frequent key values (Sparse Items) 9 key values make up for roughly 38% of the stream

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

slide-57
SLIDE 57

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Gedik et al., 2014]

▪Ad-hoc mapping of “heavy hitters” to operator instances. hashing for the rest.

HH1 HH2 HH3 HH4 HH5 HH6 HH7 HH8 HH9 HH2

Instance 2 Instance 1 Instance 4 Instance 3

HH1 HH3 HH4 HH5 HH6 HH7 HH8 HH9

Non frequent key values (Sparse Items)

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

slide-58
SLIDE 58

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Gedik et al., 2014]

▪Technical issues

▪ Large number of keys. How to keep track of all their frequencies in the stream? ▪ While achieving balance, the system should also maintain low migration cost

▪Solution: Use lossy counting to keep track of key frequencies

▪ Count elements in rounds ▪ Remove less frequent elements at each round end ▪ Space saving has tighter theoretical bounds on memory complexity

▪Use several counters over tumbling windows to emulate a sliding window

▪ Can keep track of load distributions evolving in time ▪ Manages state migration

▪Problem: is the impact from non-frequent keys (sparse items) really negligible?

slide-59
SLIDE 59

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti et al., DEBS 2015]

▪Ad-hoc mapping of heavy hitters AND groups of sparse items.

HH

991 key values make up for roughly 62% of the stream

Sparse Items do not cause unbalance Handle Sparse Items with the standard solution

Non frequent key values (Sparse Items)

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

slide-60
SLIDE 60

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti et al., DEBS 2015]

▪Ad-hoc mapping of heavy hitters AND groups of sparse items.

HH

991 key values make up for roughly 62% of the stream

Sparse Items do not cause unbalance Handle Sparse Items with the standard solution

Non frequent key values (Sparse Items)

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH

SI1 SI2 SI3 SI4

HH HH HH HH

Instance 2 Instance 1 Instance 4 Instance 3

248 key values 248 keys values 248 keys values 248 keys values

Worst Case partitioning

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

slide-61
SLIDE 61

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti et al., DEBS 2015]

▪Ad-hoc mapping of heavy hitters AND groups of sparse items.

HH

991 key values make up for roughly 62% of the stream

Sparse Items do not cause unbalance Handle Sparse Items with the standard solution

Non frequent key values (Sparse Items)

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH

SI1 SI2 SI3 SI4

HH HH HH HH

Instance 2 Instance 1 Instance 4 Instance 3

248 key values 248 keys values 248 keys values 248 keys values

Worst Case partitioning

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH HH HH HH

Instance 2 Instance 1 Instance 4 Instance 3

SI1 SI2 SI3 SI4

248 key values 248 key values 248 key values

HH

SI1 SI2 SI3 SI4

248 key values 248 keys values 248 keys values 248 keys values

Worst Case partitioning

248 key values

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

slide-62
SLIDE 62

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti et al., DEBS 2015]

▪Ad-hoc mapping of heavy hitters AND groups of sparse items.

HH

991 key values make up for roughly 62% of the stream

Sparse Items do not cause unbalance Handle Sparse Items with the standard solution

Non frequent key values (Sparse Items)

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH

SI1 SI2 SI3 SI4

HH HH HH HH

Instance 2 Instance 1 Instance 4 Instance 3

248 key values 248 keys values 248 keys values 248 keys values

Worst Case partitioning

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH HH HH HH

Instance 2 Instance 1 Instance 4 Instance 3

SI1 SI2 SI3 SI4

248 key values 248 key values 248 key values

HH

SI1 SI2 SI3 SI4

248 key values 248 keys values 248 keys values 248 keys values

Worst Case partitioning

248 key values

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH

991 key values make up for roughly 62% of the stream

Each single key value does not cause unbalance

  • 2. Many-to-one mapping

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

slide-63
SLIDE 63

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti et al., DEBS 2015]

▪Ad-hoc mapping of heavy hitters AND groups of sparse items.

HH

991 key values make up for roughly 62% of the stream

Sparse Items do not cause unbalance Handle Sparse Items with the standard solution

Non frequent key values (Sparse Items)

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH

SI1 SI2 SI3 SI4

HH HH HH HH

Instance 2 Instance 1 Instance 4 Instance 3

248 key values 248 keys values 248 keys values 248 keys values

Worst Case partitioning

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH HH HH HH

Instance 2 Instance 1 Instance 4 Instance 3

SI1 SI2 SI3 SI4

248 key values 248 key values 248 key values

HH

SI1 SI2 SI3 SI4

248 key values 248 keys values 248 keys values 248 keys values

Worst Case partitioning

248 key values

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH

991 key values make up for roughly 62% of the stream

Each single key value does not cause unbalance

  • 2. Many-to-one mapping

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

HH2

Instance 2 Instance 1 Instance 4 Instance 3

HH1 HH3 HH4 HH5 HH6 HH7 HH8 HH9 SI1

SI4 SI5 SI8 SI2 SI3 SI6 SI7

124 key values 497 key values 374 key values 5 key values

HH1 HH2 HH3 HH4 HH5 HH6 HH7 HH8 HH9 SI1

SI4 SI5 SI8 SI2 SI3 SI6SI7

124 key values 124 key values 124 key values 124 key values 124 key values 124 key values

Worst Case partitioning

124 key values 124 key values

1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000

Probability key values

Skewed key value distribution

probability distribution

slide-64
SLIDE 64

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti et al., DEBS 2015]

▪Ad-hoc mapping of heavy hitters AND groups of sparse items.

counter tuple

Space Saving vh h(t)

counter

LEARN BUILD DEPLOY

S C H E D U L E R <SI,HH> µ × k ⎡1/ε⎤ key grouping based on <SI,HH> O0 O1 Ok -1

<tuple,counter> <bucket,counter> tuples tuples

data source P P

B C D A F E

slide-65
SLIDE 65

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti et al., DEBS 2015]

▪Mapping also sparse items actually makes a difference

  • 100

200 300 400 500 600 700 800 0.1 0.5 1 1.5 2 2.5 3 Imbalance 𝜇 (%) Zipfian exponent 𝛽

Imbalance (𝜇) as a function of the Zipfian exponent 𝛽 with k=10 (𝜄=0.1 and 𝜈=2)

DKG Worst Case DKG WOSIM Worst Case

slide-66
SLIDE 66

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti et al., DEBS 2015]

▪Mapping also sparse items actually makes a difference

1000 2000 3000 4000 5000 6000 7000 8000 2 3 4 5 6 7 8 9 10

tuples / s number of instances k

Throughput (tuples / s) as a function of the number of instances

Modulo DKG Apache Storm Standard Key Grouping

slide-67
SLIDE 67

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

With stateless operators the same happens if computation latency depends on the tuple content:

OP 𝑃1 OP 𝑃2 OP 𝑃1 OP 𝑃2 𝜏 =

Short Execution time Long execution time

𝑐4 𝑏4 𝑐3 𝑏3 𝑐2 𝑏2 𝑐1 𝑏1 𝑐5 𝑐4 𝑐3 𝑐2 𝑐1 𝑐4 𝑐3 𝑐2 𝑐1

Completion Time Gain

Round-Robin Online Full-Knowledge Unfeasible → Approximation?

slide-68
SLIDE 68

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti el al., 2016]

▪Dynamically schedule incoming tuples

  • 𝑃

1 𝜁 log 1 𝜀 log 𝑛 + log 𝑜

OSG

OP S OP 𝑃1 OP 𝑃2 OP 𝑃3 JSQ Scheduler Schedules

slide-69
SLIDE 69

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

SENSITIVITY TO LOAD IMBALANCE

Solution: [Rivetti el al., 2016]

▪Dynamically schedule incoming tuples

10 20 30 40 50 uniform zipf-0.5 zipf-1 zipf-1.5 zipf-2 zipf-2.5 zipf-3 average completion time (ms) frequency distributions Full Knowledge OSG Round-Robin

slide-70
SLIDE 70

DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI

Cyber Intelligence and information Security

CIS Sapienza

THANKS!

Time for questions?