THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - PowerPoint PPT Presentation

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING Leonardo Querzoni querzoni@diag.uniroma1.it Auto-DaSP - Turin, August 28th 2018 D IPARTIMENTO DI I NGEGNERIA CIS Sapienza I NFORMATICA A UTOMATICA E G ESTIONALE A NTONIO R UBERTI Cyber Intelligence and information Security

ELASTIC COMPUTING ”[...] defines elasticity as the configurability and expandability of the solution [...] Centrally, it is the ”Elasticity is basically a ’rename’ of scalability [...]” ability to scale up and scale down capacity based and ”removes any manual labor needed to on subscriber workload.”   increase or reduce capacity”   OCDA. Master Usage Model: Compute Infratructure as a SCHOUTEN, E. (IBM) Rapid Elasticity and the Cloud, Septem- Service. Tech. rep., Open Data Center Alliance (OCDA), 2012 ber 2012 ”Rapid elasticity: Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.”   MELL, P ., AND GRANCE, T. The NIST Definition of Cloud Computing. Tech. rep., U.S. National Institute of Standards and Technology (NIST), SP 800-145, 2011 ”the quantifiable ability to manage, measure, predict and adapt responsiveness of an application based on real ”Elasticity measures the ability of the time demands placed on an infrastructure using a combi- cloud to map a single user request to nation of local and remote computing resources.”   different resources.”   COHEN, R. Defining Elastic Computing, September 2009. WOLSKI, R. Cloud Computing and Open Source: Watching Hype meet Reality, May 2011 THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING Load Static Provisioning Workload Time THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING Load Static Provisioning Workload Elastic Provisioning Time THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING Load Static Provisioning Workload Elastic Provisioning Underprovisioning Overprovisioning Time THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING Load Static Provisioning Workload Elastic Provisioning Elastic provisioning Time THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING Elastic computing drove the success of cloud providers ▪ Virtually infinite resources ▪ On-demand provisioning ▪ Near-instant availability ▪ Automatic scale-out ▪ Pay-what-you-use THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC COMPUTING Elastic processing of big-data is today a reality THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

DISTRIBUTED STREAM PROCESSING Data Stream Processing Engine: ▪ continuously calculate results for persistent queries ▪ on (potentially) unbounded data streams ▪ using operators: algebraic (filters, join, aggregation) or user defined ▪ stateless/stateful source A op 1 op 2 DB op 1 op 1 op 1 op 4 op 5 op 4 source B KB op 3 event / tuple THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

DISTRIBUTED STREAM PROCESSING Data stream processing (DSP) was in the past considered a solution for very specific problems. ▪ Financial trading ▪ Logistics tracking ▪ Factory monitoring Today the potentialities of DSPs start to be used in more general settings. DSP : online processing = MR : batch processing THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

ELASTIC STREAM VS BATCH Why is realizing elastic stream processing more di ffi cult? ▪ Data in motion vs data at rest ▪ Variable data rates ▪ No obvious ways to characterize data content ▪ Latency-sensitive applications ▪ Batch applications are typically throughput-oriented ▪ Long term executions ▪ Batch jobs are expected to be short-lived ▪ Stream processing applications are designed to stay up and running for hours/days/ week/months THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

HOW TO SCALE DSP A few optimization strategies are known to deal with these issues: Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014 THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

HOW TO SCALE DSP A few optimization strategies are known to deal with these issues: Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014 FUSION THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

HOW TO SCALE DSP A few optimization strategies are known to deal with these issues: Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014 FUSION FISSION THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

HOW TO SCALE DSP A few optimization strategies are known to deal with these issues: Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014 FUSION FISSION PLACEMENT THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

HOW TO SCALE DSP A few optimization strategies are known to deal with these issues: Hirzel et al. A Catalog of Stream Processing Optimizations. ACM CSUR, Vol. 46, No. 4, 2014 FUSION FISSION PLACEMENT LOAD BALANCING THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

CURRENT SOLUTIONS Most of the existing solutions apply a standard MAPE-K (monitor, analyze, plan, and execute) model: CONTROLLER PLAN ANALYZE KNOWLEDGE EXECUTE MONITOR DSP Framework THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

CURRENT SOLUTIONS MONITOR Performance about the runtime execution of steam applications is gathered at several possible collection points: ▪ Hosts-level ▪ memory/cpu utilization ▪ interprocess communications ▪ Network-level ▪ communications among hosts in the cluster ▪ link congestion ▪ Application level ▪ Metrics exposed by the framework (e.g. operator selectivity, bu ff er congestion, etc.) ▪ Metrics exposed by software stacks (e.g. thread CPU utilization, hep size, etc.) THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

CURRENT SOLUTIONS ANALYZE Collected data is analyzed to take scale-in/out decisions. Conditions are usually expressed on thresholds: ▪ Static - rely on domain knowledge or sysadmin expertise ▪ Dynamic - thresholds are automatically recomputed at runtime depending on monitored data Thresholds can be checked (Heinze et al, 2014) ▪ Locally - they evaluate the current status of each single host ▪ Globally - represent the system as a whole THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - PowerPoint PPT Presentation

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING Leonardo Querzoni querzoni@diag.uniroma1.it Auto-DaSP - Turin, August 28th 2018 D IPARTIMENTO DI I NGEGNERIA CIS Sapienza I NFORMATICA A UTOMATICA E G ESTIONALE A NTONIO R UBERTI

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

Using Kieker with Elastic APM: An Experience Report Valentin Seifermann Duan Okanovi SSP

Monitor your containers with the Elastic Stack Monica Sarbu Monica Sarbu Team lead, Beats team

Stream Ciphers Stream Ciphers 1 Stream Ciphers Generalization of one-time pad Trade

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed

What is a road vehicle? Road & Non-road vehicles in the RVSA The Meaning of Road

Towards Benchmarking Stream Data Warehouses Arian Br, Lukasz Golab 02.11.2012 Stream Data

Cyber-Physical-Social Systems Towards a New Paradigm for elastic distributed systems 2 August

A Theory of A Theory of Elastic Presentation Space Elastic Presentation Space Sheelagh

A Theory of A Theory of Elastic Presentation Space Elastic Presentation Space Sheelagh

Unconstrained Elastic Matching Unconstrained Elastic Matching and Eigen Eigen- -Deformations

Trapped Modes in Elastic Media for Zero Poisson Coefficient Three-dimensional elastic plate with

Transparent boundary conditions for the elastic Transparent boundary conditions for the elastic

Notes 1D Elastic Continuum From last class: elastic rod Required reading: linear

Synchronous Elastic Systems Synchronous Elastic Systems Mike Kishinevsky and Jordi Cortadella

1 in a Nutshell 2019 Pass the S ALT Workshop Overview 2 Introduction to Elastic S tack

Computing and Processing Correspondences with Functional Maps SIGGRAPH 2017 course Maks Ovsjanikov

U SING Q UALITY V IEWS TO T ACKLE T ECHNICAL D EBT AT S EEN N OT S EEN Q UALITY V IEWS M OTIVATION

Queues and Network Control for Urban Traffic Systems Workshop on Control for Networked

Welcome to the DDDAS2020 Conference Conference Co-Chairs: Dr. Frederica Darema and Dr. Erik Blasch

Rate-Based Stochastic Fusion Calculus and Angelo Troina Continuous Time Markov Chains Fusion

Immutability, or Putting the Dream Machine to Work The trie memory scheme is ine ffi cient for

Symbiosis Column Stores and R Statistics XLDB 2013, Hannes Mhleisen, CWI Database Architectures

Taxi Travel Time Prediction Assignment 3 - Outcome Lecture Sebastian Caldas and Nicholay Topin

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - PowerPoint PPT Presentation

THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING Leonardo Querzoni querzoni@diag.uniroma1.it Auto-DaSP - Turin, August 28th 2018 D IPARTIMENTO DI I NGEGNERIA CIS Sapienza I NFORMATICA A UTOMATICA E G ESTIONALE A NTONIO R UBERTI

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

Using Kieker with Elastic APM: An Experience Report Valentin Seifermann Duan Okanovi SSP

Monitor your containers with the Elastic Stack Monica Sarbu Monica Sarbu Team lead, Beats team

Stream Ciphers Stream Ciphers 1 Stream Ciphers Generalization of one-time pad Trade

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed

What is a road vehicle? Road &amp; Non-road vehicles in the RVSA The Meaning of Road

Towards Benchmarking Stream Data Warehouses Arian Br, Lukasz Golab 02.11.2012 Stream Data

Cyber-Physical-Social Systems Towards a New Paradigm for elastic distributed systems 2 August

A Theory of A Theory of Elastic Presentation Space Elastic Presentation Space Sheelagh

A Theory of A Theory of Elastic Presentation Space Elastic Presentation Space Sheelagh

Unconstrained Elastic Matching Unconstrained Elastic Matching and Eigen Eigen- -Deformations

Trapped Modes in Elastic Media for Zero Poisson Coefficient Three-dimensional elastic plate with

Transparent boundary conditions for the elastic Transparent boundary conditions for the elastic

Notes 1D Elastic Continuum From last class: elastic rod Required reading: linear

Synchronous Elastic Systems Synchronous Elastic Systems Mike Kishinevsky and Jordi Cortadella

1 in a Nutshell 2019 Pass the S ALT Workshop Overview 2 Introduction to Elastic S tack

Computing and Processing Correspondences with Functional Maps SIGGRAPH 2017 course Maks Ovsjanikov

U SING Q UALITY V IEWS TO T ACKLE T ECHNICAL D EBT AT S EEN N OT S EEN Q UALITY V IEWS M OTIVATION

Queues and Network Control for Urban Traffic Systems Workshop on Control for Networked

Welcome to the DDDAS2020 Conference Conference Co-Chairs: Dr. Frederica Darema and Dr. Erik Blasch

Rate-Based Stochastic Fusion Calculus and Angelo Troina Continuous Time Markov Chains Fusion

Immutability, or Putting the Dream Machine to Work The trie memory scheme is ine ffi cient for

Symbiosis Column Stores and R Statistics XLDB 2013, Hannes Mhleisen, CWI Database Architectures

Taxi Travel Time Prediction Assignment 3 - Outcome Lecture Sebastian Caldas and Nicholay Topin

What is a road vehicle? Road & Non-road vehicles in the RVSA The Meaning of Road