An Experiment-Driven Performance Model of Stream Processing - - PowerPoint PPT Presentation

an experiment driven performance model of stream
SMART_READER_LITE
LIVE PREVIEW

An Experiment-Driven Performance Model of Stream Processing - - PowerPoint PPT Presentation

An Experiment-Driven Performance Model of Stream Processing Operators in Fog Computing Environments Hamidreza Arkian 1 , Guillaume Pierre 1 , Johan Tordsson 2 , Erik Elmroth 2 1 University of Rennes1/IRISA, France 2 Elastisys AB, Sweden SAC20 -


slide-1
SLIDE 1

An Experiment-Driven Performance Model of Stream Processing Operators in Fog Computing Environments

Hamidreza Arkian1, Guillaume Pierre1, Johan Tordsson2, Erik Elmroth2

1University of Rennes1/IRISA, France 2Elastisys AB, Sweden

SAC’20 - March 30-April 3, 2020 - Brno, Czech Republic

slide-2
SLIDE 2

2/16

IoT-to-Cloud basic architecture

slide-3
SLIDE 3

3/16

Cloud-based stream processing

Apache Flink

slide-4
SLIDE 4

4/16

Challenges

Apache Flink

Low Throughput!! Low Bandwidth!! Cost!! Continuously generating stream of data with high rate Latency-sensitive applications

slide-5
SLIDE 5

5/16

Fog-based stream processing

slide-6
SLIDE 6

6/16

Operator 2

Stream processing in Fog environment

Source

Operator 4 Operator 1

Sink

Operator 2 Operator 3

Source

Operator 4 Operator 1

Sink

Operator 2 Operator 3

Source

Operator 2

Logical graph of DSP Workflow execution model

slide-7
SLIDE 7

7/16

Operator 2

Stream processing in geo-distributed environments

Source

Operator 4 Operator 1

Sink

Operator 2 Operator 3

Source

Operator 4 Operator 1

Sink

Operator 2 Operator 3

Source

Operator 2

Logical graph of DSP Workflow execution model Deployment in Fog geo-distributed environment

Sink

Operator 1 Op2 Replica2 Operator 3 Operator 4

Source

Op2 Replica1 Op2 Replica3

Source

slide-8
SLIDE 8

8/16

Challenges

➢ Understanding the performance of a geo-distributed stream processing application is difficult. ➢ Any configuration decision can have a significant impact on performance.

slide-9
SLIDE 9

9/16

Experimental setup

➢ Emulation of a real fog platform

  • 32-core server ≈ 16 fog nodes (2 cores/node)
  • Emulated network latencies
  • Apache Flink

➢ Test Application

  • Input stream of 100,000 Tuple2 records
  • The operator calls the Fibonacci function

Fib(24) upon every processed record

➢ Performance metric:

  • Processing Time (PT)
slide-10
SLIDE 10

10/16

Modeling operator replication

➢ n operator replicas should in principle process data n times faster than a single replica ➢ α represents the computation capacity of a single node. ➢ We can determine the value of α based on one measurement

Experiment Model

slide-11
SLIDE 11

11/16

Considering heterogeneous network delays

➢ Network delays between data sources and operator replicas slow down the whole system. ➢ When the network delays are heterogeneous, the dominating one is the greatest one (NDmax). ➢ γ represents the impact of network delays on overall performance. ➢ We can determine both α and γ based on two measurements

Experiment Model

slide-12
SLIDE 12

12/16

Improving the model’s accuracy

➢ Operator replication incurs some amount of parallelization inefficiency ➢ The speedup with n nodes is usually a little less than n ➢ 𝛾 represents Flink’s parallelization inefficiency ➢ We can determine α, 𝛾 and γ based on three or more measurements

Experiment Model

slide-13
SLIDE 13

13/16

Prediction accuracy Accuracy metric: 𝑁𝐵𝑄𝐹

4 measurements, 2.0% accuracy

slide-14
SLIDE 14

14/16

What about modeling an entire (simple) workflow?

➢ The throughput of an entire workflow is determined by the slowest operator 𝛲Workflow = max(𝛲Map+KeyBy, 𝛲Reduce) Experiment Model Workflow

slide-15
SLIDE 15

15/16

Can we reuse the parameters instead of multiple measurements?

➢ 𝛽 cannot be reused because it is specific to the computation complexity of one operator. ➢ β and γ capture properties that are independent from the nature of the computation carried out by the

  • perator.

➢ β and γ values of one operator’s model might be reused for other operators’ models.

Calibrated model for Operator 1 Uncalibrated model for Operator 2 𝛽1 β1 γ1 𝛽2 β1 γ1

slide-16
SLIDE 16

16/16

Conclusions

➢ Heterogeneous network characteristics make it difficult to understand the performance of stream processing engines in geo-distributed environments. ➢ A predictive performance model for Apache Flink operators that is backed by experimental measurements and evaluations was proposed. ➢ The model predictions are accurate within ±2% of the actual values.

Hamidreza Arkian hamidreza.arkian@irisa.fr

Acknowledgment

This work is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 765452. The information and views set out in this publication are those of the author(s) and do not necessarily reflect the

  • fficial opinion of the European Union. Neither the European Union institutions

and bodies nor any person acting on their behalf may be held responsible for the use which may be made of the information contained therein. Training the next generation of European Fog computing experts http://www.fogguru.eu/