Autonomic and Latency-Aware Degree of Parallelism Management in SPar - - PowerPoint PPT Presentation

autonomic and latency aware degree of parallelism
SMART_READER_LITE
LIVE PREVIEW

Autonomic and Latency-Aware Degree of Parallelism Management in SPar - - PowerPoint PPT Presentation

Autonomic and Latency-Aware Degree of Parallelism Management in SPar Adriano Vogel 1 , Dalvan Griebler 1 , Daniele De Sensi 2 , Marco Danelutto 2 and Luiz Gustavo Fernandes 1 1 Pontifical Catholic University of Rio Grande do Sul (PUCRS) 2


slide-1
SLIDE 1

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Adriano Vogel1, Dalvan Griebler1, Daniele De Sensi2, Marco Danelutto2 and Luiz Gustavo Fernandes1

1 Pontifical Catholic University of Rio Grande do Sul (PUCRS)

2 Department of Computer Science, University of Pisa (UNIPI)

2018

slide-2
SLIDE 2

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Outline

  • Introduction & Related Work
  • SPar (overview)
  • Parallelism & Latency
  • Autonomic management of Parallelism Degree
  • Experimental results
  • Conclusions

2

slide-3
SLIDE 3

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Stream processing applications

3

slide-4
SLIDE 4

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

  • Challenges

Parallel Programming complexities ○ Productivity ○ Programming and system architecture expertises

  • High-level parallel programming frameworks

○ Intel Threading Building Blocks (TBB) ○ FastFlow ○ StreamIt

  • DSL (Domain-Specific Language)

○ SPar

The scenario

4

slide-5
SLIDE 5

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Related Work

5

Work Library/System Environment Objective

De Sensi et al. [4] NORNIR Multi-core Manage throughput and power consumption De Matteis et al. [5] FastFlow Multi-core Latency and energy efficiency Gedik et al. [2] SPL Multi-core High throughput without wasting computational resources Heinze et al. [6] FUGU Distributed Latency and the system utilization Selva et al. [8] StreamIt Multi-core Throughput This work SPar Multi-core Parallelism abstraction for latency

slide-6
SLIDE 6

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

SPar: the concepts

DSL for stream parallelism

  • Internal DSL
  • Fully C++ compliant (C++11 or higher)

Exploits C++ attributes to expose stream parallelism

  • In standard C++ (non parallel) code

(à la OpenMP, somehow) Minimal set of attributes

  • To identify stream sources and stream processors

6

slide-7
SLIDE 7

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Overview of SPar

7

[[spar::ToStream]] while(true){ item = read(); [[spar::Stage,spar::Input(item),spar::Output(item),spar::Replicate(N)]]{ item = filter(item); } [[spar::Stage,spar::Input(item)]]{ write(item); } } ID AUX

slide-8
SLIDE 8

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Overview of SPar

8

[[spar::ToStream]] while(true){ item = read(); [[spar::Stage,spar::Input(item),spar::Output(item),spar::Replicate(N)]]{ item = filter(item); } [[spar::Stage,spar::Input(item)]]{ write(item); } }

We want to avoid this…

slide-9
SLIDE 9

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Spar RunTime

FastFlow

  • Backend of SPar
  • Provides necessary patterns and building blocks

Rules

  • Transform attributes into FastFlow building blocks
  • Reusing business logic code from original seq C++

E.g. : [[spar::stage, … , spar::replicate(N)]] -> farm()

9

slide-10
SLIDE 10

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Spar RunTime

10

[[spar::ToStream]] [[spar::Stage,spar::Replicate(N)]] [[spar::Stage]]

slide-11
SLIDE 11

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Spar RunTime

11

[[spar::ToStream]] [[spar::Stage,spar::Replicate(N)]] [[spar::Stage]]

Adaptivity needed

slide-12
SLIDE 12

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

  • Lane Detection application running in a 8 cores - 16 SMT machine

(from Griebler, D.; Hoffmann, R. B.; Danelutto, M.; Fernandes, L. G. “Higher-Level Parallelism Abstractions for Video Applications

with SPar”. In: 3rd International Workshop on Reengineering for Parallelism in Heterogeneous Parallel Platforms, 2017)

The Impact of Parallelism on Latency

12

slide-13
SLIDE 13

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

The Impact of Parallelism on Latency

13

slide-14
SLIDE 14

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Targets

  • Goals:

○ Abstract definition of the parallelism degree in SPar ○ Latency monitoring ○ Adapt the number of replicas on-the-fly

  • Contributions:

○ An extension of the SPar DSL with a new parallelism abstraction for latency-sensitive applications. ○ An experimental evaluation of the strategy’s effectiveness

14

slide-15
SLIDE 15

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

The solution

  • Monitor observes

and reports what’s going on

  • Regulator applies

pardegree regulation policies

Autonomous Degree of Parallelism Implementation

15

slide-16
SLIDE 16

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

  • The regulator strategy

Autonomous Degree of Parallelism Implementation

16

slide-17
SLIDE 17

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Results: Threshold 10% and Latency constraint 180 ms

Number of Replicas used (Degree of Parallelism)

17

Throughput (frames per second) and Latency of stream items (ms). SF (scaling factor) replicas that activated or suspended in reconfigurations

slide-18
SLIDE 18

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Results: Threshold 20% and Latency constraint 180 ms

18

Number of Replicas used (Degree of Parallelism) Throughput (frames per second) and Latency of stream items (ms). SF (scaling factor) replicas that activated or suspended in reconfigurations

slide-19
SLIDE 19

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Results: Threshold 10% and Latency constraint 200 ms

19

Number of Replicas used (Degree of Parallelism) Throughput (frames per second) and Latency of stream items (ms). SF (scaling factor) replicas that activated or suspended in reconfigurations

slide-20
SLIDE 20

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Results: Threshold 20% and Latency constraint 200 ms

20

Number of Replicas used (Degree of Parallelism) Throughput (frames per second) and Latency of stream items (ms). SF (scaling factor) replicas that activated or suspended in reconfigurations

slide-21
SLIDE 21

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

Conclusion

  • Latency is important
  • SPar extended with a new parallelism abstraction
  • On the fly adaptation of parallelism degree
  • Effectiveness demonstrated with a stream processing application

Future Work:

Consider applications with a more complex structure

Evaluate approach in other latency sensitive applications

Proactive approaches

21

slide-22
SLIDE 22

Autonomic and Latency-Aware Degree of Parallelism Management in SPar

References

[1] Andrade, H.; Gedik, B.; Turaga, D. “Fundamentals of Stream Processing: Application Design, Systems, and Analytics”. Cambridge University Press, 2014. [2] Gedik, B.; Schneider, S.; Hirzel, M.; Wu, K.-L. “Elastic scaling for data stream processing”, IEEE Transactions on Parallel and Distributed Systems, vol. 25–6, 2014, pp. 1447–1463. [3]Su, Y.; Shi, F.; Talpur, S.; Wang, Y.; Hu, S.; Wei, J. “Achieving self-aware parallelism in stream programs”, Cluster Computing, vol. 18–2, 2015, pp. 949–962. [4] Sensi, D. D.; Torquati, M.; Danelutto, M. “A reconfiguration algorithm for power-aware parallel applications”, ACM Transactions on Architecture and Code Optimization (TACO), vol. 13–4, 2016, pp. 43. [5] De Matteis, T.; Mencagli, G. “Keep calm and react with foresight: strategies for low-latency and energy-efficient elastic data stream processing”. In: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016, pp. 13. [6] Heinze, T.; Pappalardo, V.; Jerzak, Z.; Fetzer, C. “Auto-scaling techniques for elastic data stream processing”. In: Data Engineering Workshops (ICDEW), 2014 IEEE 30th International Conference on, 2014, pp. 296–302. [7] Griebler, D. “Domain-Specific Language & Support Tool for High-Level Stream Parallelism”, Ph.D. Thesis, Faculdade de Informática - PPGCC - PUCRS, Porto Alegre, Brazil, 2016, 243p. [8] Selva, M.; Morel, L.; Marquet, K.; Frenot, S. “A monitoring system for runtime adaptations of streaming applications”. In: Parallel, Distributed and Network Based Processing (PDP), 2015 23rd Euromicro International Conference on, 2015,

  • pp. 27–34.

22

slide-23
SLIDE 23

Thank you!

E-mail: adriano.vogel@acad.pucrs.br

Daniele De Sensi Marco Danelutto Dalvan Griebler Adriano Vogel Luiz Gustavo Fernandes