Conception and Development of a Pipe & Filter Framework for C++ - - PowerPoint PPT Presentation

conception and development of a pipe filter framework for
SMART_READER_LITE
LIVE PREVIEW

Conception and Development of a Pipe & Filter Framework for C++ - - PowerPoint PPT Presentation

Conception and Development of a Pipe & Filter Framework for C++ Johannes Ohlemacher November 23, 2016 Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 1 / 38 Outline 1. Motivation


slide-1
SLIDE 1

Conception and Development of a Pipe & Filter Framework for C++

Johannes Ohlemacher November 23, 2016

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 1 / 38

slide-2
SLIDE 2

Outline

  • 1. Motivation
  • 2. Goals
  • 3. Foundations
  • 4. Design and Evaluation of a SPSC Queue
  • 5. Design and Evaluation of TeeTime for C++
  • 6. Conclusions and Future Work

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 2 / 38

slide-3
SLIDE 3

Motivation

Motivation

◮ TeeTime

◮ high performance, low overhead Pipe-and-Filter (P&F) Framework ◮ flexibel and extensible ◮ architecture designed to be implementable in other languages Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 3 / 38

slide-4
SLIDE 4

Motivation

Motivation

◮ TeeTime

◮ high performance, low overhead Pipe-and-Filter (P&F) Framework ◮ flexibel and extensible ◮ architecture designed to be implementable in other languages

◮ Why C++?

◮ How does the language influence the performance of TeeTime? ◮ Compare TeeTime with an established C++ solution (FastFlow) Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 3 / 38

slide-5
SLIDE 5

Goals

Goals

◮ G1: Basic Pipe & Filter Setup ◮ G2: Ready-To-Use Stages ◮ G3: Support for Multiple Platforms ◮ G4: Comparison with FastFlow ◮ G5: Comparison with Java-based TeeTime

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 4 / 38

slide-6
SLIDE 6

The Pipe-and-Filter Architectural Style

Foundations

◮ The Pipe-and-Filter Architectural Style (Christian Wulf, 2016) ◮ The Pipe-and-Filter Framework TeeTime (Wulf, Ehmke, and

Hasselbring, 2014)

◮ The C++11 Programming Language (Meyers, 2014; Stroustrup,

2013)

◮ The FastFlow Programming Framework (M. Aldinucci et al., 2010)

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 5 / 38

slide-7
SLIDE 7

The C++11 Programming Language

Foundations

◮ Native code ◮ Manual Memory Management ◮ Value Semantics ◮ Move Semantics

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 6 / 38

slide-8
SLIDE 8

The FastFlow Programming Framework

Foundations

◮ Parallel Programming Framework ◮ Ready-to-use skeletons: Pipe and Taskfarm ◮ All Stages are active ◮ No value semantics

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 7 / 38

slide-9
SLIDE 9

The FastFlow Programming Framework

Foundations Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 8 / 38

slide-10
SLIDE 10

The FastFlow Programming Framework

Foundations Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 9 / 38

slide-11
SLIDE 11

Overview of existing SPSC queue implementations

Design and Evaluation of a SPSC Queue

◮ Lamport: (Lamport, 1983) ◮ FastForward: (Marco Aldinucci, Danelutto, et al., 2012;

Giacomoni, Moseley, and Vachharajani, 2008)

1❤tt♣s✿✴✴❣✐t❤✉❜✳❝♦♠✴❢❛❝❡❜♦♦❦✴❢♦❧❧② 2❤tt♣✿✴✴✇✇✇✳❜♦♦st✳♦r❣ Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 10 / 38

slide-12
SLIDE 12

Overview of existing SPSC queue implementations

Design and Evaluation of a SPSC Queue

◮ Lamport: (Lamport, 1983) ◮ FastForward: (Marco Aldinucci, Danelutto, et al., 2012;

Giacomoni, Moseley, and Vachharajani, 2008) Queue type type safe value semantics move semantics folly1 Lamport yes yes yes boost2 Lamport yes yes no FastFlow FastForward no no no

1❤tt♣s✿✴✴❣✐t❤✉❜✳❝♦♠✴❢❛❝❡❜♦♦❦✴❢♦❧❧② 2❤tt♣✿✴✴✇✇✇✳❜♦♦st✳♦r❣ Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 10 / 38

slide-13
SLIDE 13

Pointer-based FastForward Queue

Design and Evaluation of a SPSC Queue Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 11 / 38

slide-14
SLIDE 14

Value-based FastForward Queue

Design and Evaluation of a SPSC Queue Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 12 / 38

slide-15
SLIDE 15

Evaluation of SPSC queues

Design and Evaluation of a SPSC Queue

◮ Test Environment

  • 1. 2x Intel Xeon E5-2650, 2.8GHz, 8 cores, 128GB, Debian, gcc 4.9.2
  • 2. 1x Intel i7 6700K, 4.2GHz, 4 cores, 16GB, Ubuntu, gcc 5.2.1
  • 3. 1x Intel i7 6700K, 4.2GHz, 4 cores, 16GB, Windows 10, MSVC

2015

◮ Test Scenarios

◮ Scenario 1: pointers ◮ Scenario 2: std::vector<int> ◮ Scenario 3: 4x4 double matrix Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 13 / 38

slide-16
SLIDE 16

Evaluation of SPSC queues

Design and Evaluation of a SPSC Queue Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 14 / 38

slide-17
SLIDE 17

Overview

Design and Evaluation of TeeTime for C++

AbstractStage <<abstract>> +debugName +execute() +createRunnable() #addOutputPort(): OutputPort<T> #addInputPort(): InputPort<T> AbstractOutputPort <<abstract>> AbstractInputPort <<abstract>> InputPort<T> +receive(): T OutputPort<T> +send(value: T) * * Pipe<T> <<abstract>> source target Configuration #declareStageActive(stage, cpuAffinity) #declareStageNonActive(stage) #connectPorts(outputport, inputport, capacity) #createStage<T>(args...): Stage #createStageFromFunctionPtr(FunctionPtr): Stage #createStageFromLambda(lambda): Stage +executeBlocking() UnsynchedQueue<T> SynchedPipe<T, TQueue>

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 15 / 38

slide-18
SLIDE 18

Configuration

Design and Evaluation of TeeTime for C++ Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 16 / 38

slide-19
SLIDE 19

Functions and Lambdas

Design and Evaluation of TeeTime for C++ Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 17 / 38

slide-20
SLIDE 20

Evaluation of TeeTime for C++

Design and Evaluation of TeeTime for C++

◮ Test Environment

  • 1. 2x Intel Xeon E5-2650, 2.8GHz, 8 cores, 128GB, Debian, gcc 4.9.2

◮ Test Scenarios

◮ Scenario 1: CPU intensive ◮ Scenario 2: IO intensive ◮ Scenario 3: CPU and IO intensive

◮ Test Configurations

◮ C1: TeeTime (no affinity) ◮ C2: TeeTime (prefer same CPU) ◮ C3: TeeTime (avoid same core) ◮ C4: FastFlow (multi alloc) ◮ C5: FastFlow (single alloc) ◮ C6: TeeTime (Java) Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 18 / 38

slide-21
SLIDE 21

CPU intensive Scenario

Design and Evaluation of TeeTime for C++ Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 19 / 38

slide-22
SLIDE 22

IO intensive Scenario

Design and Evaluation of TeeTime for C++ Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 20 / 38

slide-23
SLIDE 23

CPU and IO intensive Scenario

Design and Evaluation of TeeTime for C++ Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 21 / 38

slide-24
SLIDE 24

CPU and IO intensive Scenario

Design and Evaluation of TeeTime for C++ Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 22 / 38

slide-25
SLIDE 25

Overview

Design and Evaluation of TeeTime for C++ 2 4 6 8 10 12 14 16 18 20 22 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 Speedup Number of worker threads MD5 Benchmark, 20µs Ideal C1: TeeTime (no affinity) C2: TeeTime (prefer same CPU) C3: TeeTime (avoid same core) C4: FastFlow (multi alloc) C5: FastFlow (single alloc) C6: TeeTime (Java) Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 23 / 38

slide-26
SLIDE 26

Overview

Design and Evaluation of TeeTime for C++ 2 4 6 8 10 12 14 16 18 20 22 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 Speedup Number of worker threads MD5 Benchmark, 500ns Ideal C1: TeeTime (no affinity) C2: TeeTime (prefer same CPU) C3: TeeTime (avoid same core) C4: FastFlow (multi alloc) C5: FastFlow (single alloc) C6: TeeTime (Java) Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 24 / 38

slide-27
SLIDE 27

Overview

Design and Evaluation of TeeTime for C++ 2 4 6 8 10 12 14 16 18 20 22 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 Speedup Number of worker threads IO Benchmark, 20000 * 1MB Ideal C1: TeeTime (no affinity) C2: TeeTime (prefer same CPU) C3: TeeTime (avoid same core) C4: FastFlow (multi alloc) C5: FastFlow (single alloc) C6: TeeTime (Java) Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 25 / 38

slide-28
SLIDE 28

Overview

Design and Evaluation of TeeTime for C++ 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000 24000 26000 28000 30000 32000 34000 36000 38000 40000 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 Time (milliseconds) Number of worker threads IO Benchmark, 20000 * 1MB C1: TeeTime (no affinity) C2: TeeTime (prefer same CPU) C3: TeeTime (avoid same core) C4: FastFlow (multi alloc) C5: FastFlow (single alloc) C6: TeeTime (Java) Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 26 / 38

slide-29
SLIDE 29

Overview

Design and Evaluation of TeeTime for C++ 2 4 6 8 10 12 14 16 18 20 22 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 Speedup Number of worker threads Mipmaps Benchmark, 100 * 512x512 Ideal C1: TeeTime (no affinity) C2: TeeTime (prefer same CPU) C3: TeeTime (avoid same core) C4: FastFlow (multi alloc) C5: FastFlow (single alloc) Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 27 / 38

slide-30
SLIDE 30

Conclusions

Conclusions and Future Work

◮ TeeTime successfully ported to C++ ◮ TeeTime for C++ supports modern use of the C++ programming

language

◮ Implemented a Value-based FastForward SPSC queue ◮ Speedup very similar to FastFlow

◮ in many cases even better due to less memory management ◮ TeeTime supports better modularity and stages are more reusable

than with FastFlow

◮ TeeTime for C++ added features currently not available for the

Java version (CPU affinity, Lambdas)

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 28 / 38

slide-31
SLIDE 31

Future Work

Conclusions and Future Work

◮ Extent evaluation on more diverse platforms ◮ Adopt C++14 and C++17 ◮ Adaptive Taskfarm Pattern (Wulf, Wiechmann, and Hasselbring,

2016)

◮ Targeting distributed systems (Marco Aldinucci, Campa, et al.,

2013)

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 29 / 38

slide-32
SLIDE 32

References I

Appendix

Aldinucci, Marco, Sonia Campa, et al. (2013). “Targeting Distributed Systems in Fastflow”. In: Proceedings of the 18th International Conference on Parallel Processing Workshops. Euro-Par’12. Rhodes Island, Greece: Springer-Verlag, pp. 47–56. ISBN: 978-3-642-36948-3. Aldinucci, Marco, Marco Danelutto, et al. (2012). “An Efficient Synchronisation Mechanism for Multi-Core Systems”. In: Aldinucci, M. et al. (2010). Efficient streaming applications on multi-core with FastFlow: The biosequence alignment test-bed.

  • English. Vol. 19. Advances in Parallel Computing. Elsevier,
  • pp. 273–280. DOI: ✶✵✳✸✷✸✸✴✾✼✽✲✶✲✻✵✼✺✵✲✺✸✵✲✸✲✷✼✸.

Christian Wulf, Wilhelm Hasselbring (2016). “The Pipe-and-Filter Architectural Style Revisited: From Basic Concepts towards Smart Framework Implementations”. In:

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 30 / 38

slide-33
SLIDE 33

References II

Appendix

Giacomoni, John, Tipp Moseley, and Manish Vachharajani (2008). “FastForward for Efficient Pipeline Parallelism: A Cache-optimized Concurrent Lock-free Queue”. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel

  • Programming. PPoPP ’08. Salt Lake City, UT, USA: ACM, pp. 43–52.

ISBN: 978-1-59593-795-7.

Lamport, Leslie (1983). “Specifying Concurrent Program Modules”. In: ACM Trans. Program. Lang. Syst. 5.2, pp. 190–222. ISSN: 0164-0925. Meyers, Scott (2014). Effective Modern C++: 42 Specific Ways to Improve Your Use of C++11 and C++14. 1st. O’Reilly Media, Inc.

ISBN: 1491903996, 9781491903995.

Stroustrup, Bjarne (2013). The C++ Programming Language. 4th. Addison-Wesley Professional. ISBN: 0321563840, 9780321563842.

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 31 / 38

slide-34
SLIDE 34

References III

Appendix

Wulf, Christian, Nils Christian Ehmke, and Wilhelm Hasselbring (2014). “Toward a Generic and Concurrency-Aware Pipes & Filters Framework”. In: Symposium on Software Performance 2014: Joint Descartes/Kieker/Palladio Days. Wulf, Christian, Christian Claus Wiechmann, and Wilhelm Hasselbring (2016). “Increasing the Throughput of Pipe-and-Filter Architectures by Integrating the Task Farm Parallelization Pattern”. In: 2016 19th International ACM SIGSOFT Symposium on Component-Based Software Engineering (CBSE).

Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 32 / 38

slide-35
SLIDE 35

C++ Value Semantics

Appendix Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 33 / 38

slide-36
SLIDE 36

C++ Value Semantics

Appendix Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 34 / 38

slide-37
SLIDE 37

C++ Move Semantics

Appendix Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 35 / 38

slide-38
SLIDE 38

The FastFlow Programming Framework

Appendix Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 36 / 38

slide-39
SLIDE 39

value-based FastForward Queue

Appendix Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 37 / 38

slide-40
SLIDE 40

FastForward Queue

Appendix Johannes Ohlemacher Conception and Development of a Pipe & Filter Framework for C++ November 23, 2016 38 / 38