Towards a Domain-Specific Language for Patterns-Oriented Parallel - - PowerPoint PPT Presentation

towards a domain specific language for patterns oriented
SMART_READER_LITE
LIVE PREVIEW

Towards a Domain-Specific Language for Patterns-Oriented Parallel - - PowerPoint PPT Presentation

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Towards a Domain-Specific Language for Patterns-Oriented Parallel Programming Dalvan Griebler, Luiz Gustavo Fernandes Pontifcia


slide-1
SLIDE 1

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

Towards a Domain-Specific Language for Patterns-Oriented Parallel Programming

Dalvan Griebler, Luiz Gustavo Fernandes

Pontifícia Universidade Católica do Rio Grande do Sul - PUCRS Programa de Pós-Graduação em Ciência da Computação - PPGCC Grupo de Modelagem de Aplicações Paralelas - GMAP Brazilian Symposium on Programming Languages - SBLP

October 2013

1 / 21

slide-2
SLIDE 2

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

Summary

1

Introduction

2

Patterns-Oriented Parallel Programming (POPP)

3

DSL-POPP Compilation Process Programming Interface and Implementation Levels of parallelism

4

Results Implementation Example of the DSL-POPP Tests Scenario Performance of DSL-POPP

5

Conclusions

6

References

2 / 21

slide-3
SLIDE 3

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

Introduction Skeletons/Patterns ([1], [2], [3])

3 / 21

slide-4
SLIDE 4

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

Introduction Skeletons/Patterns ([1], [2], [3]) Programming Interfaces (FastFlow [4], Muesli [5], SkeTo[6], Skandium [7] , eSkel[8], P3L [9], Lithium [10], Muskel [11] and Skil [12])

3 / 21

slide-5
SLIDE 5

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

Introduction Skeletons/Patterns ([1], [2], [3]) Programming Interfaces (FastFlow [4], Muesli [5], SkeTo[6], Skandium [7] , eSkel[8], P3L [9], Lithium [10], Muskel [11] and Skil [12]) Main goals of DSL-POPP [13]:

Reduce the effort without compromise the performance Patterns-Oriented Parallel Programming Abstract details of patterns implementation Offer different levels of parallelism

3 / 21

slide-6
SLIDE 6

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

Introduction Skeletons/Patterns ([1], [2], [3]) Programming Interfaces (FastFlow [4], Muesli [5], SkeTo[6], Skandium [7] , eSkel[8], P3L [9], Lithium [10], Muskel [11] and Skil [12]) Main goals of DSL-POPP [13]:

Reduce the effort without compromise the performance Patterns-Oriented Parallel Programming Abstract details of patterns implementation Offer different levels of parallelism

Paper contributions

We propose the POPP model We introduce DSL-POPP We present a case study based on an image processing algorithm

3 / 21

slide-7
SLIDE 7

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

Patterns-Oriented Parallel Programming (POPP)

Main Routine

Subroutine 1

Code Block 1

Code Block 1 Code Block n

Subroutine 1

Code Block 1 Code Block n

Subroutine n

Code Block 1 Code Block n

Code Block n

. . .

. . . . . .

. . .

. . .

Figure: POPP model

M S1 Sn

. . .

m s1 sn ...

code blocks

m s1 sn ... subroutine 1 subroutine n

main routine Master/Slave pattern

P1 Pn

...

p1 pn ... subroutine 1 p1 pn ... subroutine n

code blocks

main routine

Legend: M,S: Master/Slave (main routine) m,s: master/slave (subrotine) P: Pipeline stage (main routine) p: pipeline stage (subroutine)

Pipeline pattern

Figure: Master/Slave - Pipeline.

P1 P2

m s1 sn

...

m s1 sn

...

subroutine 2 (master/slave) subroutine 1 (master/slave)

Pn

p1 pn ... subroutine n (pipeline)

... main routine (pipeline) Hybrid patterns 4 / 21

slide-8
SLIDE 8

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

Patterns-Oriented Parallel Programming (POPP)

Main Routine

Subroutine 1

Code Block 1

Code Block 1 Code Block n

Subroutine 1

Code Block 1 Code Block n

Subroutine n

Code Block 1 Code Block n

Code Block n

. . .

. . . . . .

. . .

. . .

Figure: POPP model

M S1 Sn

. . .

m s1 sn ...

code blocks

m s1 sn ... subroutine 1 subroutine n

main routine Master/Slave pattern

P1 Pn

...

p1 pn ... subroutine 1 p1 pn ... subroutine n

code blocks

main routine

Legend: M,S: Master/Slave (main routine) m,s: master/slave (subrotine) P: Pipeline stage (main routine) p: pipeline stage (subroutine)

Pipeline pattern

Figure: Master/Slave - Pipeline.

P1 P2

m s1 sn

...

m s1 sn

...

subroutine 2 (master/slave) subroutine 1 (master/slave)

Pn

p1 pn ... subroutine n (pipeline)

... main routine (pipeline) Hybrid patterns

Figure: Combination of Patterns.

4 / 21

slide-9
SLIDE 9

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Compilation Process

DSL-POPP

$PipelinePattern int main(...){ @Pipeline{ @Stage(...){ } @Stage(...){ } } } include pthread.h include smmpi.h SMMPI_send(...) SMMPI_recv(...) pthread_create(...) pthread_join(...)

Syntatic/Semantic Analysis Source-to-Source Transformation

Source Code DSL-POPP Precompiler System

GCC Compiler Binary Code

Pattern Tree

Figure: Compilation process.

5 / 21

slide-10
SLIDE 10

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Programming Interface and Implementation

DSL-POPP

C r e a t e T h r e a d s

...

thread 1

Work 0

...

thread 2

Work 0

J

  • i

n T h r e a d s

...

thread 0

Work 0

Stage Block Pipeline Block $PipelinePattern void func_name(...){ @Pipeline{ @Stage(int num_th, void* buffer, int buf_size){ } @Stage(int num_th, void* buffer, int buf_size){ } @Stage(int num_th, void* buffer, int buf_size){ } } }

(a) Pipeline

Work 0.0

thread 0

...

Work 0.n

...

Work n.0

thread n

Work n.n

...

Create Threads Join Threads Master Block Slave Block $MasterSlavePattern void func_name(...){ @Master{ @Slave(int num_th, void* buffer, int buf_size, const POPP_LB_Policy){ } } }

6 / 21

slide-11
SLIDE 11

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Programming Interface and Implementation

DSL-POPP

C r e a t e T h r e a d s

...

thread 1

Work 0

...

thread 2

Work 0

J

  • i

n T h r e a d s

...

thread 0

Work 0

Stage Block Pipeline Block $PipelinePattern void func_name(...){ @Pipeline{ @Stage(int num_th, void* buffer, int buf_size){ } @Stage(int num_th, void* buffer, int buf_size){ } @Stage(int num_th, void* buffer, int buf_size){ } } }

(a) Pipeline

Work 0.0

thread 0

...

Work 0.n

...

Work n.0

thread n

Work n.n

...

Create Threads Join Threads Master Block Slave Block $MasterSlavePattern void func_name(...){ @Master{ @Slave(int num_th, void* buffer, int buf_size, const POPP_LB_Policy){ } } }

(b) Master/Slave

Figure: Syntax and logical structure of the DSL-POPP

Policies for Load Balancing: POPP_LB_STATIC; POPP_LB_DYNAMIC; POPP_LB_COST.

6 / 21

slide-12
SLIDE 12

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Levels of parallelism

DSL-POPP

a) b) c) d)

First level active threads

Master/Slave - Master/Slave Master/Slave - Pipeline Pipeline - Master/Slave Pipeline - Pipeline

Second level active threads Control threads (master)

Figure: Overview of thread graph in DSL-POPP .

7 / 21

slide-13
SLIDE 13

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Implementation Example of the DSL-POPP

Results

Prewitt Sobel Roberts

IM1 IM1 IM1

...

IM1 IM2 IM3 IM4 IM39 IM40 List of images with 3000x2550 resolution

Figure: Overview of DSL-POPP Image Processing Algorithm Implementation.

8 / 21

slide-14
SLIDE 14

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Implementation Example of the DSL-POPP

Results

Prewitt Sobel Roberts

IM1 IM1 IM1

...

IM1 IM2 IM3 IM4 IM39 IM40 List of images with 3000x2550 resolution

Split Split Split Split

IM1

1 2

n

1 2 n

...

. . .

Figure: Overview of DSL-POPP Image Processing Algorithm Implementation.

9 / 21

slide-15
SLIDE 15

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Implementation Example of the DSL-POPP

Results

10 / 21

slide-16
SLIDE 16

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Implementation Example of the DSL-POPP

Results

11 / 21

slide-17
SLIDE 17

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Tests Scenario

Results

Prewitt Sobel Roberts

IM1 IM1 IM1

...

IM1 IM2 IM3 IM4 IM39 IM40 List of images with 3000x2550 resolution

Split Split Split

IM1

1 2

n

. . .

Master/Slave Master/Slave Master/Slave

T est-1

12 / 21

slide-18
SLIDE 18

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Tests Scenario

Results

T est-2

Prewitt Sobel Roberts

IM1 IM2 IM1 IM2 IM1 IM3 IM39 IM39 IM39 Pipeline

...

IM1 IM2 IM3 IM4 IM39 IM40

... ... ...

List of images with 3000x2550 resolution

13 / 21

slide-19
SLIDE 19

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Tests Scenario

Results

Prewitt Sobel Roberts

IM1 IM1 IM1

...

IM1 IM2 IM3 IM4 IM39 IM40 List of images with 3000x2550 resolution

Split Split Split Split

IM1

1 2

n

1 2 n

...

. . .

Master/Slave Master/Slave Master/Slave Master/Slave

T est-3.1 and T est-3.2

14 / 21

slide-20
SLIDE 20

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Tests Scenario

Results

T est-4

Pipeline

...

IM1 IM2 IM3 IM4 IM39 IM40 List of images with 3000x2550 resolution

Prewitt Sobel Roberts

IM1 IM1 IM1

Split Split Split

IM1

1 2

n

. . .

Master/Slave Master/Slave Master/Slave IM1 IM2 IM1 IM2 IM1 IM3 IM39 IM39 IM39

... ... ...

15 / 21

slide-21
SLIDE 21

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Tests Scenario

Results

Prewitt Sobel Roberts

IM1 IM1 IM1

...

IM1 IM2 IM3 IM4 IM39 IM40 List of images with 3000x2550 resolution

Split 1 2 n

...

Master/Slave

T est-5

16 / 21

slide-22
SLIDE 22

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References Performance of DSL-POPP

Results

1 3 6 9 12 15 18 21 24 3 6 9 12 15 18 21 24 0.2 0.4 0.6 0.8 1 Speedup Efficiency Number of threads Test-1 Efficiency Speedup Ideal 1 3 6 9 12 15 18 21 24 3 6 9 12 15 18 21 24 0.2 0.4 0.6 0.8 1 Speedup Efficiency Number of threads Test-2 Efficiency Speedup Ideal 1 3 6 9 12 15 18 21 24 3 6 9 12 15 18 21 24 0.2 0.4 0.6 0.8 1 Speedup Efficiency Number of threads Test-3.1 Efficiency Speedup Ideal 1 3 6 9 12 15 18 21 24 3 6 9 12 15 18 21 24 0.2 0.4 0.6 0.8 1 Speedup Efficiency Number of threads Test-3.2 Efficiency Speedup Ideal 1 3 6 9 12 15 18 21 24 3 6 9 12 15 18 21 24 0.2 0.4 0.6 0.8 1 Speedup Efficiency Number of threads Test-4 Efficiency Speedup Ideal 1 3 6 9 12 15 18 21 24 3 6 9 12 15 18 21 24 0.2 0.4 0.6 0.8 1 Speedup Efficiency Number of threads Test-5 Efficiency Speedup Ideal

17 / 21

slide-23
SLIDE 23

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

Conclusions About this paper

Hide Low level parallel programming primitives Patterns may be easily nested or combined Good performance for image processing application Different parallel implementation tests were performed

Future Works

Include other parallel patterns Investigate optimized techniques for code generation Effort evaluation.

18 / 21

slide-24
SLIDE 24

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

References I

Mattson G. T., Sanders A. B., and Massingill L. B. Patterns for Parallel Programming. Addison-Wesley, Boston, USA, 2005. Intel and Mccool D. M. Structured Parallel Programming with Deterministic Patterns. In HotPar-2nd USENIX Workshop on Hot Topics in Parallelism, pages 1–6, Berkeley, CA, June 2010. Catanzaro R. and Keutzer K. Parallel Computing with Patterns and Frameworks. XRDS: Crossroads, The ACM Magazine for Students, 17(1):22–27, 2010. Aldinucci M. and Danelutto M. and Kilpatrick P . and Torquati M. FastFlow: High-Level and Efficient Streaming on Multi-core. In Programming Multi-core and Many-core Computing Systems, Parallel and Distributed Computing, chapter 13. Wiley, Boston, USA, 2013. Ciechanowicz P . and Kuchen H. Enhancing Muesli’s Data Parallel Skeletons for Multi-core Computer Architectures. In High Performance Computing and Communications (HPCC), 2010 12th IEEE International Conference

  • n, pages 108–113, Melbourne, Australia, September 2010.

Karasawa Y. and Iwasaki H. A Parallel Skeleton Library for Multi-core Clusters. In Parallel Processing, 2009. ICPP ’09. International Conference on, pages 84–91, Vienna, Austria, September 2009. 19 / 21

slide-25
SLIDE 25

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

References II

Leyton M. and Piquer J.M. Skandium: Multi-core Programming with Algorithmic Skeletons. In Parallel, Distributed and Network-Based Processing (PDP), 2010 18th Euromicro International Conference

  • n, pages 289–296, Pisa, Italy, February 2010.

Benoit A., Cole M., Gilmore S., and Hillston J. Flexible Skeletal Programming with eSkel. In Proceedings of the 11th international Euro-Par conference on Parallel Processing, pages 761–770, Lisboa, Portugal, September, 2005. Bacci B. and Danelutto M. and Orlando S. and Pelagatti S. and Vanneschi M. P3L: A Structured High-Level Parallel Language, and its Structured Support. Concurrency: Practice and Experience, 7(3):225–255, 1995. Aldinucci M. and Danelutto M. and Teti P . An Advanced Environment Supporting Structured Parallel Programming in Java. Future Gener. Comput. Syst., 19(5):611–626, 2003. Aldinucci M. and Danelutto M. and Kilpatrick P . Skeletons for Multi/Many-core Systems. In Parallel Computing: From Multicores and GPU’s to Petascale (Proc. of PARCO 2009, Lyon, France), pages 265–272, Lyon, France, September 2009. Botorog G.H. and Kuchen H. Skil: An Imperative Language with Algorithmic Skeletons for Efficient Distributed Programming. In High Performance Distributed Computing, 1996., Proceedings of 5th IEEE International Symposium on, pages 243–252, Syracuse, NY, USA, August 1996. 20 / 21

slide-26
SLIDE 26

Introduction Patterns-Oriented Parallel Programming (POPP) DSL-POPP Results Conclusions References

References III

Griebler D. J. Proposta de uma Linguagem Específica de Domínio de Programação Paralela Orientada a Padrões Paralelos: um Estudo de Caso Baseado no Padrão Mestre/Escravo para Arquiteturas Multi-Core. Master’s thesis, PUCRS, 2012. Voltar para Capa 21 / 21