A proposal for User-Defined Reductions in OpenMP A. Duran 1 , R. - - PowerPoint PPT Presentation

a proposal for user defined reductions in openmp
SMART_READER_LITE
LIVE PREVIEW

A proposal for User-Defined Reductions in OpenMP A. Duran 1 , R. - - PowerPoint PPT Presentation

A proposal for User-Defined Reductions in OpenMP A. Duran 1 , R. Ferrer 1 , M. Klemm 2 , B. de Supinski 3 , E. Ayguad 1 1 BSC, 2 Intel, 3 LLNL June 16th 2010 Outline Motivation 1 UDR Design rationale 2 Declaring UDRs 3 Array reductions 4


slide-1
SLIDE 1

A proposal for User-Defined Reductions in OpenMP

  • A. Duran1, R. Ferrer1, M. Klemm2, B. de Supinski3, E. Ayguadé1

1BSC, 2Intel, 3LLNL

June 16th 2010

slide-2
SLIDE 2

Outline

1

Motivation

2

UDR Design rationale

3

Declaring UDRs

4

Array reductions

5

C++ specific extensions

6

Conclusions

Duran et al. () UDRs in OpenMP June 16th 2010 2 / 28

slide-3
SLIDE 3

Motivation

Outline

1

Motivation

2

UDR Design rationale

3

Declaring UDRs

4

Array reductions

5

C++ specific extensions

6

Conclusions

Duran et al. () UDRs in OpenMP June 16th 2010 3 / 28

slide-4
SLIDE 4

Motivation

Reductions in OpenMP 3.0

Current OpenMP supports reduction: basic scalar types simple arithmetic operators (+,-,*,&,...) array reductions (Fortran only) min and max operators (Fortran only)

Duran et al. () UDRs in OpenMP June 16th 2010 4 / 28

slide-5
SLIDE 5

Motivation

Reductions in OpenMP 3.0

Current OpenMP supports reduction: basic scalar types simple arithmetic operators (+,-,*,&,...) array reductions (Fortran only) min and max operators (Fortran only) Users with other reductions must find their way manually: Using critical Using atomic (when possible) Adding complex code to implement the reduction

Duran et al. () UDRs in OpenMP June 16th 2010 4 / 28

slide-6
SLIDE 6

Motivation

Reductions by hand

1 complex_t complex_mul ( complex_t a ,

complex_t b ) ;

2 3 void example ( complex_t ∗array ,

size_t N) {

4

complex_t prd = {1.0 , 0 . 0 } ;

5 6 7 8 9

#pragma omp parallel reduction( prd )

10

{

11 12

#pragma omp for

13

for ( size_t i = 0; i < N; i ++)

14

prd = complex_mul ( prd , array [ i ] ) ;

15 16

}

17 18 19 } Duran et al. () UDRs in OpenMP June 16th 2010 5 / 28

slide-7
SLIDE 7

Motivation

Reductions by hand

1 complex_t complex_mul ( complex_t a ,

complex_t b ) ;

2 3 void example ( complex_t ∗array ,

size_t N) {

4

complex_t prd = {1.0 , 0 . 0 } ;

5 6

int nthreads = omp_get_max_threads ( ) ;

7

complex_t part_prd [ nthreads ] ;

8 9

#pragma omp parallel shared( part_prd ) private( prd )

10

{

11

prd = {1.0 , 0 . 0 } ;

12

#pragma omp for

13

for ( size_t i = 0; i < N; i ++)

14

prd = complex_mul ( prd , array [ i ] ) ;

15

part_prd [ omp_get_thread_num ( ) ] = prd ;

16

}

17 18

for ( int t h r = 0; t h r < nthreads ; t h r ++)

19

prd = complex_mul ( prd , part_prd [ t h r ] ) ;

20 21 } Duran et al. () UDRs in OpenMP June 16th 2010 5 / 28

slide-8
SLIDE 8

Motivation

Not good

Drawbacks

More complex user code Error prone Doesn’t benefit from implementation improvements

Duran et al. () UDRs in OpenMP June 16th 2010 6 / 28

slide-9
SLIDE 9

Motivation

Not good

Drawbacks

More complex user code Error prone Doesn’t benefit from implementation improvements

Solution

Add user-defined reductions to OpenMP

Duran et al. () UDRs in OpenMP June 16th 2010 6 / 28

slide-10
SLIDE 10

Motivation

User-defined reductions

Allow the user to inform OpenMP about new reductions by providing: A type An operation over that type The identity value for that operation and type

Duran et al. () UDRs in OpenMP June 16th 2010 7 / 28

slide-11
SLIDE 11

Motivation

User-defined reductions

Allow the user to inform OpenMP about new reductions by providing: A type An operation over that type The identity value for that operation and type Presented to the OpenMP language committee (still on discussion)

Duran et al. () UDRs in OpenMP June 16th 2010 7 / 28

slide-12
SLIDE 12

UDR Design rationale

Outline

1

Motivation

2

UDR Design rationale

3

Declaring UDRs

4

Array reductions

5

C++ specific extensions

6

Conclusions

Duran et al. () UDRs in OpenMP June 16th 2010 8 / 28

slide-13
SLIDE 13

UDR Design rationale

Driving design goals

follow OpenMP directive-based philosophy support all OpenMP base languages

but mantain a common syntax as much as possible

follow a declaration/usage pattern Allow code re-use Allow efficient implementation

require associativity and commutativity

Duran et al. () UDRs in OpenMP June 16th 2010 9 / 28

slide-14
SLIDE 14

Declaring UDRs

Outline

1

Motivation

2

UDR Design rationale

3

Declaring UDRs

4

Array reductions

5

C++ specific extensions

6

Conclusions

Duran et al. () UDRs in OpenMP June 16th 2010 10 / 28

slide-15
SLIDE 15

Declaring UDRs

Declaring an UDR

Syntax

C:

1

#pragma omp declare reduction(

  • perator−l i s t

: type ) [ clause ]

C++:

1 #pragma omp declare reduction ( [ template <template−params >]

  • perator−l i s t

: type−l i s t ) [ clause ]

Fortran:

1

! $omp declare reduction(

  • perator−l i s t

: typename ) [ clause ]

where clause is:

1 identity( expression | brace−i n i t i a l i z e r C/C++ | constructor [ ( argument−l i s t ) ] C++ ) Duran et al. () UDRs in OpenMP June 16th 2010 11 / 28

slide-16
SLIDE 16

Declaring UDRs

Example

1

complex_t complex_mul ( complex_t a , complex_t b ) ;

2 3 #pragma omp declare reduction( complex_mul

: complex_t ) \

4

identity ( { 1 . 0 , 0 . 0 } )

5 6 void example ( complex_t ∗array ,

size_t N) {

7

complex_t prd = {1.0 , 0 . 0 } ;

8 9

#pragma omp parallel for reduction( complex_mul : prd )

10

for ( size_t i = 0; i < N; i ++)

11

prd = complex_mul ( prd , array [ i ] ) ;

12 13

}

Duran et al. () UDRs in OpenMP June 16th 2010 12 / 28

slide-17
SLIDE 17

Declaring UDRs

Operator requisites

binary operators with compatible arguments with UDR type

Return value either by return or parameter (*,&)

Return value priority: function return, left parameter, right parameter

Allow const and references

(C++) Unary member functions

specified by prepending a dot to the operator name

(C++) valid overloaded operators associative commutative available using base language “symbol“ lookup

both at declaration and usage

Duran et al. () UDRs in OpenMP June 16th 2010 13 / 28

slide-18
SLIDE 18

Declaring UDRs

Identity value

By default: C/Fortran Zero initialization C++ C++ rules for value-initialization Can be overrided by the identity clause

The identity expression must evaluate always to the same value An implementation can evaluate it one or more times

Duran et al. () UDRs in OpenMP June 16th 2010 14 / 28

slide-19
SLIDE 19

Array reductions

Outline

1

Motivation

2

UDR Design rationale

3

Declaring UDRs

4

Array reductions

5

C++ specific extensions

6

Conclusions

Duran et al. () UDRs in OpenMP June 16th 2010 15 / 28

slide-20
SLIDE 20

Array reductions

Array UDRs

Declaration

Types can be prepended with [] that indicate that is going to be an array UDR One [] per dimension Dimension size is not fix at declaration

Operators can have additional integer parameters to get the actual size

Usage

Array UDRs can be applied to variables of: Array types Pointer to array types

allows to ”recover” arrays through function calls

Support for VLA (pointers to) arrays is limited to C.

Duran et al. () UDRs in OpenMP June 16th 2010 16 / 28

slide-21
SLIDE 21

Array reductions

Example

1 const int N = 10; 2 void vector_add

( int ∗A, int ∗B, int n ) ;

3 4 #pragma omp declare reduction (

vector_add : int [ ] )

5 6 void foo

( int ∗ a , int ∗ b , int n )

7

{

8

int v1 [N ] ;

9

int (∗ v2 ) [N] = ( int ( ∗ ) [N] ) a ;

10

int (∗ v3 ) [ n ] = ( int ( ∗ ) [ n ] ) b ; / / VLA; Only v a l i d in C

11 12

#pragma omp for reduction ( vector_add : v1 , v2 , v3 )

13

for ( . . . ) { . . . }

14

}

Duran et al. () UDRs in OpenMP June 16th 2010 17 / 28

slide-22
SLIDE 22

C++ specific extensions

Outline

1

Motivation

2

UDR Design rationale

3

Declaring UDRs

4

Array reductions

5

C++ specific extensions

6

Conclusions

Duran et al. () UDRs in OpenMP June 16th 2010 18 / 28

slide-23
SLIDE 23

C++ specific extensions

Constructed indentities

The special constructor keyword can be used in the identity clause Private copies will be initialized with a constructor instead of by-assignment

Duran et al. () UDRs in OpenMP June 16th 2010 19 / 28

slide-24
SLIDE 24

C++ specific extensions

Constructed indentities

The special constructor keyword can be used in the identity clause Private copies will be initialized with a constructor instead of by-assignment

1

#pragma omp declare reduction( ∗ : Complex ) identity(constructor ( 1 . 0 , 0 . 0

Duran et al. () UDRs in OpenMP June 16th 2010 19 / 28

slide-25
SLIDE 25

C++ specific extensions

Inheritance support

Idea

Support the C++ philosophy of allowing to use methods defined over base classes with derived classes

Duran et al. () UDRs in OpenMP June 16th 2010 20 / 28

slide-26
SLIDE 26

C++ specific extensions

Inheritance support

Idea

Support the C++ philosophy of allowing to use methods defined over base classes with derived classes UDR of base classes can be used for derived classes Only if it does not create object slicing

  • perator parameters must be pointers or references

Duran et al. () UDRs in OpenMP June 16th 2010 20 / 28

slide-27
SLIDE 27

C++ specific extensions

Template UDRs

Allows to declare an UDR on all (or partial) instantiations of template type

Duran et al. () UDRs in OpenMP June 16th 2010 21 / 28

slide-28
SLIDE 28

C++ specific extensions

Template UDRs

Allows to declare an UDR on all (or partial) instantiations of template type

1 #pragma omp declare \ 2

reduction( template<class T> std : : l i s t <T > : : merge : std : : l i s t <T> )

3 4 void foo

( )

5

{

6

std : : l i s t <int > l i ;

7

std : : l i s t <float > l f ;

8 9

#pragma omp parallel for reduction( std : : l i s t <int > : : merge : l i )

10

reduction( std : : l i s t <float > : : merge : l f )

11

for ( . . . ) { . . . }

12

}

Duran et al. () UDRs in OpenMP June 16th 2010 21 / 28

slide-29
SLIDE 29

C++ specific extensions

Dot syntax

Simplify qualification for C++ by doing auto-qualification based on the UDR type

Duran et al. () UDRs in OpenMP June 16th 2010 22 / 28

slide-30
SLIDE 30

C++ specific extensions

Dot syntax

Simplify qualification for C++ by doing auto-qualification based on the UDR type

1 /∗ #pragma omp declare

\

2

reduction ( template <class T> . merge : std : : l i s t <T> ) ∗/

3 4 #pragma omp declare \ 5

reduction( . merge : std : : l i s t <int > , std : : l i s t <float > )

6 7 void foo

( )

8

{

9

std : : l i s t <int > l i ;

10

std : : l i s t <float > l f ;

11 12

#pragma omp parallel for reduction( . merge : l i , l f )

13

for ( . . . ) { . . . }

14

}

Duran et al. () UDRs in OpenMP June 16th 2010 22 / 28

slide-31
SLIDE 31

Conclusions

Outline

1

Motivation

2

UDR Design rationale

3

Declaring UDRs

4

Array reductions

5

C++ specific extensions

6

Conclusions

Duran et al. () UDRs in OpenMP June 16th 2010 23 / 28

slide-32
SLIDE 32

Conclusions

Conclusions

Proposal allows users to extend OpenMP with their own reduction

  • perations

aggregated types arrays template types

Still in discussion for OpenMP 3.1

We’re looking for user codes

Duran et al. () UDRs in OpenMP June 16th 2010 24 / 28

slide-33
SLIDE 33

Conclusions

The End

Thanks for your attention!

Duran et al. () UDRs in OpenMP June 16th 2010 25 / 28

slide-34
SLIDE 34

Conclusions

Min/Max

1 #pragma omp declare \ 2

reduction ( template <typename T> std : : min : T) \

3

identity( std : : numeric_limits <T > : :max ( ) )

4 #pragma omp declare \ 5

reduction ( template <typename T> std : : max : T) \

6

identity( std : : numeric_limits <_T > : : min ( ) )

Duran et al. () UDRs in OpenMP June 16th 2010 26 / 28

slide-35
SLIDE 35

Conclusions

Some example

1 i n l i n e void pointMin ( Point ∗a , Point ∗b ) 2 { 3 i f ( a− >x < b− >x ) a− >x = b− >x ; 4 i f ( a− >y < b− >y ) a− >y = b− >y ; 5 } 6 7 i n l i n e void pointMax ( Point ∗a , Point ∗b ) 8 { 9 i f ( a− >x > b− >x ) a− >x = b− >x ; 10 i f ( a− >y > b− >y ) a− >y = b− >y ; 11 } 12 13 #pragma omp declare reduction ( pointMax : Point ∗) 14 #pragma omp declare reduction ( pointMin : Point ∗) 15 16 void findMinMax ( PointVector points , index_t n , Point∗ minPoint , 17 Point∗ maxPoint ) 18 { 19 20 #pragma omp parallel for schedule( static ) \ 21 reduction( pointMin : minPoint ) reduction( pointMax : maxPoint ) 22 for ( index_t i = 1; i < n ; i ++) { 23 i f ( points [ i ] . x < min . x ) minPoint− >x = points [ i ] . x ; 24 i f ( points [ i ] . y < min . y ) minPoint− >y = points [ i ] . y ; 25 i f ( points [ i ] . x > max. x ) maxPoint− >x = points [ i ] . x ; 26 i f ( points [ i ] . y > max. y ) maxPoint− >y = points [ i ] . y ; 27 } 28 } 29 } Duran et al. () UDRs in OpenMP June 16th 2010 27 / 28

slide-36
SLIDE 36

Conclusions

Gromacs

1 void array_rvec_add ( rvec ∗a , rvec ∗b , int n ) 2 { 3 for ( int i = 0; i < n ; i ++) { 4 a [ i ] . xx += b [ i ] . xx ; 5 a [ i ] . yy += b [ i ] . yy ; 6 a [ i ] . zz += b [ i ] . zz ; 7 } 8 } 9 10 void real_array_add ( real ∗a , real ∗b , int n ) 11 { 12 for ( int i = 0; i < n ; i ++) a [ i ] += b [ i ] ; 13 } 14 15 #pragma omp declare reduction ( array_rvec_add : rvec [ ] ) identity ( { 0 . 0 , 0 . 0 , 0 . 0 } ) 16 #pragma omp declare reduction ( array_real_add : real [ ] ) identity ( 0 . 0 ) 17 18 . . . 19 neg2 = mdatoms− >nenergrp∗mdatoms− >nenergrp ; 20 21 rvec (∗ pf ) [ mdatoms− >nr ] = ( rvec (∗) [ mdatoms− >nr ] ) f ; 22 rvec (∗ p f s h i f t ) [ SHIFTS ] = ( rvec (∗) [ SHIFTS ] ) p f s h i f t ; 23 real (∗ pegcoul ) [ neg2 ] = ( real (∗) [ neg2 ] ) egcoul ; 24 real (∗pegnb ) [ neg2 ] = ( real (∗) [ neg2 ] ) egnb ; 25 26 #pragma omp parallel for reduction( array_rvec_add : pf , p f s h i f t ) \ 27 reduction( array_real_add : pegcoul , pegnb ) 28 for ( b=0; b<nb ; b++) 29 function ( p f s h i f t , pf , pegcoul , pegnb ) ; 30 31 . . . Duran et al. () UDRs in OpenMP June 16th 2010 28 / 28