OpenMP Instructor PanteA Zardoshti Department of Computer - - PowerPoint PPT Presentation

openmp
SMART_READER_LITE
LIVE PREVIEW

OpenMP Instructor PanteA Zardoshti Department of Computer - - PowerPoint PPT Presentation

OpenMP Instructor PanteA Zardoshti Department of Computer Engineering Sharif University of Technology e-mail: azad@sharif.edu What Is OpenMP? OpenMP in an application programming interface that provides a parallel programming model for


slide-1
SLIDE 1

OpenMP

Department of Computer Engineering Sharif University of Technology e-mail: azad@sharif.edu

Instructor

PanteA Zardoshti

slide-2
SLIDE 2

2

  • OpenMP in an application programming interface that

provides a parallel programming model for shared memory (and distributed shared memory) multiprocessors. It extends programming languages (C/C++ and Fortran) by

  • a set of compiler directives (called pragmas) to express shared memory

parallelism.

  • runtime library routines and environment variables that are used to

examine and modify execution parameters.

What Is OpenMP?

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-3
SLIDE 3

3

  • OpenMP is based on the fork-join execution model.
  • At the start of an OpenMP program, a single thread (master

thread) is executing

Execution Model

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-4
SLIDE 4

3

  • OpenMP is based on the fork-join execution model.
  • At the start of an OpenMP program, a single thread (master

thread) is executing

  • Creating teams of threads

Execution Model

Master er Threa ead

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-5
SLIDE 5

3

  • OpenMP is based on the fork-join execution model.
  • At the start of an OpenMP program, a single thread (master

thread) is executing

  • Creating teams of threads
  • sharing work among threads

Execution Model

Master er Threa ead

Work

  • rker

er Thre reads ads

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-6
SLIDE 6

3

  • OpenMP is based on the fork-join execution model.
  • At the start of an OpenMP program, a single thread (master

thread) is executing

  • Creating teams of threads
  • sharing work among threads

Execution Model

Master er Threa ead

Paral alle lel Re Regio ions ns Work

  • rker

er Thre reads ads

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-7
SLIDE 7

3

  • OpenMP is based on the fork-join execution model.
  • At the start of an OpenMP program, a single thread (master

thread) is executing

  • Creating teams of threads
  • sharing work among threads
  • synchronizing the threads

Execution Model

Master er Threa ead

Paral alle lel Re Regio ions ns Work

  • rker

er Thre reads ads Work

  • rker

er Thre reads ads

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-8
SLIDE 8

8

  • Most constructs in OpenMP are compiler directives or pragmas.
  • For C and C++, the pragmas take the form:

#pragma omp construct [clause [clause]…]

OpenMP Pragma Syntax

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-9
SLIDE 9

8

  • Most constructs in OpenMP are compiler directives or pragmas.
  • For C and C++, the pragmas take the form:

#pragma omp construct [clause [clause]…]

  • Example:

#pragma omp parallel num_threads(4)

OpenMP Pragma Syntax

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-10
SLIDE 10

8

  • Most constructs in OpenMP are compiler directives or pragmas.
  • For C and C++, the pragmas take the form:

#pragma omp construct [clause [clause]…]

  • Example:

#pragma omp parallel num_threads(4)

  • Most OpenMP constructs apply to a “structured block”.
  • Structured block: a block of one or more statements with one point of

entry at the top and one point of exit at the bottom.

OpenMP Pragma Syntax

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-11
SLIDE 11

9

Parallel Regions

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-12
SLIDE 12

9

  • Defines parallel region over structured block of code

Parallel Regions

#pragma omp parallel { block }

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-13
SLIDE 13

9

  • Defines parallel region over structured block of code

Parallel Regions

#pragma omp parallel { block }

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-14
SLIDE 14

9

  • Defines parallel region over structured block of code
  • If the master thread arrives at a parallel directive it spawns

some new threads and forms a team of treads.

Parallel Regions

#pragma omp parallel { block }

Trea ead d 1 Trea ead d 2 Trea ead d 3 Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-15
SLIDE 15

9

  • Defines parallel region over structured block of code
  • If the master thread arrives at a parallel directive it spawns

some new threads and forms a team of treads.

  • At the end of the parallel section the execution is joined

again in the single master thread.

Parallel Regions

#pragma omp parallel { block }

Trea ead d 1 Trea ead d 2 Trea ead d 3 Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-16
SLIDE 16

9

  • Defines parallel region over structured block of code
  • If the master thread arrives at a parallel directive it spawns

some new threads and forms a team of treads.

  • At the end of the parallel section the execution is joined

again in the single master thread.

  • There is an implicite barrier at this point.

Parallel Regions

#pragma omp parallel { block }

Trea ead d 1 Trea ead d 2 Trea ead d 3 Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-17
SLIDE 17

15

  • Number of openMP threads can be set using:
  • Environmental variable OMP_NUM_THREADS
  • Runtime function omp_set_num_threads(n)

How Many Threads?

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-18
SLIDE 18

15

  • Number of openMP threads can be set using:
  • Environmental variable OMP_NUM_THREADS
  • Runtime function omp_set_num_threads(n)
  • Other useful function to get information about threads:
  • Runtime function omp_get_num_threads()
  • Returns number of threads in parallel region
  • Returns 1 if called outside parallel region

How Many Threads?

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-19
SLIDE 19

15

  • Number of openMP threads can be set using:
  • Environmental variable OMP_NUM_THREADS
  • Runtime function omp_set_num_threads(n)
  • Other useful function to get information about threads:
  • Runtime function omp_get_num_threads()
  • Returns number of threads in parallel region
  • Returns 1 if called outside parallel region
  • Runtime function omp_get_thread_num()
  • Returns id of thread in team
  • Value between [0,n-1] // where n = #threads
  • Master thread always has id 0

How Many Threads?

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-20
SLIDE 20

7

#include “omp.h” void main() { #pragma omp parallel { int ID = omp_get_thread_num(); printf(“ hello(%d) ”, ID); printf(“ world(%d) \n”, ID); } }

First OpenMP Example

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-21
SLIDE 21

7

#include “omp.h” void main() { #pragma omp parallel { int ID = omp_get_thread_num(); printf(“ hello(%d) ”, ID); printf(“ world(%d) \n”, ID); } }

First OpenMP Example

OpenMP include file

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-22
SLIDE 22

7

#include “omp.h” void main() { #pragma omp parallel { int ID = omp_get_thread_num(); printf(“ hello(%d) ”, ID); printf(“ world(%d) \n”, ID); } }

First OpenMP Example

OpenMP include file Parallel region with default number of threads

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-23
SLIDE 23

7

#include “omp.h” void main() { #pragma omp parallel { int ID = omp_get_thread_num(); printf(“ hello(%d) ”, ID); printf(“ world(%d) \n”, ID); } }

First OpenMP Example

OpenMP include file Parallel region with default number of threads Runtime library function to return a thread ID.

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-24
SLIDE 24

7

#include “omp.h” void main() { #pragma omp parallel { int ID = omp_get_thread_num(); printf(“ hello(%d) ”, ID); printf(“ world(%d) \n”, ID); } }

First OpenMP Example

OpenMP include file Parallel region with default number of threads Runtime library function to return a thread ID. End of the Parallel region

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-25
SLIDE 25

7

#include “omp.h” void main() { #pragma omp parallel { int ID = omp_get_thread_num(); printf(“ hello(%d) ”, ID); printf(“ world(%d) \n”, ID); } }

First OpenMP Example

OpenMP include file Parallel region with default number of threads Runtime library function to return a thread ID. End of the Parallel region

hello(1) hello(0) world(1) world(0) hello (3) hello(2) world(3) world(2)

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-26
SLIDE 26

13

  • Each thread executes a copy of the code within the structured block

#include “omp.h” void main() { num_threads(4); #pragma omp parallel { int ID = omp_get_thread_num(); printf(“ hello(%d) ”, ID); printf(“ world(%d) \n”, ID); }

  • }

A first OpenMP example

Runtime function to request a certain number of threads

  • mp_set_

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-27
SLIDE 27

24

  • int ID = omp_get_thread_num();
  • THREADS = omp_get_num_threads();

Shared Memory Model

Trea ead d 1 Trea ead d 2 Trea ead d 3 Trea ead d 4

Var : ID Var: THREA EADS DS Shared d Memory

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-28
SLIDE 28

24

  • int ID = omp_get_thread_num();
  • THREADS = omp_get_num_threads();
  • All threads try to access the same variable (possibly at the

same time). This can lead to a race condition. Different runs of same program might give different results because of race conditions.

Shared Memory Model

Trea ead d 1 Trea ead d 2 Trea ead d 3 Trea ead d 4

Var : ID Var: THREA EADS DS Shared d Memory

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-29
SLIDE 29

25

  • OpenMP provides a way to declare variables private or shared

within an OpenMP block.This is done using the following OpenMP clauses:

Data scoping clauses

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-30
SLIDE 30

25

  • OpenMP provides a way to declare variables private or shared

within an OpenMP block.This is done using the following OpenMP clauses:

  • SHARED ( list )
  • All variables in list will be considered shared.
  • Every openmp thread has access to all these variables

Data scoping clauses

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-31
SLIDE 31

25

  • OpenMP provides a way to declare variables private or shared

within an OpenMP block.This is done using the following OpenMP clauses:

  • SHARED ( list )
  • All variables in list will be considered shared.
  • Every openmp thread has access to all these variables
  • PRIVATE ( list )
  • Every openmp thread will have it's own private copy of variables in list
  • No other openmp thread has access to this private copy

#pragma omp parallel private(a,b,c)

Data scoping clauses

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-32
SLIDE 32

25

  • OpenMP provides a way to declare variables private or shared

within an OpenMP block.This is done using the following OpenMP clauses:

  • SHARED ( list )
  • All variables in list will be considered shared.
  • Every openmp thread has access to all these variables
  • PRIVATE ( list )
  • Every openmp thread will have it's own private copy of variables in list
  • No other openmp thread has access to this private copy

#pragma omp parallel private(a,b,c)

  • NOTE: By default most variables are considered shared in OpenMP.Exceptions

include index variables (Fortran, C/C++) and variables declared inside parallel region (C/C++)

Data scoping clauses

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-33
SLIDE 33

26

Data scoping clauses(cont.)

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-34
SLIDE 34

26

  • FIRSTPRIVATE ( list ):
  • Same as PRIVATE but every private copy of variable 'x' will be initialized

with the original value (before the omp region started) of 'x‘

Data scoping clauses(cont.)

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-35
SLIDE 35

26

  • FIRSTPRIVATE ( list ):
  • Same as PRIVATE but every private copy of variable 'x' will be initialized

with the original value (before the omp region started) of 'x‘

  • LASTPRIVATE ( list ):
  • Same as PRIVATE but the private copies of the variables in list from the

last work sharing will be copied to shared version. To be used with for Directive.

Data scoping clauses(cont.)

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-36
SLIDE 36

26

  • FIRSTPRIVATE ( list ):
  • Same as PRIVATE but every private copy of variable 'x' will be initialized

with the original value (before the omp region started) of 'x‘

  • LASTPRIVATE ( list ):
  • Same as PRIVATE but the private copies of the variables in list from the

last work sharing will be copied to shared version. To be used with for Directive.

  • DEFAULT (SHARED | PRIVATE | FIRSTPRIVATE | LASTPRIVATE ):
  • Specifies the default scope for all variables in omp region.

Data scoping clauses(cont.)

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-37
SLIDE 37

27

Example

Int a; Void foo ( ) { int b , c;

#pragma omp parallel private ( b ) {

int d ;

#pragma omp task parallel {

int e ; a = b = c = d = e =

} } }

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-38
SLIDE 38

27

Example

Int a; Void foo ( ) { int b , c;

#pragma omp parallel private ( b ) {

int d ;

#pragma omp task parallel {

int e ; a = b = c = d = e =

} } }

shared

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-39
SLIDE 39

27

Example

Int a; Void foo ( ) { int b , c;

#pragma omp parallel private ( b ) {

int d ;

#pragma omp task parallel {

int e ; a = b = c = d = e =

} } }

shared firstprivate

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-40
SLIDE 40

27

Example

Int a; Void foo ( ) { int b , c;

#pragma omp parallel private ( b ) {

int d ;

#pragma omp task parallel {

int e ; a = b = c = d = e =

} } }

shared firstprivate shared

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-41
SLIDE 41

27

Example

Int a; Void foo ( ) { int b , c;

#pragma omp parallel private ( b ) {

int d ;

#pragma omp task parallel {

int e ; a = b = c = d = e =

} } }

shared firstprivate shared firstprivate

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-42
SLIDE 42

27

Example

Int a; Void foo ( ) { int b , c;

#pragma omp parallel private ( b ) {

int d ;

#pragma omp task parallel {

int e ; a = b = c = d = e =

} } }

shared firstprivate shared firstprivate private

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-43
SLIDE 43

28

  • Parallel regions did same work. Not very useful. What we

want is to share work among all threads so we can solve our problems faster.

#pragma omp parallel { for (i=0;i<100;i++) A(i) = A(i) + B }

Work Sharing

Partition the iteration space manually, every thread computes N/num iterations

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-44
SLIDE 44

29

  • OpenMP takes care of partitioning the iteration space for you.

The only thing needed is to add the #pragma omp for

#pragma omp parallel { #pragma omp for for (i=0;i<100;i++) { A(i) = A(i) + B } }

  • even more compact by combining omp parallel/do directives

#pragma omp parallel for

Work Sharing(cont.)

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-45
SLIDE 45

29

  • OpenMP takes care of partitioning the iteration space for you.

The only thing needed is to add the #pragma omp for

for (i=0;i<100;i++) { A(i) = A(i) + B }

  • even more compact by combining omp parallel/do directives

#pragma omp parallel for

Work Sharing(cont.)

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-46
SLIDE 46

30

for (i = 0; i < n; i++) c[i] = a[i] + b[i];

Example

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-47
SLIDE 47

30

  • mp_set_num_threads(4)

#pragma omp parallel for shared(n, a, b, c) \ private(i) for (i = 0; i < n; i++) c[i] = a[i] + b[i];

Example

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-48
SLIDE 48

31

Example Parallel Execution

Computational Mathematics, OpenMP , Sharif University Fall 2015

slide-49
SLIDE 49