Lecture 3: Writing Parallel Programs Abhinav Bhatele, Department of - - PowerPoint PPT Presentation

lecture 3 writing parallel programs
SMART_READER_LITE
LIVE PREVIEW

Lecture 3: Writing Parallel Programs Abhinav Bhatele, Department of - - PowerPoint PPT Presentation

Introduction to Parallel Computing (CMSC498X / CMSC818X) Lecture 3: Writing Parallel Programs Abhinav Bhatele, Department of Computer Science Announcements Deepthought2 (dt2) accounts have been mailed to everyone If you want to use your


slide-1
SLIDE 1

Lecture 3: Writing Parallel Programs

Abhinav Bhatele, Department of Computer Science

Introduction to Parallel Computing (CMSC498X / CMSC818X)

slide-2
SLIDE 2

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Announcements

  • Deepthought2 (dt2) accounts have been mailed to everyone
  • If you want to use your own account, read the Piazza post and follow instructions

2

slide-3
SLIDE 3

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Writing parallel programs

  • Decide the serial algorithm first
  • Data: how to distribute data among threads/processes?
  • Data locality: assignment of data to specific processes to minimize data movement
  • Computation: how to divide work among threads/processes?
  • Figure out how often communication will be needed

3

slide-4
SLIDE 4

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Two-dimensional stencil computation

  • Commonly found kernel in computational codes
  • Heat diffusion, Jacobi method, Gauss-Seidel method

4

A[i, j] = A[i, j] + A[i − 1,j] + A[i + 1,j] + A[i, j − 1] + A[i, j + 1] 5

slide-5
SLIDE 5

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

2D stencil iteration in parallel

5

slide-6
SLIDE 6

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

2D stencil iteration in parallel

  • 1D decomposition
  • Divide rows (or columns) among processes

5

slide-7
SLIDE 7

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

2D stencil iteration in parallel

  • 1D decomposition
  • Divide rows (or columns) among processes

5

slide-8
SLIDE 8

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

2D stencil iteration in parallel

  • 1D decomposition
  • Divide rows (or columns) among processes
  • 2D decomposition
  • Divide both rows and columns (2d blocks)

among processes

5

slide-9
SLIDE 9

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

2D stencil iteration in parallel

  • 1D decomposition
  • Divide rows (or columns) among processes
  • 2D decomposition
  • Divide both rows and columns (2d blocks)

among processes

5

slide-10
SLIDE 10

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

N-body problem

6

https://developer.nvidia.com/gpugems/gpugems3/part-v-physics-simulation/chapter-31-fast-n-body-simulation-cuda

slide-11
SLIDE 11

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

N-body problem

  • Simulating the movement of N-bodies under

gravitational forces

6

https://developer.nvidia.com/gpugems/gpugems3/part-v-physics-simulation/chapter-31-fast-n-body-simulation-cuda

slide-12
SLIDE 12

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

N-body problem

  • Simulating the movement of N-bodies under

gravitational forces

  • Naive algorithm: O(n2)
  • Every body calculates forces pair-wise with every other

body (particle)

6

https://developer.nvidia.com/gpugems/gpugems3/part-v-physics-simulation/chapter-31-fast-n-body-simulation-cuda

slide-13
SLIDE 13

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Data distribution in N-body problems

  • Naive approach: Assign n/k particles to each process
  • Other approaches?

7

http://datagenetics.com/blog/march22013/ https://en.wikipedia.org/wiki/Z-order_curve

slide-14
SLIDE 14

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Data distribution in N-body problems

  • Naive approach: Assign n/k particles to each process
  • Other approaches?

7

http://datagenetics.com/blog/march22013/ https://en.wikipedia.org/wiki/Z-order_curve

Space- filling curves

slide-15
SLIDE 15

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Data distribution in N-body problems

  • Naive approach: Assign n/k particles to each process
  • Other approaches?

7

http://datagenetics.com/blog/march22013/ https://en.wikipedia.org/wiki/Z-order_curve

Space- filling curves

slide-16
SLIDE 16

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Data distribution in N-body problems

  • Naive approach: Assign n/k particles to each process
  • Other approaches?

7

http://datagenetics.com/blog/march22013/ https://en.wikipedia.org/wiki/Z-order_curve http://charm.cs.uiuc.edu/workshops/charmWorkshop2011/slides/CharmWorkshop2011_apps_ChaNGa.pdf

Space- filling curves ORB

slide-17
SLIDE 17

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Data distribution in N-body problems

  • Let us consider a two-dimensional space with bodies/particles in it

8

slide-18
SLIDE 18

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Data distribution in N-body problems

  • Let us consider a two-dimensional space with bodies/particles in it

8

slide-19
SLIDE 19

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Data distribution in N-body problems

  • Let us consider a two-dimensional space with bodies/particles in it

8

Quad-tree: not all nodes are shown

slide-20
SLIDE 20

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Load balance and grain size

  • Load balance: try to balance the amount of work (computation) assigned to different

threads/ processes

  • Bring ratio of maximum to average load as close to 1 as possible
  • Secondary consideration: also load balance amount of communication
  • Grain size: ratio of computation-to-communication
  • Coarse-grained (more computation) vs. fine-grained (more communication)

9

slide-21
SLIDE 21

Abhinav Bhatele 5218 Brendan Iribe Center (IRB) / College Park, MD 20742 phone: 301.405.4507 / e-mail: bhatele@cs.umd.edu