Lecture 8: Designing Parallel Algorithms Abhinav Bhatele, Department - - PowerPoint PPT Presentation

lecture 8 designing parallel algorithms
SMART_READER_LITE
LIVE PREVIEW

Lecture 8: Designing Parallel Algorithms Abhinav Bhatele, Department - - PowerPoint PPT Presentation

Introduction to Parallel Computing (CMSC498X / CMSC818X) Lecture 8: Designing Parallel Algorithms Abhinav Bhatele, Department of Computer Science Announcements Assignment 1 has been released and due on October 5 Project descriptions are


slide-1
SLIDE 1

Lecture 8: Designing Parallel Algorithms

Abhinav Bhatele, Department of Computer Science

Introduction to Parallel Computing (CMSC498X / CMSC818X)

slide-2
SLIDE 2

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Announcements

  • Assignment 1 has been released and due on October 5
  • Project descriptions are due on September 28
  • Quiz on September 30, due on October 1

2

slide-3
SLIDE 3

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Writing parallel programs

  • Decide the serial algorithm first
  • Data: how to distribute data among threads/processes?
  • Data locality: assignment of data to specific processes to minimize data movement
  • Computation: how to divide work among threads/processes?
  • Figure out how often communication is needed

3

slide-4
SLIDE 4

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Prefix sum

  • Calculate partial sums of elements in array
  • Also called a “scan” sometimes

4

pSum[0] = A[0] for(i=1; i<N; i++) { pSum[i] = pSum[i-1] + A[i]

1 2 3 4 5 6 … 1 3 6 10 15 21 …

slide-5
SLIDE 5

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Parallel prefix sum

5

2 8 3 5 7 4 1 6

slide-6
SLIDE 6

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Parallel prefix sum

5

2 8 3 5 7 4 1 6 2 10 11 8 12 11 5 7

slide-7
SLIDE 7

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Parallel prefix sum

5

2 8 3 5 7 4 1 6 2 10 11 8 12 11 5 7 2 10 13 18 23 19 17 18

slide-8
SLIDE 8

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Parallel prefix sum

5

2 8 3 5 7 4 1 6 2 10 11 8 12 11 5 7 2 10 13 18 25 29 30 36 2 10 13 18 23 19 17 18

slide-9
SLIDE 9

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

In practice

  • You have N numbers and P processes, N >> P
  • Assign a N/P block to each process
  • Do calculation for the blocks on each process locally
  • Then do parallel algorithm with partial prefix sums

6

slide-10
SLIDE 10

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Parallel Sorting

  • Sorting is used in many HPC codes
  • For example, figuring out which particles/atoms are within a cutoff radius
  • Two broad categories of parallel sorting algorithms:
  • Merge-based
  • Splitter-based

7

slide-11
SLIDE 11

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Review QuickSort

  • Choose a pivot element from the unsorted list
  • Move all elements < pivot before the pivot and all elements > pivot after the pivot
  • Recursively apply this to the sublists before and after pivot

8

slide-12
SLIDE 12

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Sample Sort

  • Generalization of QuickSort
  • Instead of selecting one pivot, we select s-1 samples randomly
  • This provides us with s-1 “splitters”
  • Once sorted, these s-1 splitters create s buckets
  • Keys are then placed in the appropriate bucket
  • Call sample sort or quick sort recursively

9

slide-13
SLIDE 13

Abhinav Bhatele (CMSC498X/CMSC818X) LIVE RECORDING

Parallel Sample Sort

  • Assumption: keys are distributed across all processors in the beginning
  • Sample s keys randomly from each process
  • Bring all keys s * p keys to one process
  • select p-1 splitters from this sorted sample
  • Send all splitters to all processes
  • Processes exchange data based on buckets
  • Call some fast sorting algorithm locally

10

slide-14
SLIDE 14

Abhinav Bhatele 5218 Brendan Iribe Center (IRB) / College Park, MD 20742 phone: 301.405.4507 / e-mail: bhatele@cs.umd.edu