A Course-Based Usability Analysis of Cilk Plus and OpenMP Michael - - PowerPoint PPT Presentation

a course based usability analysis of cilk plus and openmp
SMART_READER_LITE
LIVE PREVIEW

A Course-Based Usability Analysis of Cilk Plus and OpenMP Michael - - PowerPoint PPT Presentation

A Course-Based Usability Analysis of Cilk Plus and OpenMP Michael Coblenz, Robert Seacord, Brad Myers, Joshua Sunshine, and Jonathan Aldrich 1 PROGRAMMING PARALLEL SYSTEMS Parallel programming is notoriously hard Must coordinate work of


slide-1
SLIDE 1

A Course-Based Usability Analysis of Cilk Plus and OpenMP


Michael Coblenz, Robert Seacord, Brad Myers, Joshua Sunshine, and Jonathan Aldrich

1

slide-2
SLIDE 2

PROGRAMMING PARALLEL SYSTEMS

Parallel programming is notoriously hard Must coordinate work of many different processing units

2

slide-3
SLIDE 3

C LANGUAGE PARALLEL PROGRAMMING

C has only low-level parallel programming features The CPLEX study group wants to fix this! Can they decide on a human-centered basis?

3

slide-4
SLIDE 4

4

slide-5
SLIDE 5

THE DESIGN SPACE

Cilk Plus and OpenMP: both existing, popular approaches Both use shared-memory, fork-join parallelism

5

slide-6
SLIDE 6

FORK-JOIN PARALLELISM

  • Monolithic task

Subtask Subtask Subtask Combined result

Reducers combine partial results Tasks, e.g. loop iterations, are split across threads

  • Overall approach in Cilk Plus and OpenMP: split task into subtasks; assign

subtasks to different threads

6

slide-7
SLIDE 7

OpenMP vs. Cilk Plus

OpenMP: very popular approach using compiler directives Introduced in 1997 Managed by OpenMP ARB Came from industry. Supports FORTRAN too. Cilk Plus: now owned by Intel, originated at MIT Introduced in 1995 Keyword-based approach

7

slide-8
SLIDE 8

EMPIRICAL COMPARISON

Does one approach lead to fewer bugs? Faster performance? Faster task completion times? Preliminary study; didn’t plan for statistical significance

8

slide-9
SLIDE 9

METHOD

Master’s level Secure Coding class assignment 9 students; 8 submitted the assignment; 8 participated in experiment Experienced programmers but no apparent experience with parallel programming Gave lectures on OpenMP , Cilk Plus, parallel programming (including race conditions)

9

slide-10
SLIDE 10

TASK

Parallelize provided serial anagram-finding code twice Told to use reducers and get speedup ≥ 1.5 All students used both extensions controlled for ordering effects Students given VM with Eclipse + Fluorite

10

slide-11
SLIDE 11

CILK PLUS CRASH COURSE (1)

cilk_for (int i = 0; i < 10; i++) { printf("Hello, world!"); }

11

slide-12
SLIDE 12

CILK PLUS CRASH COURSE (2)

CILK_C_DECLARE_REDUCER(results_t) results = CILK_C_INIT_REDUCER( results_t, reduce, identity, destroy ); cilk_for (int i = 0; i < word_len - pos; i++) { find_anagrams( dict, permutations[i], results, word_len, pos+1 ); } … results_append(&REDUCER_VIEW(*results), word);

12

slide-13
SLIDE 13

#pragma omp parallel for { for (int i = 0; i < 10; i++) { printf("Hello, world!"); } }

OPENMP CRASH COURSE (1)

13

slide-14
SLIDE 14

OPENMP CRASH COURSE (2)

#pragma omp declare reduction (results_reduction : results_t : results_reduce(&omp_out, &omp_in) ) initializer(results_init(&omp_priv)) results_t results; … #pragma omp parallel for reduction(results_reduction: results) for (int i = 0; i < word_len; i++) { find_anagrams( dict, permutations[i], &results, word_len, 1 ); }

14

slide-15
SLIDE 15

WHICH DO YOU PREFER?

15

#pragma omp declare reduction (results_reduction : results_t : results_reduce(&omp_out, &omp_in) ) initializer(results_init(&omp_priv)) CILK_C_DECLARE_REDUCER(results_t) results = CILK_C_INIT_REDUCER( results_t, reduce, identity, destroy );

slide-16
SLIDE 16

SUMMARY OF RESULTS

Cilk Plus OpenMP Number of correct programs 5 3 Average speedup 1.5 1.2 Number of correct programs with speedup at least 1.5 4 2

16

slide-17
SLIDE 17

CORRECTNESS

4/8 OpenMP solutions attempted to use reducer One tried but failed; three were successful 4/8 OpenMP solutions didn’t use reducers at all (but two tried) One tried to use #pragma omp critical but neglected {} 8/8 Cilk Plus solutions attempted to use reducer Two tried but failed (one declared reducer but didn’t use it;

  • ne called REDUCER_VIEW outside parallel region)

17

slide-18
SLIDE 18

PERFORMANCE

Speedup (safe solutions

  • nly)

0.5 1 1.5 2 OpenMP speedups Cilk Plus speedups

18

slide-19
SLIDE 19

ESTIMATED TIME ON TASK

OpenMP average task time Cilk Plus average task time Total task time

OpenMP first, Cilk Plus second

11 4 15

OpenMP second, Cilk Plus first

3 9 12

Estimates (hours) based on Fluorite logs

19

First task Second task

slide-20
SLIDE 20

HYPOTHESES

Parallel programming languages cannot be used safely (yet) by naïve programmers 😲 Distinct reducer, value types may reduce error rates Reducer syntax in OpenMP impedes programmers

20

slide-21
SLIDE 21

LIMITATIONS

Small sample size Results affected by instruction and provided materials One small task with learning effects Short time frame Small code size Novices in these extensions specifically; students in general

21

slide-22
SLIDE 22

CONCLUSIONS

Fork/join parallel programming cannot be used safely (yet) by naïve programmers 😲 Maybe we can design better languages, tools, or training OpenMP seems to be harder for novices to use than Cilk Plus Differences seem to affect productivity and correctness — study is worthwhile

22

slide-23
SLIDE 23

THANKS!

Thanks to the DoD, NSA and NSF for support, and to our anonymous reviewers Contact: Michael Coblenz (mcoblenz@cs.cmu.edu)

23