Flexible and Type-Safe Skeleton Programming for Heterogeneous - - PowerPoint PPT Presentation

flexible and type safe skeleton programming for
SMART_READER_LITE
LIVE PREVIEW

Flexible and Type-Safe Skeleton Programming for Heterogeneous - - PowerPoint PPT Presentation

EXCESS workshop August 26, 2016 (From HLPP 2016, updated) Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems August Ernstsson Lu Li Christoph Kessler {firstname.lastname}@liu.se Linkping University, Sweden


slide-1
SLIDE 1

August Ernstsson Lu Li Christoph Kessler

Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

EXCESS workshop

{firstname.lastname}@liu.se

Linköping University, Sweden August 26, 2016

(From HLPP 2016, updated)

slide-2
SLIDE 2
  • Skeleton Programming
  • SkePU 1

Overview

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

Background Results

  • SkePU 2
  • Readability Survey
  • Performance Evaluation
  • Conclusions

2

slide-3
SLIDE 3

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

Skeleton Programming

3

slide-4
SLIDE 4

Programming parallel systems is hard!

  • Resource utilization
  • Synchronization
  • Communication
  • Memory consistency
  • Different hardware architectures
  • Heterogeneity

Skeleton Programming :: Motivation

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

4

slide-5
SLIDE 5
  • A high-level parallel programming concept
  • Inspired by functional programming
  • Generic computational patterns
  • Abstracts architecture-specific issues

Skeleton Programming :: Introduction

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

Skeleton programming (algorithmic skeletons)

5

slide-6
SLIDE 6

Skeleton Programming :: Skeletons

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

Skeletons Parametrizable higher-order constructs

  • Map
  • Reduce
  • MapReduce
  • Scan
  • and others

Map MapReduce

6

slide-7
SLIDE 7

Skeleton Programming :: User Functions

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

Mult

Add

User functions User-defined operators

Abs

Pow

7

slide-8
SLIDE 8

Skeleton Programming :: Example

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

MapReduce

Mult

Add

Skeleton parametrization example Dot product operation

8

slide-9
SLIDE 9

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

SkePU

9

slide-10
SLIDE 10

SkePU :: Features

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

✓ Efficient parallel algorithms ✓ Memory management and data movement ✓ Automatic backend selection and tuning

10

slide-11
SLIDE 11

SkePU :: Architecture

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

C++ interface (macros, skeletons, smart containers, …) C++ OpenMP OpenCL CUDA … CPU Multi-core CPU Accelerator GPU …

A d d i t i

  • n

a l r e s e a r c h

  • n

g

  • i

n g

11

slide-12
SLIDE 12

SkePU :: Syntax

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

BINARY_FUNC(add, int, a, b, return a + b; ) Map<add> vec_sum(new add);

Mult

Map Mult

vec_sum(v1, v2, result);

Mult Mult Mult Mult Mult Mult Mult Mult

12

slide-13
SLIDE 13

SkePU :: Syntax

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

BINARY_FUNC_CONSTANT(add, int, int, a, b, m, return (a + b) % m; )

Mult

vec_sum.setConstant(5); vec_sum(v1, v2, result); Map<add> vec_sum(new add);

Map Mult

Mult Mult Mult Mult Mult Mult Mult Mult

13

slide-14
SLIDE 14

SkePU :: Limitations

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

× Type-unsafe user functions × Constrained skeleton signatures × Non-intuitive macro syntax

14

slide-15
SLIDE 15

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

15

slide-16
SLIDE 16
  • Builds on the SkePU 1 runtime and algorithms
  • New, more native-looking interface (API)
  • Extra source-to-source translation step
  • Based on Clang compiler front-end libraries

SkePU 2 :: Introduction

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

16

slide-17
SLIDE 17

int add(int a, int b) { return a + b; }

SkePU 2 :: Syntax

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

auto vec_sum = Map<2>(add);

Map Mult

Mult

vec_sum(result, v1, v2);

Mult Mult Mult Mult Mult Mult Mult Mult

17

slide-18
SLIDE 18

int add(int a, int b, int m) { return (a + b) % m; }

SkePU 2 :: Syntax

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

vec_sum(result, v1, v2, 5);

Map Mult

auto vec_sum = Map<2>(add);

Mult

Mult Mult Mult Mult Mult Mult Mult Mult

18

slide-19
SLIDE 19

SkePU 2 :: Architecture

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

source-to-source translator

SkePU program source

C++11 compiler

Sequential runtime library

Sequential executable

Parallel runtime Parallel runtime

Parallel backend runtime

C++11 compiler

Parallel executable

19

slide-20
SLIDE 20
  • Variable arity on Map and MapReduce skeletons
  • Index argument (of current Map’d container element)
  • Uniform arguments
  • Smart container arguments accessible freely inside user

function

  • Read-only / write-only / read-write copy modes
  • User function templates

SkePU 2 :: Flexibility

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

20

slide-21
SLIDE 21

SkePU 2 :: Advanced Example

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

template<typename T> T abs(T input) { return input < 0 ? -input : input; } template<typename T> T mvmult(Index1D row, const Matrix<T> m, const Vector<T> v) { T res = 0; for (size_t i = 0; i < v.size; ++i) res += m[row.i * m.cols + i] * v[i]; return abs(res); }

R e a d

  • n

l y , n

  • c
  • p

y b a c k Templates C h a i n e d u s e r f u n c t i

  • n

s

21

slide-22
SLIDE 22

SkePU 2 :: Type Safety

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

SkePU 2, at compile time

error: no matching function for call to 'Reduce' auto globalSum = Reduce(plus_f); ^~~~~~~~~~~~~~ note: candidate template ignored: failed template argument deduction Reduce(T(*red)(T, T))

Type safety test-case Reduce skeleton with unary user function

SkePU 1, at run time

[SKEPU_ERROR] Wrong operator type! Reduce operation require binary user function.

22

slide-23
SLIDE 23
  • User function specialization for backends
  • Extends SkePU for multi-variant components
  • ”Call” skeleton
  • Custom types
  • Chained user functions
  • In-line lambda syntax for user functions

SkePU 2 :: Experimental Features

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

23

slide-24
SLIDE 24

SkePU 2 :: Lambda Syntax

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

int add(int a, int b) { return a + b; } auto vec_sum = Map<2>(mult); // ... vec_sum(result, v1, v2);

24

slide-25
SLIDE 25

SkePU 2 :: Lambda Syntax

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

auto vec_sum = Map<2>([](int a, int b) { return a + b; }); // ... vec_sum(result, v1, v2);

25

slide-26
SLIDE 26

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

Readability Survey

26

slide-27
SLIDE 27
  • Survey was made on a development version of SkePU 2

with a slightly different syntax

  • Main difference: Used C++11 attributes
  • [[skepu::userfunction]] on user functions
  • [[skepu::instance]] on skeleton instances
  • Reason: Guide the source-to-source translator and

generate better error messages

Readability Survey

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

27

slide-28
SLIDE 28

Readability :: Simple Example

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

BINARY_FUNC(sum, int, a, b, return a + b; ) Vector<float> vector_sum(Vector<float> &v1, Vector<float> &v2) { Map<sum> vsum(new sum); Vector<float> result(v1.size()); vsum(v1, v2, result); return result; } [[skepu::userfunction]] float sum(float a, float b) { return a + b; } Vector<float> vector_sum(Vector<float> &v1, Vector<float> &v2) { auto vsum [[skepu::instance]] = Map<2>(sum); Vector<float> result(v1.size()); vsum(result, v1, v2); return result; }

2 1

28

slide-29
SLIDE 29

Readability :: Complex Example

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems [[skepu::userfunction]] float kth_term(skepu2::Index1D index, float x) { int k = index.i + 1; float temp_x = pow(x, k); int sign = (k % 2 == 0) ? -1 : 1; return sign * temp_x / k; } [[skepu::userfunction]] float plus(float a, float b) { return a + b; } float taylor_approx(float x, size_t N) { auto taylor [[skepu::instance]] = skepu2::MapReduce<0>(kth_term, plus); taylor.setDefaultSize(N); return taylor(x); } UNARY_FUNC_CONSTANT(kth_term, float, float, k, x, float temp_x = pow(x, k); int sign = ((int)k % 2 == 0) ? -1 : 1; return sign * temp_x / k; ) BINARY_FUNC(plus, float, a, b, return a + b; ) GENERATE_FUNC(init, float, float, index, seed, return index + 1; ) float taylor_approx(float x, size_t N) { skepu::MapReduce<kth_term, plus> taylor(new kth_term, new plus); skepu::Generate<init> vec_init(new init); taylor.setConstant(x); skepu::Vector<float> terms(N); vec_init(N, terms); return taylor(terms); }

2 1

29

slide-30
SLIDE 30

Readability :: Results

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

2 4 6 8 SkePU 1 ∙ Neutral ∙ SkePU 2

SkePU 2 first SkePU 1 first

SkePU 1 ∙ Neutral ∙ SkePU 2

First example (simple) Second example (complex)

30

slide-31
SLIDE 31

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

Performance Evaluation

31

slide-32
SLIDE 32

Performance :: Compile Time

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

Compile time [s]

4 8 12 16

Mandelbrot MVmult CMA PPMCC PSNR Taylor Coulombic Nbody Median

SkePU 2 avg SkePU 1 avg

SkePU 1 SkePU 2

32

slide-33
SLIDE 33

Performance :: Backends

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

33

slide-34
SLIDE 34

Performance :: Versions

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

SkePU 1.2 SkePU 2

34

slide-35
SLIDE 35

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

Conclusions

35

slide-36
SLIDE 36
  • SkePU 2 advancements

+ A native-looking and flexible interface + Better type safety + Possibility for more efficient algorithms

Conclusions

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

  • Current limitations

− Needs more performance evaluation − Some SkePU 1 features are not available − C++11 attributes may be unfamiliar to users

36

slide-37
SLIDE 37

Conclusions :: Availability

August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems

will be distributed as open source soon. Check the website at: http://www.ida.liu.se/labs/pelab/skepu/

37