August Ernstsson Lu Li Christoph Kessler
Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
EXCESS workshop
{firstname.lastname}@liu.se
Linköping University, Sweden August 26, 2016
(From HLPP 2016, updated)
Flexible and Type-Safe Skeleton Programming for Heterogeneous - - PowerPoint PPT Presentation
EXCESS workshop August 26, 2016 (From HLPP 2016, updated) Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems August Ernstsson Lu Li Christoph Kessler {firstname.lastname}@liu.se Linkping University, Sweden
August Ernstsson Lu Li Christoph Kessler
Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
EXCESS workshop
{firstname.lastname}@liu.se
Linköping University, Sweden August 26, 2016
(From HLPP 2016, updated)
Overview
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
Background Results
2
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
3
Programming parallel systems is hard!
Skeleton Programming :: Motivation
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
4
Skeleton Programming :: Introduction
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
Skeleton programming (algorithmic skeletons)
5
Skeleton Programming :: Skeletons
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
Skeletons Parametrizable higher-order constructs
Map MapReduce
6
Skeleton Programming :: User Functions
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
Mult
Add
User functions User-defined operators
Abs
Pow
…
7
Skeleton Programming :: Example
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
MapReduce
Mult
Add
Skeleton parametrization example Dot product operation
8
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
9
SkePU :: Features
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
10
SkePU :: Architecture
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
C++ interface (macros, skeletons, smart containers, …) C++ OpenMP OpenCL CUDA … CPU Multi-core CPU Accelerator GPU …
A d d i t i
a l r e s e a r c h
g
n g
11
SkePU :: Syntax
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
BINARY_FUNC(add, int, a, b, return a + b; ) Map<add> vec_sum(new add);
Mult
Map Mult
vec_sum(v1, v2, result);
Mult Mult Mult Mult Mult Mult Mult Mult12
SkePU :: Syntax
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
BINARY_FUNC_CONSTANT(add, int, int, a, b, m, return (a + b) % m; )
Mult
vec_sum.setConstant(5); vec_sum(v1, v2, result); Map<add> vec_sum(new add);
Map Mult
Mult Mult Mult Mult Mult Mult Mult Mult13
SkePU :: Limitations
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
14
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
15
SkePU 2 :: Introduction
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
16
int add(int a, int b) { return a + b; }
SkePU 2 :: Syntax
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
auto vec_sum = Map<2>(add);
Map Mult
Mult
vec_sum(result, v1, v2);
Mult Mult Mult Mult Mult Mult Mult Mult17
int add(int a, int b, int m) { return (a + b) % m; }
SkePU 2 :: Syntax
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
vec_sum(result, v1, v2, 5);
Map Mult
auto vec_sum = Map<2>(add);
Mult
Mult Mult Mult Mult Mult Mult Mult Mult18
SkePU 2 :: Architecture
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
source-to-source translator
SkePU program source
C++11 compiler
Sequential runtime library
Sequential executable
Parallel runtime Parallel runtime
Parallel backend runtime
C++11 compiler
Parallel executable
19
function
SkePU 2 :: Flexibility
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
20
SkePU 2 :: Advanced Example
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
template<typename T> T abs(T input) { return input < 0 ? -input : input; } template<typename T> T mvmult(Index1D row, const Matrix<T> m, const Vector<T> v) { T res = 0; for (size_t i = 0; i < v.size; ++i) res += m[row.i * m.cols + i] * v[i]; return abs(res); }
R e a d
l y , n
y b a c k Templates C h a i n e d u s e r f u n c t i
s
21
SkePU 2 :: Type Safety
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
SkePU 2, at compile time
error: no matching function for call to 'Reduce' auto globalSum = Reduce(plus_f); ^~~~~~~~~~~~~~ note: candidate template ignored: failed template argument deduction Reduce(T(*red)(T, T))
Type safety test-case Reduce skeleton with unary user function
SkePU 1, at run time
[SKEPU_ERROR] Wrong operator type! Reduce operation require binary user function.
22
SkePU 2 :: Experimental Features
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
23
SkePU 2 :: Lambda Syntax
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
int add(int a, int b) { return a + b; } auto vec_sum = Map<2>(mult); // ... vec_sum(result, v1, v2);
24
SkePU 2 :: Lambda Syntax
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
auto vec_sum = Map<2>([](int a, int b) { return a + b; }); // ... vec_sum(result, v1, v2);
25
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
26
with a slightly different syntax
generate better error messages
Readability Survey
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
27
Readability :: Simple Example
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
BINARY_FUNC(sum, int, a, b, return a + b; ) Vector<float> vector_sum(Vector<float> &v1, Vector<float> &v2) { Map<sum> vsum(new sum); Vector<float> result(v1.size()); vsum(v1, v2, result); return result; } [[skepu::userfunction]] float sum(float a, float b) { return a + b; } Vector<float> vector_sum(Vector<float> &v1, Vector<float> &v2) { auto vsum [[skepu::instance]] = Map<2>(sum); Vector<float> result(v1.size()); vsum(result, v1, v2); return result; }
2 1
28
Readability :: Complex Example
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems [[skepu::userfunction]] float kth_term(skepu2::Index1D index, float x) { int k = index.i + 1; float temp_x = pow(x, k); int sign = (k % 2 == 0) ? -1 : 1; return sign * temp_x / k; } [[skepu::userfunction]] float plus(float a, float b) { return a + b; } float taylor_approx(float x, size_t N) { auto taylor [[skepu::instance]] = skepu2::MapReduce<0>(kth_term, plus); taylor.setDefaultSize(N); return taylor(x); } UNARY_FUNC_CONSTANT(kth_term, float, float, k, x, float temp_x = pow(x, k); int sign = ((int)k % 2 == 0) ? -1 : 1; return sign * temp_x / k; ) BINARY_FUNC(plus, float, a, b, return a + b; ) GENERATE_FUNC(init, float, float, index, seed, return index + 1; ) float taylor_approx(float x, size_t N) { skepu::MapReduce<kth_term, plus> taylor(new kth_term, new plus); skepu::Generate<init> vec_init(new init); taylor.setConstant(x); skepu::Vector<float> terms(N); vec_init(N, terms); return taylor(terms); }
2 1
29
Readability :: Results
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
2 4 6 8 SkePU 1 ∙ Neutral ∙ SkePU 2
SkePU 2 first SkePU 1 first
SkePU 1 ∙ Neutral ∙ SkePU 2
First example (simple) Second example (complex)
30
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
31
Performance :: Compile Time
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
Compile time [s]
4 8 12 16
Mandelbrot MVmult CMA PPMCC PSNR Taylor Coulombic Nbody Median
SkePU 2 avg SkePU 1 avg
SkePU 1 SkePU 2
32
Performance :: Backends
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
33
Performance :: Versions
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
SkePU 1.2 SkePU 2
34
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
35
+ A native-looking and flexible interface + Better type safety + Possibility for more efficient algorithms
Conclusions
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
− Needs more performance evaluation − Some SkePU 1 features are not available − C++11 attributes may be unfamiliar to users
36
Conclusions :: Availability
August Ernstsson – 2016-08-26 SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems
will be distributed as open source soon. Check the website at: http://www.ida.liu.se/labs/pelab/skepu/
37