automatic code generation for library method inclusion in
play

Automatic Code Generation for Library Method Inclusion in Domain - PowerPoint PPT Presentation

u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Faculty of Science Automatic Code Generation for Library Method Inclusion in Domain Specific Languages Communicating Process Architectures 2017 University of


  1. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Faculty of Science Automatic Code Generation for Library Method Inclusion in Domain Specific Languages Communicating Process Architectures 2017 – University of Malta Mads Ohm Larsen Niels Bohr Institute, University of Copenhagen, Denmark 21 August 2017 Slide 1/21

  2. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Why use libraries? Introduction Somebody else has already written a faster method than you could ever do. Slide 2/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  3. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Why use libraries? Introduction Somebody else has already written a faster method than you could ever do. An example of such a method is a fast way of multiplying two matrices that comes with the blas library call *gemm . Slide 2/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  4. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Why use libraries? Introduction Somebody else has already written a faster method than you could ever do. An example of such a method is a fast way of multiplying two matrices that comes with the blas library call *gemm . If possible, we always want to use this faster method. Slide 2/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  5. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e cBLAS, Accelerate, clBLAS, LAPACK Why can’t we? So why not just use one of these specialized libraries all the time? Slide 3/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  6. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e cBLAS, Accelerate, clBLAS, LAPACK Why can’t we? So why not just use one of these specialized libraries all the time? There exist many different libraries, for many different purposes/architectures/OSes. Slide 3/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  7. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e cBLAS, Accelerate, clBLAS, LAPACK Use the best? We have cBLAS , Accelerate, clBLAS , lapack and many many more. Slide 4/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  8. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e cBLAS, Accelerate, clBLAS, LAPACK Use the best? We have cBLAS , Accelerate, clBLAS , lapack and many many more. “Best” is hard to define. No one of the above is “best”. They all are “best” in their own way. Slide 4/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  9. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Coding blas Code cBLAS code: #include <cblas.h> 1 2 ... // Set up m, n, k, A_data, B_data, and C_data 3 4 // Calculates 5 // C := alpha * op(A) * op(B) + beta * C 6 // where op(X) is either X or X^T 7 cblas_sgemm( 8 CblasRowMajor, // Memory management 9 CblasNoTrans, // Transpose A? 10 CblasNoTrans, // Transpose B? 11 m, // Number of rows of op(A) 12 n, // Number of columns of op(B) 13 k, // Number of columns/rows of op(A) and op(B) 14 1.0, // Alpha argument 15 A_data, // Array of size m*k 16 k, // First dimension of A / Stride of A 17 B_data, // Array of size k*n 18 n, // Stride of B 19 0.0, // Beta argument 20 C_data, // Array of size m*n 21 n // Stride of C 22 ); 23 Slide 5/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  10. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Coding blas Code Python code: import numpy as np 1 ... # Set up a and b 2 c = np.matmul(a, b) 3 Slide 6/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  11. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Python/NumPy NumPy NumPy already uses blas for calls like matmul . Problem solved? Slide 7/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  12. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Python/NumPy NumPy NumPy already uses blas for calls like matmul . Problem solved? No. Python/NumPy is “slow” (single threaded) and cannot utilize GPGPUs or other accelerators out-of-the-box. Slide 7/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  13. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Python/NumPy vs. Bohrium Bohrium Bohrium can use GPGPUs, but does not support blas . Slide 8/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  14. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Python/NumPy vs. Bohrium Bohrium Bohrium can use GPGPUs, but does not support blas . Let us make it support these library methods, such as blas . Slide 8/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  15. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Compile When you compile/install Bohrium, CMake can look for present libraries to link with. NumPy does the same when you compile or install it. Slide 9/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  16. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Compile When you compile/install Bohrium, CMake can look for present libraries to link with. NumPy does the same when you compile or install it. If we find blas we want to link with it. Slide 9/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  17. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Compile When you compile/install Bohrium, CMake can look for present libraries to link with. NumPy does the same when you compile or install it. If we find blas we want to link with it. However, if we find clBLAS we would also link with that. Slide 9/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  18. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Choose With automatic code inclusion, we can choose which library we want to use on compile- and run-time! Slide 10/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  19. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Choose With automatic code inclusion, we can choose which library we want to use on compile- and run-time! We want Bohrium to link to both blas and choose the correct one later. Slide 10/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  20. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Implementing We can implement all the blas calls ourselves. Slide 11/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  21. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e Code-generation for Bohrium Implementing We can implement all the blas calls ourselves. Tedious. Let’s generate it instead! Slide 11/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  22. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e JSON, template, generate! JSON All of the blas methods follow a similar pattern. Slide 12/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  23. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e JSON, template, generate! JSON All of the blas methods follow a similar pattern. Let’s use that to our advantage. Slide 12/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

  24. u n i v e r s i t y o f c o p e n h a g e n n i e l s b o h r i n s t i t u t e JSON, template, generate! JSON { 1 "methods": [ 2 { 3 "name": "gemm", 4 "types": [ "s", "d", "c", "z" ], 5 "options": [ 6 "layout", "notransA", "notransB", 7 "m", "n", "k", 8 "A", "B", "C" 9 ] 10 }, 11 ... 12 ] 13 } 14 Slide 13/21 — M. O. Larsen — Automatic Code Generation for Library Method Inclusion in DSLs — August 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend