generic programming techniques
play

Generic Programming Techniques by the example of tensor contraction - PowerPoint PPT Presentation

Generic Programming Techniques by the example of tensor contraction Patrick Seewald International Fortran Conference, University of Zurich, July 3, 2020 Department of Chemistry, University of Zurich 1 / 16 Background


  1. Generic Programming Techniques by the example of tensor contraction Patrick Seewald International Fortran Conference, University of Zurich, July 3, 2020 Department of Chemistry, University of Zurich 1 / 16

  2. Background https://github.com/aradi/fypp generalization of DBCSR sparse matrix library to tensors • My work: algorithms based on sparse tensor contractions, https://github.com/Fortran-FOSS-Programmers/ford • FORD : Documentation generator https://github.com/pseewald/fprettify • fprettify : Auto formatter (whitespace / indentation) • Fypp : Preprocessor (Python-based templates / macros) • PhD student in theoretical chemistry • Fortran tools used in CP2K / DBCSR: https://github.com/cp2k/dbcsr sparse matrix/tensor library DBCSR • CP2K is strong in algorithms based on sparse linear algebra using the https://github.com/cp2k/cp2k • CP2K project: quantum chemistry & solid state physics, Fortran 2008 2 / 16

  3. Generic Programming in General • application of the same algorithm to multiple data types • solve a class of related problems instead of tackling each specifjc problem on its own 3 / 16 → e.g. sort implemenation for arbitrary data types → general algorithm that can be applied to many difgerent problems

  4. Generic Programming with Fortran implement a basic preprocessor (cpp/fpp) restricted to including macro language • Non-standard preprocessors : extend Fortran syntax with an external • Code generators : delegate generic programming to another language • Code duplication : limited and problematic (bug fjxes, refactoring) Typical workarounds for missing Fortran macros/templates: common code snippet. 4 / 16 Important generic programming ingredients: types • Templates and macros (compile-time): generate code for difgerent types (e.g. a common shape class for rectangles and triangles) • Polymorphism (run-time): generic type representing multiple specifjc □ ✓ Fortran 2003 □ Fortran standard does not include templates/macros. Compilers

  5. Example: tensor contractions / Einstein summation ! allocatable : : a , b , c integer , dimension ( : , : , : , : ) , allocatable : : data_a integer , dimension ( : , : , : ) , allocatable : : data_b . . . [ 1 ] ) a l l o c a t e and assign data_a , data_b . . . a = tensor ( data_a ) b = tensor (data_b ) c = tensor_einsum ( & a , [1 ,2 ,3 ,4] , b , [2 ,1 ,5] , [5 ,3 ,4]) Fortran API as simple as commonly used Python libraries (Numpy, PyTorch) Implementation in 500 lines of Fortran: https://github.com/pseewald/fortran-einsum-example c l a s s ( tensor ) , [3 ,2] , 5 / 16 real , k Generic Fortran API: c l a s s ( tensor ) , allocatable : : a , b , c real , dimension ( : , : , : ) , allocatable : : [1 ,2 ,3] , b , data_a dimension ( : , : ) , data_b a , c = tensor_einsum ( & b = tensor (data_b ) a = tensor ( data_a ) allocatable . . . and assign data_a , a l l o c a t e . . . ! data_b : : ∑ ∑ A ijk B kj = C i A ijkl B jim = C mkl i , j

  6. Example: tensor contractions / Einstein summation DO i =1, SIZE (A,1) C(m, k , l ) = C(m, k , l ) + & A( i , j , k , l )∗B( j , i ,m) ENDDO ENDDO ENDDO ENDDO ENDDO DO j =1, SIZE (A,2) DO l =1, SIZE (A,4) DO k=1, SIZE (A,3) C = C + & A( i , j , k)∗B( j , i , k) ENDDO ENDDO ENDDO Generated code also needs to be optimized (loop unrolling, blocking, DO m=1, SIZE (B,3) DO k=1, SIZE (A,3) DO j =1, SIZE (A,2) C( i ) = C( i ) + & k DO i =1, SIZE (A,1) DO j =1, SIZE (A,2) DO k=1, SIZE (A,3) Naive / direct implementation implementation: A( i , j , k)∗B(k , j ) ENDDO ENDDO ENDDO DO i =1, SIZE (A,1) 6 / 16 ∑ ∑ ∑ A ijk B kj = C i A ijkl B jim = C mkl A ijk B jik = C i , j i , j , k parallelization) → not a good starting point

  7. Generic einsum implementation: strategy • all tensor contractions can be mapped to products of matrices and specifjc: implementations for all specifjc types (code generation) 3. ( select type construct) dynamic: mapping generic algorithms to specifjc implementations 2. generic: generic algorithms working on generic types (abstract classes) 1. • generic implementation based on following hierarchy: • difgerent tensor ranks (0–7) • difgerent data types (integer/real/complex in 4-byte/8-byte precision) • tensor type should incorporate vector-vector outer product vector-vector inner product matrix-matrix product matrix-vector product for all fmoating point operations: 7 / 16 vectors → reuse of existing libraries (e.g. Fortran intrinsics or BLAS) • ∑ k A ijk B kj = C i • ∑ i , j A ijkl B jim = C mkl • ∑ i , j , k A ijk B jik = C • A ij B k = C ijk → 6 · 8 = 48 difgerent base data types, need a code generator or preprocessor

  8. Generic code design real , allocatable :: t2 (:, :) type is (tensor_2d) t2%data = specifjc_2d_func(t1%data) ! ... end select end function function specifjc_2d_func(t1) RESULT(t2) real , intent(in) :: t1 (:, :) ! ... type is (tensor_1d) end function function generic_func(t1) RESULT(t2) class(tensor) :: t1 class(tensor), allocatable :: t2 t2 = dynamic_func(t1) ! ... end function t2%data = specifjc_1d_func(t1%data) select type (t1) tensor_einsum tensor_from_2d tensor_to_matrix matrix_to_tensor tensor_to_0d tensor_to_1d tensor_to_2d tensor_from_0d tensor_from_1d reshape_data ! ... reshape matrix_product dot_product outer_product matmul function dynamic_func(t1) RESULT(t2) class(tensor) :: t1 class(tensor), allocatable :: t2 8 / 16

  9. Tensor einsum: types type , d ( : , : ) : : allocatable integer ( int32 ) , data_2d_i4 : : extends ( tensor_data ) . . . Automated type generation using ! end type d ( : ) : : allocatable integer ( int32 ) , data_1d_i4 : : end type Fypp preprocessor: type , data_${ rank }$d_${ name }$ #:endfor #:endfor end type d${ shape ( rank )}$ : : allocatable ${ type }$ , : : & #: for extends ( tensor_data ) type , in data_params type #: for name , ranks in rank extends ( tensor_data ) . . . ! data integer ( int32 ) , data_0d_i4 : : extends ( tensor_data ) type , instances ) types (48 concrete : : ! end type tensor_data : : abstract type , data type abstract allocatable d ! type , end type d : : allocatable real ( real32 ) , data_0d_r4 : : extends ( tensor_data ) end type end type d : : allocatable integer ( int64 ) , data_0d_i8 : : extends ( tensor_data ) type , 9 / 16

  10. Fypp – Python powered Fortran metaprogramming @:ASSERT( size (myArray) > 0) l i n e ${_LINE_}$” e r r o r stop end i f #: endif #:enddef ASSERT • Insertion of arbitrary Python expressions https://github.com/aradi/fypp character (∗) , parameter : : comp_date = ”${time . s t r f t i m e (’%Y % m % d ’)} $” ${_FILE_}$ , f i l e in f a i l e d • Iterated output to simulate templates interface myfunc #: for dtype in [ ’ r e a l ’ , ’ dreal ’ , ’ complex ’ , ’ dcomplex ’ ] module procedure myfunc_${dtype}$ #:endfor end interface myfunc • Macros #:def ASSERT( cond ) #: i f DEBUG > 0 i f ( . not . ${cond}$) then print ∗ , ” Assert 10 / 16 − −

  11. Tensor einsum: macros #:endfor #: for name , type in data_params type , extends ( tensor_data ) : : data_${ rank }$d_${ name }$ ${ type }$ , allocatable : : d${ shape ( rank )}$ end type #:endfor tensor_lib .fpp ! . . . type , extends ( tensor_data ) : : data_3d_i4 integer ( int32 ) , allocatable : : d ( : , : , : ) end type ! . . . ranks in rank . . . ] #:set ranks = range (0 , RANK +1) #:set data_name = [ ’ i4 ’ , ’ i8 ’ , ’ r4 ’ , ’ r8 ’ , ’ c4 ’ , ’ c8 ’ ] #:set data_type = [ ’ integer ( int32 ) ’ , ’ integer ( int64 ) ’ , ’ r e a l ( real32 ) ’ , #:set data_params = l i s t ( zip (data_name , #: for data_type )) #:def shape (n) $ : ’ ’ i f n == 0 else #:enddef ! concrete data types generated using Fypp preprocessor 11 / 16 ’ ( ’ + ’ : ’ + ’ , : ’ ∗ (n − 1) + ’ ) ’ fypp − DRANK=7 tensor_lib.fpp > tensor_lib.f90

  12. Tensor einsum: types and constructor data${ shape ( rank )}$ allocatable c l a s s ( tensor ) , t_${ rank }$d : : & allocatable type ( tensor_${ rank }$d ) , sh : : dimension (${ rank }$) integer , : : t intent ( in ) ${ type }$ , r e s u l t ( t ) tensor_${ rank }$d_${ name }$ ( data ) & function in data_params type #: for name , ranks in rank : : type ( data_${ rank }$d_${ name }$ ) , & implementation ( t_data ) #:endfor #:endfor end function t ) move_alloc (t_${ rank }$d , c a l l move_alloc ( t_data , t_${ rank }$d% data ) c a l l source= data ) ( t_data%d , allocate allocate allocatable source=sh ) (t_${ rank }$d% shape (${ rank }$ ) , & allocate (t_${ rank }$d) allocate #: endif sh = shape ( data ) rank > 0 #: i f t_data : : #: for constructor ! c l a s s ( tensor_data ) , rank #: for types tensor concrete ! end type data : : allocatable shape ranks : : allocatable dimension ( : ) , integer , tensor : : abstract type , type tensor abstract in type , ! interface end interface #:endfor #:endfor tensor_${ rank }$d_${ name }$ module procedure #: for name in data_name ranks in rank #: for tensor constructor extends ( tensor ) ! #:endfor end type #: endif vector_type = row_vec : : integer rank == 1 #: i f tensor_${ rank }$d : : 12 / 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend