on the variety of static control parts in real world
play

On the Variety of Static Control Parts in Real-World Applications: - PowerPoint PPT Presentation

On the Variety of Static Control Parts in Real-World Applications: from Affine via Multi-dimensional to Polynomial and Just-in-Time Andreas Simbrger Armin Grlinger 4th International Workshop on Polyhedral Compilation Techniques 1 / 25


  1. On the Variety of Static Control Parts in Real-World Applications: from Affine via Multi-dimensional to Polynomial and Just-in-Time Andreas Simbürger Armin Größlinger 4th International Workshop on Polyhedral Compilation Techniques 1 / 25

  2. Defining the Real World 2 / 25

  3. Defining the Real World ◮ LLVM ( llvm.org ) 2 / 25

  4. Defining the Real World ◮ LLVM ( llvm.org ) ◮ Polly ( polly.llvm.org ) 2 / 25

  5. Defining the Real World ◮ LLVM ( llvm.org ) ◮ Polly ( polly.llvm.org ) ◮ PolyJIT ( www.infosun.fim.uni-passau.de/cl/PolyJIT ) 2 / 25

  6. Automatic Detection of SCoPs in LLVM Loop Scop Polly IR LLVM IR detection detection Loop Scalar normalization evolution 3 / 25

  7. Effectiveness of Automatic Polyhedral Optimization Polly SCoP optimized LLVM IR optimizer detection LLVM IR 4 / 25

  8. Effectiveness of Automatic Polyhedral Optimization Polly SCoP optimized LLVM IR optimizer detection LLVM IR Exploitation of parallelism (transformations) 4 / 25

  9. Effectiveness of Automatic Polyhedral Optimization Polly SCoP optimized LLVM IR optimizer detection LLVM IR Detection: Applicability and Exploitation of parallelism potential of valid loops (transformations) 4 / 25

  10. Effectiveness of Automatic Polyhedral Optimization Polly SCoP optimized LLVM IR optimizer detection LLVM IR Detection: Applicability and Exploitation of parallelism potential of valid loops (transformations) The detection process lacks thorough empirical evaluation! 4 / 25

  11. PolyJIT: pprof ◮ Set of 50 programs commonly used in various domains. ◮ 8 domains (Multimedia, Scientific, Simulation, Encryption, Compilation, Compression, Databases, Verification). ◮ Extract run time and compile time statistics. 5 / 25

  12. Measuring a SCoP’s fraction of the total run time What fraction of a program’s total run time is spent inside SCoPs? for ( int i=0; i<=n; ++i) for ( int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; } Definition (Execution SCoP coverage) ExecCov = Time spent inside SCoPs Total program run time 6 / 25

  13. Measuring a SCoP’s fraction of the total run time What fraction of a program’s total run time is spent inside SCoPs? Definition (Execution SCoP coverage) ExecCov = Time spent inside SCoPs Total program run time 6 / 25

  14. Static Control Parts: Class Static Detection at compile time for ( int i=0; i<=n; ++i) for ( int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; } 7 / 25

  15. Static Control Parts: Class Static Detection at compile time for ( int i=0; i<=n; ++i) for ( int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; } 1. Affine expressions in ◮ Loop bounds ◮ Conditions ◮ Memory accesses 7 / 25

  16. Static Control Parts: Class Static Detection at compile time for ( int i=0; i<=n; ++i) for ( int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; } 1. Affine expressions in ◮ Loop bounds ◮ Conditions ◮ Memory accesses 2. Static control flow 7 / 25

  17. Static Control Parts: Class Static Detection at compile time for ( int i=0; i<=n; ++i) for ( int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; } 1. Affine expressions in ◮ Loop bounds ◮ Conditions ◮ Memory accesses 2. Static control flow 3. Side-effect known function calls 7 / 25

  18. Static Control Parts: Class Static Detection at compile time for ( int i=0; i<=n; ++i) for ( int j=i; j<=n; ++j) if (i >= n-j) { S: A[i+n][j+i] = B[n+2*i-1][j]; T: B[i+n][j-i] = A[n-2*i+1][j]; } 1. Affine expressions in ◮ Loop bounds ◮ Conditions ◮ Memory accesses 2. Static control flow 3. Side-effect known function calls What can we do, if it is not a static (affine) SCoP? 7 / 25

  19. Problem 1: Multi-dimensional array accesses Contiguous A[i][j]; 8 / 25

  20. Problem 1: Multi-dimensional array accesses Contiguous clang -O0 %0 = mul nsw i32 %i, %n %idx = getelementptr float * %A, i32 %0 %idx1 = getelementptr float * %idx, i32 %j A[i][j] 8 / 25

  21. Problem 1: Multi-dimensional array accesses Contiguous clang -O1 %0 = mul nsw i32 %i, %n %idx.s = add i32 %0, %j %idx1 = getelementptr float * %A, i32 %idx.s A[n*i+j] 8 / 25

  22. Problem 1: Multi-dimensional array accesses Contiguous clang -O1 %0 = mul nsw i32 %i, %n %idx.s = add i32 %0, %j %idx1 = getelementptr float * %A, i32 %idx.s A[n*i+j] 8 / 25

  23. Delinearization of array accesses A[n*i+i+j] n ∗ i + i + j = ( n + 1 ) ∗ i + j 9 / 25

  24. Delinearization of array accesses A[n*i+i+j] n ∗ i + i + j = ( n + 1 ) ∗ i + j A[i][i+j] i j 0 n 9 / 25

  25. Delinearization of array accesses A[n*i+i+j] n ∗ i + i + j = ( n + 1 ) ∗ i + j A[i][i+j] A[i][j] i i j j 0 0 n+1 n 9 / 25

  26. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] 10 / 25

  27. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ 10 / 25

  28. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ 10 / 25

  29. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ a − a ′ = 0 ni + 2 i + mi − i ′ − 2 m − 2 n = 0 10 / 25

  30. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ a − a ′ = 0 ni + 2 i + mi − i ′ − 2 m − 2 n = 0 Split into terms ni , 2 i , mi , − i ′ , − 2 m , − 2 n 10 / 25

  31. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ a − a ′ = 0 ni + 2 i + mi − i ′ − 2 m − 2 n = 0 Split into terms ni , 2 i , mi , − i ′ , − 2 m , − 2 n Group by parameters n ( i − 2 ) + m ( i − 2 ) + 1 ( 2 i − i ′ ) 10 / 25

  32. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ a − a ′ = 0 ni + 2 i + mi − i ′ − 2 m − 2 n = 0 Split into terms ni , 2 i , mi , − i ′ , − 2 m , − 2 n Group by parameters n ( i − 2 ) + m ( i − 2 ) + 1 ( 2 i − i ′ ) Factor out common expressions ( n + m )( i − 2 ) + ( 1 )( 2 i − i ′ ) 10 / 25

  33. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ a − a ′ = 0 ni + 2 i + mi − i ′ − 2 m − 2 n = 0 Split into terms ni , 2 i , mi , − i ′ , − 2 m , − 2 n Group by parameters n ( i − 2 ) + m ( i − 2 ) + 1 ( 2 i − i ′ ) Factor out common expressions ( n + m )( i − 2 ) + ( 1 )( 2 i − i ′ ) 10 / 25

  34. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ a − a ′ = 0 ni + 2 i + mi − i ′ − 2 m − 2 n = 0 Split into terms ni , 2 i , mi , − i ′ , − 2 m , − 2 n Group by parameters n ( i − 2 ) + m ( i − 2 ) + 1 ( 2 i − i ′ ) Factor out common expressions ( n + m )( i − 2 ) + ( 1 )( 2 i − i ′ ) 0 i -n-m n+m | 1 ( 2 i − i ′ ) | ≤ | n + m | − 1 Bounds check 10 / 25

  35. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ a − a ′ = 0 ni + 2 i + mi − i ′ − 2 m − 2 n = 0 Split into terms ni , 2 i , mi , − i ′ , − 2 m , − 2 n Group by parameters n ( i − 2 ) + m ( i − 2 ) + 1 ( 2 i − i ′ ) Factor out common expressions ( n + m )( i − 2 ) + ( 1 )( 2 i − i ′ ) 0 i -n-m n+m | 1 ( 2 i − i ′ ) | ≤ | n + m | − 1 Bounds check 10 / 25

  36. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ a − a ′ = 0 ni + 2 i + mi − i ′ − 2 m − 2 n = 0 Split into terms ni , 2 i , mi , − i ′ , − 2 m , − 2 n Group by parameters n ( i − 2 ) + m ( i − 2 ) + 1 ( 2 i − i ′ ) Factor out common expressions ( n + m )( i − 2 ) + ( 1 )( 2 i − i ′ ) 0 i -n-m n+m | 1 ( 2 i − i ′ ) | ≤ | n + m | − 1 Bounds check a − a ′ = 0 ⇔ i − 2 = 0 ∧ 2 i − i ′ = 0 10 / 25

  37. Static Control Parts: Class Multi Let’s allow delinearizeable accesses! A[(n+2+m)*i] A[i’+2*m+2*n] ni + 2 i + mi = i ′ + 2 m + 2 n a = a ′ a − a ′ = 0 ni + 2 i + mi − i ′ − 2 m − 2 n = 0 Split into terms ni , 2 i , mi , − i ′ , − 2 m , − 2 n Group by parameters n ( i − 2 ) + m ( i − 2 ) + 1 ( 2 i − i ′ ) Factor out common expressions ( n + m )( i − 2 ) + ( 1 )( 2 i − i ′ ) 0 i -n-m n+m | 1 ( 2 i − i ′ ) | ≤ | n + m | − 1 Bounds check a − a ′ = 0 ⇔ i − 2 = 0 ∧ 2 i − i ′ = 0 i = 2 and i ′ = 4 10 / 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend