static analysis of openmp data mapping for target
play

Static Analysis of OpenMP data mapping for target offmoading - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . 1/44 Introduction Our Solution Evaluation Conclusion Static Analysis of OpenMP data mapping for target offmoading Prithayan Barua, Vivek Sarkar . . . . . . . . . . . . . . . . . .


  1. . . . . . . . . . . . . . . . 1/44 Introduction Our Solution Evaluation Conclusion Static Analysis of OpenMP data mapping for target offmoading Prithayan Barua, Vivek Sarkar . . . . . . . . . . . . . . . . . . . . . . . . . Georgia Institute of Technology

  2. . . . . . . . . . . . . . . . 2/44 Introduction Our Solution Evaluation Conclusion Acknowledgements Shirako Jun, Tsang Whitney, Paudel Jeeva, Chen Wang OMPSan: Static Verifjcation of OpenMP’s Data Mapping Constructs. . . . . . . . . . . . . . . . . . . . . . . . . . IWOMP 2019

  3. . Outline . . . . . . . 3/44 Introduction Our Solution Evaluation Conclusion 1 . Introduction OpenMP Target Offmoading 2 Our Solution Basic Idea Analysis Interpret OpenMP Clauses 3 Evaluation Example Analysis Conclusion Experiment Results 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion

  4. . Outline . . . . . . . 4/44 Introduction Our Solution Evaluation Conclusion 1 . Introduction OpenMP Target Offmoading 2 Our Solution Basic Idea Analysis Interpret OpenMP Clauses 3 Evaluation Example Analysis Conclusion Experiment Results 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion

  5. . . . . . . . . . . . . 5/44 . Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Programming Heterogeneous Systems using OpenMP Programming Model Host can offmoad computations to target devices Each target device has a corresponding data environment Host can update the data between host and . . . . . . . . . . . . . . . . . . . . . . . . . . . devices using data mapping clauses

  6. . L4: . . 6/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Using OpenMP for Target offmoading Example 1, How to ofmoad computations #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { #pragma omp target reduction(+:sum) #pragma omp target } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . }

  7. . L4: . . 7/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target data map Example 1, L2 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { #pragma omp target reduction(+:sum) #pragma omp target } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } ▷

  8. . L4: . . 8/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target 2 Example 1, L4 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { // Copy 'A[0:N]' to device. . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { #pragma omp target reduction(+:sum) #pragma omp target } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { L7: . . . . . . . . . . . . . . . . . . . } //Copy 'A[0:N]' from device to host. . . . . . . . . . . . . . . . . . ▷

  9. . L4: . . 9/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target Example 1, L8 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { // Copy 'A[0:N]' to device. . L11: L14: } // Leave 'A[0:N]' and 'sum' on device. L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { // Execute on device. #pragma omp target reduction(+:sum) #pragma omp target } // Leave 'A[0:N]' on device. L9: A[i]=i; L8: for ( int i=0; i<N; i++) { // Execute on device L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } //Copy 'A[0:N]' from device to host. ▷

  10. . L4: . . 10/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target Example 1, L12 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { // Copy 'A[0:N]' to device. . L11: L14: } // Leave 'A[0:N]' and 'sum' on device. L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { // Execute on device. #pragma omp target reduction(+:sum) #pragma omp target } // Leave 'A[0:N]' on device. L9: A[i]=i; L8: for ( int i=0; i<N; i++) { // Execute on device L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } //Copy 'A[0:N]' from device to host. ▷

  11. . L4: . . 11/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Semantics of target data map Example 1, L14 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { // Copy 'A[0:N]' to device. . L11: L14: } // Leave 'A[0:N]' and 'sum' on device. L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { // Execute on device. #pragma omp target reduction(+:sum) #pragma omp target } // Leave 'A[0:N]' on device. L9: A[i]=i; L8: for ( int i=0; i<N; i++) { // Execute on device L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ▷ } //Copy 'A[0:N]' from device to host.

  12. . OpenMP Target Offmoading #pragma omp target data map(tofrom:A[0:N]) int A[N], sum=0; L2: #define N 10 Example 2 Execute L11: loop on host Conclusion { Evaluation Our Solution Introduction 12/44 . . L4: #pragma omp target . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { reduction(+:sum) L7: // #pragma omp target } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } ▷

  13. . #pragma omp target data map(tofrom:A[0:N]) . . . 13/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Disaster !! Wrong Output Example 2, L12 #define N 10 L2: int A[N], sum=0; L4: . L11: L14: } L13: sum += A[i]; // Access host copy of stale 'A'! L12: for ( int i=0; i<N; i++) { // Execute on host } // Leave 'A[0:N]' on device. { // Allocate 'A[0:N]' on device. L9: A[i]=i; L8: for ( int i=0; i<N; i++) { // Execute on device L7: #pragma omp target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } //Copy 'A[0:N]' from device to host. ▷

  14. . . . . . . . . . . . . . . . 14/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading But Why ? . . . . . . . . . . . . . . . . . . . . . . . . . Default Solution: OpenMP Specifjcations

  15. . . . . . . . . . . . . . . . 15/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading Understanding the Data Map Usage Data Map Specifjcation . . . . . . . . . . . . . . . . . . . . . . . . . Our Flowchart to explain the Specifjcation

  16. . L4: . . 16/44 Introduction Our Solution Evaluation Conclusion OpenMP Target Offmoading One possible fjx Example 3 #define N 10 L2: int A[N], sum=0; #pragma omp target data map(tofrom:A[0:N]) { . L11: L14: } L13: sum += A[i]; L12: for ( int i=0; i<N; i++) { // Force Copy 'A[0:N]' to host. #pragma omp target update from(A[0:N]) #pragma omp target map(from:A[0:N]) } L9: A[i]=i; L8: for ( int i=0; i<N; i++) { L7: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . } ▷

  17. . Conclusion . . . . . . . . 17/44 Introduction Our Solution Evaluation OpenMP Target Offmoading . Memory Optimization Naive Jacobian while ( error > tol && iter < iter_max ) { #pragma omp target map(tofrom:Anew) map(tofrom:A) map(tofrom:error) for ( int j = 1; j < n-1; j++) for ( int i = 1; i < m-1; i++ ) { + A[j-1][i] + A[j+1][i]); #pragma omp target map(tofrom:Anew) map(tofrom:A) for ( int j = 1; j < n-1; j++) for ( int i = 1; i < m-1; i++ ) A[j][i] = Anew[j][i]; iter++; . . . . . . . . . . . . . . . . . . . . . . . . } . . . . . . error = 0.0; Anew[j][i] = 0.25 * ( A[j][i+1] + A[j][i-1] error = fmax( error, fabs(Anew[j][i] - A[j][i])); }

  18. . OpenMP Target Offmoading . . . . . . . 18/44 Introduction Our Solution Evaluation Conclusion Memory Optimization . Remove Redundant Memory Copies #pragma omp target data map(to:Anew) map(tofrom:A) while ( error > tol && iter < iter_max ) { #pragma omp target map(tofrom:error) for ( int j = 1; j < n-1; j++) for ( int i = 1; i < m-1; i++ ) { + A[j-1][i] + A[j+1][i]); error = fmax( error, fabs(Anew[j][i] - A[j][i])); } #pragma omp target for ( int j = 1; j < n-1; j++) for ( int i = 1; i < m-1; i++ ) A[j][i] = Anew[j][i]; iter++; . . . . . . . . . . . . . . . . . . . . . . . . . } . . . . . . error = 0.0; Anew[j][i] = 0.25 * ( A[j][i+1] + A[j][i-1]

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend