1
Outline Outline
1.) N-Body Methods 2.) Dynamic Programming 3.) Sparse Linear Algebra 4.) Unstructured Grids 5.) Conclusion
Outline Outline 1.) N-Body Methods 2.) Dynamic Programming 3.) - - PowerPoint PPT Presentation
Outline Outline 1.) N-Body Methods 2.) Dynamic Programming 3.) Sparse Linear Algebra 4.) Unstructured Grids 5.) Conclusion 1 N-Body Methods Overview N-Body Methods Particle simulation Large number of simple entities
1
1.) N-Body Methods 2.) Dynamic Programming 3.) Sparse Linear Algebra 4.) Unstructured Grids 5.) Conclusion
Source: [3]
Image Source: [2]
Memory bandwidth Memory bandwidth
Vector register utilization
Vector register utilization
Computation bandwidth Computation bandwidth
Memory latency Memory latency
Image Source: [1]
Source: [1]
Source: [1]
Matrices from the UFL Sparse Matrix Collection (/20)
Matrix 8 has much more non-zero entries than the rest.
[Gflops/s] Based on data from [2]
1 6
Image Source: [4]
Image Source: [5]
Image Source: [6]
1 9
2
Data races can not be determined
„loop over edges and accessing data on
2 1
Image Source: [7]
2 2
Price $ # Cores Mem-Bandwidth double GFlops Last Level Cache
500 1000 1500 2000 2500 3000 3500
2 × Xeon E5-2680 Xeon Phi 5110P Tesla K40
2 3
At the highest level:
At the lower level:
2 4
Image Source: [8]
2 5
Image Source: [8]
2 6
Benchmark probably not fair (price) DO manual Vectorization:
Use MPI +
2 7
Architecture introduction: slide 18 and 1 to 8 Presentation team x2: slide 1 and 16 to 27
Presentation team x2: slide 2 to 15
2 8
[1] Xing Liu et al.: Efficient Sparse Matrix-Vector Multiplication on x86-Based Many-Core Processors
[2] Erik Saule et al.: Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi
[3] Konstantinos Krommydas et al.: On the Characterization of OpenCL Dwarfs on Fixed and Reconfigurable Platforms
[4] http://en.wikipedia.org/wiki/Unstructured_grid
[5] http://view.eecs.berkeley.edu/wiki/Unstructured_Grids
[6] https://cfd.gmu.edu/~jcebral/gallery/vis04/index.html
2 9
[7] http://en.wikipedia.org/wiki/File:Airfoil_with_flow.png
[8] I. Z. Reguly et al.: Vectorizing Unstructured Mesh Computations for Many-core Architecture