SLIDE 44 44
ROW-LENGTH “IMPERVIOUSNESS”:
The correlation between throughput (GFLOPs) vs row-length variation (closer to 0.0 is better)
- 0.16
- 0.07
- 0.06
- 0.03
- 0.24
- 0.01
- 0.07
- 0.04
MKL CsrMV Merge-based CsrMV CSB SpMV POSKI SpMV cuSPARSE CsrMV Merge-based CsrMV HYB SpMV yaSpMV
0.1 0.2 0.3 0.4 0.5 Correlation of GFLOPs to row-variation [1] A. Buluç, J. T. Fineman, M. Frigo, J. R. Gilbert, and C. E. Leiserson, “Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication Using Compressed Sparse Blocks,” in Proc. SPAA, Calgary, Canada, 2009. [2] A. Jain, “pOSKI: An Extensible Autotuning Framework to Perform Optimized SpMVs on Multicore Architectures,” Master’s Thesis, University of California at Berkeley, 2008. [3] N. Bell and M. Garland, “Implementing sparse matrix-vector multiplication on throughput-oriented processors,” in Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, New York, NY, USA, 2009, pp. 18:1–18:11. [4] S. Yan, C. Li, Y. Zhang, and H. Zhou, “yaSpMV: Yet Another SpMV Framework on GPUs,” in Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, New York, NY, USA, 2014, pp. 107–118.
[1] [2] [3] [4]