Space-filling curves in S p MV multiplication Albert-Jan Yzelman - PowerPoint PPT Presentation

Space-filling curves in S p MV multiplication Albert-Jan Yzelman (ExaScience Lab / KU Leuven) Dirk Roose (KU Leuven) September 2013 � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 1 / 24

Introduction Given a sparse m × n matrix A and an n × 1 input vector x . We consider both sequential and parallel computation of Ax = y : We utilise space-filling curves to offset inefficient cache use. � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 2 / 24

Introduction Curves have always been used in sparse computations: Compressed Row Storage (CRS) A row-major ordering of the matrix nonzeroes is imposed by the above curve. This causes a linear access of the output vector y ; but causes irregular access of the input vector x . � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 3 / 24

Introduction Ideas for improvement: Zig-zag CRS Alternating ascending-descending row-major ordering. Retains linear access of the output vector y ; imposes a bit more ( O ( m ) ) locality. Ref. : A. N. Yzelman and Rob H. Bisseling, “Cache-oblivious sparse matrix-vector multiplication by using sparse matrix partitioning methods”, SIAM Journal of Scientific Computation 31(4), pp. 3128-3154 (2009). � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 4 / 24

Introduction Ideas for improvement: why not space-filling curves? Fractal storage using the coordinate format (COO) Nonzero ordered according to the Hilbert curve. No longer linear access of the output vector y , but accesses on both x and y now have temporal locality. Ref. : Haase, Liebmann and Plank, “A Hilbert-Order Multiplication Scheme for Unstructured Sparse Matrices”, International Journal of Parallel, Emergent and Distributed Systems 22(4), pp. 213-220 (2007). � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 5 / 24

Sequential SpMV Space-filling curves avoid inefficient cache use , but that is not the only problem: 64 with vectorization 32 attainable GFLOP/sec peak floating-point 16 peak memory BW 8 4 2 1 1/8 1/4 1/2 1 2 4 8 16 Arithmetic Intensity FLOP/Byte SpMV has low arithmetic intensity : bandwidth issues arise. Compression is mandatory! (Image courtesy of Prof. Wim Vanroose, UA) � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 6 / 24

Sequential SpMV Assuming a row-major order of nonzeroes:   4 1 3 0 0 0 2 3   A =   1 0 0 2   7 0 1 1 CRS:  V [4 1 3 2 3 1 2 7 1 1]   A = J [0 1 2 2 3 0 3 0 2 3] ˆ  I [0 3 5 7 10]  Storage requirements: Θ(2 nz + m + 1) , where nz is the number of nonzeroes in A . � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 7 / 24

Sequential SpMV Assuming a Hilbert order of nonzeroes:   4 1 3 0 0 0 2 3   A =   1 0 0 2   7 0 1 1 COO:  V [7 1 4 1 2 3 3 2 1 1]   A = J [0 0 0 1 2 2 3 3 3 2]  I [3 2 0 0 1 0 1 2 3 3]  Storage requirements: Θ(3 nz ) . This extra data movement is prohibitive . � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 8 / 24

Sequential SpMV   4 1 3 0 0 0 2 3   A =   1 0 0 2   7 0 1 1 BICRS:  V [7 1 4 1 2 3 3 2 1 1]   A = ∆ J [0 4 4 1 5 4 5 4 3 1]  ∆ I [3 -1 -2 1 -1 1 1 1]  Storage requirements: Θ(2 nz + row jumps + 1) . Ref. : Yzelman and Bisseling, “A cache-oblivious sparse matrix–vector multiplication scheme based on the Hilbert curve”, Progress in Industrial Mathematics at ECMI 2010, pp. 627-634 (2012). � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 9 / 24

Sequential SpMV Is cache-obliviousness on the level of nonzeroes required? Sparse blocking may have advantages: corresponding vector elements will fit into cache, may apply low-level optimisations within blocks. � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 10 / 24

Sequential SpMV Space-filling curves on top, full cache-obliviousness: (Using compressed BICRS, CBICRS) Ref. : Martone, Filippone, Tucci, Paprzycki, and Ganzha, “Utilizing recursive storage in sparse matrix-vector multiplication - preliminary considerations”, Proceedings of the ISCA 25th International Conference on Computers and Their Applications (CATA), pp 300-305 (2010). � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 11 / 24

Sequential SpMV Space-filling curves on top, full cache-obliviousness: (Using the Z-curve and dense BLAS) Ref. : Lorton and Wise, “Analyzing block locality in Morton-order and Morton-hybrid matrices”, SIGARCH Computer Architecture News, 35(4), pp. 6-12 (2007). Proceedings of the ISCA 25th International Conference on Computers and Their Applications (CATA), pp 300-305 (2010). � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 11 / 24

Sequential SpMV Space-filling curves on top, full cache-obliviousness: (Using the Z-curve, a quad-tree, and CRS within blocks) Ref. : Martone, Filippone, Tucci, Paprzycki, and Ganzha, “Utilizing recursive storage in sparse matrix-vector multiplication - preliminary considerations”, Proceedings of the ISCA 25th International Conference on Computers and Their Applications (CATA), pp 300-305 (2010). � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 11 / 24

Sequential SpMV Space-filling curves on top, full cache-obliviousness: But how much storage does CRS within blocks require? Ref. : Martone, Filippone, Tucci, Paprzycki, and Ganzha, “Utilizing recursive storage in sparse matrix-vector multiplication - preliminary considerations”, Proceedings of the ISCA 25th International Conference on Computers and Their Applications (CATA), pp 300-305 (2010). � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 11 / 24

Sequential SpMV Space-filling curves within can be stored efficiently: (Stored using Compressed Sparse Blocks, CSB) Ref. : Buluc ¸, Williams, Oliker, and Demmel, “Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication”, Proc. of the Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International, pp. 721-733 (2011). � 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 12 / 24

Space-filling curves in S p MV multiplication Albert-Jan Yzelman - PowerPoint PPT Presentation

Space-filling curves in S p MV multiplication Albert-Jan Yzelman (ExaScience Lab / KU Leuven) Dirk Roose (KU Leuven) September 2013 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 1 / 24

Safe filling for all Higher standards in low-capacity filling 1 Safe filling for all! The Kosan

Comparison of Pixel Correlation Induced by Space-Filling Curves on 2D Image Data Stphane

Filling multiples of embedded curves Robert Young University of Toronto Aug. 2013 Filling area

Ordering! CS294S Spring 2020 Riya Verma Project Motivation - Slot filling Form Filling -

Efficient space-filling and non-collapsing sequential design strategies for simulation-based

Tank Car Filling Limit & Filling Density Standards Overload a condition that

ISPE NJ Chapter Tour Demo of Aseptic Filling Line Tour-Demo of Aseptic Filling Line with

Baldwin Space Summary October 25 1 Baldwin School Space Summary 2 Baldwin School Space Summary

Partial-Order Planning 1 State-Space vs. Plan-Space State-space ( situation space ) planning

Last lecture Configuration Space Free-Space and C-Space Obstacles Minkowski Sums 1

Cold Start KB and Slot-Filling Approaches UMass Amherst Ben Roth, Nick Monath, David Belanger,

Neural Networks and Coreference Resolution for Slot Filling Heike Adel, Hinrich Sch utze Team

Filling the Gaps and Patching the Cracks Connected Care for Home Health Care Agencies Barbara

PRESENTATION ON INTRODUCTION TO AN ELECTRONIC CYLINDER FILLING SYSTEM Presented By Sri Sudarshan

Bark Bites SESE - Clay City Making the product Clean-UP Baking Packaging Filling Orders

Webinar powered by Q Cells Filling in the gaps: The evolution of high-density module design

Statistical Machine Learning Lecture 04: Optimization Refresher Kristian Kersting TU Darmstadt

Hardware Acceleration of Feature Detection and Description Algorithms on LowPower Embedded

Self-Adjusting Data Structures 1 Self-Adjusting Data Structures move-to-front 2 7 4 1 9 5

Synchronous Grammars Synchronous grammars are a way of simultaneously generating pairs of

Data Structures in Java Session 10 Instructor: Bert Huang

Secondary electron interference from trigonal warping in clean carbon nanotubes A. Dirnaichner et

Transform Coding - Overview Principle of block-wise transform coding Properties of orthonormal

Probabilistic cellular automata with memory two Ir` ene Marcovici Joint work with J er ome

Space-filling curves in S p MV multiplication Albert-Jan Yzelman - PowerPoint PPT Presentation

Space-filling curves in S p MV multiplication Albert-Jan Yzelman (ExaScience Lab / KU Leuven) Dirk Roose (KU Leuven) September 2013 2013, ExaScience Lab - A. N. Yzelman, D. Roose c Space-filling curves in S p MV multiplication 1 / 24

Safe filling for all Higher standards in low-capacity filling 1 Safe filling for all! The Kosan

Comparison of Pixel Correlation Induced by Space-Filling Curves on 2D Image Data Stphane

Filling multiples of embedded curves Robert Young University of Toronto Aug. 2013 Filling area

Ordering! CS294S Spring 2020 Riya Verma Project Motivation - Slot filling Form Filling -

Efficient space-filling and non-collapsing sequential design strategies for simulation-based

Tank Car Filling Limit &amp; Filling Density Standards Overload a condition that

ISPE NJ Chapter Tour Demo of Aseptic Filling Line Tour-Demo of Aseptic Filling Line with

Baldwin Space Summary October 25 1 Baldwin School Space Summary 2 Baldwin School Space Summary

Partial-Order Planning 1 State-Space vs. Plan-Space State-space ( situation space ) planning

Last lecture Configuration Space Free-Space and C-Space Obstacles Minkowski Sums 1

Cold Start KB and Slot-Filling Approaches UMass Amherst Ben Roth, Nick Monath, David Belanger,

Neural Networks and Coreference Resolution for Slot Filling Heike Adel, Hinrich Sch utze Team

Filling the Gaps and Patching the Cracks Connected Care for Home Health Care Agencies Barbara

PRESENTATION ON INTRODUCTION TO AN ELECTRONIC CYLINDER FILLING SYSTEM Presented By Sri Sudarshan

Bark Bites SESE - Clay City Making the product Clean-UP Baking Packaging Filling Orders

Webinar powered by Q Cells Filling in the gaps: The evolution of high-density module design

Statistical Machine Learning Lecture 04: Optimization Refresher Kristian Kersting TU Darmstadt

Hardware Acceleration of Feature Detection and Description Algorithms on LowPower Embedded

Self-Adjusting Data Structures 1 Self-Adjusting Data Structures move-to-front 2 7 4 1 9 5

Synchronous Grammars Synchronous grammars are a way of simultaneously generating pairs of

Data Structures in Java Session 10 Instructor: Bert Huang

Secondary electron interference from trigonal warping in clean carbon nanotubes A. Dirnaichner et

Transform Coding - Overview Principle of block-wise transform coding Properties of orthonormal

Probabilistic cellular automata with memory two Ir` ene Marcovici Joint work with J er ome

Tank Car Filling Limit & Filling Density Standards Overload a condition that