Investigating hypergraph-partitioning-based sparse matrix - PowerPoint PPT Presentation

Outline Investigating hypergraph-partitioning-based sparse matrix partitioning methods Bora U¸ car ro:ma, Lyon, France 22 October 2009 Jointly with ¨ Umit V. C ¸ataly¨ urek (VMWIP) 1/37 Hypergraph partitioning

Outline Outline Hypergraphs 1 Parallel SpMxV 2 Scalability analysis of partitioning methods 3 Concluding remarks 4 2/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Hypergraphs: Definitions A hypergraph H = ( V , N ) is a set of vertices V and a set of hyperedges (nets) N . A hyperedge h ∈ N is a subset of vertices. A cost c ( h ) is associated with each hyperedge h . A weight w ( v ) is associated with each vertex v . An undirected graph can be seen as a hypergraph where each hyperedge contains exactly two vertices. 3/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Hypergraphs: Partitioning Partition Π = {V 1 , V 2 , . . . , V K } is a K -way vertex partition if V k � = ∅ , parts are mutually exclusive: V k ∩ V ℓ = ∅ , parts are collectively exhaustive: V = � V k . The connectivity λ ( h ) of a hyperedge h is equal to the number of parts in which h has vertices. Objective: Minimize cutsize(Π) Constraint: Balanced part weights P v ∈V w ( v ) � h c ( h )( λ ( h ) − 1), � v ∈V k w ( v ) ≤ (1 + ε ) . K 4/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Hypergraph partitioning 10 vertices, 4 nets. V V 2 5 7 Partitioned into 4 parts: { 4 , 5 } , 1 { 7 , 10 } , { 3 , 8 , 9 } , and { 1 , 2 , 6 } , 10 4 λ ( n 1 ) = 2 λ ( n 2 ) = 3 n 1 n 2 λ ( n 3 ) = 3 λ ( n 4 ) = 2 cutsize (Π) = c ( n 1 ) + 2 c ( n 2 ) + 9 n 4 n 3 2 2 c ( n 3 ) + c ( n 4 ) 6 (with unit costs 6). 8 V 3 V 4 1 3 5/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Hypergraphs: Partitioning tools and applications Tools Applications hMETIS ( Karypis and Kumar, Univ. VLSI: circuit partitioning, Minnesota ), Scientific computing: matrix MLPart ( Caldwell, Kahng, and partitioning, ordering, Markov, UCLA/UMich ), cryptology, Mondriaan ( Bisseling and Meesen, Parallel/distributed Utrecht Univ. ), computing: volume rendering, Par k way ( Trifunovic and data aggregation, Knottenbelt, Imperial Coll. London ), declustering/clustering, PaToH ( C ¸ataly¨ urek and Aykanat, scheduling, Bilkent Univ. ), Software engineering, Zoltan-PHG ( Devine, Boman, information retrieval, Heaphy, Bisseling, and C ¸ataly¨ urek, processing spatial join queries, Sandia National Labs. ). etc. 6/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Parallel sparse matrix-vector multiplies Row-column-parallel multiplies P 4 P 2 P 4 P 1 P 1 P 3 P 3 P 2 x x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 To compute y ← Ax Expand x 1 P 1 y 1 1 ( P 1 sends x 5 to P 2 and P 3 .) y 2 P 1 1 1 1 y 3 P 3 Scalar multiply and add 1 3 3 1 2 y P 4 4 4 4 ( P 2 computes a partial y P 2 2 2 2 5 result y 6 P 4 4 4 2 2 2 y ′ 6 = a 65 x 5 + a 66 x 6 + a 68 x 8 .) y 7 P 4 4 3 y P 3 Fold y 3 3 3 3 8 ( P 2 sends its partial result y A y ′ 6 to P 4 .) 7/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Parallelization objectives Achieve load balance Load of a processor: number of nonzeros. ⇒ assign almost equal number of nonzeros per processor. Minimize communication cost Communication cost is a complex function (depends on the machine architecture and the problem size): total volume of messages, total number of messages, max. volume of messages per processor (sends or receives, both?), max. number of messages per processor (sends or receives, both?). The common metric in different works: total volume of communication. 8/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Parallelization problem Problem definition: Partition the matrix so that processors have equal number of nonzeros, minimize the total volume of messages. Volume of messages Consider x 5 . If assigned to processor: x 5 P 1 : 2 units of communication, P 2 : 2 units of communication, 1 P 3 : 2 units of communication, 1 P 4 : 3 units of communication, and does not make sense. y 6 4 4 2 2 2 Consider y 6 Similar 3 9/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Parallelization problem (cont’) Volume of messages x 5 Nonzeros in column c j in s c ( j ) processors: 1 volume of communication is s c ( j ) − 1. 1 Nonzeros in row r i in s r ( i ) processors: volume of communication is s r ( i ) − 1. y 6 4 4 2 2 2 3 Total volume of communication is � s c ( j ) − 1 + � s r ( i ) − 1. Balance the number of nonzeros per processor and minimize the number of processors sharing a column/row. Equivalent to the hypergraph partitioning problem; it is NP-complete. 10/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Three main models for matrix partitioning v (r ) h h a hj Column-net model: Used for rowwise a ij partitioning. Each column is a net and v (r ) n (c ) j j i i each row is a vertex. a kj v (r ) k k v (c ) h h a jh Row-net model: Used for columnwise a ji v (c ) n (r ) partitioning. Each row is a net and each i i j j a column is a vertex. jk v (c ) k k Fine-grain model: Used for nonzero- based partitioning. Each row is a net, y a j x j i i each column is a net, and each nonzero is a vertex. 11/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Parallel sparse matrix-vector multiplies Row-column-parallel multiplies P 4 P 2 P 4 P 1 P 1 P 3 P 3 P 2 x x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 To compute y ← Ax Expand x 1 P 1 y 1 1 ( P 1 sends x 5 to P 2 and P 3 .) y 2 P 1 1 1 1 y 3 P 3 Scalar multiply and add 1 3 3 1 2 y P 4 4 4 4 ( P 2 computes a partial y P 2 2 2 2 5 result y 6 P 4 4 4 2 2 2 y ′ 6 = a 65 x 5 + a 66 x 6 + a 68 x 8 .) y 7 P 4 4 3 y P 3 Fold y 3 3 3 3 8 ( P 2 sends its partial result y A y ′ 6 to P 4 .) 12/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Taxonomy of sparse matrix partitioning methods and models Parallel y ← Ax computation Parallel Algorithm Row-parallel Column-parallel Row-Column-parallel Partitioning Scheme 2D via Orthogonal Partitioning 2D Nonzero Based 1D RW 1D CW 2D JL 2D ORB 2D CH 2D ML2D 2D FG Multi-Constraint Hypergraph Model Column-Net Row-Net Column-Row-Net 13/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Example: Jagged-like partitioning 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 nnz = 47 R1 R2 r 16 3 r 5 4 c 11 c 8 r 2 6 c 4 r 11 8 c 2 r 1 r 8 11 12 c 16 r 7 14 r 4 c 10 16 c 3 c 1 c 7 c 13 1 c 5 c 9 r 6 2 c 14 5 r 10 7 r 15 r 14 9 r 3 r 9 c 6 c 12 10 r 12 c 15 r 13 13 15 3 4 6 8 11 12 14 16 1 2 5 7 9 10 13 15 nnz = 47 vol = 3 imbal = [ − 2.1%, 2.1%] 14/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Example: Jagged-like partitioning P1 P2 c 6 4 r 8 c 4 8 c 8 c 5 12 r 3 r 16 r 6 r 14 16 c 12 c 14 c 3 3 r 12 c 16 6 r 4 c 11 c 2 r 11 11 14 1 2 5 P3 P4 13 c 1 r 15 c 10 7 r 1 9 c 13 c 15 c 9 10 r 5 r 13 r 9 15 r 10 c 5 c 7 c 12 c 2 4 8 12 16 3 6 11 14 1 2 5 13 7 9 10 15 r 7 nnz = 47 r 2 vol = 8 imbal = [ − 6.4%, 2.1%] 15/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks Too many alternatives? Ideally, for partitioning a given matrix, we would like to choose the best method. The best is not well-defined. Given that the main objective of the hypergraph-partitioning-based methods is the minimization of the total communication volume, we may content ourselves with the “least total communication volume”. Can we know which method will give the best result, without applying them all? The landscape is not too complicated; the fine-grain method usually obtains the least total communication volume. But, also with the highest run time and, even worse, the highest total number of messages. 16/37 Hypergraph partitioning

Hypergraphs Parallel SpMxV Scalability analysis of partitioning methods Concluding remarks A recipe for matrix partitioning Partitioning Recipe Square? ( M = N ?) No Yes Pathological? M < 0 . 35 N No or or M ≥ Z mode( d r , d c ) = 0 √ M/ or K < max( d r , d c ) √ Yes max( d r , d c ) ≥ (1 − ε ) 2 Z/ K No Yes N < 0 . 35 M FGS/FGU CWU sym( A ) > 0.95 No No Yes Yes Q 3 ( d r )-med( d r ) < max( d r ) − Q 3 ( d r ) RWU FGU 2 or No avg( d r ) > med( d r ) Q 3 ( d c )-med( d c ) < max( d c ) − Q 3 ( d c ) 2 med( d r ) ≤ med( d c ) No Yes Yes No Yes FGS/FGU JLS/JLU FGU JLU T JLU 17/37 Hypergraph partitioning

Investigating hypergraph-partitioning-based sparse matrix - PowerPoint PPT Presentation

Outline Investigating hypergraph-partitioning-based sparse matrix partitioning methods Bora U car ro:ma, Lyon, France 22 October 2009 Jointly with Umit V. C ataly urek (VMWIP) 1/37 Hypergraph partitioning Outline Outline

Sparse Matrix Partitioning, Reordering and Vector Multiplication Albert-Jan Yzelman, Utrecht

Recommended Reading Efficient Parallel Sparse MatrixVector Multiplication U.V. C ataly

Sparse matrix partitioning, ordering, and visualisation by Mondriaan 3.0 Outline Partitioning

Column-Based Matrix Partitioning for Parallel Matrix Multiplication on Heterogeneous Processors

Lattice and Hypergraph MERT Graham Neubig Nara Institute of Science and Technology (NAIST)

Hypergraph Decompositions and Toric Ideals Elizabeth Gross and Kaie Kubjas June 9, 2015 Toric

Fast sparse matrixvector multiplication by partitioning and reordering Albert-Jan Yzelman

Fast sparse matrixvector multiplication by partitioning and reordering Albert-Jan Yzelman

Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply

Partitioning Introduction to Partitioning Mahapatra-Texas A&M-Spring02 1 System

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

High-performance and Memory-saving Sparse General Matrix-Matrix Multiplication for Pascal GPU

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector

Parallel Sparse Matrix-Vector and Matrix- Transpose-Vector Multiplication using Compressed Sparse

Advanced Flow-Based Multilevel Hypergraph Partitioning SEA 2020 June 5, 2020 Lars Gottesb uren

Entangled Hypergraphs vs. Hypergraph States and Their Role in Classification of Multipartite

Cache-oblivious sparse matrixvector multiplication Albert-Jan Yzelman & Rob H. Bisseling

LambekGrammars,TreeAdjoining GrammarsandHyperedge

Partial duality of hypermaps Sergei Chmutov Ohio State University, Mansfield Conference Legacy of

Constraint Grammars Thierry Martinez Acknowledgments to Rmy Haemmerl for the original

social hash an assignment framework for optimizing distributed systems operations in social

Block Crossings in Storyline Visualizations Thomas van Dijk, Martin Fink , Norbert Fischer, Fabian

Transformation of Corecursive Graphs Towards M -Adhesive Categories of Corecursive Graphs Julia

Sambuz

Useful Links

Newsletter

Mail Us

Investigating hypergraph-partitioning-based sparse matrix - PowerPoint PPT Presentation

Outline Investigating hypergraph-partitioning-based sparse matrix partitioning methods Bora U car ro:ma, Lyon, France 22 October 2009 Jointly with Umit V. C ataly urek (VMWIP) 1/37 Hypergraph partitioning Outline Outline

Sparse Matrix Partitioning, Reordering and Vector Multiplication Albert-Jan Yzelman, Utrecht

Recommended Reading Efficient Parallel Sparse MatrixVector Multiplication U.V. C ataly

Sparse matrix partitioning, ordering, and visualisation by Mondriaan 3.0 Outline Partitioning

Column-Based Matrix Partitioning for Parallel Matrix Multiplication on Heterogeneous Processors

Lattice and Hypergraph MERT Graham Neubig Nara Institute of Science and Technology (NAIST)

Hypergraph Decompositions and Toric Ideals Elizabeth Gross and Kaie Kubjas June 9, 2015 Toric

Fast sparse matrixvector multiplication by partitioning and reordering Albert-Jan Yzelman

Fast sparse matrixvector multiplication by partitioning and reordering Albert-Jan Yzelman

Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply

Partitioning Introduction to Partitioning Mahapatra-Texas A&amp;M-Spring02 1 System

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

High-performance and Memory-saving Sparse General Matrix-Matrix Multiplication for Pascal GPU

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector

Parallel Sparse Matrix-Vector and Matrix- Transpose-Vector Multiplication using Compressed Sparse

Advanced Flow-Based Multilevel Hypergraph Partitioning SEA 2020 June 5, 2020 Lars Gottesb uren

Entangled Hypergraphs vs. Hypergraph States and Their Role in Classification of Multipartite

Cache-oblivious sparse matrixvector multiplication Albert-Jan Yzelman &amp; Rob H. Bisseling

LambekGrammars,TreeAdjoining GrammarsandHyperedge

Partial duality of hypermaps Sergei Chmutov Ohio State University, Mansfield Conference Legacy of

Constraint Grammars Thierry Martinez Acknowledgments to Rmy Haemmerl for the original

social hash an assignment framework for optimizing distributed systems operations in social

Block Crossings in Storyline Visualizations Thomas van Dijk, Martin Fink , Norbert Fischer, Fabian

Transformation of Corecursive Graphs Towards M -Adhesive Categories of Corecursive Graphs Julia

Sambuz

Useful Links

Newsletter

Mail Us

Partitioning Introduction to Partitioning Mahapatra-Texas A&M-Spring02 1 System

Cache-oblivious sparse matrixvector multiplication Albert-Jan Yzelman & Rob H. Bisseling