partitioning spatially located load with rectangles
play

Partitioning Spatially Located Load with Rectangles Erik Saule 1 , - PowerPoint PPT Presentation

Partitioning Spatially Located Load with Rectangles Erik Saule 1 , Erdeniz s 1 , 2 , urek 1 , 3 O. Ba Umit V. C ataly { esaule,erdeniz,umit } @bmi.osu.edu 1 Department of Biomedical Informatics 2 Department of Computer Science and


  1. Partitioning Spatially Located Load with Rectangles Erik Saule 1 , Erdeniz ¨ s 1 , 2 , ¨ urek 1 , 3 O. Ba¸ Umit V. C ¸ataly¨ { esaule,erdeniz,umit } @bmi.osu.edu 1 Department of Biomedical Informatics 2 Department of Computer Science and Engineering 3 Department of Electric and Computer Engineering The Ohio State University IPDPS 2011 Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek :: 1 / 36 HPC Lab http://bmi.osu.edu/hpc

  2. A load distribution problem Load matrix In parallel computing, the load can be spatially located. The computation should be distributed accordingly. Applications Particles in Cell (stencil) Sparse Matrices Direct Volume Rendering Metrics Load balance Communication Stability Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Introduction:: 2 / 36 HPC Lab http://bmi.osu.edu/hpc

  3. Different kinds of partition Uniform Rectilinear P × Q -way jagged (th) m -way jagged hierarchical spiral (def, heur, th, opt) (heur, opt) (heur, opt) Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Introduction:: 3 / 36 HPC Lab http://bmi.osu.edu/hpc

  4. Different load balance on 2304 processors Particles (2050x2050) Uniform (17.5%) Rectilinear (15.1%) P × Q -way jagged (2.3%) m -way jagged (2.0%) hierarchical (2.7%) Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Introduction:: 4 / 36 HPC Lab http://bmi.osu.edu/hpc

  5. This talk is about how to generate such partitions, either optimally or heuristically, and the type of guarantee we can obtain. Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Introduction:: 5 / 36 HPC Lab http://bmi.osu.edu/hpc

  6. Outline Introduction 1 Preliminaries 2 Notation In One Dimension Simulation Setting Rectilinear Partitioning 3 Nicol’s Algorithm Jagged Partitioning 4 P × Q -way Jagged m -way Jagged Hierarchical Bisection 5 Recursive Bisection Dynamic Programming Final thoughts 6 Summing up Conclusion and Perspective Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Introduction:: 6 / 36 HPC Lab http://bmi.osu.edu/hpc

  7. The Rectangular Partitioning Problem Definition Let A be a n 1 × n 2 matrix of non-negative values. The problem is to partition the [1 , 1] × [ n 1 , n 2 ] rectangle into a set S of m rectangles. The load of rectangle r = [ x , y ] × [ x ′ , y ′ ] is L ( r ) = � x ≤ i ≤ x ′ , y ≤ j ≤ y ′ A [ i ][ j ]. The problem is to minimize L max = max r ∈ S L ( r ). Prefix Sum Algorithms are rarely interested in the value of a particular element but rather interested in the load of a rectangle. The matrix is given as a 2D i ′ ≤ i , j ′ ≤ j A [ i ′ ][ j ′ ]. By convention prefix sum array Pr such as Pr [ i ][ j ] = � Pr [0][ j ] = Pr [ i ][0] = 0. We can now compute the load of rectangle r = [ x , y ] × [ x ′ , y ′ ] as L ( r ) = Pr [ x ′ ][ y ′ ] − Pr [ x − 1][ y ′ ] − Pr [ x ′ ][ y − 1] + Pr [ x − 1][ y − 1]. Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Preliminaries::Notation 7 / 36 HPC Lab http://bmi.osu.edu/hpc

  8. In One Dimension Optimal : Nicol’s algorithm [Nic94] (improved by [PA04]) Based on parametric search. Complexity: O (( m log n m ) 2 ). Heuristic : Direct Cut [MP97] Greedy algorithm. Complexity: O ( m log n m ). i ′ A [ i ′ ] � Guarantees : L max ( DC ) ≤ + max i A [ i ]. m (More details in Section 2.2) Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Preliminaries::In One Dimension 8 / 36 HPC Lab http://bmi.osu.edu/hpc

  9. Simulation Setting Classes (Some inspired by [MS96]) Processors Simulation are perform with different number of processors: most squared numbers up to 10,000. Metric L max Load imbalance is the presented metric : − 1. � i , j A [ i ][ j ] m Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Preliminaries::Simulation Setting 9 / 36 HPC Lab http://bmi.osu.edu/hpc

  10. Outline of the Talk Introduction 1 Preliminaries 2 Notation In One Dimension Simulation Setting Rectilinear Partitioning 3 Nicol’s Algorithm Jagged Partitioning 4 P × Q -way Jagged m -way Jagged Hierarchical Bisection 5 Recursive Bisection Dynamic Programming Final thoughts 6 Summing up Conclusion and Perspective Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Rectilinear Partitioning:: 10 / 36 HPC Lab http://bmi.osu.edu/hpc

  11. Rectilinear Partitioning Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Rectilinear Partitioning:: 11 / 36 HPC Lab http://bmi.osu.edu/hpc

  12. Nicol’s Algorithm [Nic94]: RECT-NICOL The algorithm RECT-NICOL is an iterative heuristic. At each iteration the partition in one dimension is refined by using a 1D algorithm. Complexity: O ( n 1 n 2 ) iterations (around 10 in practice) P ) 2 + P ( Q log n 2 1 iteration : O ( Q ( P log n 1 Q ) 2 ). Other algorithms The problem of finding the optimal Rectilinear Partitioning is NP-Complete. Therefore, other algorithms which mainly focuses on theoretical properties. The guarantees are unsuitable. The algorithms are computationally expensive ( n 10 1 ) and difficult to implement (rely on linear programming or present numerical instability). (See Section 3.1 for more details) Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Rectilinear Partitioning:: 12 / 36 HPC Lab http://bmi.osu.edu/hpc

  13. Outline of the Talk Introduction 1 Preliminaries 2 Notation In One Dimension Simulation Setting Rectilinear Partitioning 3 Nicol’s Algorithm Jagged Partitioning 4 P × Q -way Jagged m -way Jagged Hierarchical Bisection 5 Recursive Bisection Dynamic Programming Final thoughts 6 Summing up Conclusion and Perspective Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Jagged Partitioning:: 13 / 36 HPC Lab http://bmi.osu.edu/hpc

  14. P × Q -way Jagged Partitioning Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Jagged Partitioning:: P × Q -way Jagged 14 / 36 HPC Lab http://bmi.osu.edu/hpc

  15. A P × Q -way Jagged Heuristic: JAG-PQ-HEUR P × Q Jagged Partitioning Sum on columns to generate a 1D problem. Partition it in P parts. For the first stripe, sum on rows. Partition it in Q parts. � � � � � � � Treat all stripes. Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Jagged Partitioning:: P × Q -way Jagged 15 / 36 HPC Lab http://bmi.osu.edu/hpc

  16. A P × Q -way Jagged Heuristic: JAG-PQ-HEUR � P × Q Jagged Partitioning � Sum on columns to generate a � 1D problem. � Partition it in P parts. � For the first stripe, sum on rows. � Partition it in Q parts. Treat all stripes. Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Jagged Partitioning:: P × Q -way Jagged 15 / 36 HPC Lab http://bmi.osu.edu/hpc

  17. A P × Q -way Jagged Heuristic: JAG-PQ-HEUR P × Q Jagged Partitioning Sum on columns to generate a 1D problem. Partition it in P parts. For the first stripe, sum on rows. Partition it in Q parts. Treat all stripes. Complexity : P ) 2 + P × ( Q log n 2 O (( P log n 1 Q ) 2 ). Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Jagged Partitioning:: P × Q -way Jagged 15 / 36 HPC Lab http://bmi.osu.edu/hpc

  18. How good is that ? Theorem (Theorem 1 in Section 3.2.1) If there are no zero in the array, JAG-PQ-HEUR is a (1 + ∆ P n 1 )(1 + ∆ Q n 2 ) -approximation algorithm where ∆ = max A min A , P < n 1 , Q < n 2 . Proof. Based on the guarantee of 1D heuristics. Theorem (Theorem 2 in Section 3.2.1) � m n 1 The approximation ratio is minimized by P = n 2 . Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Jagged Partitioning:: P × Q -way Jagged 16 / 36 HPC Lab http://bmi.osu.edu/hpc

  19. An optimal P × Q -way jagged partitioning : JAG-PQ-OPT A Dynamic Programming Formulation  L max ( n 1 , P ) = min 1 ≤ k < n 1 max( L max ( k − 1 , P − 1) , 1 D ( k , n 1 , Q ))  L max (0 , P ) = 0 L max ( n 1 , 0) = + ∞ , ∀ n 1 ≥ 1  O ( n 1 P ) L max functions to evaluate. (Each is O ( k ).) O ( n 2 1 ) 1D functions to evaluate. (Each is O (( Q log n 2 Q ) 2 ).) (Some significant implementation optimizations apply) For a 512x512 matrix and 1000 processors, that’s 512,000+262,144 values. On 64-bit values, that’s 6MB. Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Jagged Partitioning:: P × Q -way Jagged 17 / 36 HPC Lab http://bmi.osu.edu/hpc

  20. Performance of P × Q -way jagged (PIC-MAG it=30000) 1 RECT-NICOL JAG-PQ-HEUR JAG-PQ-OPT 0.1 load imbalance 0.01 0.001 10 100 1000 10000 number of processors Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Jagged Partitioning:: P × Q -way Jagged 18 / 36 HPC Lab http://bmi.osu.edu/hpc

  21. m-way Jagged Partitioning Ohio State University, Biomedical Informatics 2D partitioning ¨ Umit V. C ¸ataly¨ urek Jagged Partitioning:: m -way Jagged 19 / 36 HPC Lab http://bmi.osu.edu/hpc

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend