Multi-Criteria Partitioning of Multi-Block Structured Grids Hengjie - PowerPoint PPT Presentation

Multi-Criteria Partitioning of Multi-Block Structured Grids Hengjie Wang Aparna Chandramowlishwaran HPC Forge University of California, Irvine Jun. 27, 2019 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 1 / 39

Outline Background Algorithms Tests and Results Conclusion H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 2 / 39

Background Outline Background Algorithms Tests and Results Conclusion H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 3 / 39

Background Structured Grid ◮ Structured Grid: Regular connectivity between grid cells. i,j+1 i-1,j i,j i+1,j i,j-1 ◮ Block: grid unit equivalent to a single rectangle. airfoil connected, Block2Block Airfoil Grid Block H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 4 / 39

Background Structured Grid ◮ Multi-Block Structured Grids Bump3D, 5 blocks H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 5 / 39

Background Halo Exchange Split a block into 2 partitions and assign each partition to a node: communication Block2Block communication H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 6 / 39

Background Hybird Programming Model Hybrid programming model: ◮ 1 MPI process per node and spawn threads within a node. ◮ Assume shared memory copy takes no time. Partition 4 blocks onto 2 nodes: 40Bytes Average Workload W 105 50 50 50Bytes 50Bytes 40Bytes 50 60 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 7 / 39

Background Hybird Programming Model Hybrid programming model: ◮ 1 MPI process per node and spawn threads within a node. ◮ Assume shared memory copy takes no time. Partition 4 blocks onto 2 nodes: 40Bytes Average Workload W 105 50 50 Imbalance 5/105 50Bytes 50Bytes 40Bytes 50 60 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 7 / 39

Background Hybird Programming Model Hybrid programming model: ◮ 1 MPI process per node and spawn threads within a node. ◮ Assume shared memory copy takes no time. Partition 4 blocks onto 2 nodes: 40Bytes Average Workload W 105 50 50 Imbalance 5/105 Edge Cuts 2 50Bytes 50Bytes Communcation Volume 80 Bytes 40Bytes 50 60 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 7 / 39

Background Hybird Programming Model Hybrid programming model: ◮ 1 MPI process per node and spawn threads within a node. ◮ Assume shared memory copy takes no time. Partition 4 blocks onto 2 nodes: 40Bytes Average Workload W 105 50 50 Imbalance 5/105 Edge Cuts 2 50Bytes 50Bytes Communcation Volume 80 Bytes Shared Memory Copy 100 Btyes 40Bytes 50 60 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 7 / 39

Background Objectives Given the number of partitions n p , workload per partition W , the partitioner should: ◮ Achieve load balance ◮ Minimize communication cost H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 8 / 39

Background Objectives Given the number of partitions n p , workload per partition W , the partitioner should: ◮ Achieve load balance • Trade off load balance for communication cost ◮ Minimize communication cost H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 8 / 39

Background Objectives Given the number of partitions n p , workload per partition W , the partitioner should: ◮ Achieve load balance • Trade off load balance for communication cost ◮ Minimize communication cost • Reduce the inter-node communication • Convert Block2Block communication to shared memory copy H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 8 / 39

Algorithms Outline Background Algorithms Tests and Results Conclusion H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 9 / 39

Algorithms State-of-the-art Methods The state-of-the-art methods can be divided into two strategies: ◮ Top-down strategy: • Cut large blocks and assign sub-blocks to partitions. • Group small blocks to fill partitions. Examples: Greedy [Ytterstr¨ om 97] Recursive Edge Bisection (REB) [Berger 87] Integer Factorization (IF) ◮ Bottom-up strategy: Transform the problem to graph partitioning and use graph partitioner. Examples: Metis [Karypis 94], Scotch [Roman 96], Chaco [Leland 95] H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 10 / 39

Algorithms Greedy Algorithm Greedy Algorithm: ◮ Assign (part of) the largest block to the most underload partition. ◮ Cut at the longest edge of a block. 20 15 10 10 10 10 10 W = 300 W p = 0 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 11 / 39

Algorithms Greedy Algorithm Greedy Algorithm: ◮ Assign (part of) the largest block to the most underload partition. ◮ Cut at the longest edge of a block. 20 15 10 10 10 10 10 W = 300 W p = 200 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 11 / 39

Algorithms Greedy Algorithm Greedy Algorithm: ◮ Assign (part of) the largest block to the most underload partition. ◮ Cut at the longest edge of a block. 20 10 5 10 10 10 10 10 W = 300 W p = 300 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 11 / 39

Algorithms Greedy Algorithm Greedy Algorithm: ◮ Assign (part of) the largest block to the most underload partition. ◮ Cut at the longest edge of a block. 20 10 5 Ignores the connectivity 10 10 10 between blocks. Creates excessive small blocks when cutting a large block 10 10 W = 300 W p = 300 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 11 / 39

Algorithms Greedy Algorithm Bump3D grid: 5 blocks, the largest block is 27 times larger than the rest. Bump3D blocks H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 12 / 39

Algorithms Greedy Algorithm Bump3D grid: 5 blocks, the largest block is 27 times larger than the rest. Bump3D blocks Greedy, 16 partitions H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 12 / 39

Algorithms Bottom-up Strategy Bottom-up: Convert the structured grid partitioning to general graph partitioning. For a graph partitioner to work well, it needs large number of vertices per partition . 1. Over-decompose blocks, construct graph with blocks as vertices 2. Apply graph partitioner: Metis, Scotch, Chaco, etc 3. Merge blocks within one partition H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 13 / 39

Algorithms Bottom-up Strategy Use Metis as the graph partitioner to generate 16 partitions with different over-decomposition method. Over-Decompose to elementary blocks Over-Decompose with IF H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 14 / 39

Algorithms Limitations of State-of-the-art Methods Above methods share the limitations: ◮ Flat MPI, ignore the shared memory on the algorithm level. H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 15 / 39

Algorithms Limitations of State-of-the-art Methods Above methods share the limitations: ◮ Flat MPI, ignore the shared memory on the algorithm level. ◮ The communication performance does not distinguish the shared memory copy and inter-nodes data transfer. H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 15 / 39

Algorithms Limitations of State-of-the-art Methods Above methods share the limitations: ◮ Flat MPI, ignore the shared memory on the algorithm level. ◮ The communication performance does not distinguish the shared memory copy and inter-nodes data transfer. ◮ Primarily focus on reducing communication volume, ignore the effect of network’s latency. H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 15 / 39

Algorithms Our Partition Algorithms Our contributions: ◮ Use α − β model to measure communication cost, which incorporates communication volume, edge cut, and network properties. H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 16 / 39

Algorithms Our Partition Algorithms Our contributions: ◮ Use α − β model to measure communication cost, which incorporates communication volume, edge cut, and network properties. ◮ Propose new partition algorithms following the top-down strategy. H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 16 / 39

Algorithms Our Partition Algorithms Our contributions: ◮ Use α − β model to measure communication cost, which incorporates communication volume, edge cut, and network properties. ◮ Propose new partition algorithms following the top-down strategy. • Modify Recursive Edge Bisection (REB) and Integer Factorization (IF) for cutting large blocks ( W > W ). H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 16 / 39

Algorithms Our Partition Algorithms Our contributions: ◮ Use α − β model to measure communication cost, which incorporates communication volume, edge cut, and network properties. ◮ Propose new partition algorithms following the top-down strategy. • Modify Recursive Edge Bisection (REB) and Integer Factorization (IF) for cutting large blocks ( W > W ). • Propose Cut-Combine-Greedy (CCG) and Graph-Grow-Sweep (GGS) for grouping small blocks. H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS’ 19 06/28/2019 16 / 39

Multi-Criteria Partitioning of Multi-Block Structured Grids Hengjie - PowerPoint PPT Presentation

Multi-Criteria Partitioning of Multi-Block Structured Grids Hengjie Wang Aparna Chandramowlishwaran HPC Forge University of California, Irvine Jun. 27, 2019 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS 19 06/28/2019 1 / 39

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply

Partitioning Introduction to Partitioning Mahapatra-Texas A&M-Spring02 1 System

ESG Criteria: ESG Criteria: ESG Criteria: ESG Criteria: New paradigm that will redefine the

Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael

1 1 Slide 5 Slide 6 Partitioning and Load Balancing Partitioning Goals Assignment of

Partitioning Problem and Usage Lecture 8 CSCI 4974/6971 26 Sep 2016 1 / 14 Todays Biz 1.

Investigating hypergraph-partitioning-based sparse matrix partitioning methods Bora U car

Problem 1 k zero bits n bits IV Block Block Block Block Cipher Cipher Cipher Cipher

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Balance-Enforced Multi-Level Algorithm for Multi-Criteria Graph Partitioning Rmi Barat 1 , 2

L101: Introduction to Structured Prediction Ryan Cotterell What is structured prediction?

Introducing EF Block TM Introduction to EF Block Building Materials Overview of EF

CRYSTAL CITY BLOCK PLAN # CCBP- J-K 2019 1 BLOCK J-K Long Range Planning Committee Block

Natural Response to Non-zero Initial Conditions Prof. Seungchul Lee Industrial AI Lab. The

Solar-powering your geek gear Alternative and mobile energy for all your little toys Michael

CS5412: NETWORKS AND THE CLOUD Lecture III Ken Birman The Internet and the Cloud 2 Cloud

Tools for large-scale collection & analysis of source code repositories OPEN SOURCE GIT

Many-core Computing Many-core Computing Can compilers and tools do the Can compilers and tools

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

November 16, 2017 Gildas Avoine Loc Ferreira Rescuing LoRaWAN 1.0 Workshop CRYPTACUS 1

SURFsara NOC Flash talk Erik Ruiter, Sr. Network Specialist, SURFsara TF-NOC Meeting Cambridge

Sambuz

Useful Links

Newsletter

Mail Us

Multi-Criteria Partitioning of Multi-Block Structured Grids Hengjie - PowerPoint PPT Presentation

Multi-Criteria Partitioning of Multi-Block Structured Grids Hengjie Wang Aparna Chandramowlishwaran HPC Forge University of California, Irvine Jun. 27, 2019 H.Wang, A.Chandramowlishwaran (UCI) Partitioner ICS 19 06/28/2019 1 / 39

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply

Partitioning Introduction to Partitioning Mahapatra-Texas A&amp;M-Spring02 1 System

ESG Criteria: ESG Criteria: ESG Criteria: ESG Criteria: New paradigm that will redefine the

Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael

1 1 Slide 5 Slide 6 Partitioning and Load Balancing Partitioning Goals Assignment of

Partitioning Problem and Usage Lecture 8 CSCI 4974/6971 26 Sep 2016 1 / 14 Todays Biz 1.

Investigating hypergraph-partitioning-based sparse matrix partitioning methods Bora U car

Problem 1 k zero bits n bits IV Block Block Block Block Cipher Cipher Cipher Cipher

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Balance-Enforced Multi-Level Algorithm for Multi-Criteria Graph Partitioning Rmi Barat 1 , 2

L101: Introduction to Structured Prediction Ryan Cotterell What is structured prediction?

Introducing EF Block TM Introduction to EF Block Building Materials Overview of EF

CRYSTAL CITY BLOCK PLAN # CCBP- J-K 2019 1 BLOCK J-K Long Range Planning Committee Block

Natural Response to Non-zero Initial Conditions Prof. Seungchul Lee Industrial AI Lab. The

Solar-powering your geek gear Alternative and mobile energy for all your little toys Michael

CS5412: NETWORKS AND THE CLOUD Lecture III Ken Birman The Internet and the Cloud 2 Cloud

Tools for large-scale collection &amp; analysis of source code repositories OPEN SOURCE GIT

Many-core Computing Many-core Computing Can compilers and tools do the Can compilers and tools

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

November 16, 2017 Gildas Avoine Loc Ferreira Rescuing LoRaWAN 1.0 Workshop CRYPTACUS 1

SURFsara NOC Flash talk Erik Ruiter, Sr. Network Specialist, SURFsara TF-NOC Meeting Cambridge

Sambuz

Useful Links

Newsletter

Mail Us

Partitioning Introduction to Partitioning Mahapatra-Texas A&M-Spring02 1 System

Tools for large-scale collection & analysis of source code repositories OPEN SOURCE GIT