grabi
play

GraBi : Communication-Efficient and Workload-Balanced Partitioning - PowerPoint PPT Presentation

The 49th International Conference on Parallel Processing (ICPP20) GraBi : Communication-Efficient and Workload-Balanced Partitioning for Bipartite Graphs 1 Feng Sheng, 1 Qiang Cao, 2 Hong Jiang, and 1 Jie Yao 1 Huazhong University of Science


  1. The 49th International Conference on Parallel Processing (ICPP’20) GraBi : Communication-Efficient and Workload-Balanced Partitioning for Bipartite Graphs 1 Feng Sheng, 1 Qiang Cao, 2 Hong Jiang, and 1 Jie Yao 1 Huazhong University of Science and Technology 2 University of Texas at Arlington 17-20 August 2020, Edmonton, AB, Canada

  2. GraBi: Communication-Efficient and Workload-Balanced Partitioning for Bipartite Graphs Outline   Background  Motivation  Design of GraBi ➢ Vertical Partitioning: Vertex-vector Chunking ➢ Horizontal Partitioning: Vertex-chunk Assignment  Evaluation  Conclusion Feng Sheng, Qiang Cao, Hong Jiang, and Jie Yao

  3. Graph Partitioning Graph partitioning distributes vertices and edges over computing nodes. Background · Motivation · Design · Evaluation · Conclusion

  4. Graph Partitioning Graph partitioning distributes vertices and edges over computing nodes. Vertex Master Vertex Replica Node 1 Node 2 1 2 Node 1 1 2 1 2 1 2 1 2 4 4 2 Node 2 4 3 4 3 4 3 4 3 Node 3 4 3 Node 3 (a) Edge-cut (b) Vertex-cut ➢ Edge-cut equally distributes vertices among nodes. ➢ Vertex-cut equally distributes edges among nodes. Background · Motivation · Design · Evaluation · Conclusion

  5. Graph Partitioning Graph partitioning distributes vertices and edges over computing nodes. Vertex Master Vertex Replica Node 1 Node 2 1 2 Node 1 1 2 1 2 1 2 1 2 4 4 2 Node 2 4 3 4 3 4 3 4 3 Node 3 4 3 Node 3 (a) Edge-cut (b) Vertex-cut ➢ Edge-cut equally distributes vertices among nodes. ➢ Vertex-cut equally distributes edges among nodes. ➢ replication factor ( 𝜇 ): the average number of replicas per vertex. Background · Motivation · Design · Evaluation · Conclusion

  6. Bipartite graphs & MLDM algorithms • Bipartite graphs ➢ Vertices are separated into two disjoint subsets. ➢ Every edge connects one vertex each from the two subsets. Background · Motivation · Design · Evaluation · Conclusion

  7. Bipartite graphs & MLDM algorithms • Bipartite graphs ➢ Vertices are separated into two disjoint subsets. ➢ Every edge connects one vertex each from the two subsets. • Machine Learning and Data Mining (MLDM) algorithms ➢ Bipartite graphs have been widely used in MLDM applications . Background · Motivation · Design · Evaluation · Conclusion

  8. Bipartite graphs & MLDM algorithms • Bipartite graphs ➢ Vertices are separated into two disjoint subsets. ➢ Every edge connects one vertex each from the two subsets. • Machine Learning and Data Mining (MLDM) algorithms ➢ Bipartite graphs have been widely used in MLDM applications. Y # of items D R(u,v) p 1 1 1 q 1 Q v X # of users p 2 2 2 q 2 Q 𝑈 ≈ P D R x ... ... P u R(u,v) X Y p X q Y (b) View of Graph (a) View of Matrix Background · Motivation · Design · Evaluation · Conclusion

  9. Observations • Observation 1 : The vertex value in MLDM algorithms is a multi-element vector. Background · Motivation · Design · Evaluation · Conclusion

  10. Observations • Observation 1 : The vertex value in MLDM algorithms is a multi-element vector. ➢ The authors of 𝐷𝑉𝐶𝐹 [1] associate each vertex with a vector of up to 128 elements. ➢ The users of 𝑄𝑝𝑥𝑓𝑠𝐻𝑠𝑏𝑞ℎ [2] can configure each vertex value as a vector of thousands of elements [1] M. Zhang, Y. Wu, K. Chen, et al. Exploring the Hidden Dimension in Graph Processing. In OSDI 2016. [2] J. E. Gonzalez, Y. Low, H. Gu, et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In OSDI 2012. Background · Motivation · Design · Evaluation · Conclusion

  11. Observations • Observation 1 : The vertex value in MLDM algorithms is a multi-element vector. ➢ The authors of 𝐷𝑉𝐶𝐹 [1] associate each vertex with a vector of up to 128 elements. ➢ The users of 𝑄𝑝𝑥𝑓𝑠𝐻𝑠𝑏𝑞ℎ [2] can configure each vertex value as a vector of thousands of elements • Observation 2 : The sizes of two vertex-subsets in a bipartite graph can be highly lopsided. [1] M. Zhang, Y. Wu, K. Chen, et al. Exploring the Hidden Dimension in Graph Processing. In OSDI 2016. [2] J. E. Gonzalez, Y. Low, H. Gu, et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In OSDI 2012. Background · Motivation · Design · Evaluation · Conclusion

  12. Observations • Observation 1 : The vertex value in MLDM algorithms is a multi-element vector. ➢ The authors of 𝐷𝑉𝐶𝐹 [1] associate each vertex with a vector of up to 128 elements. ➢ The users of 𝑄𝑝𝑥𝑓𝑠𝐻𝑠𝑏𝑞ℎ [2] can configure each vertex value as a vector of thousands of elements • Observation 2 : The sizes of two vertex-subsets in a bipartite graph can be highly lopsided. ➢ In 𝑂𝑓𝑢𝑔𝑚𝑗𝑦 [3] , the number of users is about 27x that of movies. ➢ In 𝐹𝑜𝑕𝑚𝑗𝑡ℎ 𝑋𝑗𝑙𝑗𝑞𝑓𝑒𝑗𝑏 [4] , the number of articles is about 98x that of words. [1] M. Zhang, Y. Wu, K. Chen, et al. Exploring the Hidden Dimension in Graph Processing. In OSDI 2016. [2] J. E. Gonzalez, Y. Low, H. Gu, et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In OSDI 2012. [3] http://www.netflixprize.com/community/viewtopic.php?pid=9857 [4] https://dumps.wikimedia.org/ Background · Motivation · Design · Evaluation · Conclusion

  13. Observations • Observation 3: Within a vertex-subset, the vertices usually exhibit power-law degree distribution Background · Motivation · Design · Evaluation · Conclusion

  14. Observations • Observation 3: Within a vertex-subset, the vertices usually exhibit power-law degree distribution ➢ Both the two vertex-subsets in 𝐸𝐶𝑀𝑄 [1] exhibit power-law degree distribution. (a) Author Degree Distribution (b) Publication Degree Distribution [1] https://dumps.wikimedia.org/ Background · Motivation · Design · Evaluation · Conclusion

  15. Opportunities • Observation 1 : The vertex value in MLDM algorithms is a multi-element vector. Each vertex vector can be divided into multiple sub-vectors. Background · Motivation · Design · Evaluation · Conclusion

  16. Opportunities • Observation 1 : The vertex value in MLDM algorithms is a multi-element vector. Each vertex vector can be divided into multiple sub-vectors. • Observation 2 : The sizes of two vertex-subsets in a bipartite graph can be highly lopsided The two vertex-subsets can be processed with different priorities. Background · Motivation · Design · Evaluation · Conclusion

  17. Opportunities • Observation 1 : The vertex value in MLDM algorithms is a multi-element vector. Each vertex vector can be divided into multiple sub-vectors. • Observation 2 : The sizes of two vertex-subsets in a bipartite graph can be highly lopsided The two vertex-subsets can be processed with different priorities. • Observation 3 : Within a vertex-subset, the vertices usually exhibit power-law degree distribution The vertices of different degrees should be distinguished. Background · Motivation · Design · Evaluation · Conclusion

  18. Overview of GraBi ➢ GraBi is a communication-efficient and workload-balanced partitioning framework for bipartite graphs. Background · Motivation · Design · Evaluation · Conclusion

  19. Overview of GraBi ➢ GraBi is a communication-efficient and workload-balanced partitioning framework for bipartite graphs. ➢ GraBi comprehensively exploits the above three observations of bipartite graphs and MLDM algorithms. Background · Motivation · Design · Evaluation · Conclusion

  20. Overview of GraBi ➢ GraBi is a communication-efficient and workload-balanced partitioning framework for bipartite graphs. ➢ GraBi comprehensively exploits the above three features of bipartite graphs and MLDM algorithms. ➢ GraBi partitions a bipartite graph first vertically, and then horizontally, to realize high-quality partitioning. Background · Motivation · Design · Evaluation · Conclusion

  21. Vertical Partitioning: Vertex-vector Chunking Replica 1 Vertex 2 Vertex 2 Replica 3 Vertex 1 Vertex 1 Node 1 Horizontal Replica 2 Vertex 3 Vertex 3 Partitioning Node 2 Node 3 Vertex Master Vertex Replica inter-vertex Comm. Intra-vertex Comm. Background · Motivation · Design · Evaluation · Conclusion

  22. Vertical Partitioning: Vertex-vector Chunking Replica 1 Vertex 2 Vertex 2 Replica 3 Vertex 1 Vertex 1 Node 1 Horizontal Replica 2 Vertex 3 Vertex 3 Partitioning Node 2 Node 3 Vertex Master Vertex Replica inter-vertex Comm. Intra-vertex Comm. The whole vector of a vertex is assigned to a computing node. Background · Motivation · Design · Evaluation · Conclusion

  23. Vertical Partitioning: Vertex-vector Chunking Replica 1 Vertex 2 Vertex 2 Replica 3 Vertex 1 Vertex 1 Node 1 Horizontal Replica 2 Vertex 3 Vertex 3 Partitioning Node 2 Node 3 Vertex Master Vertex Replica inter-vertex Comm. Intra-vertex Comm. • Inter-vertex Communication happens between computing nodes The whole vector of a vertex • is assigned to a computing node. Intra-vertex Communication happens within a computing node Background · Motivation · Design · Evaluation · Conclusion

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend