the more the merrier efficient multi source graph
play

The More the Merrier: Efficient Multi-Source Graph Traversal Manuel - PowerPoint PPT Presentation

The More the Merrier: Efficient Multi-Source Graph Traversal Manuel Then * , Moritz Kaufmann * , Fernando Chirigati , Tuan-Anh Hoang-Vu , Kien Pham , Huy T. Vo , Alfons Kemper * , Thomas Neumann * * Technische Universit t M


  1. The More the Merrier: Efficient Multi-Source Graph Traversal Manuel Then * , Moritz Kaufmann * , Fernando Chirigati † , Tuan-Anh Hoang-Vu † , Kien Pham † , Huy T. Vo † , Alfons Kemper * , Thomas Neumann * * Technische Universit ä t M ü nchen, † New York University

  2. Outline • Motivation • Challenges • Goals • Multi-Source BFS • Evaluation • Summary 2015-09-01 2 The More the Merrier: Efficient Multi-Source Graph Traversal The More the Merrier: Efficient Multi-Source Graph Traversal

  3. Motivation • Graph traversal vital part of graph analytics - BFS, DFS, Neighbor traversals, Random walks, ... 2015-09-01 3 The More the Merrier: Efficient Multi-Source Graph Traversal

  4. Motivation • Graph traversal vital part of graph analytics - BFS, DFS, Neighbor traversals, Random walks, ... • Often multiple BFS traversals necessary to compute results - Closeness centrality, Shortest paths, ... 2015-09-01 4 The More the Merrier: Efficient Multi-Source Graph Traversal

  5. Motivation • Graph traversal vital part of graph analytics - BFS, DFS, Neighbor traversals, Random walks, ... • Often multiple BFS traversals necessary to compute results - Closeness centrality, Shortest paths, ... • Real-world graphs often are small-world networks - Social networks, Web graphs, Communication networks 2015-09-01 5 The More the Merrier: Efficient Multi-Source Graph Traversal

  6. Motivation • Graph traversal vital part of graph analytics - BFS, DFS, Neighbor traversals, Random walks, ... • Often multiple BFS traversals necessary to compute results - Closeness centrality, Shortest paths, ... • Real-world graphs often are small-world networks - Social networks, Web graphs, Communication networks • Subject of this talk: efficiently run multiple BFSs on real-world graphs 2015-09-01 6 The More the Merrier: Efficient Multi-Source Graph Traversal

  7. Challenges • Random data access intrinsic to graph traversal algorithms - bad cache behavior, frequent CPU stalls 2015-09-01 7 The More the Merrier: Efficient Multi-Source Graph Traversal

  8. Challenges • Random data access intrinsic to graph traversal algorithms - bad cache behavior, frequent CPU stalls • Single bit accesses waste memory bandwidth - e.g. for BFS seen bitmaps 2015-09-01 8 The More the Merrier: Efficient Multi-Source Graph Traversal

  9. Challenges • Random data access intrinsic to graph traversal algorithms - bad cache behavior, frequent CPU stalls • Single bit accesses waste memory bandwidth - e.g. for BFS seen bitmaps • Independent BFS runs redundantly visit vertices multiple times 2015-09-01 9 The More the Merrier: Efficient Multi-Source Graph Traversal

  10. Challenge - Redundant visits Example: BFSs in a simple graph Initial BFS 1 2015-09-01 10 The More the Merrier: Efficient Multi-Source Graph Traversal

  11. Challenge - Redundant visits Example: BFSs in a simple graph Initial Iteration 1 BFS 1 2015-09-01 11 The More the Merrier: Efficient Multi-Source Graph Traversal

  12. Challenge - Redundant visits Example: BFSs in a simple graph Initial Iteration 1 Iteration 2 BFS 1 2015-09-01 12 The More the Merrier: Efficient Multi-Source Graph Traversal

  13. Challenge - Redundant visits Example: BFSs in a simple graph Initial Iteration 1 Iteration 2 BFS 1 BFS 2 2015-09-01 13 The More the Merrier: Efficient Multi-Source Graph Traversal

  14. Challenge - Redundant visits Example: BFSs in a simple graph Initial Iteration 1 Iteration 2 BFS 1 BFS 2 2015-09-01 14 The More the Merrier: Efficient Multi-Source Graph Traversal

  15. Challenge - Redundant visits Example: BFSs in a simple graph Initial Iteration 1 Iteration 2 BFS 1 BFS 2 2015-09-01 15 The More the Merrier: Efficient Multi-Source Graph Traversal

  16. Challenge - Redundant visits (cont.) Redundant vertex visits for 512 BFSs on LDBC 1M social network graph • After a few iterations, many redundant visits in small-world networks 2015-09-01 16 The More the Merrier: Efficient Multi-Source Graph Traversal

  17. Goals • Leverage knowledge that multiple BFS traversal are run 2015-09-01 17 The More the Merrier: Efficient Multi-Source Graph Traversal

  18. Goals • Leverage knowledge that multiple BFS traversal are run • Optimize data access patterns - embrace memory accesses instead of trying to hide them - CPUs always fetch full cache lines - use all of them 2015-09-01 18 The More the Merrier: Efficient Multi-Source Graph Traversal

  19. Goals • Leverage knowledge that multiple BFS traversal are run • Optimize data access patterns - embrace memory accesses instead of trying to hide them - CPUs always fetch full cache lines - use all of them • Avoid redundant computation and vertex visits - touch vertex information as rarely as possible 2015-09-01 19 The More the Merrier: Efficient Multi-Source Graph Traversal

  20. Multi-Source BFS • Concurrently run many independent BFS traversals on the same graph - 100s of BFSs on a single CPU core 2015-09-01 20 The More the Merrier: Efficient Multi-Source Graph Traversal

  21. Multi-Source BFS • Concurrently run many independent BFS traversals on the same graph - 100s of BFSs on a single CPU core X X + visit seen next 2015-09-01 21 The More the Merrier: Efficient Multi-Source Graph Traversal

  22. Multi-Source BFS • Concurrently run many independent BFS traversals on the same graph - 100s of BFSs on a single CPU core • Store concurrent BFSs state as 3 bitsets per vertex visit seen next • Represent BFS traversal as SIMD bit operations on these bitsets 2015-09-01 22 The More the Merrier: Efficient Multi-Source Graph Traversal

  23. Multi-Source BFS • Concurrently run many independent BFS traversals on the same graph - 100s of BFSs on a single CPU core • Store concurrent BFSs state as 3 bitsets per vertex visit seen next • Represent BFS traversal as SIMD bit operations on these bitsets • Fully utilize cache line-sized memory accesses of modern CPUs • Efficiently share traversals whenever possible - neighbors traversed only once for all concurrent BFSs 2015-09-01 23 The More the Merrier: Efficient Multi-Source Graph Traversal

  24. Multi-Source BFS - Example Initial 2015-09-01 24 The More the Merrier: Efficient Multi-Source Graph Traversal

  25. Multi-Source BFS - Example Initial Iteration 1 2015-09-01 25 The More the Merrier: Efficient Multi-Source Graph Traversal

  26. Multi-Source BFS - Example Initial Iteration 1 2015-09-01 26 The More the Merrier: Efficient Multi-Source Graph Traversal

  27. Multi-Source BFS - Example Initial Iteration 1 2015-09-01 27 The More the Merrier: Efficient Multi-Source Graph Traversal

  28. Multi-Source BFS - Example Initial Iteration 1 Iteration 2 2015-09-01 28 The More the Merrier: Efficient Multi-Source Graph Traversal

  29. Multi-Source BFS - Example Initial Iteration 1 Iteration 2 2015-09-01 29 The More the Merrier: Efficient Multi-Source Graph Traversal

  30. Multi-Source BFS - Example Initial Iteration 1 Iteration 2 2015-09-01 30 The More the Merrier: Efficient Multi-Source Graph Traversal

  31. Multi-Source BFS - Further Improvements • Aggregated neighbor processing - reduce number of random writes • Batching heuristics for maximum sharing • Direction-optimizing • Prefetching ... see paper 2015-09-01 31 The More the Merrier: Efficient Multi-Source Graph Traversal

  32. Evaluation - The More the Merrier 2015-09-01 32 The More the Merrier: Efficient Multi-Source Graph Traversal

  33. Evaluation • MS-BFS-based closeness centrality. 4x Intel Xeon E7-4870v2, 1TB 2015-09-01 33 The More the Merrier: Efficient Multi-Source Graph Traversal

  34. Evaluation • MS-BFS-based closeness centrality. 4x Intel Xeon E7-4870v2, 1TB 2015-09-01 34 The More the Merrier: Efficient Multi-Source Graph Traversal

  35. Summary • Making graph traversals aware of each other can lead to substantial performance increase • Multi-Source BFS (MS-BFS) runs multiple independent BFSs ... - ... on the same graph ... - ... concurrently on a single CPU ... - ... and shares their traversals. • MS-BFS shows 10-100x speedup over existing single-source BFSs 2015-09-01 35 The More the Merrier: Efficient Multi-Source Graph Traversal

  36. Backup 1 2015-09-01 36 The More the Merrier: Efficient Multi-Source Graph Traversal

  37. Backup 2 2015-09-01 37 The More the Merrier: Efficient Multi-Source Graph Traversal

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend