mosaic processing a trillion edge graph on a single
play

Mosaic: Processing a Trillion-Edge Graph on a Single Machine Steffen - PowerPoint PPT Presentation

Mosaic: Processing a Trillion-Edge Graph on a Single Machine Steffen Maass , Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, Taesoo Kim Georgia Institute of Technology Best Student Paper April 26, 2017 Steffen Maass Mosaic:


  1. Mosaic: Processing a Trillion-Edge Graph on a Single Machine Steffen Maass , Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, Taesoo Kim Georgia Institute of Technology Best Student Paper April 26, 2017 Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 1 / 21

  2. Large-scale graph processing is ubiquitous Social networks Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 2 / 21

  3. Large-scale graph processing is ubiquitous Social networks Genome analysis Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 2 / 21

  4. Large-scale graph processing is ubiquitous Social networks Genome analysis Graphs enable Machine Learning Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 2 / 21

  5. Powerful, heterogeneous machines Terabytes of RAM on multiple sockets Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 3 / 21

  6. Powerful, heterogeneous machines Terabytes of RAM on multiple sockets Powerful many-core coprocessors Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 3 / 21

  7. Powerful, heterogeneous machines Terabytes of RAM on multiple sockets Powerful many-core coprocessors Fast, large-capacity Non-volatile Memory Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 3 / 21

  8. Powerful, heterogeneous machines Take advantage of heterogeneous machine to process tera-scale graphs Terabytes of RAM on multiple sockets Powerful many-core coprocessors Fast, large-capacity Non-volatile Memory Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 3 / 21

  9. Table of contents Graph Processing: Sample Application 1 Design 2 Mosaic Architecture Graph Encoding API Evaluation 3 Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 4 / 21

  10. Graph Processing: Applications Community Detection Find Common Friends Find Shortest Paths Estimate Impact of Vertices (webpages, users, . . . ) . . . Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 5 / 21

  11. Mosaic: Design space Graph Processing has many faces: Single Machine Out-of-core In memory Cluster Out-of-core In memory Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 6 / 21

  12. Mosaic: Design space Graph Processing has many faces: Single Machine Out-of-core ⇒ Cheap, but potentially slow In memory ⇒ Fast, but limited graph size Cluster Out-of-core ⇒ Large graphs, but expensive & slow In memory ⇒ Large graphs & fast, but very expensive Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 6 / 21

  13. Mosaic: Design space Graph Processing has many faces: Single Machine Out-of-core ⇒ Cheap, but potentially slow In memory ⇒ Fast, but limited graph size Cluster Out-of-core ⇒ Large graphs, but expensive & slow In memory ⇒ Large graphs & fast, but very expensive ⇒ Single machine, out-of-core is most cost-effective ⇒ Goal: Good performance and large graphs! Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 6 / 21

  14. Mosaic: Design goals Goal Run algorithms on very large graphs on a single machine using coprocessors Enabled by: Common, familiar API (vertex/edge-centric) Encoding: Lossless compression Cache locality Processing on isolated subgraphs Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 7 / 21

  15. Architecture of Mosaic Usage of Xeon Phi & NVMe Involvement of Host <current state> <next state> Global ... ... vertex state stripped ... Host Processors fetch receive (Xeon) per Xeon Phi (×4) . . . Meta (×6) transfer PCIe ... Tile I 1 I 2 transfer ... edge T 1 T 2 ... processing (×61 cores) NVMe Xeon Phi Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 8 / 21

  16. Graph encoding: Idea Compression Split graph into subgraphs, use local (short) identifiers Cache locality Inside subgraphs: Sort by access order Between subgraphs: Overlap vertex sets Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 9 / 21

  17. Background: Column first Locality for write Multiple sequential reads Target vertex 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 P 11 P 12 P 13 P 14 4 5 6 P 21 P 22 P 23 P 24 7 8 P 34 9 P 31 P 32 P 33 10 11 12 P 41 P 42 P 43 P 44 Source Global adjacency vertex matrix Partition (S = 3) Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 10 / 21 ⇒ Problem: No locality when switching column

  18. Background: Row first Locality for read Multiple sequential writes Target vertex 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 P 11 P 12 P 13 P 14 4 5 6 P 21 P 22 P 23 P 24 7 8 P 34 9 P 31 P 32 P 33 10 11 12 P 41 P 42 P 43 P 44 Source Global adjacency vertex matrix Partition (S = 3) Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 11 / 21 ⇒ Problem: No locality when switching row

  19. Background: Hilbert order Space-filling curve Provides locality between adjacent data points Target vertex 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 P 11 P 12 P 13 P 14 4 5 6 P 21 P 22 P 23 P 24 7 8 P 34 9 P 31 P 32 P 33 10 11 12 P 41 P 42 P 43 P 44 Source Global adjacency vertex matrix Partition (S = 3) Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 12 / 21

  20. ② ④ ⑥ ⑤ ① ③ From global to local: Tiles Convert graph to set of tiles 1) Start with adjacency Matrix: Target vertex 1 2 3 4 5 6 7 8 9 10 11 12 1 ➊ ➋ ➊ 2 ➍ ➋ 3 P 11 P 12 P 13 P 14 ➑ 4 ➐ ➌ ➎ 5 ➒ ➏ ➍ 6 ➑ P 21 P 22 P 23 P 24 7 ➏ 8 ➒ P 34 9 P 31 P 32 P 33 10 ➎ 11 ➌ ➐ 12 P 41 P 42 P 43 P 44 Source Global adjacency vertex Partition matrix (S = 3) Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 13 / 21

  21. ① ① ② ③ ④ ① ④ ③ ② ① ⑤ ⑥ ② ① From global to local: Tiles Convert graph to set of tiles 2) Use first edge in tile T 1 : Target vertex (global) 1 2 3 4 5 6 7 8 9 10 11 12 Tile-1 (T 1 ) 1 ➊ ➋ ➊ 2 ➊ ➍ ➋ 3 P 11 P 12 P 13 P 14 meta (I 1 ) ➑ 4 ➐ ➌ ➎ ( ,1) ➍ ( ,2) 5 ➒ ➏ (local) ➑ 6 P 21 P 22 P 23 P 24 7 ➏ ➒ 8 ➎ P 34 9 P 31 P 32 P 33 ➌ 10 ➐ 11 12 P 41 P 42 P 43 P 44 Source : local vertex id Global adjacency vertex Partition ( ,1) : local → global id matrix (global) : local edge store order (S = 3) ➊ Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 13 / 21

  22. ③ ⑥ ② ① ① ② ③ ④ ④ ① ① ⑤ ③ ④ ② ① From global to local: Tiles Convert graph to set of tiles 3) Consume as many edges as possible: Target vertex (global) 1 2 3 4 5 6 7 8 9 10 11 12 Tile-1 (T 1 ) 1 ➊ ➋ ➊ 2 ➊ ➍ ➋ ➋ ➍ 3 P 11 P 12 P 13 P 14 meta (I 1 ) ➌ ➑ 4 ➐ ➌ ➎ ( ,1) ( ,5) ➍ ( ,2) ( ,4) 5 ➒ ➏ (local) ➑ 6 P 21 P 22 P 23 P 24 7 ➏ ➒ 8 ➎ P 34 9 P 31 P 32 P 33 ➌ 10 ➐ 11 12 P 41 P 42 P 43 P 44 Source : local vertex id Global adjacency vertex Partition ( ,1) : local → global id matrix (global) : local edge store order (S = 3) ➊ Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 13 / 21

  23. ③ ① ③ ④ ① ② ④ ① ② ③ ④ ② ① ③ ④ ① ④ ③ ① ② ① ⑤ ⑥ ② From global to local: Tiles Convert graph to set of tiles 4) Next edges do not fit in T 1 , construct T 2 : Target vertex (global) 1 2 3 4 5 6 7 8 9 10 11 12 Tile-1 (T 1 ) 1 ➊ ➋ ➊ 2 ➊ ➍ ➋ ➋ ➍ 3 P 11 P 12 P 13 P 14 meta (I 1 ) ➌ ➑ 4 ➐ ➌ ➎ ( ,1) ( ,5) ➍ ( ,2) ( ,4) 5 ➒ ➏ (local) ➑ 6 Tile-2 (T 2 ) P 21 P 22 P 23 P 24 7 ➏ ➒ ➎ 8 ➐ ➎ P 34 9 P 31 P 32 P 33 ➑ ➏ ➌ 10 ➒ ➐ meta (I 2 ) ( ,4) ( ,5) 11 (local) ( ,6) ( ,3) 12 P 41 P 42 P 43 P 44 Source : local vertex id Global adjacency vertex Partition ( ,1) : local → global id matrix (global) : local edge store order (S = 3) ➊ Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 13 / 21

  24. Locality with Hilbert-ordered tiles Overlapping sets of sources and targets Target vertex 1 2 3 4 5 6 7 8 9 10 11 12 1 ➊ ➋ 2 ➍ 3 P 11 P 12 P 13 P 14 4 ➐ ➌ ➎ 5 ➒ ➏ 6 ➑ P 21 P 22 P 23 P 24 7 8 P 34 9 P 31 P 32 P 33 10 11 12 P 41 P 42 P 43 P 44 Source Global adjacency vertex Partition matrix (S = 3) ⇒ Better locality than row-first or column-first Steffen Maass Mosaic: Trillion Edges on a Single Machine April 26, 2017 14 / 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend