evaluation of parallel graph loading techniques
play

Evaluation of Parallel Graph Loading Techniques Manuel Then, Moritz - PowerPoint PPT Presentation

Evaluation of Parallel Graph Loading Techniques Manuel Then, Moritz Kaufmann, Alfons Kemper, Thomas Neumann Technical University of Munich Chair of Database Systems Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 3


  1. Evaluation of Parallel Graph Loading Techniques Manuel Then, Moritz Kaufmann, Alfons Kemper, Thomas Neumann Technical University of Munich Chair of Database Systems

  2. Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 3

  3. Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 4

  4. General Graph Loading Pipeline Goal : Efficiently load a given graph dataset for explorative analytics • Parse edges and create relabeling • Write edges to worker-local buffer Read • Find unique vertices • Count neighbors Sync • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 5

  5. Scenario-specific Graph Loading Problem : The optimal way of loading the graph depends on various factors: • Format of the graph data • Source of the data • Properties of the input data • Target graph data structure • Execution machine Graph loading pipeline must be adapted to the scenario at hand Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 6

  6. General Graph Loading Pipeline Goal : Efficiently load a given graph dataset for explorative analytics • Parse edges and create relabeling • Write edges to worker-local buffer Read • Find unique vertices • Count neighbors Sync • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 7

  7. General Graph Loading Pipeline Identifier data type? binary, Goal : Efficiently load a given graph dataset for explorative analytics decimal, string? • Parse edges and create relabeling • Write edges to worker-local buffer Read • Find unique vertices • Count neighbors Sync • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 8

  8. General Graph Loading Pipeline Identifier data Can input data type? binary, Goal : Efficiently load a given graph dataset for explorative analytics be read multiple decimal, string? times? • Parse edges and create relabeling • Write edges to worker-local buffer Read • Find unique vertices • Count neighbors Sync • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 9

  9. General Graph Loading Pipeline Identifier data Random Can input data type? binary, Goal : Efficiently load a given graph dataset for explorative analytics access be read multiple decimal, string? possible? times? • Parse edges and create relabeling • Write edges to worker-local buffer Read • Find unique vertices • Count neighbors Sync • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 10

  10. General Graph Loading Pipeline Identifier data Random Can input data type? binary, Goal : Efficiently load a given graph dataset for explorative analytics access be read multiple decimal, string? possible? times? • Parse edges and create relabeling • Write edges to worker-local buffer Read Explicit vertex list available? • Find unique vertices • Count neighbors Sync • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 11

  11. General Graph Loading Pipeline Identifier data Random Can input data type? binary, Goal : Efficiently load a given graph dataset for explorative analytics access be read multiple decimal, string? possible? times? • Parse edges and create relabeling • Write edges to worker-local buffer Read Explicit vertex list available? • Find unique vertices • Count neighbors Sync • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 12

  12. General Graph Loading Pipeline Identifier data Random Can input data type? binary, Goal : Efficiently load a given graph dataset for explorative analytics access be read multiple decimal, string? possible? times? • Parse edges and create relabeling • Write edges to worker-local buffer Read Explicit vertex list available? • Find unique vertices Which data • Count neighbors structure to Sync generate? • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 13

  13. General Graph Loading Pipeline Identifier data Random Can input data type? binary, Goal : Efficiently load a given graph dataset for explorative analytics access be read multiple decimal, string? possible? times? • Parse edges and create relabeling • Write edges to worker-local buffer Read Explicit vertex list available? • Find unique vertices Which data • Count neighbors structure to Sync generate? • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 14

  14. General Graph Loading Pipeline Goal : Efficiently load a given graph dataset for explorative analytics • Parse edges and create relabeling • Write edges to worker-local buffer Read • Find unique vertices • Count neighbors Sync • Create final graph data structure • Apply final relabeling Write Analytics • The actual analytics work Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 15

  15. Parsers Binary reader • No parsing necessary => directly copy vertex identifiers • Every edge same size => work splitting trivial Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 16

  16. Parsers Binary reader • No parsing necessary => directly copy vertex identifiers • Every edge same size => work splitting trivial Library-provided decimal parsing • Readily-available for many languages • We evaluated C++’s stream operator and strtol • Varying edge length => work splitting more complex Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 17

  17. Parsers 2x 20x 200x Binary reader • No parsing necessary => directly copy vertex identifiers • Every edge same size => work splitting trivial Library-provided decimal parsing • Readily-available for many languages • We evaluated C++’s stream operator and strtol • Varying edge length => work splitting more complex Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 18

  18. Parsers 2x 20x 200x Binary reader • No parsing necessary => directly copy vertex identifiers • Every edge same size => work splitting trivial Library-provided decimal parsing • Readily-available for many languages • We evaluated C++’s stream operator and strtol • Varying edge length => work splitting more complex Iterative decimal parsing • Multiply by ten and add character’s respective digit Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 19

  19. Parsers 2x 20x 200x Binary reader • No parsing necessary => directly copy vertex identifiers • Every edge same size => work splitting trivial Library-provided decimal parsing • Readily-available for many languages • We evaluated C++’s stream operator and strtol • Varying edge length => work splitting more complex Iterative decimal parsing • Multiply by ten and add character’s respective digit Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 20

  20. Parsers 2x 20x 200x Binary reader • No parsing necessary => directly copy vertex identifiers • Every edge same size => work splitting trivial Library-provided decimal parsing • Readily-available for many languages • We evaluated C++’s stream operator and strtol • Varying edge length => work splitting more complex Iterative decimal parsing • Multiply by ten and add character’s respective digit Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 21

  21. Parsers 2x 20x 200x Binary reader • No parsing necessary => directly copy vertex identifiers • Every edge same size => work splitting trivial Library-provided decimal parsing • Readily-available for many languages • We evaluated C++’s stream operator and strtol • Varying edge length => work splitting more complex Iterative decimal parsing • Multiply by ten and add character’s respective digit Vectorized decimal parsing • Leverage wide vector units for identifier parsing Manuel Then (TUM) | Evaluation of Parallel Graph Loading Techniques 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend