CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs
Presenter: Mengxiao Wang
Computation on Natural Graphs Presenter: Mengxiao Wang Problem: - - PowerPoint PPT Presentation
CSE 6350 File and Storage System Infrastructure in Data centers Supporting Internet-wide Services PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs Presenter: Mengxiao Wang Problem: Existing distributed graph computation
Presenter: Mengxiao Wang
runtime of vertices varies widely with graph-parallel execution.
results in poor locality (only a small part of machines will have most edge cuts).
communication other vertices resulting in bottleneck, like traffic problem and too many same messages.
a single machine.
but abstractions within them do not parallelize, the high-degree vertices will have more computation than other vertices.
neighbors and in-edges.
mirrors on other machines.
neighbors and activate them to start GAS vertex programs.
machines
by each vertex
loaded
approaches
Placement
Placement
For greedy vertex-cuts:
minimizes the expected number of machines spanned
– Requires coordination to place each edge – Slower: higher quality cuts
– Approx. greedy objective without coordination – Faster: lower quality cuts
Must synchronize edges Must synchronize vertices