Parallel Triangle Counting in MPI Jason Li and David Wise - - PowerPoint PPT Presentation

▶

Aug 20, 2022 380 likes •518 views

Parallel Triangle Counting in MPI Jason Li and David Wise Background A triangle in a undirected graph is a collection of 3 vertices such that all 3 pairs of vertices are connected by an edge. Triangle counting has emerged as an

SLIDE 1

Parallel Triangle Counting in MPI

Jason Li and David Wise

SLIDE 2

Background

A triangle in a undirected

graph is a collection of 3 vertices such that all 3 pairs of vertices are connected by an edge.

“Triangle counting has

emerged as an important building block in the study of social networks, identifying thematic structures of networks, spam and fraud detection, link classification and recommendation, and more” [1]

SLIDE 3

Background

A triangle in a undirected

graph is a collection of 3 vertices such that all 3 pairs of vertices are connected by an edge.

“Triangle counting has

emerged as an important building block in the study of social networks, identifying thematic structures of networks, spam and fraud detection, link classification and recommendation, and more” [1]

This graph has 2 triangles:

SLIDE 4

The Underlying Algorithm

Initialize the counter to 0.
Sort the vertices in order of increasing degree, breaking ties arbitrarily.

Similarly, sort the adjacency lists according to the same ordering.

For each edge (v, w) with v < w:
Let uv and uw be the first vertices in the adjacency lists of v and w

respectively.

While uv exists and uv < v and uw exists and uw < v:
If uv < uw then set uv to the next neighbor of v.
Else if uw < uv then set uw to the next neighbor of w.
Else increment the counter and set uv and uw to their next neighbors.

SLIDE 5

Complexity of the Algorithm

The space complexity is just O(m) since we

store the graph

Because the vertices are sorted by degree and

each edge is assigned to its smaller neighbor, it can be shown that the sequential time complexity is O(m3/2).

SLIDE 6

Parallelizing the Algorithm

The focus of our project was efficiently parallelizing

this algorithm

Naive idea: each edge is a task and can be arbitrarily

assigned to a processor

The catch is that to process an edge, the processor

needs to know the neighbors of each vertex on the edge

If the edges are arbitrarily assigned, each

processor needs a copy of the whole graph

SLIDE 7

Reducing Communication

We want the edges assigned to each processor to hit

as few vertices as possible

We can approach the problem by grouping the

vertices

We partition the vertices into r = √P groups

v1, …, vr and assign each processor a pair  (vi, vj)

The processor assigned pair (vi, vj) is responsible

for all edges going from a vertex in vi to vj.

SLIDE 8

Cost Analysis

In the average case,

each processor gets 1/P edges: we expect near perfect speedup

Each processor gets

the adjacency lists

f 2n/√P vertices,

which on average has total size 2m/√P

4 processors 2 groups of vertices P1 P2 P3 P4

The thick arrows represent groups of edges.

SLIDE 9

Actual Speedups Matched Expectations

Time (s) Number of Processors 1 4 9 16 25 36 49 64

SLIDE 10

More Results

gplus k5000 live skitter Speedup 17.23 12.72 1.95 3.98

SLIDE 11

Thank you!

Questions?

SLIDE 12

References

1. “Counting and Sampling Triangles from a Graph Stream” A. Pavan, Kanat

Tangwongsan, Srikanta Tirthapura, Kun-Lung Wu, Proceedings of the VLDB Endowment VLDB Endowment Hompage archive, Volume 6 Issue 14, September 201, Pages 1870-1881.

2. “15-418 Final Report”, Shu-Hao Yu, YiCheng Qin. http://www.cs.cmu.edu/afs/

cs/user/shuhaoy/www/Final_Project.pdf.