Exploring Scalable Implementations of Triangle Enumeration in Graphs - - PowerPoint PPT Presentation

exploring scalable implementations of triangle
SMART_READER_LITE
LIVE PREVIEW

Exploring Scalable Implementations of Triangle Enumeration in Graphs - - PowerPoint PPT Presentation

Exploring Scalable Implementations of Triangle Enumeration in Graphs of Diverse Densities: Apache-Spark vs. GPU Travis Johnston, Stephen Herbein, and Michela Taufer Global Computing Laboratory University of Delaware Travis Johnston, Stephen


slide-1
SLIDE 1

Exploring Scalable Implementations of Triangle Enumeration in Graphs of Diverse Densities: Apache-Spark vs. GPU

Travis Johnston, Stephen Herbein, and Michela Taufer Global Computing Laboratory University of Delaware

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 1

slide-2
SLIDE 2

Introduction

Graphs are powerful tools for modeling. Model social interaction:

Friendship graphs Social networks Collaboration/Co-authorship graphs Phone call graphs

Model computer networks:

WWW (pages linking to other pages) WWW (hardware linking to other hardware)

Model data moving through a network:

Moving data from servers to users (WWW-hardware network) Infectious disease moving through a social network

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 2

slide-3
SLIDE 3

Introduction

What information can the structure of a graph convey? Identify the most influential nodes:

Personalities with many Twitter followers e.g. Katy Perry, Justin Beiber, Taylor Swift, and Barrack Obama Prolific authors/collaborators e.g. Paul Erd˝

  • s with ≥ 500 collaborators and ≥ 1525 papers

Important web pages e.g. get.adobe.com/reader/, cnn.com, and google.com

Identify communities

Friends with similar interests Websites with similar topic Criminal networks

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 3

slide-4
SLIDE 4

Introduction

Why triangle enumeration? Used to calculate local clustering coefficient Used to compute transitivity ratio Directly applicable in spam detection and web link recommendation

Finding triangles in graphs is a classic theoretical problem with numerous practical applications. The recent explosion of work on social networks has led to a great interest in fast algorithms to find triangles in graphs. The social sciences and physics communities often study triangles in real networks and use them to reason about underlying social processes. ... Triangle enumeration is also a fundamental subroutine for other more complex algorithmic tasks. [1]

[1] http://www.cs.princeton.edu/~csesha/pubs/conf-triangle-enum.pdf [2] http://people.seas.harvard.edu/~babis/int-math-triangles.pdf [3] http://arxiv.org/abs/0904.3761 Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 4

slide-5
SLIDE 5

Goal and Contributions

Our goal: Study the efficiency of highly parallel algorithms for triangle enumeration on two parallel architectures

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 5

slide-6
SLIDE 6

Goal and Contributions

Our goal: Study the efficiency of highly parallel algorithms for triangle enumeration on two parallel architectures Our contributions: Present two algorithmic implementations of Triangle Enumeration

Triangle Enumeration via matrix multiplication on GPU. Triangle Enumeration via MapReduce using Apache-Spark.

Critically compare the performance on two graph models:

Erd˝

  • s-R´

enyi (ER) random graph model Preferential attachment model

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 5

slide-7
SLIDE 7

What is a Graph?

Definition

A graph G = (V , E) contains a set of vertices V and a set of edges E.

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 6

slide-8
SLIDE 8

What is a Graph?

Definition

A graph G = (V , E) contains a set of vertices V and a set of edges E. vertices vertices

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 6

slide-9
SLIDE 9

What is a Graph?

Definition

A graph G = (V , E) contains a set of vertices V and a set of edges E. Each edge e ∈ E is a set of two (distinct) vertices, e = {i, j}. edges vertices edges e i j

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 6

slide-10
SLIDE 10

What is a Triangle?

Definition

Three vertices form a triangle if each pair of vertices share an edge. 1 2 3 4 5 6

This graph contains 2 triangles (blue).

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 7

slide-11
SLIDE 11

Triangle Enumeration via Matrix Multiplication (GPU)

Definition

The Adjacency Matrix of a graph is a matrix An,n = [aij] where aij = 1 if vertex i is adjacent to vertex j, and aij = 0 otherwise. 1 2 3 4 5 6 A =                

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8

slide-12
SLIDE 12

Triangle Enumeration via Matrix Multiplication (GPU)

Definition

The Adjacency Matrix of a graph is a matrix An,n = [aij] where aij = 1 if vertex i is adjacent to vertex j, and aij = 0 otherwise. 1 2 3 4 5 6 A =         1 1        

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8

slide-13
SLIDE 13

Triangle Enumeration via Matrix Multiplication (GPU)

Definition

The Adjacency Matrix of a graph is a matrix An,n = [aij] where aij = 1 if vertex i is adjacent to vertex j, and aij = 0 otherwise. 1 2 3 4 5 6 A =         1 1 1 1        

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8

slide-14
SLIDE 14

Triangle Enumeration via Matrix Multiplication (GPU)

Definition

The Adjacency Matrix of a graph is a matrix An,n = [aij] where aij = 1 if vertex i is adjacent to vertex j, and aij = 0 otherwise. 1 2 3 4 5 6 A =         1 1 1 1 1 1        

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8

slide-15
SLIDE 15

Triangle Enumeration via Matrix Multiplication (GPU)

Definition

The Adjacency Matrix of a graph is a matrix An,n = [aij] where aij = 1 if vertex i is adjacent to vertex j, and aij = 0 otherwise. 1 2 3 4 5 6 A =         1 1 1 1 1 1 1 1        

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8

slide-16
SLIDE 16

Triangle Enumeration via Matrix Multiplication (GPU)

Definition

The Adjacency Matrix of a graph is a matrix An,n = [aij] where aij = 1 if vertex i is adjacent to vertex j, and aij = 0 otherwise. 1 2 3 4 5 6 A =         1 1 1 1 1 1 1 1 1 1        

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8

slide-17
SLIDE 17

Triangle Enumeration via Matrix Multiplication (GPU)

Definition

The Adjacency Matrix of a graph is a matrix An,n = [aij] where aij = 1 if vertex i is adjacent to vertex j, and aij = 0 otherwise. 1 2 3 4 5 6 A =         1 1 1 1 1 1 1 1 1 1 1 1        

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8

slide-18
SLIDE 18

Triangle Enumeration via Matrix Multiplication (GPU)

Definition

The Adjacency Matrix of a graph is a matrix An,n = [aij] where aij = 1 if vertex i is adjacent to vertex j, and aij = 0 otherwise. 1 2 3 4 5 6 A =         1 1 1 1 1 1 1 1 1 1 1 1 1 1        

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8

slide-19
SLIDE 19

Triangle Enumeration via Matrix Multiplication (GPU)

Definition

The Adjacency Matrix of a graph is a matrix An,n = [aij] where aij = 1 if vertex i is adjacent to vertex j, and aij = 0 otherwise. 1 2 3 4 5 6 A =         1 1 1 1 1 1 1 1 1 1 1 1 1 1        

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 8

slide-20
SLIDE 20

Triangle Enumeration via Matrix Multiplication (GPU)

Theorem

If A is the adjacency matrix of a simple graph G, then the ijth entry of Ak is the number of walks on k edges beginning at vertex i and ending at vertex j.

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 9

slide-21
SLIDE 21

Triangle Enumeration via Matrix Multiplication (GPU)

Theorem

If A is the adjacency matrix of a simple graph G, then the ijth entry of Ak is the number of walks on k edges beginning at vertex i and ending at vertex j.

Corollary

If A is the adjacency matrix of a simple graph G and A3 = [aij] then the number of triangles in G is 1 6

  • aii = tr(A3)

6 .

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 9

slide-22
SLIDE 22

Triangle Enumeration via Matrix Multiplication (GPU)

CUBLAS is a CUDA implementation of the BLAS library for GPUs. Algorithm: Construct the adjacency matrix A Copy A to the device (data movement) Compute A3 using matrix multiplication (gemm) Sum the diagonal entries of A3 (divide by 6)

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 10

slide-23
SLIDE 23

Triangle Enumeration via Matrix Multiplication (GPU)

CUBLAS is a CUDA implementation of the BLAS library for GPUs. Algorithm: Construct the adjacency matrix A Copy A to the device (data movement) Compute A3 using matrix multiplication (gemm) Sum the diagonal entries of A3 (divide by 6) Advantages: Easy to use (library function call, twice) A single GPU can execute many parallel threads (≥ 1000s)

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 10

slide-24
SLIDE 24

Triangle Enumeration via Matrix Multiplication (GPU)

CUBLAS is a CUDA implementation of the BLAS library for GPUs. Algorithm: Construct the adjacency matrix A Copy A to the device (data movement) Compute A3 using matrix multiplication (gemm) Sum the diagonal entries of A3 (divide by 6) Advantages: Easy to use (library function call, twice) A single GPU can execute many parallel threads (≥ 1000s) Disadvantages: Data movement from host to device can be expensive Shared memory per thread is relatively small on GPU

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 10

slide-25
SLIDE 25

Triangle Enumeration via MapReduce (Spark)

Spark is a framework designed for in-memory, fault-tolerant computing. Algorithm: (map) Each vertex v emits edges containing v (map) Each vertex v emits angles (potential triangles) (reduce) Combine edges and angles to form triangles

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 11

slide-26
SLIDE 26

Triangle Enumeration via MapReduce (Spark)

Spark is a framework designed for in-memory, fault-tolerant computing. Algorithm: (map) Each vertex v emits edges containing v (map) Each vertex v emits angles (potential triangles) (reduce) Combine edges and angles to form triangles Advantages: Can take advantage of all memory available to node Agnostic to the number of nodes/cores

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 11

slide-27
SLIDE 27

Triangle Enumeration via MapReduce (Spark)

Spark is a framework designed for in-memory, fault-tolerant computing. Algorithm: (map) Each vertex v emits edges containing v (map) Each vertex v emits angles (potential triangles) (reduce) Combine edges and angles to form triangles Advantages: Can take advantage of all memory available to node Agnostic to the number of nodes/cores Disadvantages: Between map and reduce there can be an expensive data movement Processing done on CPU limits parallelism

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 11

slide-28
SLIDE 28

Triangle Enumeration via MapReduce (Spark)

1 2 3 4 5 1 → ((1, 2), #), ((1, 5), #) 2 → ((2, 5), #) 3 → ((3, 2), #), ((3, 4), #) 4 → ((4, 5), #) 5 → Phase I: Emit Edges (map) If vertex i and j share an edge, then vertex i emits a KV pair ((i, j), #) following RULE 1. RULE 1: If (i, j) is emitted as an edge then either deg(i) < deg(j), or deg(i) = deg(j) and i < j.

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 12

slide-29
SLIDE 29

Triangle Enumeration via MapReduce (Spark)

1 2 3 4 5 1 → ((2, 5), 1) 2 → 3 → ((2, 4), 3) 4 → 5 → Phase II: Emit Angles (map) Each vertex k emits a KV pair ((i, j), k) if k shares an edge with both i and j, following RULE 2. RULE 2: If ((i, j), k) is emitted as an angle then (i, j) follows RULE 1 and either: deg(k) < deg(i), or deg(k) = deg(i) and k < i.

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 12

slide-30
SLIDE 30

Triangle Enumeration via MapReduce (Spark)

1 2 3 4 5 Phase III: Combine by key (reduce) Transform the two sets of KV pairs into a set of Key Multi-value pairs. The result are pairs of the form ((i, j), L = [a, b, ...]). If # ∈ L then the edge (i, j) completes the triangles {a, i, j}, {b, i, j}, ... If # / ∈ L then (i, j) is not an edge of the graph and so {a, i, j}, ... do not form triangles.

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 12

slide-31
SLIDE 31

Triangle Enumeration via MapReduce (Spark)

1 2 3 4 5 Phase I: 1 → ((1, 2), #), ((1, 5), #) 2 → ((2, 5), #) 3 → ((3, 2), #), ((3, 4), #) 4 → ((4, 5), #) 5 →

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 13

slide-32
SLIDE 32

Triangle Enumeration via MapReduce (Spark)

1 2 3 4 5 Phase I: 1 → ((1, 2), #), ((1, 5), #) 2 → ((2, 5), #) 3 → ((3, 2), #), ((3, 4), #) 4 → ((4, 5), #) 5 → Phase II: 1 → ((2, 5), 1) 2 → 3 → ((2, 4), 3) 4 → 5 →

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 13

slide-33
SLIDE 33

Triangle Enumeration via MapReduce (Spark)

1 2 3 4 5 Phase I: 1 → ((1, 2), #), ((1, 5), #) 2 → ((2, 5), #) 3 → ((3, 2), #), ((3, 4), #) 4 → ((4, 5), #) 5 → Phase II: 1 → ((2, 5), 1) 2 → 3 → ((2, 4), 3) 4 → 5 → Phase III: ((1, 2), [#]) ((1, 5), [#]) ((2, 5), [#, 1]) ((2, 4), [3]) ((3, 2), [#]) ((3, 4), [#]) ((4, 5), [#])

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 13

slide-34
SLIDE 34

Triangle Enumeration via MapReduce (Spark)

1 2 3 4 5 Phase III: ((1, 2), [#]) ((1, 5), [#]) ((2, 5), [#, 1]) ((2, 4), [3]) ((3, 2), [#]) ((3, 4), [#]) ((4, 5), [#]) (2, 5) completes the {1, 2, 5} triangle

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 13

slide-35
SLIDE 35

Triangle Enumeration via MapReduce (Spark)

1 2 3 4 5 Phase III: ((1, 2), [#]) ((1, 5), [#]) ((2, 5), [#, 1]) ((2, 4), [3]) ((3, 2), [#]) ((3, 4), [#]) ((4, 5), [#]) (2, 4) would complete the {2, 3, 4} triangle but (2, 4) is not an edge in the graph.

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 13

slide-36
SLIDE 36

Graph Model 1: Erd˝

  • s-R´

enyi (ER) Random Graphs

Two parameters: n (number of vertices) and p ∈ [0, 1] (a probability) n − 1 vertices new vertex Each edge appears independently with probability p

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 14

slide-37
SLIDE 37

Graph Model 2: Preferential Attachment

Two parameters: n (number of vertices) and d (number of connections) n − 1 vertices new vertex Choose d edges randomly, preferentially attaching to vertices with higher degree

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 15

slide-38
SLIDE 38

Graph Model Comparison

Erd˝

  • s-R´

enyi Random Graph: Vertex degree: p(n − 1) Edges: p2n

2

  • ∼ n2 (dense)

Triangles: p3n

3

  • ∼ n3

Preferential Attachment: Vertex degree: 2d Edges: nd ∼ n (sparse) Triangles: O(n1.5)

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 16

slide-39
SLIDE 39

Experimental Setup

Single Node: 2x Intel Xeon E5520 processors

4 physical cores, 8 logical cores each 2.26 GHz

24 GB RAM 1x K20c GPU

2688 CUDA cores 6 GB RAM

Random Graphs: 5 random graphs with every pair of parameters: n ∈ {1000, 1250, 1500, 1750, 2000, ..., 16000} p ∈ {.01, .02, .04, .08, .16} (Erd˝

  • s-R´

enyi) d ∈ {2, 4, 6, 8} (Preferential attachment)

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 17

slide-40
SLIDE 40

Results: (Dense) Erd˝

  • s-R´

enyi Random Graph Model

cuBLAS+GPU

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 18

slide-41
SLIDE 41

Results: (Dense) Erd˝

  • s-R´

enyi Random Graph Model

Spark

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 19

slide-42
SLIDE 42

Results: (Dense) Erd˝

  • s-R´

enyi Random Graph Model

cuBLAS+GPU Spark

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 20

slide-43
SLIDE 43

Results: (Dense) Erd˝

  • s-R´

enyi Random Graph Model

cuBLAS+GPU Spark

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 20

slide-44
SLIDE 44

Results: (Sparse) Preferential Attachment Model

cuBLAS+GPU

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 21

slide-45
SLIDE 45

Results: (Sparse) Preferential Attachment Model

Spark

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 22

slide-46
SLIDE 46

Results: (Sparse) Preferential Attachment Model

Comparison of cuBLAS+GPU v. Spark

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 23

slide-47
SLIDE 47

Conclusions

We explored the performance of two triangle enumeration algorithms. Our algorithm using cuBLAS library + GPU

Performance depended only on graph size (vertices) Graph size bound∗ by memory on GPU

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 24

slide-48
SLIDE 48

Conclusions

We explored the performance of two triangle enumeration algorithms. Our algorithm using cuBLAS library + GPU

Performance depended only on graph size (vertices) Graph size bound∗ by memory on GPU Larger graphs require the same data be moved between host and device more than once

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 24

slide-49
SLIDE 49

Conclusions

We explored the performance of two triangle enumeration algorithms. Our algorithm using cuBLAS library + GPU

Performance depended only on graph size (vertices) Graph size bound∗ by memory on GPU Larger graphs require the same data be moved between host and device more than once

a MapReduce algorithm tailored to Apache-Spark

Performance depends primarily on the vertex degrees Automatically shares memory from all nodes (and disk)

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 24

slide-50
SLIDE 50

Acknowledgements

Global Computing Laboratory, circa 2015

Travis Johnston, Stephen Herbein, and Michela Taufer Triangle Enumeration: Spark v. GPU 25