Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs - - PowerPoint PPT Presentation

why use gpus for graph processing
SMART_READER_LITE
LIVE PREVIEW

Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs - - PowerPoint PPT Presentation

Gunrock High-Performance Graph Analytics for the GPU Muhammad Osama University of California, Davis Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs Graphs GPUs Found everywhere Found everywhere Road &


slide-1
SLIDE 1

Gunrock High-Performance

Graph Analytics for the GPU

Muhammad Osama — University of California, Davis

slide-2
SLIDE 2

Why use GPUs for graph processing?

2 FOSDEM 2020

slide-3
SLIDE 3

GPUs and Graphs

Graphs

  • Found everywhere

○ Road & social networks, web, etc.

  • Require fast processing

Memory bandwidth, computing power and GOOD software

  • Becoming very large

○ Billions of edges

  • Irregular data access pattern and

control flow

○ Limits performance and scalability

GPUs

  • Found everywhere

○ Data center, desktops, mobiles, etc.

  • Very powerful

High memory bandwidth (900 GBps) and computing power (15.7 TFlops)

  • Limited memory size

○ 32 GB per NVIDIA V100

  • Difficult to program

○ Harder to optimize

3 FOSDEM 2020

slide-4
SLIDE 4

What is Gunrock?

4 FOSDEM 2020

slide-5
SLIDE 5

A CUDA-based graph processing library, aims for Performance

State-of-the-art graph processing library

5 FOSDEM 2020

slide-6
SLIDE 6

A CUDA-based graph processing library, aims for Generality

Covers a broad range of graph algorithms

6 FOSDEM 2020

slide-7
SLIDE 7

A CUDA-based graph processing library, aims for Programmability

Makes it easy to implement and extend graph algorithms from 1-GPU to multi-GPUs

7 FOSDEM 2020

slide-8
SLIDE 8

A CUDA-based graph processing library, aims for Scalability

Fits in (very) limited GPU memory space performance scales when using many GPUs

8 FOSDEM 2020

slide-9
SLIDE 9

Where can you find Gunrock?

9 FOSDEM 2020

slide-10
SLIDE 10

gunrock.github.io

10 FOSDEM 2020

slide-11
SLIDE 11

Project’s Workflow

Release (master) branch Development (dev) branch Git Forking Workflow Contribute GitHub Issues &

Pull Requests

Apache 2.0 License Code Coverage codecov.io Central Integration jenkins.io Documentation slate & doxygen

11 FOSDEM 2020

slide-12
SLIDE 12

Project’s Workflow (cont.)

Gunrock's Roadmap

12 FOSDEM 2020

slide-13
SLIDE 13

Some Stats and Stuff! (as of 01/30/2020)

  • 32 Contributors (over 2500 commits)
  • ~600 stars 148 forks
  • NVIDIA CUDA-X: GPU Accelerated Library
  • RAPIDS

13 FOSDEM 2020

slide-14
SLIDE 14

How does Gunrock work?

14 FOSDEM 2020

slide-15
SLIDE 15

Programming Model

  • Data-centric abstraction
  • Bulk-synchronous programming

15 FOSDEM 2020

slide-16
SLIDE 16

Frontier

A frontier; group of vertices or edges

An example graph {G} Frontier of vertices of graph {G}

16 FOSDEM 2020

slide-17
SLIDE 17

Parallel Operators

Manipulation of frontiers is an operation

  • Advance
  • Filter
  • For
  • Intersection
  • Neighbor-Reduce
  • … and more.

Generates new frontier by visiting the neighbors.

Illustration of Advance Operator

17 FOSDEM 2020

slide-18
SLIDE 18

Bulk-Synchronous Programming

Serial loop until convergence

Series of parallel

  • perations separated

by global barriers

Parallel advance

  • perator

18 FOSDEM 2020

slide-19
SLIDE 19

{Gunrock Algorithms}

A* Search Betweenness Centrality Breadth-First Search Connected Components Graph Coloring Geolocation RMAT Graph Generator Graph Trend Filtering Graph Projections Random Walk Hyperlink-Induced Topic Search K-Nearest Neighbors Louvain Modularity Label Propagation MaxFlow Minimum Spanning Tree PageRank Local Graph Clustering GraphSAGE Stochastic Approach for Link-Structure Analysis Subgraph Matching Shared Nearest Neighbors Scan Statistics Single Source Shortest Path Triangle Counting Top K Vertex Nomination Who To Follow

19 FOSDEM 2020

slide-20
SLIDE 20

Example application in Gunrock.

20 FOSDEM 2020

slide-21
SLIDE 21

Single-Source Shortest Path

Implement the advance and filter C++ lambdas for SSSP

{complete code}

auto advance_op = [distances, weights] __host__ __device__ (...) -> bool { auto distance = distances[vertex_id] + weights[edge_id]; auto old_distance = atomicMin(distances + neighbor_id, distance); if (distance < old_distance) return true; return false; }; auto filter_op = [labels, iteration] __host__ __device__ (...) -> bool { if (!util::isValid(neighbor_id)) return false; return true; };

21 FOSDEM 2020

slide-22
SLIDE 22

Single-Source Shortest Path

Launch the lambdas within the operator call

{complete code}

while (!frontier.isEmpty()) {

  • prtr::Advance<oprtr::OprtrType_V2V>(

graph.csr(), frontier, oprtr_parameters, advance_op, filter_op); }

22 FOSDEM 2020

slide-23
SLIDE 23

Acknowledgements & Thanks!

NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects Agency (DARPA). SYMPHONY: Orchestrating Sparse and Dense Data for Efficient Computation. Award HR0011-18-3-0007. Department of Defense Advanced Research Projects Agency (DARPA). A Commodity Performance Baseline for HIVE Graph Applications. Award FA8650-18-2-7835. Adobe Data Science Research Award. Scalability and Mutability for Large Streaming Graph Problems on the GPU. National Science Foundation (Award OAC-1740333) SI\textln{2-SSE: Gunrock: High-Performance GPU Graph Analytics. National Science Foundation (Award CCF-1637442) Theory and implementation of dynamic data structures for the GPU. Program: AitF---Algorithms in the Field. National Science Foundation (Award CCF-1629657) PARAGRAPH: Parallel, Scalable Graph Analytics. XPS---Exploiting Parallelism & Scalability. Department of Defense Advanced Research Projects Agency (DARPA) SBIR SB152-004. Many-Core Acceleration of Common Graph Programming Frameworks. Phase II: award W911NF-16-C-0020. Department of Defense Advanced Research Projects Agency (DARPA) STTR ST13B-004 (“Data-Parallel Analytics on Graphics Processing Units (GPUs)”). A High-Level Operator Abstraction for GPU Graph Analytics. Awards D14PC00023 and D15PC00010. Department of Defense (XDATA program). An XDATA Architecture for Federated Graph Models and Multi-Tier Asymmetric Computing. Oct. 2012--Sept. 2017. Prime contractor: Sotera Defense Solutions, Inc., US Army award W911QX-12-C-0059.

23 FOSDEM 2020