Latest Development of Gunrock: a Graph Processing Library on GPUs - PowerPoint PPT Presentation

Latest Development of Gunrock: a Graph Processing Library on GPUs Yuechao Pan, with the Gunrock team ychpan@ucdavis.edu GTC 2018, 28 March 2018, San Jose, California, U.S.A. https://gunrock.github.io

What is the Gunrock Library? A CUDA-based graph processing library, aims for: ● Generality ● Performance covers a broad range of graph algorithms maintains good performance ● Programmability ● Scalability makes it easy to implement graph algorithms fits in (very) limited GPU memory space extends to multi-GPUs as simple as possible performance scales when using more GPUs Gunrock Updates @ GTC 2018, 28 March 2018 | 2

Programming Model A generic graph algorithm: Data-centric abstraction - Operations are defined on A group of V or E a group of vertices or edges ≝ a frontier Do something => Operations = manipulations of frontiers Resulting group of V or E Bulk-synchronous programming Loop until - Operations are done one by one, in order Do something convergence - Within a single operation, computing on Another resulting multiple elements can be done in parallel, group of V or E without order Yangzihao Wang, Yuechao Pan, Andrew Davidson, Yuduo Wu, Carl Yang, Leyuan Wang, Muhammad Osama, Chenshan Yuan, Weitang Liu, Andy T. Riffel, and John D. Owens. Gunrock Updates @ GTC 2018, 28 March 2018 | 3 “Gunrock: GPU Graph Analytics”, TOPC 4(1), 2017

How to Write a Graph Primitive with Gunrock? => Section S8586, Writing Graph Primitives with Gunrock Key items for a graph primitive / app: ● Problem : data used by the algorithm ● Enactor : operations on the data ● App : higher level routines ● Test : CPU reference and result verification Gunrock Updates @ GTC 2018, 28 March 2018 | 4

New APIs Problem Graph loader Test Driver External interfaces = App. specific data / generator app. : callable from external graph: external data structures (e.g. GoAI) 1 Run = 1 set of parameters + 1 Enact() operator: calls external operators (e.g. GraphBLAS) 1 Experiment = multiple runs with different parameters Application Iteration-loop Iteration-loop Iteration Iteration Frontier Operator Operator Frontier Operator Operator Graph Graph Gunrock Updates @ GTC 2018, 28 March 2018 | 5

New APIs - oprtr::Advance cudaError_t gunrock::oprtr:: Advance <FLAG> ( // type (V2V, V2E, etc.) and // option (Idempotence, Mark_Preds, ...) graph, // graph representation input_frontier, // input set of elements output_froniter, // output set of elements oprtr_parameters, // operator parameters (stream, etc.) Advance : advance_op , // per-element advance lambda filter_op ) // per-element filter lambda (optional) visit neighbor lists ● Only 7 parameters, down from 20+ ● Interface independent of graph representations => App. implementation isolated from graph representations => Operator will select a suitable implementation based on the given graph representation(s) ● Advance and filter operator share the same interface ● Lambda operator signatures are fixed for advance and filter ● Merged Cond. and Apply functors in older API Gunrock Updates @ GTC 2018, 28 March 2018 | 6

New APIs - oprtr::Advance Example: SSSP advance auto advance_op = [distances, weights, preds] __host__ __device__ ( const VertexT &src, VertexT &dest, const SizeT &edge_id, const VertexT &input_item, const SizeT &input_pos, SizeT &output_pos) -> bool { ValueT src_distance = Load<cub::LOAD_CG>(distances + src); ValueT edge_weight = Load<cub::LOAD_CS>(weights + edge_id); ValueT new_distance = src_distance + edge_weight; if (new_distance >= atomicMin(distances + dest, new_distance)) return false; Store(preds + dest, src); return true; }; // Call the advance operator, using the advance operation oprtr:: Advance <oprtr::OprtrType_V2V>( graph.csr(), frontier.V_Q(), frontier.Next_V_Q(), oprtr_parameters, advance_op, filter_op); Gunrock Updates @ GTC 2018, 28 March 2018 | 7

New APIs - oprtr::Filter cudaError_t gunrock::oprtr:: Filter <FLAG> ( // type (V2V, V2E, etc.) and // option (Idempotence, Mark_Preds, ...) graph, // graph representation input_frontier, // input set of elements output_froniter, // output set of elements Filter : oprtr_parameters, // operator parameters (stream, etc.) advance_op , // per-element advance lambda (optional) select and reorganize filter_op ) // per-element filter lambda Example: SSSP filter auto filter_op = [labels, iteration] __host__ __device__ ( // Call the filter operator, using the filter operation const VertexT &src, VertexT &dest, const SizeT &edge_id, oprtr::Filter<oprtr::OprtrType_V2V>( const VertexT &input_item, const SizeT &input_pos, graph.csr(), frontier.V_Q(), frontier.Next_V_Q(), SizeT &output_pos) -> bool oprtr_parameters, filter_op); { if (!util::isValid(dest)) return false; if (labels[dest] == iteration) return false; labels[dest] = iteration; return true; }; Gunrock Updates @ GTC 2018, 28 March 2018 | 8

New APIs - Compute Operators cudaError_t gunrock::util::Array1D<...>:: ForEach ( compute_op , // per-element computation lambda ( w/o pos ) num_elements, // number of elements target, // where to perform the computation, CPU or GPU stream) // cudaStream cudaError_t gunrock::util::Array1D<...>:: ForAll ( Compute : compute_op , // per-element computation lambda ( with pos ) num_elements, // number of elements per-element operation target, // where to perform the computation stream) // cudaStream rank_next.ForEach([normalize, delta]__host__ __device__(ValueT &rank) { ● No need to write a kernel for simple operations rank = normalize ? (ValueT)0.0 : (ValueT)(1.0 - delta); }, graph.nodes, target, stream); ● Target independent => same code works on either CPU or GPU degrees.ForAll([graph] __host__ __device__(SizeT *degrees, const SizeT &pos) <= Example: PR reset { degrees[pos] = sub_graph.GetNeighborListLength(pos); }, graph.nodes, target, stream); Gunrock Updates @ GTC 2018, 28 March 2018 | 9

New APIs - Graph Primitives / Apps template <typename GraphT, … , ProblemFlag FLAG> struct Problem : ProblemBase<GraphT, FLAG> Template: data types & option switches { Problem(util::Parameters &parameters, ProblemFlag flag = Problem_None); util::Parameters: running parameters cudaError_t Init(GraphT &graph, src, distances, preds : algorithm specific inputs util::Location target); target: cudaError_t Reset( src , target); where to do the action cudaError_t Extract( distances, preds , target); Init : initialization, only do once cudaError_t Release(target); Reset: data / status reset, do for each run }; template <typename Problem, … > Enact: invoke the algorithm implementation struct Enactor : public EnactorBase< … > { Extract: get back the results Enactor(); Release: clean-up cudaError_t Init(Problem &problem, target); cudaError_t Reset( src , target); cudaError_t Enact( src , target); cudaError_t Release(target); }; Gunrock Updates @ GTC 2018, 28 March 2018 | 10

New APIs - External Interfaces template < typename VertexT = int, typename SizeT = int, // @brief Entry of gunrock_sssp function typename GValueT = unsigned int, // @tparam GraphT Type of the graph typename SSSPValueT = GValueT> // @tparam ValueT Type of the distances double sssp( // @param[in] parameters Excution parameters const SizeT num_nodes, // @param[in] graph Input graph const SizeT num_edges, // @param[out] distances Shortest distances from source const SizeT *row_offsets, // @param[out] preds Predecessors of each vertex const VertexT *col_indices, // \return double Accumulated elapsed times const GValueT *edge_values, template < const int num_runs, <= Using gunrock data types typename GraphT, VertexT *sources, Using raw data pointers => typename ValueT = typename GraphT::ValueT> const bool mark_pred, double gunrock_sssp( SSSPValueT **distances, Able to take in graphs in GPU gunrock::util::Parameters &parameters, VertexT **preds = NULL) / CPU memory GraphT &graph, { ValueT **distances, gunrock::util::Parameters parameters; Able to take in different graph typename GraphT::VertexT **preds = NULL) GraphT graph; representations {...} // prepare parameters & graph return gunrock_sssp(parameters, graph, => GoAI and other libraries Distances, preds); } Gunrock Updates @ GTC 2018, 28 March 2018 | 11

New Features - Graph Representations ● Graph representation is isolated from most parts of Gunrock ○ Only operator implementations, graph generators & converters need to know the representation ○ Application level implementations does NOT need to know => External graph inputs (e.g. GoAI) => New graph representations (e.g. mutable graphs) ● Current status: ○ 3 basic representations: CSR, CSC, COO ○ SSSP on CSR, PR on COO Gunrock Updates @ GTC 2018, 28 March 2018 | 12

New Primitives - Random Walks Find x random paths of given length y ● Algorithm Q 0 <- {x randomly select source vertices} Do y iterations: Q 1 <- {} For each vertex v in Q 0 : Randomly select a neighbor u of v Put u in Q 1 Q 0 <- Q 1 Running time of GPU random walk Gunrock Updates @ GTC 2018, 28 March 2018 | 13

Latest Development of Gunrock: a Graph Processing Library on GPUs - PowerPoint PPT Presentation

Latest Development of Gunrock: a Graph Processing Library on GPUs Yuechao Pan, with the Gunrock team ychpan@ucdavis.edu GTC 2018, 28 March 2018, San Jose, California, U.S.A. https://gunrock.github.io What is the Gunrock Library? A CUDA-based

Gunrock: A Fast and Programmable Multi- GPU Graph Processing Library Yangzihao Wang and Yuechao

Library Department FY 2021 Library Department FY 2021 Library Organization Chart Springfield

Presentation 7.3b: Multiple linear regression Murray Logan 09 Aug 2016 library (GGally) library

Verified Graph Algorithms in ACL2 Nathan Guermond Kestrel Institute November 5, 2018 Another

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs Graphs GPUs Found

Graph Data Processing M. Tamer Ozsu 1 / 75 Outline Introduction RDF Graph Querying

AAPoly Library Orientation Library Contacts Phone : 61 3 8610 4132 Email : library@aapoly.edu.au

Neo4j Graph Data Science Library An Overview Max Kieling What is the Graph Data Science

Frontal Dummies Frontal Dummies Latest Developments Latest Developments Page 1 Hybrid III

Batch & Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

GraVF: GraVF: A Vertex-Centric A Vertex-Centric Graph Processing Graph Processing Framework

The Homeschooling - Library Connection Diane Pamel- Library Director Southworth Library and

Eric Lashley Library Director, Georgetown Public Library (TX) Patrick Lloyd, LMSW Community

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Maximal Ideals of Triangular Operator Algebras John Lindsay Orr jorr@math.unl.edu University of

A Framework for Water Security Assessment at City Scale: Application to Bangkok Informal

Effective Presentation and Visuals for PowerPoint: PowerPoint can be an effective visual tool to

Regularity of powers of edge ideals Huy Ti H Tulane University Joint with Selvi Beyarslan

Actions of Compact Quantum Groups II Examples, spectral components and Podle s algebra Kenny

Identifiability of Blind Deconvolution with Subspace or Sparsity Constraints Yanjun Li Joint

Identifiability in dynamic network identification Harm Weerts 1 Arne Dankers 2 Paul Van den Hof 1 1

Compliance: New FDA Guidance and Revised Common Rule Complying With Agency Regulations for

Latest Development of Gunrock: a Graph Processing Library on GPUs - PowerPoint PPT Presentation

Latest Development of Gunrock: a Graph Processing Library on GPUs Yuechao Pan, with the Gunrock team ychpan@ucdavis.edu GTC 2018, 28 March 2018, San Jose, California, U.S.A. https://gunrock.github.io What is the Gunrock Library? A CUDA-based

Gunrock: A Fast and Programmable Multi- GPU Graph Processing Library Yangzihao Wang and Yuechao

Library Department FY 2021 Library Department FY 2021 Library Organization Chart Springfield

Presentation 7.3b: Multiple linear regression Murray Logan 09 Aug 2016 library (GGally) library

Verified Graph Algorithms in ACL2 Nathan Guermond Kestrel Institute November 5, 2018 Another

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs Graphs GPUs Found

Graph Data Processing M. Tamer Ozsu 1 / 75 Outline Introduction RDF Graph Querying

AAPoly Library Orientation Library Contacts Phone : 61 3 8610 4132 Email : library@aapoly.edu.au

Neo4j Graph Data Science Library An Overview Max Kieling What is the Graph Data Science

Frontal Dummies Frontal Dummies Latest Developments Latest Developments Page 1 Hybrid III

Batch &amp; Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

GraVF: GraVF: A Vertex-Centric A Vertex-Centric Graph Processing Graph Processing Framework

The Homeschooling - Library Connection Diane Pamel- Library Director Southworth Library and

Eric Lashley Library Director, Georgetown Public Library (TX) Patrick Lloyd, LMSW Community

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Maximal Ideals of Triangular Operator Algebras John Lindsay Orr jorr@math.unl.edu University of

A Framework for Water Security Assessment at City Scale: Application to Bangkok Informal

Effective Presentation and Visuals for PowerPoint: PowerPoint can be an effective visual tool to

Regularity of powers of edge ideals Huy Ti H Tulane University Joint with Selvi Beyarslan

Actions of Compact Quantum Groups II Examples, spectral components and Podle s algebra Kenny

Identifiability of Blind Deconvolution with Subspace or Sparsity Constraints Yanjun Li Joint

Identifiability in dynamic network identification Harm Weerts 1 Arne Dankers 2 Paul Van den Hof 1 1

Compliance: New FDA Guidance and Revised Common Rule Complying With Agency Regulations for

Batch & Stream Graph Processing with Apache Flink Vasia Kalavri vasia@apache.org @vkalavri