Implementing a Parallel Graph Clustering Algorithm with Sparse - PowerPoint PPT Presentation

Implementing a Parallel Graph Clustering Algorithm with Sparse Matrix Computation Jun Chen, Peigang Zou High Performance Computing Center, Institute of Applied Physics and Computational Mathematics chenjun@iapcm.ac.cn

OUTLINE n Gr Graph c h clu lustering ng n Peer pressure clustering(PPCL) n Cha halle lleng nges n A s solu lution: b n: based o on L n Large g graph p h pla latforms ms/li libraries n Combinatorial BLAS n Rela lated w works ks n Paralle llel P l PPCL a alg lgorithm w hm with ma h matrix c computation n Nume merical R l Result lts n Di Discussions ns n Conc nclu lusion

1 、 Graph Clustering ß A wide class of algorithms to classify vertices in a group into many clusters, where the vertices in the same cluster have high connectivity than those in various clusters.

Graph clustering Vs Vs. Vector clustering Clustering ： find natural groups. Vector clustering Graph clustering Classified by relationship between points. Classified by distances between points.

Graph clustering （ cont. ） Machine Pattern learning recognition Peer pressure Random walks clustering （ PPCL ） Minimum cuts Graph clustering multi-way Graph clustering partition Genetic algorithms Image Bioinformatics …… analyze

2. Challenges

2.1 The size of Graph is growing E.g., # of Facebook users > 1 billion. A big graph!

2.2. Large Graph Computation n Paralle llel g l graph c h clu lustering ng i imple leme ment nt i is d difficult lt. It . It r requires: : n well-suited description of the natural sparse locality n Storage them effectively n High performance computing n High p h performa manc nce c cha halle lleng nge n Scalability: time should <= O(n). Memory consumption <O(nxn). n Parallel patterns for solving PDEs in typical scientific computing is based on dense computations. They are not suitful for the sparse characteristic in large graph computation. n MapReduce pattern for big data problems has low efficiency. ü A S Solu lution: B n: Based o on L n Large Gr Graph p h pla latform/ m/li library y

3. Large graph platforms/Libraries Scale leGr Graph n TITECH, JAPAN n Implements Pregel model provided by google ， and optimize its collective n communication and memory management methods. Goal: can analyze the graphs containing 10 billion nodes and edges. n Di DisBeli lief n Google n A parallel framework for deep learning n PBGL GL(Paralle llel Bo Boost Gr Graph Library) y) n Indiana University, USA n C++ graph library n GA GAPDT （ Gr Graph A h Alg lgorithm a hm and nd P Pattern Di n Discovery T y Toolb lbox ） n UCSB, USA n Provide interactive graph operations, and can parallel run on Star-P, a parallel n version of MATLAB. Use distributed sparse array to describe the parallel operations. n

Gr Graph BL BLAS n Þ Defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. etc. n

Combinatorial BLAS CombBLAS: Ø A representation implementation of Graph BLAS Ø Multi-core parallelism based on MPI Ø A collective set of basic linear algebra operations. Table le 1 1. Many b y basic li line near a alg lgebra o operations ns i in n Comb mbBLAS

[Buluç A, Gilbert J R. The Combinatorial BLAS: design, implementation, and applications]

4. Related works ß Introduction of random walks method Þ The cluster assignment of a vertex will be the same at that of most of its neighbors. ß Current representative parallel PPCL methods. Þ Parallel PPCL in SPARQL Þ Parallel PPCL in STAR-P

Related works about parallel PPCL Paper 1 ： Paper 2 ： PPCL PPCL in S n SPARQL PPCL in S PPCL n STAR-P -P Result lts ： Result lts ： ü Maximu mum p m processor nu numb mber ： 64 64 ü Maximu mum p m processor nu numb mber ： 128 128 ü RDF DF Gr Graph ： 10000 v vertices, 2 , 232,0 ,000 edg edges es ü R-M -MAT g graph ： 209 2097152 52 vertices, 1 , 18305177 edg edges es 200 秒 200 • (scale le 2 21 ) ) ü Low p performa manc nce. Ø SPARQL is a similar SQL tool for RDF graph. Ø STAR-P is a parallel implementation of MATLAB. RDF Graph ： stores meta-data for web resources. Its vertex identify a resource, its arc describes the resource attributes.

5. Parallel PPCL Algorithm with Matrix Computation • Standard PPCL algorithm • Alternative PPCL algorithm based on linear algebra • Parallel PPCL algorithm based on linear algebra • Parallel PPCL implementation on ComBLAS

5.1 、 standard PPCL algorithm 3.Vote Alg lgorithm 1 hm 1. PeerPressure( G ( , V E C ), ) = i 1 for ( , , ) u v w E ∈ 2 do ( )( T v C u ( )) T v C u ( )( ( )) w ← + i i 3 for n V ∈ 1.Given an 4 do C ( ) n i : j V T n : ( )( ) j T n i ( )( ) ← ∀ ∈ ≤ f 4.tally 2.Initial approximation 5 if C C == i f G’ 6 then return C f 7 else return PeerPre ssure( G ( , V E C ), ) = f If G’!=G’’, then G’=G’’ 5.Form new cluster Result If approximation G’’ G’==G’’ G’’ It starts with an initial cluster assignment, e.g., each vertex being in its own cluster.Each iteration performs an election at each vertex to select its cluster num. The votes are the cluster assignments of its neighbors. Ties are settled by selecting the lowest cluster ID to maintain determinism here. The algorithm converges when two consecutive iterations have a tiny difference between them.

5.2 PPCL algorithm based on Linear algebra Alg lgorithm 2 hm 2: 3.Vote PeerPressure( G A : R N N ,C : B N N ) × × = i + 1 T : R N N C : B N N m : R N × × f + + 2 T = C A i 3 m = T max. 4 C = m .== T f 5 if C == C 1.Given an i f 4.tally 6 then return C 2.Initial approximation f 7 else return PeerPressure( ,C ) G G’ f 1. Starting approximation G’: each If G’!=G’’, then G’=G’’ vertex is a cluster. 5.Form new 2. Initialization: assuring vertices have cluster Result If approximation equally votes. G’’ G’==G’’ G’’ 3. Vote: Each node vote for its neighbors. 4. Tally: (1) normalize. (2) Settling ties: what to do if two clusters tie for the maximum number of votes for a vertex. 5. Form a new approximation G’’

0.25 0 0.25 0.25 0 0 0.25 0 ⎛ ⎞ ⎜ ⎟ 0 0.25 0.25 0.25 0 0 0 0.25 ⎜ ⎟ 0.25 0 0.25 0 0.25 0 0.25 0 ⎜ ⎟ ⎜ ⎟ 0.25 0.25 0 0.25 0 0.25 0 0 ⎜ ⎟ A = 0.25 ⎜ 0 0.25 0 0.25 0 0.25 0 ⎟ ⎜ ⎟ 0 0.2 0 0.2 0 0.2 0.2 0.2 ⎜ ⎟ ⎜ ⎟ 0.33 0 0 0 0.33 0 0.33 0 ⎜ ⎟ ⎜ ⎟ 0 0.25 0 0.25 0 0.25 0 0.25 ⎝ ⎠ (a) The object graph G (b) The adjacency A of G after initialization 0 0 1 1 0 0 0 0 ⎛ ⎞ 0 0 1 1 0 0 0 0 ⎛ ⎞ ⎜ ⎟ 0 1 0 0 0 0 0 1 ⎜ ⎟ 0 1 1 1 0 0 0 1 ⎜ ⎟ ⎜ ⎟ 0 0 0 0 0 0 0 0 ⎜ ⎟ 0 0 1 0 0 0 0 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 ⎜ ⎟ C = 0 ⎜ ⎟ C = 0 f ⎜ 0 0 0 0 0 0 0 ⎟ ⎜ 0 1 0 0 0 0 0 ⎟ ⎜ ⎟ ⎜ ⎟ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 1 0 0 0 1 0 1 0 ⎜ ⎟ 1 0 0 0 1 0 1 0 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ 0 0 0 0 0 0 0 0 ⎜ ⎟ 0 1 0 1 0 1 0 1 ⎝ ⎠ ⎝ ⎠ (c) Temporary results of matrix C (d) Temporary matrix C after 1 st tires-settling Fig.1 The procedure when applying algorithm 2 for object graph G.

5.3 Parallel PPCL algorithm based on linear algebra Alg lgorithm 3 hm 3: R N N B N N Input : a matix-based tageted graph A : × and a matrix-based initial approximation garph C : × . i + Output : a matrix-based clustering result. procedure PeerPressure( G A : R N N ,C : B N N ) × × = i + 1 SpParMat<unsig ned,double,SpDCCols,<unsigned,double>> A, C; /* reduce to Row columns are collapsed to , single entries */ 2 DenseParVec<unsigned,double> rowsums = A.Reduce(Row,plus<double>); Ini Initiali lization /* multinv double function is user defined for double */ < > 3 rowsums.Apply(multinv<double>); /* nomalize A scale each Column with given vector , */ 4 A.DimScale(Row, rowsums); 5 while C != T do /* vote */ Vote Vo 6 SpParMat<unsigned,double,SpDCCols,<unsigned,double>> T = SpGEMM(C,A); /* renomalize T */ Norma mali lization 7 Renomalize(SpParMat<unsigned,double,SpDCCols,<unsigned,double>> &T); /* settling ties */ Settli ling ng t ties 8 settling_ties(SpParMat<unsigned,double,SpDCCols,<unsigned,double>> &T);

5.4 Parallel PPCL implementation on CombBLAS ß Data distribution and storage Þ DCSC storage structure ß Algorithm expansion & MPI implementation Þ Parallel voting Þ Renormalization Þ Parallel ties-settling

5.4.1 Data distribution and storage ß Distribute the sparse matrices on a 2D <Pr, Pc> processor grid. Þ Processor P(I,j) stores the sub-matrix Aij of dimensions (m/Pr)x(n/Pc) in its local memory. ß HyperSparseGEMM operates on O(nnz) DCSC data structure.

Implementing a Parallel Graph Clustering Algorithm with Sparse - PowerPoint PPT Presentation

Implementing a Parallel Graph Clustering Algorithm with Sparse Matrix Computation Jun Chen, Peigang Zou High Performance Computing Center, Institute of Applied Physics and Computational Mathematics chenjun@iapcm.ac.cn OUTLINE n Gr Graph c h

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Graph Clustering Why graph clustering is useful? Distance matrices are graphs as useful as

LECTURE 7 Clustering The k-means algorithm Hierarchical Clustering The DBSCAN algorithm

PAC-Bayesian Analysis of Co-clustering, Graph Clustering and Pairwise Clustering Yevgeny Seldin

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Lecture 23: Spectral clustering Hierarchical clustering What is a good clustering?

Introduction to Parallel Computing George Karypis Principles of Parallel Algorithm Design

K-MEANS++ OPTIMAL INITIALIZATION ALGORITHM An Improved K-means Clustering Method OVERVIEW

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Clustering Hierarchical clustering, k-mean clustering Genome 559: Introduction to Statistical and

CSCE 478/878 Lecture 8: Stephen Scott Clustering Introduction Outline Clustering Stephen

A Fistful of Bitcoins: Characterizing Payments Among Men with No Names Sarah Meiklejohn (UC San

Detecting Clusters in Moderate-to-high Dimensional Data: Subspace Clustering, Pattern-based

Clusters for DNN Training Workloads Myeongjae Jeon , Shivaram Venkataraman, Amar Phanishayee,

DBSCAN Presented by: Garrett Poppe A density-based algorithm for discovering clusters in large

Reliable Variational Learning for Hierarchical Dirichlet Processes Erik Sudderth Brown University

Escapers and non-escapers in star clusters Douglas Heggie University of Edinburgh UK Luchon

Vembu extends support to Vembu extends support to Vembu v4.0 Hyper-V Cluster with v4.0 Agenda

Geodesic Distance Distance based based Geodesic Fuzzy Clustering Clustering Fuzzy Abonyi and