Multiplex graph analysis with GraphBLAS Gbor Szrnyas With - - PowerPoint PPT Presentation

multiplex graph analysis with graphblas
SMART_READER_LITE
LIVE PREVIEW

Multiplex graph analysis with GraphBLAS Gbor Szrnyas With - - PowerPoint PPT Presentation

Multiplex graph analysis with GraphBLAS Gbor Szrnyas With contributions from Petra Vrhegyi and Lehel Bor Fault Tolerant Systems Research Group MTA-BME Lendlet Cyber-Physical Systems Research Group Budapest University of Technology


slide-1
SLIDE 1

Budapest University of Technology and Economics Department of Measurement and Information Systems

Fault Tolerant Systems Research Group MTA-BME Lendület Cyber-Physical Systems Research Group

Multiplex graph analysis with GraphBLAS

Gábor Szárnyas

1

With contributions from Petra Várhegyi and Lehel Boér

slide-2
SLIDE 2

PARADISE PAPERS

3

slide-3
SLIDE 3

Paradise papers

  • Data set leaked to investigative journalists
  • Similar to Panama Papers, but larger
  • Mid-sized data set: 2M nodes, 3M edges

4

slide-4
SLIDE 4

Paradise papers schema

5

Node labels Edge types

slide-5
SLIDE 5

GRAPH DATA MODELS

6

slide-6
SLIDE 6

Example

7

Bob Dan Eve Carol Fred

Simple graph

  • “Textbook graph”
  • Untyped graph
  • Homogeneous

network

  • Monoplex graph
slide-7
SLIDE 7

Example with edge types

8

Eve Bob Dan Carol Fred

  • Friends
  • Business partners

Multiplex graph

  • Edge-labelled or

edge-typed graph

  • Heterogeneous or

multidimensional network

Expressive power: between untyped and property graphs.

slide-8
SLIDE 8

MULTIPLEX GRAPH ANALYSIS

9

slide-9
SLIDE 9

Local clustering coefficient metric

10

v

Wedge

v

Triangle LCC(𝑤) = No. of triangles in 𝑤

  • No. of wedges in 𝑤

A wedge closes into a triangle

  • K. Faust, S. Wasserman (1994):

Social Network Analysis

slide-10
SLIDE 10

Typed clustering coefficient metric

11

TCC(𝑤) = No. of typed triangles in 𝑤

  • No. of typed wedges in 𝑤

v Two edges of the same type

Wedge

v Two edges of the same type

Triangle

Edge with a different type

  • F. Battiston et al. @ Physical Review E 2014

Structural measures for multiplex networks

slide-11
SLIDE 11

Typed clusteredness example

12

B D E C F

2 3 2 3 2 3 2 3 2 3

E B D C F

2 3 1 2

LCC TCC

slide-12
SLIDE 12

Multiplex analysis on Paradise papers

  • Previous research
  • Characterization of HW/SW/building models
  • 100k–1M nodes/edges
  • Naïve Java implementation using edge lists
  • Ran for Paradise papers data set
  • Clustering coefficient metrics did not complete in days
  • What’s going on?
  • Implementation and algorithmic aspects need to be tuned

13

  • G. Szárnyas et al. @ MODELS 2016

Towards the characterization of realistic models: evaluation of multidisciplinary graph metrics

slide-13
SLIDE 13

GRAPH PROCESSING WORKLOADS

14

slide-14
SLIDE 14

OLAP queries

expected execution time amount of data accessed

OLTP queries

Graph processing landscape

Graph analytics

  • LDBC: Linked Data

Benchmark Council

Graph queries (structure, types, and props) Graph analysis (structure only) Multiplex graph analysis (structure and types) Giraph, Spark GraphX, Flink Gelly, Neo4j Graph algorithms lib No off-the-shelf solutions? Cypher and SPARQL engines (Neo4j, Virtuoso, Stardog…)

slide-15
SLIDE 15

Graph analysis frameworks

Many Apache frameworks can be adapted

  • Hama Graph
  • Giraph on Hadoop
  • Spark GraphX
  • Flink Gelly

But most seem abandoned.

$ git rev-list --count --all --no-merges

  • -since="Feb 2 2015"
  • -since="Feb 2 2017"
  • -before="Feb 2 2017"

hama/graph 63 giraph 154 67 spark/graphx 362 120 flink/flink-libraries/gelly 139 112

16

slide-16
SLIDE 16

Arabesque

  • Open-source graph analytical framework over Hadoop
  • Frequent subgraph mining:

find subgraph with a minimum number of matches

  • Optimized for distributed execution

17

  • G. Siganos @ FOSDEM 2016

Arabesque: A distributed graph mining platform C.H.C. Teixeira et al. @ SOSP 2015 Arabesque: A systems for distributed graph mining

arabesque$ git rev-list... 125 91

Qatar-Computing-Research-Institute/Arabesque

slide-17
SLIDE 17

The complexity of graph computations

  • Graph computation is difficult
  • The “curse of connectedness”
  • Computer architecture are suited for hierarchical data
  • Graph analytics: vertex-centric programming model
  • “Think like a vertex”
  • Pregel, scatter-gather, gather-apply-scatter
  • The majority of distributed graph engines use this
  • Going distributed for a 2M node graph?

18

  • V. Kalavri et al. @ TKDE 2018

High-level programming abstractions for distributed graph processing

slide-18
SLIDE 18

LINEAR ALGEBRA-BASED CLUSTERING COEFFICIENTS

19

slide-19
SLIDE 19

Adjacency matrices

20

1 1 1 1 1 1 1 1 1 1

B C D E F B C D E F

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

B C D E F B C D E F

1 1 1 1 1 1 1 1

B C D E F B C D E F 𝐵 = 𝐵business = 𝐵friends =

E B D C F

Key idea: Multiplication of adjacency matrices = 2-hop, 3-hop, etc. paths

Optimization #1

slide-20
SLIDE 20

Triangles – untyped

21

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 2 1 3 1 3 2 3 2 2 2 4 2 2 1 3 2 3 1 3 1 2 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 8 8 8 4 8 4 8 4 8 8 8 8 8 8 8 4 8 4 8 4 8 8 8 4

B D E C F

diag−1 𝐵 ⋅ 𝐵 ⋅ 𝐵 4 4 8 4 4 2-hop paths Number of triangles for each node 3-hop paths

slide-21
SLIDE 21

Triangles – typed

22

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 6 2 2 2 2

C B E F D

6 diag−1(𝐵𝑗 ∙ 𝐵𝑘 ∙ 𝐵𝑗) Matrices get more dense

slide-22
SLIDE 22

Optimization: element-wise multiplication

  • 𝐵 ⋅ 𝐵 ⋅ 𝐵 enumerates all 3-hop paths
  • Matrices get more dense, but only the diagonal is used
  • Idea: use element-wise multiplication

LCC 𝑤 = diag−1(𝐵 ⋅ 𝐵 ⋅ 𝐵) = 𝐵 ⋅ 𝐵 ⊙ 𝐵 ⋅ 1

  • Typed clustering: 𝒫 𝑢2 matrix multiplications

TCC(𝑤) =

∑𝑗≠𝑘𝐵𝑗⋅𝐵𝑘⊙𝐵𝑗⋅1 𝑜−1 ⋅∑𝑗 𝐵𝑗⋅1 ⊙ 𝐵𝑗⋅1 −1

𝒫 𝑢2 matrix multiplications for each node

23

  • P. Várhegyi – Master’s thesis (2018)

Multidimensional graph analytics Embarrassingly parallel –> opt. #3 Optimization #2

slide-23
SLIDE 23

The example using the optimization

24

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1

B C D E F

More sparse matrix 6 1 1 1 1 1

slide-24
SLIDE 24

IMPLEMENTATION 1: Java

25

slide-25
SLIDE 25

Question on SoftwareRecs

26

Efficient Java Matrix Library (2014–ongoing)

slide-26
SLIDE 26

TCC on Paradise papers subsets

27

Only EJML scales for the whole graph Less than 1 min.

slide-27
SLIDE 27

Graph analyzer library

  • Java implementation
  • Linear algebra-based implementations
  • Uses EJML library with CSC compression
  • Single-threaded
  • Runs on top of Neo4j/EMF/CSV graphs
  • Number of graph metrics:
  • For nodes, types, node pairs, node-type pairs, …
  • TCC variants, typed degree distribution, degree entropy
  • Pairwise multiplexity, multiplex participation coefficient

28

ftsrg/graph-analyzer

  • P. Várhegyi – Master’s thesis (2018)

Multidimensional graph analytics

slide-28
SLIDE 28

IMPLEMENTATION 2: Julia

29

slide-29
SLIDE 29

Julia language

30

  • A high-performance, high-level dynamic language
  • v1.0 last August, v1.1 just out
slide-30
SLIDE 30

Julia implementation

31

  • Preliminary TCC implementation: ~25 lines
  • Written in a few days (incl. learning the language)

𝐵 ⋅ 𝐵 ⊙ 𝐵 ⋅ 1

A * A .* A * ones(n) Performance on Paradise papers:

  • similar to the best Java implementation
  • but easier to write and extend
slide-31
SLIDE 31

IMPLEMENTATION 3: GraphBLAS

32

slide-32
SLIDE 32

Approach for defining graph algorithms

  • GraphBLAS is an effort to define standard building blocks

for graph algorithms in the language of linear algebra

  • Idea: BLAS (Basic Linear Algebra Subprograms), since 1979

BLAS GraphBLAS Hardware arch. Hardware arch. Numerical applications Graph analytic applications

Graph algorithms LINPACK/LAPACK

  • S. McMillan @ SEI Research Review (CMU)

Graph algorithms on future architectures

Separation of concerns Separation of concerns

slide-33
SLIDE 33

Graph operations with semirings

Many graph algorithms can be captured over arbitrary semirings –> requires overloading of ⨁.⨂ Examples:

  • real numbers

+ . ⋅ clustering

  • tropical semiring

min.+ shortest path

  • boolean semiring

⋁. ⋀ traversal Old idea, but very few libraries support this.

34

Aho, Hopcrof, Ullman (1974): The Design and Analysis of Computer Algorithms Cormen, Leiserson, Rivest (1990): Introduction to Algorithms [only in the 1st edition]

slide-34
SLIDE 34

Graph operations with semirings

  • SuiteSparse:GraphBLAS
  • C API and single-threaded implementation
  • Steep learning curve
  • Good performance even with a single thread
  • Benchmark work in progress for LCC/TCC

35

  • J. Kepner, J. Gilbert,

Graph algorithms in the language of linear algebra. SIAM, 2011

  • H. Jananthan, J. Kepner,

Mathematics of Big Data. MIT Press 2018

  • R. Lipman, T. Davis @ redisconf18

Graph algebra: graph operations in the language of linear algebra

sergiud/SuiteSparse/

slide-35
SLIDE 35

RESULTS OF THE ANALYSIS

36

slide-36
SLIDE 36

Results on the Paradise papers

37

Two variants of the typed clustering coefficient Outliers

  • Note. Further analysis needs domain-specific expertise.
slide-37
SLIDE 37

ALGORITHMIC OPTIMIZATION

38

slide-38
SLIDE 38

Skewed data distribution: joins

39

H.Q. Ngo et al. @ SIGMOD Record 2013 Skew strikes back: new developments in the theory of join algorithms.

𝑈 𝑇 𝑆

Enumerate all triangles: 𝑆 ⋈ 𝑇 ⋈ 𝑈 Any solution using binary joins requires 𝒫 𝑜2 time, but the theoretical lower bound is 𝒫 𝑜1.5 .

slide-39
SLIDE 39

H.Q. Ngo et al. @ SIGMOD Record 2013 Skew strikes back: new developments in the theory of join algorithms.

Skewed data distribution: matrices

40

𝑆 ⋅ 𝑇 = 𝑆 ⋅ 𝑇 ⋅ 𝑈 = 𝑈 𝑇 𝑆 19 wedges 𝒫 𝑜2 10 triangles 𝒫 𝑜

  • Opt. #4: use 𝑈

as mask for multiplication

slide-40
SLIDE 40

Skewed data distribution

  • Worst-case optimal join algorithms can compute this

example with multiway joins ⋈ 𝑆, 𝑇, 𝑈 in 𝒫 𝑜1.5 .

  • Does this occur in practice? To some degree, due to

the power-law distribution of scale-free networks.

  • The problem itself is known in graph analytics, e.g. the

GraphBLAS API offers masked matrix multiplication.

  • But only supported by libs tailored for graph processing.

41

H.Q. Ngo @ Journal of the ACM 2018 Worst-case optimal join algorithms T.M. Low, S. McMillan et al. @ HPEC 2017 First look: linear algebra-based triangle counting without matrix multiplication

slide-41
SLIDE 41

Benchmark

42

  • A. Iosup et al. @ VLDB 2016

LDBC Graphalytics

  • LDBC Graphalytics
  • Synthetic social graph
  • 4M nodes/300M edges
  • Single machine

Bottom line:

  • LCC performances are OOMs

worse than for PageRank and single-source shortest paths

  • Slower systems time out

Runtime [s]

Timeout for Giraph and GraphX

slide-42
SLIDE 42

TAKEAWAYS

43

slide-43
SLIDE 43

Summary of requirements for graph proc.

  • Graph representation
  • sparse matrices of integers/floats/booleans
  • edge types
  • Operation
  • semirings with arbitrary ⨁ and ⨂ operators
  • parallelization
  • handle skewed distribution
  • No high-level off-the-shelf solutions

44

slide-44
SLIDE 44

Building blocks for implementations

  • C/C++
  • SuiteSparse:GraphBLAS
  • CombBLAS
  • Java:
  • Graphulo (GraphBLAS for Apache Accumulo)
  • EJML (Efficient Java Matrix Library)
  • Julia
  • SparseArrays with overrides for custom semirings
  • Python
  • PyGB (Python wrapper for GraphBLAS)

45

slide-45
SLIDE 45

Papers on linear algebra for graphs

Works on linear algebra-based graph analysis

46

V.B. Shah et al. @ HPEC 2013 Novel algebras for advanced analytics in Julia.

  • J. Chamberlin @ GABB 2018

PyGB: GraphBLAS DSL in Python

  • T. Davis – preprint (2018)

Algorithm 9xx: SuiteSparse:GraphBLAS

  • Y. Ahmad et al. @ VLDB 2018

LA3: A scalable link- and locality-aware linear algebra-based graph analytics system

Distributed and LA-based

slide-46
SLIDE 46

Implementations and libraries

  • Difficult to find an ideal platform (Hadoop? Spark?)
  • EJML is the best Java lib for sparse matrices
  • GraphBLAS is great but takes some time to learn
  • Julia is promising but has no graph support yet
  • Some Java libs could have worked for Paradise papers
  • Arabesque
  • Flink Gelly
  • GPS

47

None were included in this benchmark

slide-47
SLIDE 47

FUTURE WORK

48

slide-48
SLIDE 48

Future work: typed HO clustering

49

Fronczak et al. @ Physica A 2002 Higher order clustering coefficients in Barabási–Albert networks

  • A “counterexample” for LCCs
  • bipartite graph
  • no triangles
  • but interesting 4-hop cycles
  • Use more sophisticated metrics:
  • Higher order (HO) clustering is a generalization of LCC
  • Meta-paths are paths on certain node/edge types
  • Gets very complex very soon

Shi et al. @ TKDE 2017 A survey of heterogeneous information network analysis

slide-49
SLIDE 49

Future work: analysis of huge graphs

50

  • M. Saleem, G. Szárnyas et al. @ WWW 2019

How representative is a SPARQL benchmark? An analysis of RDF triplestore benchmarks.

Biological RDF data set: 290M nodes, 800M edges

slide-50
SLIDE 50

Acknowledgements

This research was partially supported by the MTA-BME Lendület Cyber-Physical Systems Research Group and the ÚNKP-18-3 New National Excellence Program of the Ministry of Human Capacities.

MTA-BME Lendület Cyber-Physical Systems Research Group Department of Measurement and Information Systems