Computing Graph Centrality Erik Saule with collaborations with : - - PowerPoint PPT Presentation

computing graph centrality
SMART_READER_LITE
LIVE PREVIEW

Computing Graph Centrality Erik Saule with collaborations with : - - PowerPoint PPT Presentation

Computing Graph Centrality Erik Saule with collaborations with : Ahmet Erdem Sary uce (Sandia), Kamer Kaya (Sabanci University), Umit V. C ataly urek (Georgia Tech) University of North Carolina at Charlotte (CS) SIAM CSE 2017


slide-1
SLIDE 1

Computing Graph Centrality

Erik Saule with collaborations with : Ahmet Erdem Sarıy¨ uce (Sandia), Kamer Kaya (Sabanci University), ¨ Umit V. C ¸ataly¨ urek (Georgia Tech)

University of North Carolina at Charlotte (CS)

SIAM CSE 2017

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 1 / 17

slide-2
SLIDE 2

Outline

1

Closeness and Betweenness Centrality

2

Algorithmic Optimization

3

HPC Techniques

4

Incremental

5

Conclusion

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 2 / 17

slide-3
SLIDE 3

Centralities - Concept

Answer questions such as

Who controls the flow in a network? Who is more important? Who has more influence? Whose contribution is significant for connections?

Different kinds of graph

road networks social networks power grids mechanical mesh

Applications

Covert network (e.g., terrorist identification) Contingency analysis (e.g., weakness/robustness of networks) Viral marketing (e.g., who will spread the word best) Traffic analysis Store locations

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 3 / 17

slide-4
SLIDE 4

Centralities - Definition

Let G = (V , E) be a graph with the vertex set V and edge set E. closeness centrality: cc[v] =

1 far[v], where the farness is defined as

far[v] =

u∈comp(v) d(u, v). d(u, v) is the shortest path length

between u and v. betweenness centrality: bc(v) =

s=v=t∈V σst(v) σst , where σst is the

number shortest paths between s and t, and σst(v) is the number of them passing through v. Both metrics care about the structure of the shortest path graph. Brandes algorithm computes the shortest path graph rooted in each vertex

  • f the graph. O(|E|) per source. O(|V ||E|) in total.

Believed to be asymptotically optimal [Kintali08].

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 4 / 17

slide-5
SLIDE 5

Brandes’s Algorithm for Betweenness [Brandes01]

The algorithm is executed for each source s.

Phase 1

Use BFS from s to compute the number of shortest path to all vertices.

Q.push(s), σ[s] ← 1, d[s] ← 0 while Q is not empty do v ← Q.pop(), S.push(v) for all w ∈ Γ(v) do if d[w] < 0 then Q.push(w) d[w] ← d[v] + 1 if d[w] = d[v] + 1 then σ[w] ← σ[w] + σ[v] P[w].push(v)

Phase 2

Back propagate to compute ratios of path.

δ[v] ←

1 σ[v], ∀v ∈ V

while S is not empty do w ← S.pop() for v ∈ P[w] do δ[v] ← δ[v] + δ[w] if w = s then bc[w] ← bc[w] + (δ[w] × σ[w] − 1) return bc

O(|E|) per source. O(|V ||E|) in total. Believed to be asymptotically optimal [Kintali08].

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 5 / 17

slide-6
SLIDE 6

Outline

1

Closeness and Betweenness Centrality

2

Algorithmic Optimization

3

HPC Techniques

4

Incremental

5

Conclusion

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 6 / 17

slide-7
SLIDE 7

Sampling

Idea

Only execute part of the sources for a random subset of the sources. And derive statistical bounds on the actual centrality. Changing the probability of pick a source can help correcting for sampling biais.

Refs

  • D. Bader, S. Kintali, K. Madduri, and M. Mihai. Approximating Betweenness Centrality. In WAW 2007
  • R. Geisberger, P. Sanders, and D. Schultes. 2008. Better Approximation of Betweenness Centrality. In ALENEX.
  • U. Brandes and C. Pich. 2007. Centrality Estimation in Large Networks. I. J. Bifurcation and Chaos 17, 7 (2007).
  • D. Eppstein J. Wang. Fast approximation of centrality. SODA 2001. 228229
  • K. Okamoto, W. Chen, and X.-Y. Li. Ranking of closeness centrality for large-scale social networks. In Proc. of FAW, 2008.

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 7 / 17

slide-8
SLIDE 8

Graph Decomposition

Identical vertices

to

x2

Side Vertices

to

Articulation point

A B

to

A B

+|B|

+|A|

Refs

  • M. Baglioni, F. Geraci, M. Pellegrini, and E. Lastres. 2012.

Fast Exact Computation of Betweenness Centrality in Social

  • Networks. In IEEE/ACM ASONAM 2012.
  • R. Puzis, P. Zilberman, Y. Elovici, S. Dolev, and U. Brandes.
  • 2012. Heuristics for Speeding up Betweenness Centrality
  • Computation. In SocialCom 2012
  • M. Lee, J. Lee, J. Park, R. Choi, C. Chung. QUBE: a quick

algorithm for updating betweenness centrality. WWW 2012

  • A. Sariyce, E. Saule, K. Kaya, and U. V. Catalyurek.

Shattering and compressing networks for betweenness

  • centrality. SDM, 2013.
  • L. Wang, F. Yang, L. Zhuang, H. Cui, F. Lv, X. Feng.

Articulation Point Guided Redundancy Elimination for Betweenness Centrality. PPoPP 2016.

  • A. Sariyuce, K. Kaya, E. Saule, and U. V. Catalyurek. Graph

manipulations for fast centrality computation. ACM TKDD

  • 2017. (to appear).

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 8 / 17

slide-9
SLIDE 9

Outline

1

Closeness and Betweenness Centrality

2

Algorithmic Optimization

3

HPC Techniques BFS techniques Multi source techniques Distributed memory

4

Incremental

5

Conclusion

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 9 / 17

slide-10
SLIDE 10

Standard BFS techniques

Optimizing to hardware platform

“If I change the storage format this way, I can squeeze x% more edges per cycle”

too many to even bother

Direction Optimization

Notice that on a small diameter graph, the first iteration are faster in top-down and the last are faster in bottom-up. So switch in the middle.

  • S. Beamer, K. Asanovix, D. Patterson. Direction-optimizing breadth-first search. SC 2012.
  • H. Liu, H. Huang. Enterprise: Breadth-First Graph Traversal on GPUs. SC 2015

Good as a reference, but does not always directly port to centrality.

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 10 / 17

slide-11
SLIDE 11

Multiple sources

x x x x x x x x x x x Do BFS SpMV style (bottom-up) Each source maps to a SIMD lane Reduce the number of graph reads Works better on low diameter graphs

Refs

  • A. Buluc and J. Gilbert, The Combinatorial BLAS: Design, implementation, and applications. IJHPCA 2011
  • A. Erdem Sariyce, E. Saule, K. Kaya, and U. Catalyurek. Hardware/software vectorization for closeness centrality on

multi-/many-core architectures. MTAAP, 2014.

  • M. Then, M. Kaufmann, F. Chirigati, T. Hoang-Vu, K. Pham, A. Kemper, T. Neumann, H. Vo, The More the Merrier: Efficient

Multi-Source Graph Traversal. VLDB 2014.

  • A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Regularizing graph centrality computations. JPDC, 76:106–119, February

2015

  • H. Liu, H. Huang, Yang Hu. iBFS: Concurrent Breadth-First Search on GPUs. SIGMOD 2016.

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 11 / 17

slide-12
SLIDE 12

Distributed Memory

Graph Replication

If I can fit the graph entirely on a node, then I can distribute work coarsely by distributing sources. Scales linearly.

  • R. Lichtenwalter and N. Chawla. DisNet: A Framework for

Distributed Graph Computation. ASONAM 2011. and a lot more.

Graph Partitioning

Partition the graph on multiple

  • nodes. (1D, 2D, fine grain

decomposition.) Run BFS on it. Usually does not scale well. (Scalability has a COST)

  • F. McSherry, M. Isard, D. Murray. Scalability! But at what

COST? HotOS 2015.

  • A. Buluc and J. Gilbert, The Combinatorial BLAS: Design,

implementation, and applications. IJHPCA 2011 M Bernaschi, G Carbone, F Vella. Betweenness centrality on Multi-GPU systems. MTAAP 2015 and a ton more Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 12 / 17

slide-13
SLIDE 13

Outline

1

Closeness and Betweenness Centrality

2

Algorithmic Optimization

3

HPC Techniques

4

Incremental Algorithm HPC implementation

5

Conclusion

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 13 / 17

slide-14
SLIDE 14

Incremental Principle

Common Subgraphs

Case 1: No consequential difference Case 2: One more edge in BFS graph Case 3: Potentially completly different graph

Refs

  • O. Green, R. McColl, and D. Bader. A fast algorithm for streaming betweenness centrality. In Proc. of SocialCom, 2012.
  • A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Streamer: a distributed framework for incremental closeness centrality
  • computation. In IEEE Cluster 2013.
  • A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Incremental closeness centrality in distributed memory. Parallel Computing,

47:3–18, August 2015. Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 14 / 17

slide-15
SLIDE 15

Leveraging Pipeling

Refs

  • A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Streamer: a distributed framework for incremental closeness centrality
  • computation. In IEEE Cluster 2013.
  • A. Sariyuce, E. Saule, K. Kaya, and U. Catalyurek. Incremental closeness centrality in distributed memory. Parallel Computing,

47:3–18, August 2015. Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 15 / 17

slide-16
SLIDE 16

Outline

1

Closeness and Betweenness Centrality

2

Algorithmic Optimization

3

HPC Techniques

4

Incremental

5

Conclusion

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 16 / 17

slide-17
SLIDE 17

Conclusion

Lots of works out there. What is unclear (to me) or not done (that I know)? Is sampling really a closed problem? Reconcile graph decomposition and sampling. Reconcile graph decomposition and regularization. Better SpMM based algorithms. A good implementation that use all these techniques. Why do we keep on reinventing the same techniques? Does distributed memory (beside replication) ever make sense for centrality?

Erik Saule (UNCC) Computing Graph Centrality SIAM CSE 2017 17 / 17