Large Scale Graph Analysis Erik Saule HPC Lab Biomedical - - PowerPoint PPT Presentation

large scale graph analysis
SMART_READER_LITE
LIVE PREVIEW

Large Scale Graph Analysis Erik Saule HPC Lab Biomedical - - PowerPoint PPT Presentation

Large Scale Graph Analysis Erik Saule HPC Lab Biomedical Informatics The Ohio State University March 11, 2013 UMass Boston Ohio State University, Biomedical Informatics Large Scale Graph Analysis Erik Saule :: 1 / 43 HPC Lab


slide-1
SLIDE 1

Large Scale Graph Analysis

Erik Saule

HPC Lab Biomedical Informatics The Ohio State University

March 11, 2013 UMass Boston

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis :: 1 / 43

slide-2
SLIDE 2

Outline

1

Introduction

2

theadvisor Citation Analysis for Document Recommendation A High Performance Computing Problem Result Diversification

3

Centrality Compression and Shattering Storage format for GPU acceleration Incremental Algorithms

4

Data Management Middleware for Data Analysis Out-of-Core Computing

5

Conclusion

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis :: 2 / 43

slide-3
SLIDE 3

Data in the Modern Days

Facebook

1B active users a month. Each day: 2.5B content items shared 2.7B Likes 300M photos 500TB data

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 3 / 43

slide-4
SLIDE 4

Data in the Modern Days

Facebook

1B active users a month. Each day: 2.5B content items shared 2.7B Likes 300M photos 500TB data

Twitter

500M users 340M tweets/day (2,200/sec) 24.1M super bowl tweets

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 3 / 43

slide-5
SLIDE 5

Data in the Modern Days

Facebook

1B active users a month. Each day: 2.5B content items shared 2.7B Likes 300M photos 500TB data

Twitter

500M users 340M tweets/day (2,200/sec) 24.1M super bowl tweets

Academic networks

1.5M papers/year (4,000/day) 100,000 papers/year in CS

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 3 / 43

slide-6
SLIDE 6

Data in the Modern Days

Facebook

1B active users a month. Each day: 2.5B content items shared 2.7B Likes 300M photos 500TB data

Twitter

500M users 340M tweets/day (2,200/sec) 24.1M super bowl tweets

Academic networks

1.5M papers/year (4,000/day) 100,000 papers/year in CS

Transportation

10M trips in Paris public transportation/day 2.5M registered vehicles in LA 1.2M used for commuting/day

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 3 / 43

slide-7
SLIDE 7

Data in the Modern Days

Facebook

1B active users a month. Each day: 2.5B content items shared 2.7B Likes 300M photos 500TB data

Twitter

500M users 340M tweets/day (2,200/sec) 24.1M super bowl tweets

Academic networks

1.5M papers/year (4,000/day) 100,000 papers/year in CS

Transportation

10M trips in Paris public transportation/day 2.5M registered vehicles in LA 1.2M used for commuting/day

Compositing

Problems can also come from multiple sources, e.g., identify coauthors in Facebook.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 3 / 43

slide-8
SLIDE 8

Are these problems new?

“CERN report 1959” about a 1H experiment on the synchrocyclotron

The use of the computer in this sort of measurement is important, not

  • nly because of the large amounts of data which must be handled, but

because with a modern high speed computer one can search quickly for various systematic errors.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 4 / 43

slide-9
SLIDE 9

Are these problems new?

“CERN report 1959” about a 1H experiment on the synchrocyclotron

The use of the computer in this sort of measurement is important, not

  • nly because of the large amounts of data which must be handled, but

because with a modern high speed computer one can search quickly for various systematic errors.

But also...

Intrusion detection in computer security Search engines Stock market predictions Weather forecast

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 4 / 43

slide-10
SLIDE 10

Are these problems new?

“CERN report 1959” about a 1H experiment on the synchrocyclotron

The use of the computer in this sort of measurement is important, not

  • nly because of the large amounts of data which must be handled, but

because with a modern high speed computer one can search quickly for various systematic errors.

But also...

Intrusion detection in computer security Search engines Stock market predictions Weather forecast

Not so new!

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 4 / 43

slide-11
SLIDE 11

So why is it important now?

Ubiquitous

Scientist (LHC, Metagenomics) Big companies (Data companies, Operational marketing) Small companies (Website logs, who buys what? where?) People (Personal analytics)

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 5 / 43

slide-12
SLIDE 12

So why is it important now?

Ubiquitous

Scientist (LHC, Metagenomics) Big companies (Data companies, Operational marketing) Small companies (Website logs, who buys what? where?) People (Personal analytics)

In brief, everybody has Big Data problems now!

None of these data can be manually analyzed. Automatic analysis is mandatory.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 5 / 43

slide-13
SLIDE 13

The Three Attributes of Big Data

Variety

unstructured data

Velocity

flowing in the system

Volume

in high volume

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 6 / 43

slide-14
SLIDE 14

The Three Attributes of Big Data

Variety

unstructured data

Velocity

flowing in the system

Volume

in high volume Graphs Hypergraphs Conceptual data Streaming data Temporal data Flow of queries Millions, Billions, Trillions

  • f vertices and edges

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 6 / 43

slide-15
SLIDE 15

The Three Attributes of Big Data

Variety

unstructured data

Velocity

flowing in the system

Volume

in high volume Graphs Hypergraphs Conceptual data Streaming data Temporal data Flow of queries Millions, Billions, Trillions

  • f vertices and edges

Problems

Storing and transporting such data Extracting the important data and building a graph (or else) Analyzing the graph:

static analysis recurrent analysis temporal analysis

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 6 / 43

slide-16
SLIDE 16

My Goal

Study Big Data problems and design solutions for them.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 7 / 43

slide-17
SLIDE 17

My Goal

Study Big Data problems and design solutions for them.

Applications (Source)

Facebook, theadvisor, twitter, CiteULike, traffic camera, transportation systems

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 7 / 43

slide-18
SLIDE 18

My Goal

Study Big Data problems and design solutions for them.

Applications (Source)

Facebook, theadvisor, twitter, CiteULike, traffic camera, transportation systems

Algorithms (Analysis)

Page Rank, Random Walk, Traversals, Centrality, Community Detection, Outlier Detection, Visualization

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 7 / 43

slide-19
SLIDE 19

My Goal

Study Big Data problems and design solutions for them.

Applications (Source)

Facebook, theadvisor, twitter, CiteULike, traffic camera, transportation systems

Algorithms (Analysis)

Page Rank, Random Walk, Traversals, Centrality, Community Detection, Outlier Detection, Visualization

Middleware

MPI, Hadoop, Pegasus, Graph Lab, DOoC+LAF, DataCutter, SQL, SPARQL

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 7 / 43

slide-20
SLIDE 20

My Goal

Study Big Data problems and design solutions for them.

Applications (Source)

Facebook, theadvisor, twitter, CiteULike, traffic camera, transportation systems

Algorithms (Analysis)

Page Rank, Random Walk, Traversals, Centrality, Community Detection, Outlier Detection, Visualization

Middleware

MPI, Hadoop, Pegasus, Graph Lab, DOoC+LAF, DataCutter, SQL, SPARQL

Hardware

Clusters, Cray XMT, Intel Xeon Phi, FPGAS, SSD drives, NVRAM, Infiniband, Cloud Computing, GPU.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 7 / 43

slide-21
SLIDE 21

My Goal

Study Big Data problems and design solutions for them.

Applications (Source)

Facebook, theadvisor, twitter, CiteULike, traffic camera, transportation systems

Algorithms (Analysis)

Page Rank, Random Walk, Traversals, Centrality, Community Detection, Outlier Detection, Visualization

Middleware

MPI, Hadoop, Pegasus, Graph Lab, DOoC+LAF, DataCutter, SQL, SPARQL

Hardware

Clusters, Cray XMT, Intel Xeon Phi, FPGAS, SSD drives, NVRAM, Infiniband, Cloud Computing, GPU.

What to use? When to use them? What is missing?

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Introduction:: 7 / 43

slide-22
SLIDE 22

Outline

1

Introduction

2

theadvisor Citation Analysis for Document Recommendation A High Performance Computing Problem Result Diversification

3

Centrality Compression and Shattering Storage format for GPU acceleration Incremental Algorithms

4

Data Management Middleware for Data Analysis Out-of-Core Computing

5

Conclusion

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor:: 8 / 43

slide-23
SLIDE 23

A Use Case

http://theadvisor.osu.edu/

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor:: 9 / 43

slide-24
SLIDE 24

A Use Case

http://theadvisor.osu.edu/

Using the Citation Graph

Hypothesis: If two papers are related or treat the same subject, then they will be close to each other in the citation graph (and reciprocal)

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor:: 9 / 43

slide-25
SLIDE 25

Global Approches: PageRank

Let G = (V , E) be the citation graph

Personalized PageRank [Haveliwala02]

πi(u) = dp∗(u) + (1 − d)

v∈N(u) πi−1(v) δ(v)

with p∗(u) = 1.

source: wikipedia

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Citation Analysis 10 / 43

slide-26
SLIDE 26

Direction Awareness [DBRank12]

Time exploration

What if we are interested in searching papers per years. Recent papers? Traditional papers? Let Q be a set of known relevant papers.

Direction Aware Random Walk with Restart

πi(u) = dp∗(u) + (1 − d)(κ

v∈N+(u) πi−1(v) δ−(v) + (1 − κ) v∈N−(u) πi−1(v) δ+(v) )

d ∈ (0 : 1) is the damping factor. κ ∈ (0 : 1). p∗(u) =

1 |Q|, if u ∈ Q, p∗(u) = 0, otherwise

a b c d

restart edge reference edge back-reference (citation) edge

v

d (1-κ) δ+(v) d κ δ-(v) (1-d) m

qm

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Citation Analysis 11 / 43

slide-27
SLIDE 27

Analysis: Time and Accuracy

0.2 0.4 0.6 0.8 1 κ 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 d 1980 1985 1990 1995 2000 2005 2010 average publication year

hide random hide recent hide earlier mean interval mean interval mean interval DaRWR 48.00 46.80 49.20 42.22 40.95 43.50 60.64 59.48 61.80 P.R. 56.56 55.31 57.80 38.75 37.50 40.00 58.93 57.76 60.10 Katzβ 46.33 45.16 47.50 34.56 33.42 35.70 44.19 42.97 45.40 Cocit 44.60 43.39 45.80 14.22 13.25 15.20 55.97 54.64 57.30 Cocoup 17.28 16.36 18.20 17.56 16.61 18.50 2.93 2.57 3.30 CCIDF 18.05 17.11 19.00 18.97 17.94 20.00 3.55 3.10 4.00

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Citation Analysis 12 / 43

slide-28
SLIDE 28

A Sparse Matrix-Vector Multiplication (SpMV)

Rewriting DaRWR

πi(u) = dp∗(u) + (1 − d)  κ

  • v∈N+(u)

πi−1(v) δ−(v) + (1 − κ)

  • v∈N−(u)

πi−1(v) δ+(v)   πi(u) = dp∗(u) +

  • v∈N+(u)

(1 − d)κ δ−(v) πi−1(v) +

  • v∈N−(u)

(1 − d)(1 − κ) δ+(v) πi−1(v) πi = dp∗ + A−πi−1 + A+πi−1 πi = dp∗ + Aπi−1 (CRS Full) πi = dp∗ + B− (1 − d)κ δ− πi−1

  • + B+

(1 − d)(1 − κ) δ+ πi−1

  • (CRS Half)

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::A HPC computing problem 13 / 43

slide-29
SLIDE 29

Partitioning and Ordering

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::A HPC computing problem 14 / 43

slide-30
SLIDE 30

Runtimes - AMD Opteron 2378 [ASONAM12]

1 1.5 2 2.5 3 1 2 4 8 16 32 64 execution time (s) #partitions CRS-Full CRS-Full (RCM) CRS-Full (AMD) CRS-Full (SB) CRS-Half CRS-Half (RCM) CRS-Half (AMD) CRS-Half (SB) COO-Half COO-Half (RCM) COO-Half (AMD) COO-Half (SB) Hybrid Hybrid (RCM) Hybrid (AMD) Hybrid (SB) 1 1.5 2 2.5 3 1 2 4 8 16 32 64 execution time (s) #partitions CRS-Full CRS-Full (RCM) CRS-Full (AMD) CRS-Full (SB) CRS-Half CRS-Half (RCM) CRS-Half (AMD) CRS-Half (SB) COO-Half COO-Half (RCM) COO-Half (AMD) COO-Half (SB) Hybrid Hybrid (RCM) Hybrid (AMD) Hybrid (SB) CRS-Full CRS-Half COO-Half Hybrid

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::A HPC computing problem 15 / 43

slide-31
SLIDE 31

Diversification: Principle

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Result Diversification 16 / 43

slide-32
SLIDE 32

Diversification: Principle

Relevant

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Result Diversification 16 / 43

slide-33
SLIDE 33

Diversification: Principle

Relevant Relevant Diverse

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Result Diversification 16 / 43

slide-34
SLIDE 34

Metrics and Algorithms [CSTA13]

0.2 0.4 0.6 0.8 1 5 10 20 50 100 rel k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 0.2 0.4 0.6 0.8 1 5 10 20 50 100 diff k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 1985 1990 1995 2000 5 10 20 50 100 AVG year k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 0.2 0.4 0.6 0.8 1 5 10 20 50 100 dens2 k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 5 10 20 50 100 σ2 k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 1 10 100 1000 5 10 20 50 100 time (sec) k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Result Diversification 17 / 43

slide-35
SLIDE 35

Metrics and Algorithms [CSTA13]

0.2 0.4 0.6 0.8 1 5 10 20 50 100 rel k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 0.2 0.4 0.6 0.8 1 5 10 20 50 100 diff k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 1985 1990 1995 2000 5 10 20 50 100 AVG year k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 0.2 0.4 0.6 0.8 1 5 10 20 50 100 dens2 k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 5 10 20 50 100 σ2 k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM 1 10 100 1000 5 10 20 50 100 time (sec) k DaRWR (top-k) GrassHopper PDivRank CDivRank Dragon LM k-RLM

k-RLM is good.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Result Diversification 17 / 43

slide-36
SLIDE 36

Results

GPU Multicore Generic SpMV Eigensolvers Partitioning Compression Graph mining

references recommendations top-100

e GPU Multicore Generic SpMV Eigensolvers Partitioning Compression Graph mining

references recommendations top-100

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Result Diversification 18 / 43

slide-37
SLIDE 37

A Modelization problem [WWW13]

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 dens2 rel 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 dens2 rel better

Here is a distribution of known algorithms.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Result Diversification 19 / 43

slide-38
SLIDE 38

A Modelization problem [WWW13]

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 dens2 rel 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 dens2 rel better

Here is a distribution of known algorithms. Would such an algorithm be of interest?

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Result Diversification 19 / 43

slide-39
SLIDE 39

A Modelization problem [WWW13]

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 dens2 rel 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 dens2 rel better

Here is a distribution of known algorithms. Would such an algorithm be of interest? That algorithm is query-oblivious!

Expanded Relevance

Sum the relevance of all documents at distance ℓ of a recommendation.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis theadvisor::Result Diversification 19 / 43

slide-40
SLIDE 40

Outline

1

Introduction

2

theadvisor Citation Analysis for Document Recommendation A High Performance Computing Problem Result Diversification

3

Centrality Compression and Shattering Storage format for GPU acceleration Incremental Algorithms

4

Data Management Middleware for Data Analysis Out-of-Core Computing

5

Conclusion

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality:: 20 / 43

slide-41
SLIDE 41

Centralities - Concept

Answer questions such as

Who controls the flow in a network? Who is more important? Who has more influence? Whose contribution is significant for connections?

Applications

Covert network (e.g., terrorist identification) Contingency analysis (e.g., weakness/robustness of networks) Viral marketing (e.g., who will spread the word best) Traffic analysis Store locations

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality:: 21 / 43

slide-42
SLIDE 42

Centralities - Definition

Let G = (V , E) be a graph with the vertex set V and edge set E. closeness centrality: cc[v] =

1 far[v], where the farness is defined as

far[v] =

u∈comp(v) d(u, v). d(u, v) is the shortest path length

between u and v. betweenness centrality: bc(v) =

s=v=t∈V σst(v) σst , where σst is the

number shortest paths between s and t, and σst(v) is the number of them passing through v.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality:: 22 / 43

slide-43
SLIDE 43

Centralities - Definition

Let G = (V , E) be a graph with the vertex set V and edge set E. closeness centrality: cc[v] =

1 far[v], where the farness is defined as

far[v] =

u∈comp(v) d(u, v). d(u, v) is the shortest path length

between u and v. betweenness centrality: bc(v) =

s=v=t∈V σst(v) σst , where σst is the

number shortest paths between s and t, and σst(v) is the number of them passing through v. Both metrics care about the structure of the shortest path graph. Brandes algorithm computes the shortest path graph rooted in each vertex

  • f the graph. O(|E|) per source. O(|V ||E|) in total.

Believed to be asymptotically optimal [Kintali08].

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality:: 22 / 43

slide-44
SLIDE 44

Compression and Shattering

A B A B

+|B|

+|A|

x2 Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality::Shattering 23 / 43

slide-45
SLIDE 45

BADIOS [SDM2013]

0.2 0.4 0.6 0.8 1 1.2 1.4

Epinions Gowalla bcsstk32 NotreDame RoadPA Amazon0601 Google WikiTalk

Relative time 1 Phase 1 Phase 2 Preproc

36m 14m 1h38m 1h 12m 41s 2h 17m 1d8h 20h 12h 10h 1d18h 8h 5d5h 16h

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality::Shattering 24 / 43

slide-46
SLIDE 46

Matrix Representations for GPUs

CRS [Shi11]

1 thread per vertex: bad load balance

COO [Jia11]

1 thread per edge: too many atomics

Virtual-vertex

Balances load and limits atomics

Stride

Enables coalesced memory accesses

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality::GPU 25 / 43

slide-47
SLIDE 47

NVIDIA C2050 performance [GPGPU2013]

0 ¡ 1 ¡ 2 ¡ 3 ¡ 4 ¡ 5 ¡ 6 ¡ 7 ¡ 8 ¡ 9 ¡ 10 ¡ 11 ¡ Speedup ¡wrt ¡CPU ¡1 ¡thread ¡ GPU ¡vertex ¡ GPU ¡edge ¡ GPU ¡virtual ¡ GPU ¡stride ¡

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality::GPU 26 / 43

slide-48
SLIDE 48

Edge Insertion for closeness centrality : three cases

If d(u, s) = d(v, s)

The shortest path graph does not

  • differ. So the farness of s is correct.

s u v l

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality::Incremental 27 / 43

slide-49
SLIDE 49

Edge Insertion for closeness centrality : three cases

If d(u, s) = d(v, s)

The shortest path graph does not

  • differ. So the farness of s is correct.

If d(u, s) + 1 = d(v, s)

The shortest path graph differs by exactly one edge. The levels stay the same. So the farness of s is still correct.

s u v l l+1

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality::Incremental 27 / 43

slide-50
SLIDE 50

Edge Insertion for closeness centrality : three cases

If d(u, s) = d(v, s)

The shortest path graph does not

  • differ. So the farness of s is correct.

If d(u, s) + 1 = d(v, s)

The shortest path graph differs by exactly one edge. The levels stay the same. So the farness of s is still correct.

If d(u, s) + 1 < d(v, s)

The shortest path graph differs by at least one edge. The level of v changes (and potentially more). So the farness of s is incorrect.

s u v l l+1

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality::Incremental 27 / 43

slide-51
SLIDE 51

Edge Insertion for closeness centrality : three cases

If d(u, s) = d(v, s)

The shortest path graph does not

  • differ. So the farness of s is correct.

If d(u, s) + 1 = d(v, s)

The shortest path graph differs by exactly one edge. The levels stay the same. So the farness of s is still correct.

If d(u, s) + 1 < d(v, s)

The shortest path graph differs by at least one edge. The level of v changes (and potentially more). So the farness of s is incorrect.

Algorithm

Upon insertion of (u, v) Compute BFS from u and v (before edge insertion) For all s = u, v, if |d(u, s) − d(v, s)| > 1, flag s Add (u, v) to the graph Compute cc[s] for all flagged s

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality::Incremental 27 / 43

slide-52
SLIDE 52

Results : Speedup

Graph CC-B CC-BL soc-sign-epinions 3.0 37.8 loc-gowalla edges 1.8 17.1 bcsstk32 1.0 5,493.0 web-NotreDame 4.9 23.9 roadNet-PA 1.6 3.0 amazon0601 1.2 27.6 web-Google 3.0 26.6 wiki-Talk 6.8 69.8 Geometric mean 2.39 43.58

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Centrality::Incremental 28 / 43

slide-53
SLIDE 53

Outline

1

Introduction

2

theadvisor Citation Analysis for Document Recommendation A High Performance Computing Problem Result Diversification

3

Centrality Compression and Shattering Storage format for GPU acceleration Incremental Algorithms

4

Data Management Middleware for Data Analysis Out-of-Core Computing

5

Conclusion

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management:: 29 / 43

slide-54
SLIDE 54

Data Flow Middleware

Applications

Image processing Video surveillance MRI Analysis Satellite data processing

Complex Hardware

Accelerators (GPU, Xeon Phi) Clusters Grid Heterogeneous systems

Versatile [CSUR13]

Supports: Pipeline parallelism Task parallelism Replicated parallelism Data parallelism Optimizes: Throughput Latency Energy Reliability

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management::Middleware 30 / 43

slide-55
SLIDE 55

DataCutter [Beynon01]

Node 0 Layout Placement Node 1 Node 2 A D A B B C C C E E E C GPU CPUs D B

Filter Stream

Programming model that specify the layout of an application: a set filters that transform the data streamed through them.

Placement

A given layout can be executed in different ways by the programming framework. Potentially filters can be replicated.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management::Middleware 31 / 43

slide-56
SLIDE 56

APC+

How much to cut the work?

Small chunks, pay overhead Big chunks, pay imbalance Adapt to the network Adapt to the computing units

D C B

APC+

A B C Node 0 Node 1 C B

APC+

Node n

APC+

Performance Model

Predicts computation time for all size and processing unit.

Workload Partitioner[HiPC10]

Predicts the end of the computation to balance the load.

Distributed Work Stealing

Balances the load accros multiple node.

Storage Layer

Schedule data transfers to optimize network performance.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management::Middleware 32 / 43

slide-57
SLIDE 57

SAR imaging - Weak Scaling - CPU/GPU [Parco12]

0 ¡ 50 ¡ 100 ¡ 150 ¡ 200 ¡ 250 ¡ 1 ¡ 2 ¡ 4 ¡ 8 ¡ 16 ¡32 ¡ 1 ¡ 2 ¡ 4 ¡ 8 ¡ 16 ¡32 ¡ 1 ¡ 2 ¡ 4 ¡ 8 ¡ 16 ¡32 ¡ 1 ¡ 2 ¡ 4 ¡ 8 ¡ 16 ¡32 ¡ DC-­‑APC+ ¡ DC-­‑DD ¡ KAAPI ¡ MR-­‑MPI ¡ Execu&on ¡&me ¡(seconds) ¡ TCP ¡ BOOT ¡ OVER ¡ L-­‑IMB ¡ PROC ¡

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management::Middleware 33 / 43

slide-58
SLIDE 58

A Peta Scale nuclear physics problem

Extract the lowest eigenpairs of a large Hamiltonian matrix, whose size grows with the number of particles and truncation parameter in the atom.

For Boron 10, with Nmax=8 with 2 body interactions (Toy case): 160 millions of rows, 124 billions of non zero elements.

2 4 6 8 10 12 14 Nmax 10 10

1

10

2

10

3

10

4

10

5

10

6

10

7

10

8

10

9

10

10

M-scheme basis space dimension

4He 6Li 8Be 10B 12C 16O 19F 23Na 27Al

2 4 6 8 10 Nmax 10 10

3

10

6

10

9

10

12

10

15

number of nonzero matrix elements 16O, dimension 2-body interactions 3-body interactions 4-body interactions A-body interactions Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management::Out-of-Core 34 / 43

slide-59
SLIDE 59

A Peta Scale nuclear physics problem

Extract the lowest eigenpairs of a large Hamiltonian matrix, whose size grows with the number of particles and truncation parameter in the atom.

For Boron 10, with Nmax=8 with 2 body interactions (Toy case): 160 millions of rows, 124 billions of non zero elements.

2 4 6 8 10 12 14 Nmax 10 10

1

10

2

10

3

10

4

10

5

10

6

10

7

10

8

10

9

10

10

M-scheme basis space dimension

4He 6Li 8Be 10B 12C 16O 19F 23Na 27Al

2 4 6 8 10 Nmax 10 10

3

10

6

10

9

10

12

10

15

number of nonzero matrix elements 16O, dimension 2-body interactions 3-body interactions 4-body interactions A-body interactions

Two options: use a really large machine or use Out-of-Core (SSD).

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management::Out-of-Core 34 / 43

slide-60
SLIDE 60

DOoC+LAF

LAF DOoC

Compute Node - 3

Storage Service

Data Chunks Data Chunks Data Chunks SpMM In Data Out Data In Data

dot

In Data In Data Out Data

Local Scheduler E x e c

Compute Node - 2

Storage Service

Data Chunks Data Chunks Data Chunks SpMM In Data Out Data In Data

dot

In Data In Data Out Data

Local Scheduler E x e c LOBPCG End-User Code … SymSpMM(H, psi) dot(phiT, phi) ... LOBPCG.cpp

P r i m i t i v e C

  • n

v e r s i

  • n

Compute Node - 1

Storage Service

Data Chunks Data Chunks Data Chunks SpMM In Data Out Data In Data

dot

In Data In Data Out Data

Local Scheduler E x e c

Req Data

Global Task Graph Global Scheduler

Req Data Req Data

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management::Out-of-Core 35 / 43

slide-61
SLIDE 61

Linear Algebra Frontend

Provides data types and operations for Linear Algebra.

Lanczos

Lanczos (v in, M, a in, b in, v out, a out, b out) { Vector w(out.meta()); Vector wprime(out.meta()); Vector wsecond(out.meta()); symSpMV (w, M, v in); axpyV (wprime, w, v in, 1, -b in); dot (a out, wprime, v in); axpyV (wsecond, wprime, v in, 1, -a out); dot (b out, wsecond, wsecond); vector scale(v out, wsecond, 1/b out); }

Supported Operations

Primitives Operation Primitives that creates Matrix MM, (Sym)SpMM C = AB addM C = A + B axpyM C = aA + b randomM C = random() Primitives that creates Vector MV, (Sym)SpMV y = Ax addV y = x + w axpyV y = ax + b Primitives that creates scalar dot a =< x, y >

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management::Out-of-Core 36 / 43

slide-62
SLIDE 62

5 Lanczos iterations at NERSC [Cluster12]

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Data Management::Out-of-Core 37 / 43

slide-63
SLIDE 63

Outline

1

Introduction

2

theadvisor Citation Analysis for Document Recommendation A High Performance Computing Problem Result Diversification

3

Centrality Compression and Shattering Storage format for GPU acceleration Incremental Algorithms

4

Data Management Middleware for Data Analysis Out-of-Core Computing

5

Conclusion

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 38 / 43

slide-64
SLIDE 64

Other things I do

Scheduling, Mapping, Partitioning

Areas: Application scheduling Cluster scheduling Pipelined scheduling Spatial workload partitioning Multi objective: Makespan Throughput Fairness Latency Reliability Techniques: Optimal algorithms Approximation algorithms Heuristics

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 39 / 43

slide-65
SLIDE 65

Other things I do

Scheduling, Mapping, Partitioning

Areas: Application scheduling Cluster scheduling Pipelined scheduling Spatial workload partitioning Multi objective: Makespan Throughput Fairness Latency Reliability Techniques: Optimal algorithms Approximation algorithms Heuristics

Parallel Graph Algorithms

Scalable distributed memory local search for graph coloring. Communication reductions and

  • compression. Hybrid MPI/OpenMP.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 39 / 43

slide-66
SLIDE 66

Other things I do

Scheduling, Mapping, Partitioning

Areas: Application scheduling Cluster scheduling Pipelined scheduling Spatial workload partitioning Multi objective: Makespan Throughput Fairness Latency Reliability Techniques: Optimal algorithms Approximation algorithms Heuristics

Parallel Graph Algorithms

Scalable distributed memory local search for graph coloring. Communication reductions and

  • compression. Hybrid MPI/OpenMP.

Cutting Edge Architecture

Investigated graph algorithms and sparse linear algebra operations on pre-release Intel Xeon Phi.

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 39 / 43

slide-67
SLIDE 67

Conclusions - My Philosophy

Applications

Analyze data sources What are we trying to do? What is important?

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 40 / 43

slide-68
SLIDE 68

Conclusions - My Philosophy

Applications

Analyze data sources What are we trying to do? What is important?

Algorithms

Design Re-engineer Approximate Incremental

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 40 / 43

slide-69
SLIDE 69

Conclusions - My Philosophy

Applications

Analyze data sources What are we trying to do? What is important?

Algorithms

Design Re-engineer Approximate Incremental

Middleware

Makes the software: Easier to write Reusable Efficient

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 40 / 43

slide-70
SLIDE 70

Conclusions - My Philosophy

Applications

Analyze data sources What are we trying to do? What is important?

Algorithms

Design Re-engineer Approximate Incremental

Middleware

Makes the software: Easier to write Reusable Efficient

Hardware

What is suitable? How to use it? How to improve it?

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 40 / 43

slide-71
SLIDE 71

Conclusions - My Philosophy

Applications

Analyze data sources What are we trying to do? What is important?

Algorithms

Design Re-engineer Approximate Incremental

Middleware

Makes the software: Easier to write Reusable Efficient

Hardware

What is suitable? How to use it? How to improve it?

Which is important? All of it!

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 40 / 43

slide-72
SLIDE 72

What’s Next?

Applications

Multi-graph Author Venue Paper Personal analytics Cross social network application

Algorithms

Streaming Community detection Temporal analysis

Middleware

High Level Query Cluster with Accelerator Graph Middleware The MATLAB of graphs

Hardware

Cluster with Computational Accelerator (GPU, Xeon Phi) Cluster with Storage Accelerator (SSD) Both! (Beacon project)

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 41 / 43

slide-73
SLIDE 73

Active Coauthors

The Ohio State University: ¨ Umit V. C ¸ataly¨ urek Kamer Kaya Onur K¨ u¸ c¨ uktun¸ c Ahmet Erdem Saryı¨ uce Grenoble University, France: Denis Trystram Gr´ egory Mouni´ e Pierre-Fran¸ cois Dutot Jean-Fran¸ cois Mehaut Guillaume Huard INRIA, France: Yves Robert Anne Benoit Emmanuel Jeannot Alain Girault Lawrence Berkely National Lab: Esmond G. Ng Chao Yang Hasan Metin Aktulga University of Tennessee: Jack Dongarra University of Luxembourg: Johnatan Pecero-Sanchez Vanderbilt University: Zhiao Shi Iowa State University: James Vary Pieter Maris

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 42 / 43

slide-74
SLIDE 74

Thank you

More information

contact : esaule@bmi.osu.edu visit: http://bmi.osu.edu/~esaule http://bmi.osu.edu/hpc/

Research at HPC lab is supported by

Erik Saule Ohio State University, Biomedical Informatics HPC Lab http://bmi.osu.edu/hpc Large Scale Graph Analysis Conclusion:: 43 / 43