Parallel Algorithms for Generating Random Networks with Given Degree - - PowerPoint PPT Presentation

parallel algorithms for generating random networks with
SMART_READER_LITE
LIVE PREVIEW

Parallel Algorithms for Generating Random Networks with Given Degree - - PowerPoint PPT Presentation

. . . . . . . . . . . . . . . Parallel Algorithms for Generating Random Networks with Given Degree Sequences Maleq Khan 2 1 Department of Computer Science Virginia Tech 2 Network Dynamics and Simulation Science Laboratory Virginia


slide-1
SLIDE 1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Algorithms for Generating Random Networks with Given Degree Sequences

Maksudul Alam1,2 Maleq Khan2

1Department of Computer Science

Virginia Tech

2Network Dynamics and Simulation Science Laboratory

Virginia Bioinformatics Institute

September 17, 2015

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 1 / 34

slide-2
SLIDE 2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Outline

1

Emergence of Massive Real-World Networks Massive Random Network Real-World Networks

2

Generating Networks with a Given Expected Degree Sequence Chung–Lu Model Generative Algorithm for the Chung–Lu Model

3

Parallel Generation of Networks using Chung–Lu Model Challenges Parallel Algorithm Load Balancing and Partitioning Results

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 2 / 34

slide-3
SLIDE 3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Emergence of Massive Real-World Networks Massive Random Network

Emergence of Massive Networks

Random graphs are used to model complex real-world systems We need massive networks for realistic analysis

Many patterns emerge only in massive datasets A smaller network may not exhibit the same behavior as that of a larger network [Leskovec 2008]

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 3 / 34

slide-4
SLIDE 4

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Emergence of Massive Real-World Networks Real-World Networks

Real-World Networks

Real-world networks are massive in size Many networks exhibits power-law degree distribution Many networks have no well-defined degree distribution How to generate networks with these diverse distributions?

Twitter 316 M active users Facebook 1.49 B active users Degree Distribution

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 4 / 34

slide-5
SLIDE 5

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Emergence of Massive Real-World Networks Real-World Networks

Diverse Degree Distributions

Twitter Friendstar Blackout Miami Earth Quake Fires

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 5 / 34

slide-6
SLIDE 6

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Chung–Lu Model

Chung–Lu Model

Chung–Lu model generates networks from any expected degree seq. Given n nodes V = 0, 1, 2, . . . , n − 1 with corresponding weights w = w0, w1, . . . , wn−1 An edge between nodes i and j is added with probability pi,j = wiwj ∑

k wk

If no self-loop is allowed, expected degree of node i is: E[di] = ∑

i̸=j

wiwj ∑

k wk

= wi ∑

j wj − wi

k wk

= wi − w2

i

k wk

In large graph typically: E[di] ≈ wi as

w2

i

k wk → 0 [Miller 2011]

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 6 / 34

slide-7
SLIDE 7

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Naïve Sequential Algorithm for the CL Model

Algorithm 1.1: Sequential Algorithm

1: procedure Serial–Chung–Lu(w, V) 2:

S ← ∑

k∈V wk

3:

E ← Create–Edges(w, S, V)

4: procedure Create–Edges(w, S, V) 5:

E ← ∅

6:

for all i ∈ [0, n − 2] do

7:

for all j ∈ [i + 1, n − 1] do

8:

Add edge (i, j) to E with probability

wiwj S

9:

return E

For n nodes there are (n

2

) = n(n−1)

2

= O (n2) possible edges Tierefore the runtime complexity of naïve algorithm is O (n2) How to improve it?

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 7 / 34

slide-8
SLIDE 8

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Naïve Sequential Algorithm for the CL Model

Algorithm 1.1: Sequential Algorithm

1: procedure Serial–Chung–Lu(w, V) 2:

S ← ∑

k∈V wk

3:

E ← Create–Edges(w, S, V)

4: procedure Create–Edges(w, S, V) 5:

E ← ∅

6:

for all i ∈ [0, n − 2] do

7:

for all j ∈ [i + 1, n − 1] do

8:

Add edge (i, j) to E with probability

wiwj S

9:

return E

For n nodes there are (n

2

) = n(n−1)

2

= O (n2) possible edges Tierefore the runtime complexity of naïve algorithm is O (n2) How to improve it?

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 7 / 34

slide-9
SLIDE 9

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Recall Erdos–Renyi Model

A set of n nodes V = {0, 1, 2, . . . , n − 1} Probability of an edge between nodes i and j is p (constant) For every possible node pairs, add an edge with probability p

4 2 3 1 Graph List of all possible edges (0,1) e1 (0,2) e2 (0,3) e3 (0,4) e4 (1,2) e5 (1,3) e6 (1,4) e7 (2,3) e8 (2,4) e9 (3,4) e10

Generating an ER network with 5 nodes

Runtime complexity of O(n2) Batagelj and Brandes proposed an effjcient O(n + m) algorithm

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 8 / 34

slide-10
SLIDE 10

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Recall Erdos–Renyi Model

A set of n nodes V = {0, 1, 2, . . . , n − 1} Probability of an edge between nodes i and j is p (constant) For every possible node pairs, add an edge with probability p

(0,1) e1 (0,2) e2 (0,3) e3 (0,4) e4 (1,2) e5 (1,3) e6 (1,4) e7 (2,3) e8 (2,4) e9 (3,4) e10 4 2 3 1 Graph List of all possible edges

Generating an ER network with 5 nodes

Runtime complexity of O(n2) Batagelj and Brandes proposed an effjcient O(n + m) algorithm

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 8 / 34

slide-11
SLIDE 11

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Recall Erdos–Renyi Model

A set of n nodes V = {0, 1, 2, . . . , n − 1} Probability of an edge between nodes i and j is p (constant) For every possible node pairs, add an edge with probability p

(0,1) e1 (0,2) e2 (0,3) e3 (0,4) e4 (1,2) e5 (1,3) e6 (1,4) e7 (2,3) e8 (2,4) e9 (3,4) e10 4 2 3 1 Graph List of all possible edges

Generating an ER network with 5 nodes

Runtime complexity of O(n2) Batagelj and Brandes proposed an effjcient O(n + m) algorithm

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 8 / 34

slide-12
SLIDE 12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Recall Erdos–Renyi Model

A set of n nodes V = {0, 1, 2, . . . , n − 1} Probability of an edge between nodes i and j is p (constant) For every possible node pairs, add an edge with probability p

(0,1) e1 (0,2) e2 (0,3) e3 (0,4) e4 (1,2) e5 (1,3) e6 (1,4) e7 (2,3) e8 (2,4) e9 (3,4) e10 4 2 3 1 Graph List of all possible edges

Generating an ER network with 5 nodes

Runtime complexity of O(n2) Batagelj and Brandes proposed an effjcient O(n + m) algorithm

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 8 / 34

slide-13
SLIDE 13

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Discarded Potential Edges

A sequence of potential edges are discarded before creating an edge What if we could count the number of discarded edges beforehand and add the next potential edge to the graph? It would take O(m) time (instead of O(n2)), where m is the number of edges

(0,1) e1 (0,2) e2 (0,3) e3 (0,4) e4 (1,2) e5 (1,3) e6 (1,4) e7 (2,3) e8 (2,4) e9 (3,4) e10 4 2 3 1 Graph List of all possible edges 2 edges discarded 4 edges discarded

Tie only challenge is how to determine the number of discarded edges and still maintain the same edge creation probability p

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 9 / 34

slide-14
SLIDE 14

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the ER Model

Creating an edge is a Bernoulli trial with success probability p Let X be the random variable which denotes the number of trials required before generating a successful edge

Note that X is a geometric random variable Let f(δ) be the probability that an edge is generated after exactly δ trials (f(δ) is the probability mass function of X): f(δ) = Pr (X = δ) = (1 − p)(δ−1)p = qδ−1p where q = 1 − p Tie cumulative distribution function F(δ) of X is given by: F(δ) = Pr (X ≤ δ) =

δ

i=1

qi−1p =

δ

j=0

qjp = p1 − qδ 1 − q = 1 − qδ

Note that δ − 1 edges are discarded, which is also called the skip length

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 10 / 34

slide-15
SLIDE 15

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the ER Model

How do we determine δ maintaining the probabilities? δ can be determined from a uniform random number [Batagelj 2005] Let r ∈ [0, 1) be a number chosen uniformly at random

1 f(1) = p f(2) = qp f(3) = q2p f(4)

. . . r

F(1) F(2) F(3) F(4)

Probability that r ∈ [F(δ − 1), F(δ)] is F(δ) − F(δ − 1) = f(δ) Tierefore we have: r < F(δ) = 1 − qδ ⇐ ⇒ δ > log (1 − r) log q δ = ⌊ log (1 − r) log (1 − p) ⌋

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 11 / 34

slide-16
SLIDE 16

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the ER Model

Algorithm 1.2: Effjcient Sequential ER Algorithm

1: procedure ER(V, p) 2:

E ← ∅, e ← 0

3:

for v ← 0 to M − 1 do

4:

r ← [0, 1)

5:

δ ← log (1−r)

log (1−p)

6:

Skip δ − 1 edges and add edge (e + δ) to E

7:

e ← e + δ

8:

return E

Generate δ (Line 5) Skip δ − 1 edges and add the next edge to the graph (Line 6) Repeat until there are no more potential edges left Runtime O(m + n) [Batagelj 2005]

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 12 / 34

slide-17
SLIDE 17

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the CL Model

Same edge skipping idea can be used for the CL model [Miller 2011] But, the probability of an edge between any pair of node is not constant Solution: [Miller 2011]

i i + 1

wi wi+1

. . .

n − 1

wn−1 wi w0

. . .

1 Sort the list of weights w in descending order (wi ≥ wi+1) 2 Each node i ∈ V connects to a node in the range [i + 1, n − 1] 3 Let Ti be the task to create edges from i with starting prob. p = wiwi+1

S

Tiere are n such tasks {Tu : u ∈ V}

4 Compute skip length δ − 1 using probability p 5 Let node k = i + δ is selected

Note k is selected with prob. wiwi+1

S

but should be done with prob. wiwk

S

Add edge (i, k) to the graph with prob. q

p where q = wiwk S

Set p = q and repeat Step 4 until Task Ti is finished

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 13 / 34

slide-18
SLIDE 18

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the CL Model

Same edge skipping idea can be used for the CL model [Miller 2011] But, the probability of an edge between any pair of node is not constant Solution: [Miller 2011]

i i + 1

wi wi+1

. . .

n − 1

wn−1 wi w0

. . .

1 Sort the list of weights w in descending order (wi ≥ wi+1) 2 Each node i ∈ V connects to a node in the range [i + 1, n − 1] 3 Let Ti be the task to create edges from i with starting prob. p = wiwi+1

S

Tiere are n such tasks {Tu : u ∈ V}

4 Compute skip length δ − 1 using probability p 5 Let node k = i + δ is selected

Note k is selected with prob. wiwi+1

S

but should be done with prob. wiwk

S

Add edge (i, k) to the graph with prob. q

p where q = wiwk S

Set p = q and repeat Step 4 until Task Ti is finished

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 13 / 34

slide-19
SLIDE 19

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the CL Model

Same edge skipping idea can be used for the CL model [Miller 2011] But, the probability of an edge between any pair of node is not constant Solution: [Miller 2011]

i i + 1

wi wi+1

. . .

n − 1

wn−1 wi w0

. . .

1 Sort the list of weights w in descending order (wi ≥ wi+1) 2 Each node i ∈ V connects to a node in the range [i + 1, n − 1] 3 Let Ti be the task to create edges from i with starting prob. p = wiwi+1

S

Tiere are n such tasks {Tu : u ∈ V}

4 Compute skip length δ − 1 using probability p 5 Let node k = i + δ is selected

Note k is selected with prob. wiwi+1

S

but should be done with prob. wiwk

S

Add edge (i, k) to the graph with prob. q

p where q = wiwk S

Set p = q and repeat Step 4 until Task Ti is finished

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 13 / 34

slide-20
SLIDE 20

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the CL Model

Same edge skipping idea can be used for the CL model [Miller 2011] But, the probability of an edge between any pair of node is not constant Solution: [Miller 2011]

i i + 1

wi wi+1

. . .

n − 1

wn−1 wi w0

. . .

1 Sort the list of weights w in descending order (wi ≥ wi+1) 2 Each node i ∈ V connects to a node in the range [i + 1, n − 1] 3 Let Ti be the task to create edges from i with starting prob. p = wiwi+1

S

Tiere are n such tasks {Tu : u ∈ V}

4 Compute skip length δ − 1 using probability p 5 Let node k = i + δ is selected

Note k is selected with prob. wiwi+1

S

but should be done with prob. wiwk

S

Add edge (i, k) to the graph with prob. q

p where q = wiwk S

Set p = q and repeat Step 4 until Task Ti is finished

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 13 / 34

slide-21
SLIDE 21

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the CL Model

Same edge skipping idea can be used for the CL model [Miller 2011] But, the probability of an edge between any pair of node is not constant Solution: [Miller 2011]

i i + 1

wi wi+1

. . .

n − 1

wn−1

k

wk wi w0

. . .

1 Sort the list of weights w in descending order (wi ≥ wi+1) 2 Each node i ∈ V connects to a node in the range [i + 1, n − 1] 3 Let Ti be the task to create edges from i with starting prob. p = wiwi+1

S

Tiere are n such tasks {Tu : u ∈ V}

4 Compute skip length δ − 1 using probability p 5 Let node k = i + δ is selected

Note k is selected with prob. wiwi+1

S

but should be done with prob. wiwk

S

Add edge (i, k) to the graph with prob. q

p where q = wiwk S

Set p = q and repeat Step 4 until Task Ti is finished

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 13 / 34

slide-22
SLIDE 22

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the CL Model

Same edge skipping idea can be used for the CL model [Miller 2011] But, the probability of an edge between any pair of node is not constant Solution: [Miller 2011]

i i + 1

wi wi+1

. . .

n − 1

wn−1

k

wk wi w0

. . .

1 Sort the list of weights w in descending order (wi ≥ wi+1) 2 Each node i ∈ V connects to a node in the range [i + 1, n − 1] 3 Let Ti be the task to create edges from i with starting prob. p = wiwi+1

S

Tiere are n such tasks {Tu : u ∈ V}

4 Compute skip length δ − 1 using probability p 5 Let node k = i + δ is selected

Note k is selected with prob. wiwi+1

S

but should be done with prob. wiwk

S

Add edge (i, k) to the graph with prob. q

p where q = wiwk S

Set p = q and repeat Step 4 until Task Ti is finished

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 13 / 34

slide-23
SLIDE 23

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the CL Model

Same edge skipping idea can be used for the CL model [Miller 2011] But, the probability of an edge between any pair of node is not constant Solution: [Miller 2011]

i i + 1

wi wi+1

. . .

n − 1

wn−1

k

wk wi w0

. . .

1 Sort the list of weights w in descending order (wi ≥ wi+1) 2 Each node i ∈ V connects to a node in the range [i + 1, n − 1] 3 Let Ti be the task to create edges from i with starting prob. p = wiwi+1

S

Tiere are n such tasks {Tu : u ∈ V}

4 Compute skip length δ − 1 using probability p 5 Let node k = i + δ is selected

Note k is selected with prob. wiwi+1

S

but should be done with prob. wiwk

S

Add edge (i, k) to the graph with prob. q

p where q = wiwk S

Set p = q and repeat Step 4 until Task Ti is finished

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 13 / 34

slide-24
SLIDE 24

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Algorithm for the CL Model

Same edge skipping idea can be used for the CL model [Miller 2011] But, the probability of an edge between any pair of node is not constant Solution: [Miller 2011]

i i + 1

wi wi+1

. . .

n − 1

wn−1

k

wk wi w0

. . .

1 Sort the list of weights w in descending order (wi ≥ wi+1) 2 Each node i ∈ V connects to a node in the range [i + 1, n − 1] 3 Let Ti be the task to create edges from i with starting prob. p = wiwi+1

S

Tiere are n such tasks {Tu : u ∈ V}

4 Compute skip length δ − 1 using probability p 5 Let node k = i + δ is selected

Note k is selected with prob. wiwi+1

S

but should be done with prob. wiwk

S

Add edge (i, k) to the graph with prob. q

p where q = wiwk S

Set p = q and repeat Step 4 until Task Ti is finished

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 13 / 34

slide-25
SLIDE 25

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generating Networks with a Given Expected Degree Sequence Generative Algorithm for the Chung–Lu Model

Effjcient Sequential Algorithm for the CL Model

Algorithm 2.1: Sequential Chung–Lu Algorithm

1: procedure Serial–CL(w) 2:

S ← ∑

k wk

3:

E ← Create–Edges(w, S, V)

4: procedure Create–Edges(w, S, V) 5:

E ← ∅

6:

for all i ∈ V do

7:

j ← i + 1, p ← min(wiwj/S, 1)

8:

while j < n and p > 0 do

9:

if p ̸= 1 then

10:

choose a random r ∈ (0, 1)

11:

δ ← ⌊log(r)/ log(1 − p)⌋

12:

else

13:

δ ← 0

14:

v ← j + δ ◃ skip δ edges

15:

if v < n then

16:

q ← min(wiwv/S, 1)

17:

choose a random r ∈ (0, 1)

18:

if r < q/p then

19:

E ← E ∪ {i, v}

20:

p ← q, j ← v + 1

21:

return E

For each i ∈ V Line 6 to 20 creates the edges from node i called task Ti Create–Edges executes n tasks {Ti : i ∈ V}

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 14 / 34

slide-26
SLIDE 26

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Challenges

Parallel Generation of Graphs using the CL Model

To parallelize the algorithm, workloads must be distributed among P processors Effjcient parallelization of Algorithm 2.1 requires:

Computing sum S = ∑

k wk in parallel

Sum can be computed in O ( n

P + log P

) time

Dividing the task of executing Create–Edges into independent subtasks

Distribute n tasks {Tu : u ∈ V} among P processors Note that, executing Tu does not depend on executing Tv where u ̸= v

Accurately estimating the computational cost for each task

What is the best estimation of computational cost? Computing computational costs must also be done in parallel

Balancing computational load among the processors

How to distribute the tasks to achieve the best load balancing?

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 15 / 34

slide-27
SLIDE 27

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Challenges

Parallel Generation of Graphs using the CL Model

To parallelize the algorithm, workloads must be distributed among P processors Effjcient parallelization of Algorithm 2.1 requires:

Computing sum S = ∑

k wk in parallel

Sum can be computed in O ( n

P + log P

) time

Dividing the task of executing Create–Edges into independent subtasks

Distribute n tasks {Tu : u ∈ V} among P processors Note that, executing Tu does not depend on executing Tv where u ̸= v

Accurately estimating the computational cost for each task

What is the best estimation of computational cost? Computing computational costs must also be done in parallel

Balancing computational load among the processors

How to distribute the tasks to achieve the best load balancing?

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 15 / 34

slide-28
SLIDE 28

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Challenges

Parallel Generation of Graphs using the CL Model

To parallelize the algorithm, workloads must be distributed among P processors Effjcient parallelization of Algorithm 2.1 requires:

Computing sum S = ∑

k wk in parallel

Sum can be computed in O ( n

P + log P

) time

Dividing the task of executing Create–Edges into independent subtasks

Distribute n tasks {Tu : u ∈ V} among P processors Note that, executing Tu does not depend on executing Tv where u ̸= v

Accurately estimating the computational cost for each task

What is the best estimation of computational cost? Computing computational costs must also be done in parallel

Balancing computational load among the processors

How to distribute the tasks to achieve the best load balancing?

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 15 / 34

slide-29
SLIDE 29

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Challenges

Parallel Generation of Graphs using the CL Model

To parallelize the algorithm, workloads must be distributed among P processors Effjcient parallelization of Algorithm 2.1 requires:

Computing sum S = ∑

k wk in parallel

Sum can be computed in O ( n

P + log P

) time

Dividing the task of executing Create–Edges into independent subtasks

Distribute n tasks {Tu : u ∈ V} among P processors Note that, executing Tu does not depend on executing Tv where u ̸= v

Accurately estimating the computational cost for each task

What is the best estimation of computational cost? Computing computational costs must also be done in parallel

Balancing computational load among the processors

How to distribute the tasks to achieve the best load balancing?

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 15 / 34

slide-30
SLIDE 30

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Challenges

Parallel Generation of Graphs using the CL Model

To parallelize the algorithm, workloads must be distributed among P processors Effjcient parallelization of Algorithm 2.1 requires:

Computing sum S = ∑

k wk in parallel

Sum can be computed in O ( n

P + log P

) time

Dividing the task of executing Create–Edges into independent subtasks

Distribute n tasks {Tu : u ∈ V} among P processors Note that, executing Tu does not depend on executing Tv where u ̸= v

Accurately estimating the computational cost for each task

What is the best estimation of computational cost? Computing computational costs must also be done in parallel

Balancing computational load among the processors

How to distribute the tasks to achieve the best load balancing?

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 15 / 34

slide-31
SLIDE 31

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Parallel Algorithm

Core Parallel Algorithm

Parallel Algorithm Compute sum S = ∑

k

wk Partition the nodes into P disjoint sets: V0, V1, . . . , VP−1 such that:

Vi ∩ Vj = ∅ for any i ̸= j and ∪

i

Vi = V for 0 ≤ i ≤ P − 1

Processor Pi executes the tasks {Tu : u ∈ Vi}

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 16 / 34

slide-32
SLIDE 32

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Partitioning

Partitioning Node partition plays important role in performance Partitioning the nodes into P partitions such that computational loads are well balanced is the most challenging part

P0 P1 P2 P3 P0 P1 P2 P3

Poor Runtime Good Runtime

Sometimes trivial partitioning leads to poor performance We use two types of partitioning:

1 Uniform Cost Partitioning (UCP)

We need to define appropriate cost function

2 Round Robin Partitioning (RRP)

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 17 / 34

slide-33
SLIDE 33

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Uniform Cost Partitioning: Computational Cost

Let any task Tu creates eu expected edges where: eu =

n−1

v=u+1

wuwv S = wu S

n−1

v=u+1

wv For simplicity, we assign one unit time for creating an edge and one unit time for processing one node Computational cost of Tu is given by: cu = eu + 1 Computational cost of partition Vi is given by: c(Vi) = ∑

u∈Vi

cu = mi + |Vi| where mi is the expected number of edges produced by partition Vi Total cost for all processors:

P−1

i=0

c(Vi) = m + n

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 18 / 34

slide-34
SLIDE 34

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Uniform Cost Partitioning: Consecutive Partitions

Nodes are partitioned into P partitions in consecutive fashion Partition Vi contains nodes {ni, ni + 1, . . . , ni+1 − 1}

ni is the first node of partition Vi Number of nodes: |Vi| = ni+1 − ni

How many nodes per partition? In the most naïve scheme, assign n

P nodes per partition such that partition Vi

has the nodes [

in P , (i+1)n P

− 1 ] Lemma 2 Let c(Vi) be the computational cost for partition Vi. In the naïve partitioning scheme, we have c(Vi) − c(Vi+1) ≥ n2

SP2 WiWi+1, where Wi = 1 |Vi|

u∈Vi wu, the

average weight of the nodes in Vi. Lemma 2, shows that the cost difgerence between two subsequent partitions are significant and leads imbalanced work load

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 19 / 34

slide-35
SLIDE 35

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Naïve Consecutive Partitioning

0.0 × 100 5.0 × 108 1.0 × 109 1.5 × 109 2.0 × 109 100 200 300 400 500

Processor Rank Cost

(a) Computational Cost

100 200 300 400 100 200 300 400 500

Processor Rank Time (seconds)

(b) Runtime

Networks ER PL

Computational cost and runtime for naive scheme for power–law and ER networks

Naive Consecutive Scheme Figure shows that in the naive scheme, computational cost is very unbalanced across the processors Figure also shows that, there is a direct correlation between estimated cost and actual runtime, which validates our choice of cost function

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 20 / 34

slide-36
SLIDE 36

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Uniform Consecutive Partitioning

Total cost: m + n Optimum cost per partition: m+n

P

Calculate cost cu for each task Tu Partition the tasks so that expected cost of each partition is about m+n

P

Diffjculties:

Calculating costs cu in the original form cu = wu S

n−1

v=u+1

wv + 1 for all u ∈ V takes O(n2) time sequentially No parallel algorithm exists to find the boundaries Finding partition boundaries require Ω(n + P log n) time in the best known sequential algorithms

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 21 / 34

slide-37
SLIDE 37

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Uniform Consecutive Partitioning

Total cost: m + n Optimum cost per partition: m+n

P

Calculate cost cu for each task Tu Partition the tasks so that expected cost of each partition is about m+n

P

Diffjculties:

Calculating costs cu in the original form cu = wu S

n−1

v=u+1

wv + 1 for all u ∈ V takes O(n2) time sequentially No parallel algorithm exists to find the boundaries Finding partition boundaries require Ω(n + P log n) time in the best known sequential algorithms

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 21 / 34

slide-38
SLIDE 38

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Uniform Consecutive Partitioning

Total cost: m + n Optimum cost per partition: m+n

P

Calculate cost cu for each task Tu Partition the tasks so that expected cost of each partition is about m+n

P

Diffjculties:

Calculating costs cu in the original form cu = wu S

n−1

v=u+1

wv + 1 for all u ∈ V takes O(n2) time sequentially No parallel algorithm exists to find the boundaries Finding partition boundaries require Ω(n + P log n) time in the best known sequential algorithms

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 21 / 34

slide-39
SLIDE 39

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Uniform Consecutive Partitioning

Total cost: m + n Optimum cost per partition: m+n

P

Calculate cost cu for each task Tu Partition the tasks so that expected cost of each partition is about m+n

P

Diffjculties:

Calculating costs cu in the original form cu = wu S

n−1

v=u+1

wv + 1 for all u ∈ V takes O(n2) time sequentially No parallel algorithm exists to find the boundaries Finding partition boundaries require Ω(n + P log n) time in the best known sequential algorithms

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 21 / 34

slide-40
SLIDE 40

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Uniform Consecutive Partitioning

Total cost: m + n Optimum cost per partition: m+n

P

Calculate cost cu for each task Tu Partition the tasks so that expected cost of each partition is about m+n

P

Diffjculties:

Calculating costs cu in the original form cu = wu S

n−1

v=u+1

wv + 1 for all u ∈ V takes O(n2) time sequentially No parallel algorithm exists to find the boundaries Finding partition boundaries require Ω(n + P log n) time in the best known sequential algorithms

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 21 / 34

slide-41
SLIDE 41

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Uniform Consecutive Partitioning

Total cost: m + n Optimum cost per partition: m+n

P

Calculate cost cu for each task Tu Partition the tasks so that expected cost of each partition is about m+n

P

Diffjculties:

Calculating costs cu in the original form cu = wu S

n−1

v=u+1

wv + 1 for all u ∈ V takes O(n2) time sequentially No parallel algorithm exists to find the boundaries Finding partition boundaries require Ω(n + P log n) time in the best known sequential algorithms

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 21 / 34

slide-42
SLIDE 42

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Computing Costs Effjciently in Parallel

From the definition of eu we have: eu = wu S

n−1

v=u+1

wv = wu S (n−1 ∑

v=0

wv −

u

v=0

wv ) = wu S (n−1 ∑

v=0

wv −

u−1

v=0

wv − wu ) cu = eu + 1 = wu S (S − σu − wu) + 1 [ where σu =

u−1

v=0

wv ] (1) Tiis equation can be solved independently in each processor which requires O ( n

P + log P

) = O (n

P

) time

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 22 / 34

slide-43
SLIDE 43

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Computing Costs Effjciently in Parallel

sP −1 =

n−1

  • v=n−n/P

wv s0 =

n/P −1

  • v=0

wv s1 =

2n/P −1

  • v=n/P

wv Processor 0 Processor 1 Processor P-1 Exclusive Prefix Sum on si S0 = 0 S1 = 0

i=0 si

SP −1 = P −2

i=0 si

σu ← Su Cu ← eu + 1 = wu(S−σu−wu)

S

+ 1 for v = u + 1 to (i+1)n

P

− 1 σv ← σv−1 + wv ev ← wv(S−σv−wv)

S

Cv ← Cv−1 + ev + 1 zi = C (i+1)n

P

−1

Exclusive Prefix Sum on zi Z0 = 0 Z1 = 0

i=0 zi

ZP −1 = P −2

i=0 zi

for v = u to (i+1)n

P

− 1 Cv ← Cv + Zi Steps

1 2 3 4 5

u ← in

P

5

Processor Pi

Steps for determining cumulative cost in UCP

Algorithm 3.1: UCP Scheme

1: procedure UCP(V, w, S) 2:

Calc–Cost(w, V, S)

3:

Make–Partition(w, V, S)

4: procedure Calc–Cost(w, V, S) 5:

i ← processor id

6:

si ← ∑(i+1) n

P −1

u=i n

P

wu

7:

In Parallel: Si ← ∑i−1

j=0 sj

8:

u ← in

P

9:

σu ← Si

10:

Cu ← eu + 1 = wu

S (S − σu − wu) + 1

11:

for u = in

P + 1 to (i+1)n P

− 1 do

12:

σu ← σu + wu

13:

eu ← wu

S (S − σu − wu)

14:

Cu ← Cu−1 + eu + 1

15:

zi ← C (i+1)n

P

−1

16:

In Parallel: Zi ← ∑i−1

j=0 zj

17:

for u = in

P to (i+1)n P

− 1 do

18:

Cu = Cu + Zi

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 23 / 34

slide-44
SLIDE 44

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Finding Partition Boundaries

We have n costs c0, c1, . . . cn−1 for n tasks T0, T1, . . . , Tn−1 We need to distribute costs to P partitions Compute cumulative costs Ci = ∑i

j=0 cj

n1 ni − 1 ni ni+1 − 1 ni+1 n − 1

Partition 0 Partition i

n2 − 1

Partition 1

n2 nP−1 nP−1 − 1

Partition P − 1

i(m+n) P (P−1)(m+n) P

n1 − 1

Cni Cni−1 Cni+1−1

2(m+n) P m+n P

C0 Cn1−1 Cn1 Cn2−1

(i+1)(m+n) P

Cn2

(m + n)

Cni−1 < i(m+n)

P

≤ Cni

Cn−1

· · ·

(P−1)(m+n) P

If for two consecutive nodes ⟨k − 1, k⟩, the inequality Ck−1 < i(m+n)

P

≤ Ck holds, then k will be the lower boundary of partition Vi Tierefore, lower boundary ni of partition Vi is determined by: ni = arg min

u

( Cu ≥ im + n P )

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 24 / 34

slide-45
SLIDE 45

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Finding Partition Boundaries

Algorithm 3.1: Uniform Consecutive Partition

19: procedure Make–Partition(w, V, S) 20:

In Parallel: Z ← ∑P−1

i=0 zi

21:

Z ← Z/P

22:

Find–Boundaries( in

P , (i+1)n P

− 1, C, Z)

23:

for all nk ∈ Bi do

24:

Send nk to Pk and Pk+1

25:

Receive boundaries ni and ni+1

26:

return Vi = [ni, ni+1 − 1]

27: procedure Find–Boundaries(s, e, C, Z) 28:

if ⌊

Cs Z

⌋ = ⌊

Ce Z

⌋ then return

29:

m ← (e+s)

2

30:

if ⌊

Cm Z

⌋ ̸= ⌊ Cm+1

Z

⌋ then

31:

n⌊ Cm+1

Z

⌋ ← m + 1

32:

Find–Boundaries(s, m, C, Z)

33:

Find–Boundaries(m + 1, e, C, Z)

Tie task of finding boundaries is divided equally among the processors

Processors Pi is responsible for finding boundaries in range [

in P , (i+1)n P

− 1 ]

Pi finds the boundaries in the range using Find–Boundaries Pi sends the local boundaries to the corresponding processor Takes O ( n

P + P

) time in the worst case Can be used in other partitioning problem

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 25 / 34

slide-46
SLIDE 46

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Runtime Complexity of Uniform Cost Partitioning

Tieorem 1 Tie parallel algorithm for determining the partition boundaries of the UCP scheme runs in O ( n

P + P

) time, where n and P are the number of nodes and processors, respectively. Tieorem 2 Tie computational cost in each processor is O ( m+n

P

) w.h.p. Tieorem 3 Our parallel algorithm with UCP scheme for generating random networks with the CL model runs in O (m+n

P

+ P ) time w.h.p.

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 26 / 34

slide-47
SLIDE 47

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Load Balancing and Partitioning

Round Robin Partitioning

Nodes are distributed in round robin fashion

P0 P1 P2 P3

Partition Vi = {i, i + P, . . . , i + kP} [k = n/P] Each partition has almost equal number of nodes (⌈ n

P⌉ or ⌊ n P⌋)

Lemma 3 In RRP scheme, for any i < j, we have c(Vi) − c(Vj) ≤ wi. Slightly imbalanced loads among processors Performs well in practice

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 27 / 34

slide-48
SLIDE 48

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Results

Experimental Dataset

Table: Networks used in the experiments

Network Type Nodes Edges PL Power Law Network 1B 249B ER Erdős–Rényi Network 1M 200M Miami Contact Network 2.1M 51.4B Twitter Real–World Social Network 41.65M 1.37B Friendster Real–World Social Network 65.61M 1.81B

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 28 / 34

slide-49
SLIDE 49

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Results

Comparison of Partitioning Schemes

0 K 1 M 2 M 3 M 250 500 750 1,000

Processor Cost Naive UCP RRP

ER

0 K 100 M 200 M 300 M 400 M 500 M 250 500 750 1,000

Processor Cost Naive UCP RRP

PL

0 K 1 M 2 M 3 M 4 M 5 M 250 500 750 1,000

Processor Cost Naive UCP RRP

Twitter

0 msec 2 sec 4 sec 6 sec 250 500 750 1,000

Processor Time Naive UCP RRP

ER

0 msec 25 sec 50 sec 1.2 min 1.7 min 2.1 min 250 500 750 1,000

Processor Time Naive UCP RRP

PL

0 msec 500 msec 1 sec 1.5 sec 2 sec 250 500 750 1,000

Processor Time Naive UCP RRP

Twitter

Cost and Runtime

UCP and RRP both has similar workload in practice RRP is slower than UCP due to lack of locality of memory references

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 29 / 34

slide-50
SLIDE 50

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Results

Degree Distribution of Generated Networks

10000 20000 100 200 300 400 Degree # of Nodes Input Output

Degree Distribution of ER Network

Erdos–Renyi

100 101 102 103 104 105 106 100 101 102 103 Degree # of Nodes Input Output

Degree Distribution of PowerLaw Network

Power–Law

10000 20000 30000 100 200 300 400 Degree # of Nodes Input Output

Degree Distribution of Miami Network

Miami Contact Network

100 102 104 106 100 101 102 103 104 105 106 Degree # of Nodes Input Output

Degree Distribution of Twitter Network

Twitter

100 102 104 106 100 101 102 103 Degree # of Nodes Input Output

Degree Distribution of Friendster Network

Friendster

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 30 / 34

slide-51
SLIDE 51

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Results

Strong and Weak Scaling

  • 200

400 600 800 250 500 750 1000

Number of Processors Speedup

(a) Strong Scaling (PL)

  • 100

200 300 400 250 500 750 1000

Number of Processors Speedup

(b) Strong Scaling (Twitter)

  • 20

40 60 250 500 750 1000

Number of Processors Runtime (sec.)

(c) Weak Scaling (PL)

Scheme

  • UCP

RRP Naive

Strong and Weak Scaling

Very good linear speedup with large number of processors Very good weak scaling

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 31 / 34

slide-52
SLIDE 52

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel Generation of Networks using Chung–Lu Model Results

Generating Large Networks

Tie primary goal of the work is to generate massive networks very fast With 1 B nodes and 249 B edges following power-law distributions, it took about a minute to generate using 1024 processors

Used 64 computing node each with 16 processors Each node is powered by two octa-core SandyBridge E5–2670 2.60GHz (3.3GHz Turbo) processors with 64 GB memory

About 800 speedup achieved using UCP

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 32 / 34

slide-53
SLIDE 53

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Summary

Conclusion

Designed and implemented parallel algorithms to generate massive scale-free networks using a preferential attachment algorithm and any arbitrary networks with a given expected degree sequence/distribution Scalable algorithms to a large number of processors Very good load-balancing Can generate in-memory networks with billions of edges quickly

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 33 / 34

slide-54
SLIDE 54

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Appendix For Further Reading

References

Leskovec, J.: Dynamics of Large Networks Ph.D. thesis, CMU (2008) Batagelj, V., Brandes, U.: Effjcient Generation of Large Random Networks

  • Phys. Rev. E (2005)

Miller, J., Hagberg, A.: Effjcient Generation of Networks with Given Expected Degrees. In: Proc. of Algorithms and Models for the Web-Graph (2011) Global Social Networks Ranked by Number of Users Statista.com

Maksudul Alam (Virginia Tech) Generating Rand. Net. with Given Deg. Seq. September 17, 2015 34 / 34