Uncovering disassortativity in large scale-free networks Nelly - - PowerPoint PPT Presentation

uncovering disassortativity in large scale free networks
SMART_READER_LITE
LIVE PREVIEW

Uncovering disassortativity in large scale-free networks Nelly - - PowerPoint PPT Presentation

Uncovering disassortativity in large scale-free networks Nelly Litvak University of Twente, Stochastic Operations Research group Joint work with Remco van der Hofstad Supported by EC FET Open project NADINE Trento, Italy, 23-07-2012 Power


slide-1
SLIDE 1

Uncovering disassortativity in large scale-free networks

Nelly Litvak University of Twente, Stochastic Operations Research group Joint work with Remco van der Hofstad Supported by EC FET Open project NADINE Trento, Italy, 23-07-2012

slide-2
SLIDE 2

Power laws

◮ degree of the node = # links, [fraction nodes degree k] = pk, [ N. Litvak, SOR group ] 2/30

slide-3
SLIDE 3

Power laws

◮ degree of the node = # links, [fraction nodes degree k] = pk, ◮ Power law: pk ≈ const · k−α, α > 1. [ N. Litvak, SOR group ] 2/30

slide-4
SLIDE 4

Power laws

◮ degree of the node = # links, [fraction nodes degree k] = pk, ◮ Power law: pk ≈ const · k−α, α > 1. ◮ Power laws: Internet, WWW, social networks, biological

networks, etc...

[ N. Litvak, SOR group ] 2/30

slide-5
SLIDE 5

Power laws

◮ degree of the node = # links, [fraction nodes degree k] = pk, ◮ Power law: pk ≈ const · k−α, α > 1. ◮ Power laws: Internet, WWW, social networks, biological

networks, etc...

◮ Model for high variability, scale-free graph [ N. Litvak, SOR group ] 2/30

slide-6
SLIDE 6

Power laws

◮ degree of the node = # links, [fraction nodes degree k] = pk, ◮ Power law: pk ≈ const · k−α, α > 1. ◮ Power laws: Internet, WWW, social networks, biological

networks, etc...

◮ Model for high variability, scale-free graph ◮ signature log-log plot: log pk = log(const) − α log k [ N. Litvak, SOR group ] 2/30

slide-7
SLIDE 7

Power laws

◮ degree of the node = # links, [fraction nodes degree k] = pk, ◮ Power law: pk ≈ const · k−α, α > 1. ◮ Power laws: Internet, WWW, social networks, biological

networks, etc...

◮ Model for high variability, scale-free graph ◮ signature log-log plot: log pk = log(const) − α log k ◮ Faloutsos, Faloutsos, Faloutsos (1999): power laws in Internet [ N. Litvak, SOR group ] 2/30

slide-8
SLIDE 8

But Power Law is not everything!

Example: Robustness of the Internet.

◮ Albert, Jeong and Barabasi (2000): Achille’s heel of Internet:

Internet is sensitive to targeted attack

[ N. Litvak, SOR group ] 3/30

slide-9
SLIDE 9

But Power Law is not everything!

Example: Robustness of the Internet.

◮ Albert, Jeong and Barabasi (2000): Achille’s heel of Internet:

Internet is sensitive to targeted attack

◮ Doyle et al. (2005): Robust yet fragile nature of Internet:

Internet is not a random graph, it is designed to be robust

[ N. Litvak, SOR group ] 3/30

slide-10
SLIDE 10

But Power Law is not everything! (cont.)

Example: Spread of infections

◮ Classical epidemiology, e.g. Adnerson and May (1991):

epidemic only if infection rate exceeds a critical value

[ N. Litvak, SOR group ] 4/30

slide-11
SLIDE 11

But Power Law is not everything! (cont.)

Example: Spread of infections

◮ Classical epidemiology, e.g. Adnerson and May (1991):

epidemic only if infection rate exceeds a critical value

◮ Vespignani et al. (2001): power law networks have a zero

critical infection rate!

[ N. Litvak, SOR group ] 4/30

slide-12
SLIDE 12

But Power Law is not everything! (cont.)

Example: Spread of infections

◮ Classical epidemiology, e.g. Adnerson and May (1991):

epidemic only if infection rate exceeds a critical value

◮ Vespignani et al. (2001): power law networks have a zero

critical infection rate!

◮ Eguiluz et al. (2002): a specially wired highly clustered

network is resistant up to a certain critical infection rate.

[ N. Litvak, SOR group ] 4/30

slide-13
SLIDE 13

But Power Law is not everything! (cont.)

Example: Spread of infections

◮ Classical epidemiology, e.g. Adnerson and May (1991):

epidemic only if infection rate exceeds a critical value

◮ Vespignani et al. (2001): power law networks have a zero

critical infection rate!

◮ Eguiluz et al. (2002): a specially wired highly clustered

network is resistant up to a certain critical infection rate. Example: Technological versus economical networks

[ N. Litvak, SOR group ] 4/30

slide-14
SLIDE 14

Degree-degree correlations

◮ It is clearly important how the network is wired [ N. Litvak, SOR group ] 5/30

slide-15
SLIDE 15

Degree-degree correlations

◮ It is clearly important how the network is wired ◮ To start with: do hubs connect to each other? [ N. Litvak, SOR group ] 5/30

slide-16
SLIDE 16

Degree-degree correlations

◮ It is clearly important how the network is wired ◮ To start with: do hubs connect to each other?

YES for banks, NO for Internet

[ N. Litvak, SOR group ] 5/30

slide-17
SLIDE 17

Degree-degree correlations

◮ It is clearly important how the network is wired ◮ To start with: do hubs connect to each other?

YES for banks, NO for Internet

◮ Assortative networks: nodes with similar degree connect to

each other.

◮ Disassortative networks: nodes with large degrees tend to

connect to nodes with small degrees.

[ N. Litvak, SOR group ] 5/30

slide-18
SLIDE 18

Assortativity coefficient

◮ G = (V , E) undirected graph of n nodes ◮ di degree of node i = 1, 2, . . . , n [ N. Litvak, SOR group ] 6/30

slide-19
SLIDE 19

Assortativity coefficient

◮ G = (V , E) undirected graph of n nodes ◮ di degree of node i = 1, 2, . . . , n ◮ We are interested in correlations between degrees of

neighboring nodes

[ N. Litvak, SOR group ] 6/30

slide-20
SLIDE 20

Assortativity coefficient

◮ G = (V , E) undirected graph of n nodes ◮ di degree of node i = 1, 2, . . . , n ◮ We are interested in correlations between degrees of

neighboring nodes

◮ Newman (2002): assortativity measure ρn

ρn =

1 |E|

  • ij∈E didj −
  • 1

|E|

  • ij∈E

1 2(di + dj)

2

1 |E|

  • ij∈E

1 2(d2 i + d2 j ) −

  • 1

|E|

  • ij∈E

1 2(di + dj)

2

◮ Statistical estimation of the correlation coefficient between

degrees on two ends of a random edge

[ N. Litvak, SOR group ] 6/30

slide-21
SLIDE 21

Assortativity coefficient

◮ G = (V , E) undirected graph of n nodes ◮ di degree of node i = 1, 2, . . . , n ◮ We are interested in correlations between degrees of

neighboring nodes

◮ Newman (2002): assortativity measure ρn

ρn =

1 |E|

  • ij∈E didj −
  • 1

|E|

  • ij∈E

1 2(di + dj)

2

1 |E|

  • ij∈E

1 2(d2 i + d2 j ) −

  • 1

|E|

  • ij∈E

1 2(di + dj)

2

◮ Statistical estimation of the correlation coefficient between

degrees on two ends of a random edge

◮ Very popular measure of assortativity! [ N. Litvak, SOR group ] 6/30

slide-22
SLIDE 22

Is there something wrong with ρn?

◮ Preferential Attachment graph appears to be assortatively

neutral (Newman 2003, Dorogovtsev et al. 2010)

◮ Recent criticism: ρn depends on the size of the networks

(Raschke et al. 2010; Dorogovtsev et al. 2010)

[ N. Litvak, SOR group ] 7/30

slide-23
SLIDE 23

What IS assortativity measure?

◮ ρn is a statistical estimation for the coefficient of variation

ρ = E(XY ) − [E(X)]2 Var(X) ,

◮ X and Y are the degrees of the nodes on the two ends of a

randomly chosen edge

[ N. Litvak, SOR group ] 8/30

slide-24
SLIDE 24

What IS assortativity measure?

◮ ρn is a statistical estimation for the coefficient of variation

ρ = E(XY ) − [E(X)]2 Var(X) ,

◮ X and Y are the degrees of the nodes on the two ends of a

randomly chosen edge

◮ Problems? [ N. Litvak, SOR group ] 8/30

slide-25
SLIDE 25

What IS assortativity measure?

◮ ρn is a statistical estimation for the coefficient of variation

ρ = E(XY ) − [E(X)]2 Var(X) ,

◮ X and Y are the degrees of the nodes on the two ends of a

randomly chosen edge

◮ Problems? YES!!! [ N. Litvak, SOR group ] 8/30

slide-26
SLIDE 26

What IS assortativity measure?

◮ ρn is a statistical estimation for the coefficient of variation

ρ = E(XY ) − [E(X)]2 Var(X) ,

◮ X and Y are the degrees of the nodes on the two ends of a

randomly chosen edge

◮ Problems? YES!!! ◮ X and Y are power law r.v.’s, exponent α − 1

P(X = k) = kpk/E(degree).

◮ In real networks (WWW) we often have 2 < α < 3, so

E(X) =

  • k

k kpk E(degree) = ∞

[ N. Litvak, SOR group ] 8/30

slide-27
SLIDE 27

What IS assortativity measure?

◮ ρn is a statistical estimation for the coefficient of variation

ρ = E(XY ) − [E(X)]2 Var(X) ,

◮ X and Y are the degrees of the nodes on the two ends of a

randomly chosen edge

◮ Problems? YES!!! ◮ X and Y are power law r.v.’s, exponent α − 1

P(X = k) = kpk/E(degree).

◮ In real networks (WWW) we often have 2 < α < 3, so

E(X) =

  • k

k kpk E(degree) = ∞

◮ ρ is not defined in the power law model! Then: what are we

measuring?

[ N. Litvak, SOR group ] 8/30

slide-28
SLIDE 28

Assortative and disassortative graphs

◮ Newman(2003) [ N. Litvak, SOR group ] 9/30

slide-29
SLIDE 29

Assortative and disassortative graphs

◮ Newman(2003) ◮ Technological and biological networks are disassortative,

ρn < 0

◮ Social networks are assortative, ρn > 0 [ N. Litvak, SOR group ] 9/30

slide-30
SLIDE 30

Assortative and disassortative graphs

◮ Newman(2003) ◮ Technological and biological networks are disassortative,

ρn < 0

◮ Social networks are assortative, ρn > 0 ◮ Note: large networks are never strongly disassortative... [ N. Litvak, SOR group ] 9/30

slide-31
SLIDE 31

ρn in terms of moments of the degrees

◮ Write

  • ij∈E

1 2(di + dj) =

  • i∈V

d2

i ,

  • ij∈E

1 2(d2 i + d2 j ) =

  • i∈V

d3

i [ N. Litvak, SOR group ] 10/30

slide-32
SLIDE 32

ρn in terms of moments of the degrees

◮ Write

  • ij∈E

1 2(di + dj) =

  • i∈V

d2

i ,

  • ij∈E

1 2(d2 i + d2 j ) =

  • i∈V

d3

i ◮ Then

ρn =

  • ij∈E didj −

1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 .

[ N. Litvak, SOR group ] 10/30

slide-33
SLIDE 33

Extreme value theory

Theorem (Extreme value theory)

D1, D2, . . . , Dn are i.i.d. with 1 − F(x) = P(D > x) = Cx−α+1. Then lim

n→∞ P

max{D1, D2, . . . , Dn} − bn an x

  • = exp(−(1 + δx)−1/δ),

with δ = 1/(α − 1), an = δC δnδ, bn = C δnδ. (Therefore, the maximum is ‘of the order’ n1/(α−1))

[ N. Litvak, SOR group ] 11/30

slide-34
SLIDE 34

CLT for heavy tails

Theorem (CLT for heavy tails)

D1, D2, . . . , Dn are i.i.d. with 1 − F(x) = P(D > x) = Cx−α+1. If p > α − 1 then 1 an

n

  • i=1

X p

i d

→ Z, where an = [1 − F]−1(1/np) = C 1/(α−1)np/(α−1) and Z has a stable distribution with parameter (α − 1)/p. (Therefore, the sum is ‘of the order’ np/(α−1))

[ N. Litvak, SOR group ] 12/30

slide-35
SLIDE 35

In the empirical setting

◮ P(d1 x) ≈ Cx−α+1 [ N. Litvak, SOR group ] 13/30

slide-36
SLIDE 36

In the empirical setting

◮ P(d1 x) ≈ Cx−α+1 ◮ max{d1, d2, . . . , dn} = O(n1/(α−1)) ◮ Alternative interpretation for the maximum:

P(d x) = 1/n ⇒ x = O(n1/(α−1))

[ N. Litvak, SOR group ] 13/30

slide-37
SLIDE 37

In the empirical setting

◮ P(d1 x) ≈ Cx−α+1 ◮ max{d1, d2, . . . , dn} = O(n1/(α−1)) ◮ Alternative interpretation for the maximum:

P(d x) = 1/n ⇒ x = O(n1/(α−1))

◮ P(di = k) = pk = const · k−α, usually α ∈ (2, 4) [ N. Litvak, SOR group ] 13/30

slide-38
SLIDE 38

In the empirical setting

◮ P(d1 x) ≈ Cx−α+1 ◮ max{d1, d2, . . . , dn} = O(n1/(α−1)) ◮ Alternative interpretation for the maximum:

P(d x) = 1/n ⇒ x = O(n1/(α−1))

◮ P(di = k) = pk = const · k−α, usually α ∈ (2, 4) ◮ If p > α − 1 then E(Dp) = ∞ [ N. Litvak, SOR group ] 13/30

slide-39
SLIDE 39

In the empirical setting

◮ P(d1 x) ≈ Cx−α+1 ◮ max{d1, d2, . . . , dn} = O(n1/(α−1)) ◮ Alternative interpretation for the maximum:

P(d x) = 1/n ⇒ x = O(n1/(α−1))

◮ P(di = k) = pk = const · k−α, usually α ∈ (2, 4) ◮ If p > α − 1 then E(Dp) = ∞ ◮ CLT: for p > α − 1 holds

1 n

  • i∈V

dp

i ∼ cpnp/(α−1)−1, ◮ But we get the same result just by adding up kppk from

k = 1 to k = n1/(α−1).

[ N. Litvak, SOR group ] 13/30

slide-40
SLIDE 40

Assumptions

cn |E| Cn, (SLLN) cn1/(α−1) max

i∈[n] di Cn1/(α−1),

cnmax{p/(α−1),1}

  • i∈[n]

dp

i Cnmax{p/(α−1),1},

p = 2, 3, where C, c > 0.

[ N. Litvak, SOR group ] 14/30

slide-41
SLIDE 41

Assumptions

cn |E| Cn, (SLLN) cn1/(α−1) max

i∈[n] di Cn1/(α−1),

cnmax{p/(α−1),1}

  • i∈[n]

dp

i Cnmax{p/(α−1),1},

p = 2, 3, where C, c > 0. Very natural and non-restrictive assumptions for power law graphs.

[ N. Litvak, SOR group ] 14/30

slide-42
SLIDE 42

Back to ρn

ρn = crossproducts − expectation2 variance − expectation2 variance = ρ−

n [ N. Litvak, SOR group ] 15/30

slide-43
SLIDE 43

Back to ρn

ρn = crossproducts − expectation2 variance − expectation2 variance = ρ−

n

ρ−

n = − 1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 .

[ N. Litvak, SOR group ] 15/30

slide-44
SLIDE 44

Back to ρn

ρn = crossproducts − expectation2 variance − expectation2 variance = ρ−

n

ρ−

n = − 1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 .

◮ We have i∈V d3 i cn3/(α−1) ◮ But also

1 |E|

i∈V

d2

i

2 (C 2/c)nmax{4/(α−1)−1,1}.

◮ When α ∈ (2, 4) we have max{4/(α − 1) − 1, 1} < 3/(α − 1),

so that the denominator of ρ−

n outweighs its numerator. [ N. Litvak, SOR group ] 15/30

slide-45
SLIDE 45

No disassortative scale-free random graphs

ρn ρ−

n = − 1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 .

◮ Take e.g. α = 2.5 [ N. Litvak, SOR group ] 16/30

slide-46
SLIDE 46

No disassortative scale-free random graphs

ρn ρ−

n = − 1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 .

◮ Take e.g. α = 2.5 ◮ 4/(α − 1) − 3/(α − 1) = −1/3 [ N. Litvak, SOR group ] 16/30

slide-47
SLIDE 47

No disassortative scale-free random graphs

ρn ρ−

n = − 1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 .

◮ Take e.g. α = 2.5 ◮ 4/(α − 1) − 3/(α − 1) = −1/3 ◮ ρ− n = O(n−1/3) [ N. Litvak, SOR group ] 16/30

slide-48
SLIDE 48

No disassortative scale-free random graphs

ρn ρ−

n = − 1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 .

◮ Take e.g. α = 2.5 ◮ 4/(α − 1) − 3/(α − 1) = −1/3 ◮ ρ− n = O(n−1/3) ◮ ρ− n converges to zero as n → ∞ in ANY power law graph [ N. Litvak, SOR group ] 16/30

slide-49
SLIDE 49

No disassortative scale-free random graphs

ρn ρ−

n = − 1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 .

◮ Take e.g. α = 2.5 ◮ 4/(α − 1) − 3/(α − 1) = −1/3 ◮ ρ− n = O(n−1/3) ◮ ρ− n converges to zero as n → ∞ in ANY power law graph ◮ Large scale-free graphs are never disassortative! [ N. Litvak, SOR group ] 16/30

slide-50
SLIDE 50

No disassortative scale-free random graphs

ρn ρ−

n = − 1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 .

◮ Take e.g. α = 2.5 ◮ 4/(α − 1) − 3/(α − 1) = −1/3 ◮ ρ− n = O(n−1/3) ◮ ρ− n converges to zero as n → ∞ in ANY power law graph ◮ Large scale-free graphs are never disassortative! ◮ Reason: high variability in values ⇒ dependence on n [ N. Litvak, SOR group ] 16/30

slide-51
SLIDE 51

Alternative: rank correlations

◮ ((Xi, Yi))n i=1 random variables [ N. Litvak, SOR group ] 17/30

slide-52
SLIDE 52

Alternative: rank correlations

◮ ((Xi, Yi))n i=1 random variables ◮ rX i

and rY

i

the rank of Xi and Yi, respectively

[ N. Litvak, SOR group ] 17/30

slide-53
SLIDE 53

Alternative: rank correlations

◮ ((Xi, Yi))n i=1 random variables ◮ rX i

and rY

i

the rank of Xi and Yi, respectively

◮ Spearman’s rho:

ρrank

n

= n

i=1(rX i

− (n + 1)/2)(rY

i

− (n + 1)/2) n

i=1(rX i

− (n + 1)/2)2 n

i (rY i

− (n + 1)/2)2

[ N. Litvak, SOR group ] 17/30

slide-54
SLIDE 54

Alternative: rank correlations

◮ ((Xi, Yi))n i=1 random variables ◮ rX i

and rY

i

the rank of Xi and Yi, respectively

◮ Spearman’s rho:

ρrank

n

= n

i=1(rX i

− (n + 1)/2)(rY

i

− (n + 1)/2) n

i=1(rX i

− (n + 1)/2)2 n

i (rY i

− (n + 1)/2)2

◮ Correlation coefficient for rX i

and rY

i ◮ rX i

and rY

i

are from uniform distribution: n · Uniform(0, 1)

[ N. Litvak, SOR group ] 17/30

slide-55
SLIDE 55

Alternative: rank correlations

◮ ((Xi, Yi))n i=1 random variables ◮ rX i

and rY

i

the rank of Xi and Yi, respectively

◮ Spearman’s rho:

ρrank

n

= n

i=1(rX i

− (n + 1)/2)(rY

i

− (n + 1)/2) n

i=1(rX i

− (n + 1)/2)2 n

i (rY i

− (n + 1)/2)2

◮ Correlation coefficient for rX i

and rY

i ◮ rX i

and rY

i

are from uniform distribution: n · Uniform(0, 1)

◮ Factor n cancels, no influence of high dispersion [ N. Litvak, SOR group ] 17/30

slide-56
SLIDE 56

Classical approach!

  • H. Hotelling and M.R. Pabst (1936):

‘Certainly where there is complete absence of knowledge of the form of the bivariate distribution, and especially if it is believed not to be normal, the rank correlation coefficient is to be strongly recommended as a means of testing the existence of relationship.’

[ N. Litvak, SOR group ] 18/30

slide-57
SLIDE 57

Configuration model (CM)

◮ Nodes with i.i.d. power law distributed number of half-edges

are created

◮ The half-edges connected to each other in a random fashion.

Self-loops and double edges are removed.

[ N. Litvak, SOR group ] 19/30

slide-58
SLIDE 58

Configuration model (CM)

◮ Nodes with i.i.d. power law distributed number of half-edges

are created

◮ The half-edges connected to each other in a random fashion.

Self-loops and double edges are removed.

◮ ρn (blue), ρrank n

(red), and mean ρ−

n (black) in 20 simulations

for different n

102 103 104 105 −0.25 −0.2 −0.15 −0.1 −0.05 0.05 0.1 0.15 0.2 n ρn, ρrank

n

(a)

[ N. Litvak, SOR group ] 19/30

slide-59
SLIDE 59

Configuration model with intermediate edge (CMIE)

◮ Nodes are connected randomly. Then each edge broken in two

by adding one intermediate node. Strong negative correlation: all original nodes are connected to nodes of degree 2

[ N. Litvak, SOR group ] 20/30

slide-60
SLIDE 60

Configuration model with intermediate edge (CMIE)

◮ Nodes are connected randomly. Then each edge broken in two

by adding one intermediate node. Strong negative correlation: all original nodes are connected to nodes of degree 2

◮ Clearly strongly disassortative graph [ N. Litvak, SOR group ] 20/30

slide-61
SLIDE 61

Configuration model with intermediate edge (CMIE)

◮ Nodes are connected randomly. Then each edge broken in two

by adding one intermediate node. Strong negative correlation: all original nodes are connected to nodes of degree 2

◮ Clearly strongly disassortative graph ◮ di’s are original degrees in the CM, ℓn = i di. In CMIE we

  • btain:

ρn = 2

i∈V 2di − 1 2ℓn i∈V d2 i +2ℓn

2

  • i∈V d3

i + 4ℓn − 1 2ℓn i∈V d2 i +2ℓn

2 .

[ N. Litvak, SOR group ] 20/30

slide-62
SLIDE 62

Configuration model with intermediate edge (CMIE)

◮ Nodes are connected randomly. Then each edge broken in two

by adding one intermediate node. Strong negative correlation: all original nodes are connected to nodes of degree 2

◮ Clearly strongly disassortative graph ◮ di’s are original degrees in the CM, ℓn = i di. In CMIE we

  • btain:

ρn = 2

i∈V 2di − 1 2ℓn i∈V d2 i +2ℓn

2

  • i∈V d3

i + 4ℓn − 1 2ℓn i∈V d2 i +2ℓn

2 .

◮ One can see that ρ− n → 0 [ N. Litvak, SOR group ] 20/30

slide-63
SLIDE 63

Configuration model with intermediate edge (CMIE)

◮ Nodes are connected randomly. Then each edge broken in two

by adding one intermediate node. Strong negative correlation: all original nodes are connected to nodes of degree 2

◮ Clearly strongly disassortative graph ◮ di’s are original degrees in the CM, ℓn = i di. In CMIE we

  • btain:

ρn = 2

i∈V 2di − 1 2ℓn i∈V d2 i +2ℓn

2

  • i∈V d3

i + 4ℓn − 1 2ℓn i∈V d2 i +2ℓn

2 .

◮ One can see that ρ− n → 0 [ N. Litvak, SOR group ] 20/30

slide-64
SLIDE 64

Configuration model with intermediate edge: results

◮ Nodes are connected randomly. Then each edge broken in two

by adding one intermediate node. Strong negative correlation: all original nodes are connected to nodes of degree 2.

◮ ρn (blue), ρrank n

(red), and mean ρ−

n (black) in 20 simulations

for different n

102 103 104 105 −1.3 −1.2 −1.1 −1 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 n ρn, ρrank

n

, ρ−

n

(b)

[ N. Litvak, SOR group ] 21/30

slide-65
SLIDE 65

Preferential Attachment (PA) graph

◮ Albert and Barab´

asi (1999), simplest version with one

  • utgoing edge per node.

◮ Nodes arrive one at a time. A new node connects to a node i

with probability proportional to current degree of i.

[ N. Litvak, SOR group ] 22/30

slide-66
SLIDE 66

Preferential Attachment (PA) graph

◮ Albert and Barab´

asi (1999), simplest version with one

  • utgoing edge per node.

◮ Nodes arrive one at a time. A new node connects to a node i

with probability proportional to current degree of i.

◮ ρn → 0 (Newman, 2003; Dorogovtsev et al. 2010).

Assortatively neutral?

[ N. Litvak, SOR group ] 22/30

slide-67
SLIDE 67

Preferential Attachment (PA) graph

◮ Albert and Barab´

asi (1999), simplest version with one

  • utgoing edge per node.

◮ Nodes arrive one at a time. A new node connects to a node i

with probability proportional to current degree of i.

◮ ρn → 0 (Newman, 2003; Dorogovtsev et al. 2010).

Assortatively neutral?

102 103 104 105 −0.9 −0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 n ρn, ρrank

n

, ρ−

n

(c)

[ N. Litvak, SOR group ] 22/30

slide-68
SLIDE 68

Assortative networks

ρn =

  • ij∈E didj −

1 |E| i∈V d2 i

2

  • i∈V d3

i − 1 |E| i∈V d2 i

2 . Two possible scenarios:

◮ Denominator outweighs numerator, ρn → 0 ◮ Denominator and numerator are of the same order of

  • magnitude. Limit?

[ N. Litvak, SOR group ] 23/30

slide-69
SLIDE 69

Collection of bipartite graphs

◮ ((Xi, Yi))n i=1 i.i.d.

X = bU1 + bU2, Y = bU1 + aU2, b > 0, a > 1 U1, U2 i.i.d. random variables with power law tail, exponent α.

◮ For i = 1, . . . , n, we create a complete bipartite graph of Xi

and Yi vertices, respectively.

◮ These n complete bipartite graphs are not connected to one

another.

◮ Extreme scenario of a network consisting of highly connected

clusters of different size. Such networks can serve as models for physical human contacts and are used in epidemic modelling (Eubank et al. 2004).

◮ Disassortative for n = 1 but positive dependence between X

and Y prevails for larger n.

[ N. Litvak, SOR group ] 24/30

slide-70
SLIDE 70

Collection of bipartite graphs: analysis

◮ |V | = n i=1(Xi + Yi), |E| = 2 n i=1 XiYi,

  • i∈V

dp

i = n

  • i=1

(X p

i Yi + Y p i Xi)

  • ij∈E

didj = 2

n

  • i=1

(XiYi)2.

[ N. Litvak, SOR group ] 25/30

slide-71
SLIDE 71

Collection of bipartite graphs: analysis

◮ |V | = n i=1(Xi + Yi), |E| = 2 n i=1 XiYi,

  • i∈V

dp

i = n

  • i=1

(X p

i Yi + Y p i Xi)

  • ij∈E

didj = 2

n

  • i=1

(XiYi)2.

◮ Take P(Uj > x) = c0x−α+1, where c0 > 0, x x0, and

α ∈ (4, 5), so that E[U3] < ∞, but E[U4] = ∞.

[ N. Litvak, SOR group ] 25/30

slide-72
SLIDE 72

Collection of bipartite graphs: analysis

◮ |V | = n i=1(Xi + Yi), |E| = 2 n i=1 XiYi,

  • i∈V

dp

i = n

  • i=1

(X p

i Yi + Y p i Xi)

  • ij∈E

didj = 2

n

  • i=1

(XiYi)2.

◮ Take P(Uj > x) = c0x−α+1, where c0 > 0, x x0, and

α ∈ (4, 5), so that E[U3] < ∞, but E[U4] = ∞.

◮ Then |E|/n p

→ 2E[XY ] < ∞ and

1 n

  • i∈V d2

i p

→ E[XY (X + Y )] < ∞.

[ N. Litvak, SOR group ] 25/30

slide-73
SLIDE 73

Collection of bipartite graphs: analysis

Theorem (L& van der Hofstad, 2012)

n−4/(α−1)b−4

n

  • i=1

(X 3

i Yi + Y 3 i Xi) d

→ (a3 + a)Z1 + 2Z2, n−4/(α−1)b−4

N

  • i=1

(XiYi)2

d

→ a2Z1 + Z2, where Z1 and Z2 and two independent stable distributions with parameter (α − 1)/4.

[ N. Litvak, SOR group ] 26/30

slide-74
SLIDE 74

Collection of bipartite graphs: analysis

Theorem (L& van der Hofstad, 2012)

n−4/(α−1)b−4

n

  • i=1

(X 3

i Yi + Y 3 i Xi) d

→ (a3 + a)Z1 + 2Z2, n−4/(α−1)b−4

N

  • i=1

(XiYi)2

d

→ a2Z1 + Z2, where Z1 and Z2 and two independent stable distributions with parameter (α − 1)/4. Result: ρn

d

→ 2a2Z1 + 2Z2 (a + a3)Z1 + 2Z2 , as n → ∞, which is a random variable taking values in (2a/(1 + a2), 1), a > 1.

[ N. Litvak, SOR group ] 26/30

slide-75
SLIDE 75

Collection of bipartite graphs: results

ρn (blue), ρrank

n

(red), and mean ρ−

n (black) in 20 simulations for

different n

102 103 104 105 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 n ρn, ρrank

n

(d)

[ N. Litvak, SOR group ] 27/30

slide-76
SLIDE 76

Web and social networks

Dataset Description # nodes max d ρn ρrank

n

ρ−

n

stanford-cs web domain 9,914 340

  • 0.1656
  • 0.1627
  • 0.4648

eu-2005 .eu web crawl 862,664 68,963

  • 0.0562
  • 0.2525
  • 0.0670

uk@100,000 .uk web crawl 100,000 55,252

  • 0.6536
  • 0.5676
  • 1.117

uk@1,000,000 .uk web crawl 1,000,000 403,441

  • 0.0831
  • 0.5620
  • 0.0854

enron e-mailing 69,244 1,634

  • 0.1599
  • 0.6827
  • 0.1932

dblp-2010 co-authorship 326,186 238 0.3018 0.2604

  • 0.7736

dblp-2011 co-authorship 986,324 979 0.0842 0.1351

  • 0.2963

hollywood-2009 co-starring 1,139,905 11,468 0.3446 0.4689

  • 0.6737

◮ Data from the Laboratory of Web Algorithms (LAW) at the

Universit` a degli studi di Milano

◮ All graphs are made undirected [ N. Litvak, SOR group ] 28/30

slide-77
SLIDE 77

Web and social networks

Dataset Description # nodes max d ρn ρrank

n

ρ−

n

stanford-cs web domain 9,914 340

  • 0.1656
  • 0.1627
  • 0.4648

eu-2005 .eu web crawl 862,664 68,963

  • 0.0562
  • 0.2525
  • 0.0670

uk@100,000 .uk web crawl 100,000 55,252

  • 0.6536
  • 0.5676
  • 1.117

uk@1,000,000 .uk web crawl 1,000,000 403,441

  • 0.0831
  • 0.5620
  • 0.0854

enron e-mailing 69,244 1,634

  • 0.1599
  • 0.6827
  • 0.1932

dblp-2010 co-authorship 326,186 238 0.3018 0.2604

  • 0.7736

dblp-2011 co-authorship 986,324 979 0.0842 0.1351

  • 0.2963

hollywood-2009 co-starring 1,139,905 11,468 0.3446 0.4689

  • 0.6737

◮ Data from the Laboratory of Web Algorithms (LAW) at the

Universit` a degli studi di Milano

◮ All graphs are made undirected ◮ Spearman’s rho is able to reveal strong negative correlations

in large networks

[ N. Litvak, SOR group ] 28/30

slide-78
SLIDE 78

Web and social networks

Dataset Description # nodes max d ρn ρrank

n

ρ−

n

stanford-cs web domain 9,914 340

  • 0.1656
  • 0.1627
  • 0.4648

eu-2005 .eu web crawl 862,664 68,963

  • 0.0562
  • 0.2525
  • 0.0670

uk@100,000 .uk web crawl 100,000 55,252

  • 0.6536
  • 0.5676
  • 1.117

uk@1,000,000 .uk web crawl 1,000,000 403,441

  • 0.0831
  • 0.5620
  • 0.0854

enron e-mailing 69,244 1,634

  • 0.1599
  • 0.6827
  • 0.1932

dblp-2010 co-authorship 326,186 238 0.3018 0.2604

  • 0.7736

dblp-2011 co-authorship 986,324 979 0.0842 0.1351

  • 0.2963

hollywood-2009 co-starring 1,139,905 11,468 0.3446 0.4689

  • 0.6737

◮ Data from the Laboratory of Web Algorithms (LAW) at the

Universit` a degli studi di Milano

◮ All graphs are made undirected ◮ Spearman’s rho is able to reveal strong negative correlations

in large networks

◮ ‘Infinite variance’ is not a formality, it affects the results [ N. Litvak, SOR group ] 28/30

slide-79
SLIDE 79

Web and social networks

Dataset Description # nodes max d ρn ρrank

n

ρ−

n

stanford-cs web domain 9,914 340

  • 0.1656
  • 0.1627
  • 0.4648

eu-2005 .eu web crawl 862,664 68,963

  • 0.0562
  • 0.2525
  • 0.0670

uk@100,000 .uk web crawl 100,000 55,252

  • 0.6536
  • 0.5676
  • 1.117

uk@1,000,000 .uk web crawl 1,000,000 403,441

  • 0.0831
  • 0.5620
  • 0.0854

enron e-mailing 69,244 1,634

  • 0.1599
  • 0.6827
  • 0.1932

dblp-2010 co-authorship 326,186 238 0.3018 0.2604

  • 0.7736

dblp-2011 co-authorship 986,324 979 0.0842 0.1351

  • 0.2963

hollywood-2009 co-starring 1,139,905 11,468 0.3446 0.4689

  • 0.6737

◮ Data from the Laboratory of Web Algorithms (LAW) at the

Universit` a degli studi di Milano

◮ All graphs are made undirected ◮ Spearman’s rho is able to reveal strong negative correlations

in large networks

◮ ‘Infinite variance’ is not a formality, it affects the results [ N. Litvak, SOR group ] 28/30

slide-80
SLIDE 80

Conclusions and discussion

[ N. Litvak, SOR group ] 29/30

slide-81
SLIDE 81

Conclusions and discussion

◮ The assortativity coefficient ρn is not suitable for measuring

dependencies in power law data with α < 4.

◮ ρn depends on n

[ N. Litvak, SOR group ] 29/30

slide-82
SLIDE 82

Conclusions and discussion

◮ The assortativity coefficient ρn is not suitable for measuring

dependencies in power law data with α < 4.

◮ ρn depends on n ◮ For disassortative networks, ρn goes to zero as n grows

[ N. Litvak, SOR group ] 29/30

slide-83
SLIDE 83

Conclusions and discussion

◮ The assortativity coefficient ρn is not suitable for measuring

dependencies in power law data with α < 4.

◮ ρn depends on n ◮ For disassortative networks, ρn goes to zero as n grows ◮ For assortative networks, ρn converges either to zero or to a

random variable.

[ N. Litvak, SOR group ] 29/30

slide-84
SLIDE 84

Conclusions and discussion

◮ The assortativity coefficient ρn is not suitable for measuring

dependencies in power law data with α < 4.

◮ ρn depends on n ◮ For disassortative networks, ρn goes to zero as n grows ◮ For assortative networks, ρn converges either to zero or to a

random variable.

◮ Assortativity can be used in the network analysis ONLY if

α > 4.

[ N. Litvak, SOR group ] 29/30

slide-85
SLIDE 85

Conclusions and discussion

◮ The assortativity coefficient ρn is not suitable for measuring

dependencies in power law data with α < 4.

◮ ρn depends on n ◮ For disassortative networks, ρn goes to zero as n grows ◮ For assortative networks, ρn converges either to zero or to a

random variable.

◮ Assortativity can be used in the network analysis ONLY if

α > 4.

◮ Spearman’s rho is a good alternative.

◮ Resolving ties (Mesfioui, M. and Tajar 2005; Nevslehova 2007) ◮ Consistency: proved for i.i.d. continuous (Xi, Yi), variance

O(1/n) (Borkowf 2002).

[ N. Litvak, SOR group ] 29/30

slide-86
SLIDE 86

Conclusions and discussion

◮ The assortativity coefficient ρn is not suitable for measuring

dependencies in power law data with α < 4.

◮ ρn depends on n ◮ For disassortative networks, ρn goes to zero as n grows ◮ For assortative networks, ρn converges either to zero or to a

random variable.

◮ Assortativity can be used in the network analysis ONLY if

α > 4.

◮ Spearman’s rho is a good alternative.

◮ Resolving ties (Mesfioui, M. and Tajar 2005; Nevslehova 2007) ◮ Consistency: proved for i.i.d. continuous (Xi, Yi), variance

O(1/n) (Borkowf 2002).

◮ In a graph the degrees on the ends of random edges are in

general dependent. Can we analyse Spearman’s rho? Work in progress.

[ N. Litvak, SOR group ] 29/30

slide-87
SLIDE 87

Thank you!

[ N. Litvak, SOR group ] 30/30