Detecting community structure in networks M.E.J. Newmans results 1 , - - PowerPoint PPT Presentation

detecting community structure in networks
SMART_READER_LITE
LIVE PREVIEW

Detecting community structure in networks M.E.J. Newmans results 1 , - - PowerPoint PPT Presentation

Detecting community structure in networks M.E.J. Newmans results 1 , 2 (presented by Botond Szabo) 1 Detecting community structure in networks (2004) 2 Finding community structure in networks using eigenvectors of matrices (2006) Statistics for


slide-1
SLIDE 1

Detecting community structure in networks

M.E.J. Newman’s results1,2 (presented by Botond Szabo)

1Detecting community structure in networks (2004) 2Finding community structure in networks using eigenvectors of matrices (2006)

Statistics for Structures Seminar Amsterdam, 01. 04. 2015.

slide-2
SLIDE 2

Outline

  • Introduction
  • Bisection Algorithms
  • Spectral algorithm (Laplacian)
  • The Kernighan-Lin algorithm (greedy)
  • Modularity algorithm
  • Multisection Algorithms
  • Girvan and Newman algorithm
  • Generalized modularity algorithm
  • Conclusion
slide-3
SLIDE 3

Model

Model: Grap G = (V , E), with unweighted vertices V and undirected, unweighted edges E. Goal: Find communities: Examples: Social networks, biochemical networks, information networks (parallel computing)

slide-4
SLIDE 4

Spectral algorithm I.

Definition: Laplacian L = D − A, where D is the diagonal matrix of vertex degrees and A is the adjacency matrix. Properties:

  • Since Di,i =

j Ai,j the vector v1 = (1, 1, .., 1) is an

eigenvector of L with λ1 = 0 eigenvalue.

  • All eigenvalues λi are non-negative.
  • The # of zero eigenvalues gives the # of components.
  • In symmetric matrices the eigenvectors corresponding to

different eigenvalues are orthogonal.

  • In connected graphs the eigenvectors contain both positive and

negative components (except v1).

slide-5
SLIDE 5

Spectral algorithm II.

Application: Consider the problem of finding two communities in a connected graph. Goal: Minimize the cut size R = 1 2

  • i,j in diffe-

rent groups

Ai,j = 1 4sTL s =

n

  • i=1

a2

i λi,

where si = ±1 (group indicator), s = n

i=1 aivi.

slide-6
SLIDE 6

Spectral algorithm II.

Application: Consider the problem of finding two communities in a connected graph. Goal: Minimize the cut size R = 1 2

  • i,j in diffe-

rent groups

Ai,j = 1 4sTL s =

n

  • i=1

a2

i λi,

where si = ±1 (group indicator), s = n

i=1 aivi.

Problem: The minimum of R is taken in the trivial case s = (1, 1, ..., 1).

slide-7
SLIDE 7

Spectral algorithm III.

Solution:

  • Fix the size of the two groups (n1, n2). Then

a2

1 = (vT 1 s)2 = (n1 − n2)2/n.

  • Ideally s proportional to v2, but si ∈ {−1, 1}.
  • Choose s close to proportional to v2:

si =

  • +1

if v(2)

i

≥ 0, −1 if v(2)

i

< 0. (1)

  • If #{v(2)

i

≥ 0} > n1, then assign the smallest one to the other group.

slide-8
SLIDE 8

Alternative spectral algorithm

Approximate algorithm: No size control on communities, using ideas from above: si =

  • +1

if v(2)

i

≥ 0, −1 if v(2)

i

< 0. (2) Example: The karate club Runtime: O(n3), for sparse Laplacian m/(λ3 − λ2).

slide-9
SLIDE 9

Alternative spectral algorithm

Approximate algorithm: No size control on communities, using ideas from above: si =

  • +1

if v(2)

i

≥ 0, −1 if v(2)

i

< 0. (2) Example: The karate club Runtime: O(n3), for sparse Laplacian m/(λ3 − λ2). Alternatively: Minimize the ratio cut R/(n1n2), instead of R.

slide-10
SLIDE 10

Discussion of Spectral algorithms

Problem: Satisfactory if the network does not divide up easily into groups but one has to do the best. However, they don’t reflect our intuitively concept of network communities.

slide-11
SLIDE 11

Kernighan-Lin algorithm

Algorithm:

  • Assume that we know the community sizes |G1|, |G2|
  • Assign benefit function for every division:

Q= # edges within − # edges between the two groups.

  • Stage 1: Maximize ∆Q over all pairs i ∈ G1, j ∈ G2.
  • Then switch vertices and repeat until from one group all

vertices have been swapped.

  • Stage 2: Choose in the preceding sequence the maximum Q.

Runtime: worst case O(n2). Example: Perfect match in the karate club.

slide-12
SLIDE 12

Modularity

Problem:

  • We usually don’t know the size of the communities.
  • The number of edges between communities is smaller than

expected.

slide-13
SLIDE 13

Modularity

Problem:

  • We usually don’t know the size of the communities.
  • The number of edges between communities is smaller than

expected. Definition: modularity - Benefit function (different, but related to before): Q = # edges within communities - expected # of such edges. Second term is rather vague. What do we mean under it?

slide-14
SLIDE 14

Modularity

Problem:

  • We usually don’t know the size of the communities.
  • The number of edges between communities is smaller than

expected. Definition: modularity - Benefit function (different, but related to before): Q = # edges within communities - expected # of such edges. Second term is rather vague. What do we mean under it? Null model: n vertices, Pi,j the probability of an edge between i and j. Then Q = 1 2m

  • i,j

[Ai,j − Pi,j]δ(gi, gj), where gi denotes the community i belongs to.

slide-15
SLIDE 15

Choice of Pi,j

Condition 1:

  • i,j

Pi,j =

  • i,j

Ai,j = 2m. Example: Bernoulli model Pi,j = p, which has binomial degree distribution, not right skewed like most of real-world networks.

slide-16
SLIDE 16

Choice of Pi,j

Condition 1:

  • i,j

Pi,j =

  • i,j

Ai,j = 2m. Example: Bernoulli model Pi,j = p, which has binomial degree distribution, not right skewed like most of real-world networks. Condition 2:

  • j

Pi,j =

  • j

Ai,j =: ki which for entirely random edges leads to Pi,j = kikj 2m . This is closely related to the configuration model (preferal attachment).

slide-17
SLIDE 17

Spectral optimization of modularity

Assumption: we have two communities, but no fixed size. Definition: Modularity matrix

  • Rewrite modularity function

Q = 1 4msTBs = 1 4m

  • i

a2

i βi,

where B=A-P and s = n

i=1 aiui (βi is the eigenvalue

corresponding to the eigenvector ui of B)

  • There exists i, such that βi = 0 and vi = (1, 1, ..., 1).
  • But there could be (and in practice are) both positive and

negative eigenvalues.

slide-18
SLIDE 18

Spectral optimization of modularity II

Solution: similarly to the spectral algorithm

  • Best would be to have s proportional to u1 (with largest β1).
  • But si = ±1.
  • Therefore take

si =

  • +1

if u(1)

i

≥ 0, −1 if u(1)

i

< 0. (3) Runtime: O(n2) (by using Lanczos method or its variants).

slide-19
SLIDE 19

Example: Modularity

slide-20
SLIDE 20

Negative Eigenvalues

Question: what information are stored in the negative eigenvalues?

slide-21
SLIDE 21

Negative Eigenvalues

Question: what information are stored in the negative eigenvalues? Answer: “Anti-community structure”, i.e. numbers of edges within groups are smaller than expected. Procedure:

  • Minimize modularity: take s almost parallel to vn

(corresponding βn). si =

  • +1

if u(n)

i

≥ 0, −1 if u(n)

i

< 0. (4)

  • Refinement step: move single vertices between groups to

minimize modularity.

slide-22
SLIDE 22

Negative Eigenvalues

Question: what information are stored in the negative eigenvalues? Answer: “Anti-community structure”, i.e. numbers of edges within groups are smaller than expected. Procedure:

  • Minimize modularity: take s almost parallel to vn

(corresponding βn). si =

  • +1

if u(n)

i

≥ 0, −1 if u(n)

i

< 0. (4)

  • Refinement step: move single vertices between groups to

minimize modularity. Other uses:

  • Network correlation: Adjacency vertices have similar properties.
  • Community centrality: How central vertices are in their

community.

slide-23
SLIDE 23

Example: Anti-community structure

slide-24
SLIDE 24

Example: Community centrality

slide-25
SLIDE 25

Multiple communities

Problem: In many real-world examples we don’t know the numbers

  • f the communities.
slide-26
SLIDE 26

Multiple communities

Problem: In many real-world examples we don’t know the numbers

  • f the communities.

Approach: Repeated division into two: not ideal.

slide-27
SLIDE 27

Girvan and Newman algorithm

Idea: Remove edges from the networks, with high “betweenness score”, iteratively. Motivation: Few edges between communities are bottlenecks. Traffic has to travel through them.

slide-28
SLIDE 28

Girvan and Newman algorithm

Idea: Remove edges from the networks, with high “betweenness score”, iteratively. Motivation: Few edges between communities are bottlenecks. Traffic has to travel through them. Algorithm

  • Edge betweennes: # of geodesic paths between vertex pairs

containing the edge.

  • Remove edges with the highest betweennesses until no edges

remains.

  • Progress represented in dendogram:
slide-29
SLIDE 29

Example: Girvan and Newman algorithm

slide-30
SLIDE 30

Girvan and Newman algorithm II.

Problem: No guide how many communities to have.

slide-31
SLIDE 31

Girvan and Newman algorithm II.

Problem: No guide how many communities to have. Solution:

  • Introduce again modularity:

Q = fraction of edges within communities - expected value of the same quantity

  • If Q = 0 community structure is not stronger than by random

chance.

  • Local peaks of Q during the algorithm indicates good divisions.

Runtime: Slow O(m2n) or O(n3).

slide-32
SLIDE 32

Girvan and Newman algorithm II.

Problem: No guide how many communities to have. Solution:

  • Introduce again modularity:

Q = fraction of edges within communities - expected value of the same quantity

  • If Q = 0 community structure is not stronger than by random

chance.

  • Local peaks of Q during the algorithm indicates good divisions.

Runtime: Slow O(m2n) or O(n3). Extensions:

  • Monte Carlo estimate of betweennes Tyler at al.
  • Local measure of betweennes (short loops) O(m4/n2) Radachi

et al.

slide-33
SLIDE 33

Modularity: multiple communities

Shortcomings: two communities, using only leading eigenvector.

slide-34
SLIDE 34

Modularity: multiple communities

Shortcomings: two communities, using only leading eigenvector. Goal: Generalize to c communities. Si,j = 1 if vertex i belongs to community j,

  • therwise.

(5) Then the modularity Q = Tr(STBS) =

n

  • j=1

n

  • k=1

βj(uT

j sk)2,

where B = UDUT (D is diagonal with Di,i = βi) Optimally: Choose mutually orthogonal s1, ..., sc−1 proportional to the leading eigenvectors with positive eigenvalues.

slide-35
SLIDE 35

Modularity: multiple communities

Shortcomings: two communities, using only leading eigenvector. Goal: Generalize to c communities. Si,j = 1 if vertex i belongs to community j,

  • therwise.

(5) Then the modularity Q = Tr(STBS) =

n

  • j=1

n

  • k=1

βj(uT

j sk)2,

where B = UDUT (D is diagonal with Di,i = βi) Optimally: Choose mutually orthogonal s1, ..., sc−1 proportional to the leading eigenvectors with positive eigenvalues. Problem: si ∈ {0, 1} and may not be possible to find as many index vectors making positive contribution. Therefore it gives only an upper bound on the number of communities.

slide-36
SLIDE 36

Modularity: multiple communities II

Generalization:

  • Rewrite the modularity (for possible negative α):

Q = nα + Tr[STU(D − αI)UTS],

  • Then define the p ≤ n dimensional vertex vector ri:

[ri]j =

  • βj − αUi,j
  • Keeping the leading p eigenvalues

Q ≈ nα +

c

  • k=1

p

  • j=1

i∈Gk

[ri]j 2 =: nα +

c

  • k=1

|Rk|2. Goal: Maximize the magnitude of vectors Rk =

i∈Gk ri by

dividing the vertices into groups. Connections: It is also called the “Principal component analysis of networks”.

slide-37
SLIDE 37

Modularity: multiple communities II

Maximizing the magnitude of Rk:

  • From the orthogonality of the eigenvectors we have

c

  • k=1

Rk =

n

  • i=1

ri = 0

  • For c = 2: R1 R2 are equal magnitude and opposite directed.
  • Removing vertex i from community k where RT

k ri< 0:

|Rk − ri|2 − |Rk|2 = |ri|2 − 2RT

k ri > 0.

slide-38
SLIDE 38

Conclusion

  • Algorithms for bisection graphs with known community size

(Laplacian spectral algorithm, The Kernighan-Lin algorithm)

  • In lot of real-world example the community sizes are unknown

(Modularity algorithm)

  • Modularity matrix contains various information

(anti-community structure, network correlation, community centrality)

  • Furthermore in real world examples usually there are more

communities (Girvan and Newman algorithm, generalized modularity algorithm)

  • Modularity algorithm: connection with figuration model, PCA