Walk Modularity: Graph partitioning based on a generalization of - - PowerPoint PPT Presentation

walk modularity graph partitioning based on a
SMART_READER_LITE
LIVE PREVIEW

Walk Modularity: Graph partitioning based on a generalization of - - PowerPoint PPT Presentation

Background Walk Modularity Example Graphs Benchmark Test Conclusions Walk Modularity: Graph partitioning based on a generalization of modularity David Mehrle 1 Amy Strosser 1 Carnegie Mellon University Mount St. Marys University


slide-1
SLIDE 1

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Walk Modularity: Graph partitioning based on a generalization of modularity David Mehrle1 Amy Strosser1

Carnegie Mellon University Mount St. Mary’s University dmehrle@cmu.edu amstrosser@email.msmary.edu

1 This research was supported by a National Science Foundation Research Experiences for Undergraduates Grant (Award #1062128) hosted by the Rochester Institute of Technology with co-funding from the Department of Defense

slide-2
SLIDE 2

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Graph Theory Background

Consider an undirected graph G with n vertices and m edges Adjacency matrix is the n × n symmetric matrix A with Aij =

  • 1

nodes i and j are connected by an edge

  • therwise
slide-3
SLIDE 3

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Modularity

Communities should have more edges within them than the number of edges you would expect based on random chance.

slide-4
SLIDE 4

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Modularity

Definition: Modularity (Newman, 2004) Q = 1 2m

  • i, j

(Aij − Pij) δ(ci, cj) Compares actual vs. expected number of edges within clusters Aij edges actually fall between vertices i and j Expect Pij = kikj 2m edges between vertices i and j ki is the degree of vertex i ci is the group to which vertex i belongs δ(ci, cj) =

  • 1

ci = cj

  • therwise
slide-5
SLIDE 5

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Walk Modularity

Definition: Walk Modularity Qℓ = 1 2mℓ

  • i, j
  • (Aℓ)ij − (Pℓ)ij
  • δ(ci, cj)

Compares actual vs. expected number of walks of length ℓ (Aℓ)ij is the number of walks of length ℓ between i and j (Pℓ)ij is the expected number of walks of length ℓ between i, j mℓ is the number of walks of length ℓ in the graph

slide-6
SLIDE 6

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Walk Partitioning

Partition the graph into two communities by maximizing Qℓ Define the partition vector s by si =

  • +1

vertex i in cluster 1 −1 vertex i in cluster 2 Let Bℓ = Aℓ − Pℓ Note δ(ci, cj) = 1

2(1 + sisj)

Qℓ =

  • i, j
  • (Aℓ)ij − (Pℓ)ij
  • (1 + sisj) =
  • i, j

(Bℓ)ij

  • constant

+ sTBℓ s

maximize

There are 2n possible choices for s, brute force is not practical We can find an approximate optimal solution

slide-7
SLIDE 7

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Maximizing Walk-Modularity

Expand in terms of orthonormal eigenvectors ui of Bℓ: s =

n

  • i=1

aiui , ai = uT

i s

To maximize Qℓ, concentrate as much weight as possible on largest eigenvalue Qℓ = sTBℓ s =

  • i

aiuT

i

  • Bℓ

 

j

ajuj   =

n

  • i=1

(uT

i s)2βi

If β is largest eigenvalue of Aℓ − Pℓ, with eigenvector u, choose s: si =

  • +1

ui ≥ 0 −1 ui < 0

slide-8
SLIDE 8

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Embedded K20

Erd˝

  • s-R´

enyi random graph on 500 nodes with embedded K20 Probability of edge between 2 nodes in random graph is 10% Probability of edge between node in random graph and node in K20 is 5%

slide-9
SLIDE 9

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Embedded K20

Partitioned using ℓ = 1, regular modularity

slide-10
SLIDE 10

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Embedded K20

Partitioned using ℓ = 2, walks of length 2

slide-11
SLIDE 11

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Embedded K20

Partitioned using ℓ = 3, walks of length 3

slide-12
SLIDE 12

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Embedded K20

Partitioned using ℓ = 4, walks of length 4 Rule of thumb for choosing ℓ: ℓ ≈ diameter of G

slide-13
SLIDE 13

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Dolphin Network (Lusseau 2003)

A group of 62 dolphins were tracked over ten years The group split in two after one of the dolphins departed A standard test used in literature for graph partitioning algorithms

slide-14
SLIDE 14

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Dolphin Network

Modularity partition, ℓ = 1 ±1 indicates the observed partitioning of the dolphin network Red nodes are incorrectly placed relative to observed

slide-15
SLIDE 15

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Dolphin Network

Q8 walk-modularity partition, walks of length 8 ±1 indicates the observed partitioning of the dolphin network Red nodes are incorrectly placed relative to observed

slide-16
SLIDE 16

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Dolphin Network

Q10 walk-modularity partition, walks of length 10 ±1 indicates the observed partitioning of the dolphin network Red nodes are incorrectly placed relative to observed

slide-17
SLIDE 17

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Multiple Communities

Recursively divide each community with spectral methods For each subdivision, consider change in walk-modularity ∆Qℓ = Qℓfinal

after subdivide

− Qℓinitial

before subdivide

If splitting up a community gives ∆Qℓ < 0, don’t subdivide If all nodes are in single community, don’t subdivide

slide-18
SLIDE 18

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Benchmark Tests (Lancichinetti et al. 2008)

Benchmark test for community detection algorithms designed by Lancichinetti et. al. 2008 Joins communities based on a mixing parameter, µ

Moves edges between communities with probability µ

The following slides have a community generated with n = 500, µ = 0.15, ¯ k = 25 Each vertex is placed within a single well-defined community

slide-19
SLIDE 19

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Benchmark Test

The communities as defined by the test generator

slide-20
SLIDE 20

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Benchmark Test (Modularity)

The communities as found by edge-modularity (ℓ = 1)

slide-21
SLIDE 21

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Benchmark Test (ℓ = 8)

The communities as found by walk-modularity (ℓ = 8)

slide-22
SLIDE 22

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Computational Complexity

Same asymptotic complexity as modularity, O(n2)

Power method to find leading eigenvector of Bℓ xn+1 = Bℓ xn Bℓ xn , x1 ∈ Rn random Repeated multiplication against vector avoids computing matrix powers (Aℓ − Pℓ)x = A · A · A · · · A

  • ℓ times ,O(n2)

x − P · P · P · · · P

  • ℓ times ,O(n2)

x

Comparably fast in practice as well, above tests < 1 s for most ℓ

slide-23
SLIDE 23

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Conclusions

In most of our real-world and benchmark tests so far, walk- modularity performs significantly better than edge-modularity Comparable speed both asymptotically and practically Very similar to modularity, which is often used in practice

slide-24
SLIDE 24

Background Walk Modularity Example Graphs Benchmark Test Conclusions

Acknowledgements

Thank you to . . .

  • Dr. Anthony Harkin, for mentoring and suggestions

Dr Darren Narayan, for organizing the REU Rochester Institute of Technology the National Science Foundation, for grant #1062128 the Department of Defense, for co-funding The AMS and MAA for organizing the JMM