Graph Sampling and Sparsification Lecture 19 CSCI 4974/6971 7 Nov - - PowerPoint PPT Presentation

graph sampling and sparsification
SMART_READER_LITE
LIVE PREVIEW

Graph Sampling and Sparsification Lecture 19 CSCI 4974/6971 7 Nov - - PowerPoint PPT Presentation

Graph Sampling and Sparsification Lecture 19 CSCI 4974/6971 7 Nov 2016 1 / 10 Todays Biz 1. Reminders 2. Review 3. Graph Sampling/Sparsification 2 / 10 Reminders Assignment 4: due date November 10th Setting up and running on CCI


slide-1
SLIDE 1

Graph Sampling and Sparsification

Lecture 19 CSCI 4974/6971 7 Nov 2016

1 / 10

slide-2
SLIDE 2

Today’s Biz

  • 1. Reminders
  • 2. Review
  • 3. Graph Sampling/Sparsification

2 / 10

slide-3
SLIDE 3

Reminders

◮ Assignment 4: due date November 10th

◮ Setting up and running on CCI clusters

◮ Assignment 5: due date TBD (before Thanksgiving

break, probably 22nd)

◮ Assignment 6: due date TBD (early December) ◮ Tentative: No class November 14 and/or 17 ◮ Final Project Presentation: December 8th ◮ Project Report: December 11th ◮ Office hours: Tuesday & Wednesday 14:00-16:00 Lally

317

◮ Or email me for other availability 3 / 10

slide-4
SLIDE 4

Today’s Biz

  • 1. Reminders
  • 2. Review
  • 3. Graph Sampling/Sparsification

4 / 10

slide-5
SLIDE 5

Quick Review

Graph Compression:

5 / 10

slide-6
SLIDE 6

Today’s Biz

  • 1. Reminders
  • 2. Review
  • 3. Graph Sampling/Sparsification

6 / 10

slide-7
SLIDE 7

Sampling and Summarization for Social Networks ShouDe Lin, MiYen Yeh, and ChengTe Li, National Taiwan University

7 / 10

slide-8
SLIDE 8

Sampling and Summarization for Social Networks

PAKDD 2013 Tutorial Shou‐De Lin*, Mi‐Yen Yeh#, and Cheng‐Te Li*

* Computer Science and Information Engineering, National Taiwan University

# Institute of Information Science, Academic Sinica

sdlin@csie.ntu.edu.tw, miyen@iis.sinica.edu.tw, d98944005@csie.ntu.edu.tw Tutorial slides can be downloaded here: http://mslab.csie.ntu.edu.tw/tut‐pakdd13/

slide-9
SLIDE 9

About This Tutorial

  • It is a two‐hour tutorial for PAKDD2013 on social

network sampling and summarization

– We do not anticipate to cover everything relevant to this topic. – We will highlight the trend, categorize different types of strategies, and describe some ongoing works of us

  • Agenda

– Introduction + Sampling +Q/A(45+10 min) – Summarization + conclusion + Q/A (45+10 min)

13/05/02 Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 2

slide-10
SLIDE 10

by Paul Butler

3

What can be mined from this picture?

Big Social Network Billions of different types of nodes and links

slide-11
SLIDE 11

Motivation

>1 billion >500 million >200 million

  • Sometimes the full networks are not completely observed

in advance

  • Even they are, loading everything into memory for further

analysis might not be feasible

  • Even it is feasible, generating some simple statistics (e.g.

average path length, diameter) can take a long time, not to mention more complicated ones (e.g. counting the

  • ccurrence of certain pattern)

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 4

slide-12
SLIDE 12
  • 1+Billion users
  • Avg: 130 friends each node

It costs >1TB memory to simply save the raw graph data (without attributes, labels nor content)

This can cause problems for information extraction, processing, and analysis

An Example on Facebook

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 5

Two possible solutions: Sampling and Summarization

slide-13
SLIDE 13

Sampling Versus Summarization

  • Sampling

– Assume the information of nodes/links become known

  • nly after they are sampled

– Require certain sampling strategy to explore/expand the network gradually – Goal: gradually identify a small set of representative nodes and links of a social network, usually given little prior information about this network

  • Summarization

– The entire social network is known in prior – Goal: condense the social network as much as possible without losing too much information

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 6

slide-14
SLIDE 14

Homogeneous VS Heterogeneous Social Networks

  • Homogeneous  Single Relational Network

– Single object type & Link type

  • Heterogeneous  Multi‐Relational Network

– Multiple object type & Link type

  • Example

– Homogeneous – Heterogeneous

Link Types Friend Family Love Link Types Friend

13/05/02 7 Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial

slide-15
SLIDE 15

Sampling for Social Networks

slide-16
SLIDE 16

Sampling Social Networks

  • Assume that the detailed information of a node

can only be seen after it is sampled

– Entire social network is not known in advance

  • Goal

– Sample (i.e. gradually observe nodes and links) a sub‐ network that represents the whole network

  • To preserve certain properties of the original network

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 9

slide-17
SLIDE 17

Evaluating the Sampling Quality

  • How to measure the quality of the sampling

algorithm?

  • A sampling algorithm is effective if

– The sampled social network can preserve certain network properties – Using the sampled network to perform an ultimate task (e.g. centrality analysis, link prediction, etc), one can produce similar results as if this task were performed on the fully observed network – The sample sub‐network is small

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 10

slide-18
SLIDE 18

Properties Preserved (1/3)

  • Homogeneous Static Social Network

– In/Out Degree Distribution – Path Length Distribution – Clustering Coefficient Distribution – Eigenvalues – Weakly/Strongly Connected Component Size Distribution – Community Structure – Etc..

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 11

slide-19
SLIDE 19

Properties Preserved (2/3)

  • Homogeneous Dynamic Social Networks

(Graphs are time‐evolving)

– Densification Power Law

  • Number of edges vs. number of nodes over time

– Shrinking diameter

  • Observed that shrinks and stabilizes over time

– Average clustering coefficient over time – Largest singular value of graph adjacency matrix

  • ver time

– Etc…

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 12

slide-20
SLIDE 20

Properties Preserved (3/3)

  • Heterogeneous Social Network

– Note type Distribution – Intra‐link and Inter‐link type Distribution – Higher‐order types connection

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 13

slide-21
SLIDE 21

Evaluation Metrics

  • Whether certain properties are preserved

– For single value properties (E.g. clustering coefficient, average path length), one can measure whether this value is preserved – For distributional properties (E.g. degree distribution, component size distribution), one can compute the distance between two distributions (e.g. KL divergence)

  • Whether certain end‐task can be performed

similarly

– Performing a certain task using the sampled network, and check whether the results are similar to those when the full network is used

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 14

slide-22
SLIDE 22

Sampling for Homogeneous Social Networks

slide-23
SLIDE 23

Three Main Strategies

  • Node Selection
  • Edge Selection
  • Sampling by Exploration

– Random Walk – Graph Search – Chain‐Referral Sampling

Seeds (i.e., ego)

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 16

slide-24
SLIDE 24

Node Selection

  • Random Node Sampling

– Uniformly select a set of nodes

  • Degree‐based Sampling [Adamic’01]

– the probability of a node being selected is proportional to its degree (assuming known)

  • PageRank‐based Sampling [Leskovec’06]

– the probability of a node being selected is proportional to its PageRank value (assuming known)

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 17

slide-25
SLIDE 25

Edge Selection

  • Random Edge (RE) Sampling

– Uniformly select edges at random, and then include the associated nodes

  • Random Node‐Edge (RNE) Sampling

– Uniformly select a node, then uniformly select an edge incident to it

  • Hybrid Sampling [Leskovec’06]

– With probability p perform RE sampling, with probability 1‐p perform RNE sampling

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 18

slide-26
SLIDE 26

Edge Selection (cont.)

  • Induced Edge Sampling [Ahmed’12]

– Step 1: Uniformly select edges (and consequently nodes) for several rounds – Step 2: Add edges that exist between sampled nodes

  • Frontier Sampling [Ribeiro’10]

– Step 0: Randomly select a set of nodes L as seeds – Step 1: Select a seed u from L using degree‐based sampling – Step 2: Select an edge of u, (u, v), uniformly – Step 3: Replace u by v in L and add (u, v) to the sequence of sampled edges – * Repeat Step 1 to 3

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 19

slide-27
SLIDE 27

Sampling by Exploration

  • Random Walk [Gjoka’10]

– The next‐hop node is chosen uniformly among the neighbors of the current node

  • Random Walk with Restart [Leskovec’06]

– Uniformly select a random node and perform a random walk with restarts

  • Random Jump [Ribeiro’10]

– Same as random walk but with a probability p we jump to any node in the network

  • Forest Fire [Leskovec’06]

– Choose a node u uniformly – Generate a random number z and select z out links of u that are not yet visited – Apply this step recursively for all newly added nodes

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 20

slide-28
SLIDE 28

Sampling by Exploration (cont.)

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 21

  • Ego‐Centric Exploration (ECE) Sampling

– Similar to random walk, but each neighbor has p probability to be selected – Multiple ECE (starting with multiple seeds)

  • Depth‐First / Breadth‐First Search [Krishnamurthy’05]

– Keep visiting neighbors of earliest / most recently visited nodes

  • Sample Edge Count [Maiya’11]

– Move to neighbor with the highest degree, and keep going

  • Expansion Sampling [Maiya’11]

– Construct a sample with the maximal expansion. Select the neighbor v based on

S: the set of sampled nodes, N(S): the 1st neighbor set of S

slide-29
SLIDE 29

Example: Expansion Sampling

E G H F A B C D |N({A})|=4 |N({E}) – N({A}) ∪{A}|=|{F,G,H}|=3 |N({D}) – N({A}) ∪{A}|=|{F}|=1

slide-30
SLIDE 30

qk ‐ sampled

node degree distribution

pk ‐ real node

degree distribution

Drawback of Random Walk: Degree Bias!

  • Real average node degree ~ 94, Sampled average node degree ~ 338
  • Solution: modify the transition probability :

13/05/02 23

  • ,

1

  • ∗ min

1,

  • 1
  • ,
  • If w is a neighbor of v

If w = v

  • therwise
slide-31
SLIDE 31

Metropolis Graph Sampling

  • Step 1: Initially pick one subgraph sample S with n’

nodes randomly

  • Step 2: Iterate the following steps until convergence

2.1: Remove one node from S 2.2: Randomly add a new node to S  S’ 2.3: Compute the likelihood ratio – *(S) measures the similarity of a certain property between the sample S and the original network G

  • Be derived approximately using Simulated Annealing

[Hubler’08]

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 24

∗′ ∗

1: : ≔ 1: : ≔ with probability : ≔ with probability 1

slide-32
SLIDE 32

Sampling for Heterogeneous Social Networks

slide-33
SLIDE 33

Sampling on Heterogeneous Social Networks

  • Heterogeneous Social Networks (HSN)

– A graph G=<V, E> has n nodes (v1,v2, …, vn), m directed edges (e1, …, em) and k different types – Each node/edge belongs to a type

  • Given a finite set L = {L1, ..., Lk} denoting k types
  • Sampling methods for HSN

– Multi‐graph sampling – Type‐distribution preserving sampling – Relational‐profile preserving sampling

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 26

[Gjoka’10] (Li’ 11) (Yang’13)

slide-34
SLIDE 34

Multigraph Sampling

  • Random walk sampling on the union multiple

graph to avoid stopping on the disconnected graph.

13/05/02 Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 27

slide-35
SLIDE 35

Sampling Heterogeneous Social Networks

  • Sampling methods for HSN

– Multi‐graph sampling – Type‐distribution preserving sampling – Relational‐profile preserving sampling

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 28

[Gjoka’10] (Li’ 11) (Yang’13)

slide-36
SLIDE 36

Node Type Distribution Preserving Sampling

  • Given a graph G and a sampled subgraph GS
  • The node type distribution of GS is expected to

be the same as G, i.e., d(Dist(Gs),Dist(G)) = 0

– d() denotes the difference between two distributions

(9:6) = (3:2)

Sampled Network Original Network

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 29

slide-37
SLIDE 37

Connection‐type Preserving Sampling

  • Heterogeneous Connection

– For an edge E[vi,vj] – Intra‐connection edge: Type(vi) = Type(vj) – Inter‐connection edge: Type(vi) != Type(vj)

  • Intra‐Relationship preserving

– The ratio of the intra‐connection should be preserved, that is:

d(IR(GS),IR(G)) = 0

– If the intra‐relationship is preserved, the inter‐relationship is also preserved

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 30

slide-38
SLIDE 38

Respondent‐driven Sampling

  • First proposed in social science[Heck’99] to solve the hidden

population in surveying.

  • Two Main Phases:

Snowball sampling  Finding steady‐state in Recruitment matrix

31

G

respondents

limited coupon c limited coupon c limited coupon c

S11 S12 S13 S21 S22 S23 S31 S32 S33

N‐step transition

P1 P2 P3 Transition Matrix

steady‐state vector

slide-39
SLIDE 39
  • Respondent‐driven Sampling does a good job with small

node size, but saturate to mediocre afterwards

  • Random node sampling performs poorly in the

beginning, but reaches the best results after sufficient amount of nodes are sampled.

13/05/02 Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 32

Comparing Different Sampling algorithms

Similarity of node type‐distribution Similarity of Intra‐link distribution

slide-40
SLIDE 40

Heterogeneous Social Networks

  • Sampling methods for HSN

– Multi‐graph sampling – Type‐distribution preserving sampling – Relational‐profile preserving sampling

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 33

[Gjoka’10] (Li’ 11) (Yang’13)

slide-41
SLIDE 41

Relational Profile Preserving Sampling

  • Node‐type/intra‐type preservation considers the

semantics of nodes, but not the structure of networks

  • Propose the Relational Profile to consider semantic and

structure all together

– Capture the dependency between each Node Type(NT) and Edge Type(ET) of a directed Heterogeneous Network – Consists of 4 Relational Matrices

  • Conditional probabilities P(Tj|Ti) (e.g. P(LT=cites|NT=paper) )
  • Node to node, node to edge, edge to node, edge to edge

NT ET NT Transition Matrix Transition Matrix ET Transition Matrix Transition Matrix

paper cites cites journal_of authored author

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 34

slide-42
SLIDE 42

Example of Relational Profile (RP)

P A C J c p a P 0.44 0.22 0.22 0.11 0.44 0.33 0.22 A 1 1 C 1 1 J 1 1 c 1 0.22 0.44 0.33 p 0.5 0.33 0.17 0.66 0.33 a 0.5 0.5 0.6 0.4 P A C J c p a P 0.182 0.364 0.091 0.273 0.182 0.364 0.364 A 1 1 C 1 1 J 1 1 c 1 0.5 0.5 p 0.5 0.125 0.375 0.17 0.5 0.33 a 0.5 0.5 0.22 0.33 0.44 Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 35

slide-43
SLIDE 43

Challenge: How to approximate RP when the true RP is unknown

  • We propose Exploration by Expectation Sampling
  • Aim to preserve the unknown relational profile

while adding new sample node

  • 1. Randomly choose a starting node and the corresponding edges
  • 2. Based on current RP, select a next node from all 1 degree neighbor
  • 3. Add the new node and all its edges
  • 4. Update RP of the sub‐sampled graph
  • 5. Repeat step 3, 4 & 5 until the converge of RP
  • Which node should be selected?

– Select the node whose inclusion can potentially lead to the largest change to the existing RP

  • Use the partially observed RP to generate the ‘expected

amount of change’ of each node as its score

  • Weighted sampling based on the score

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 36

slide-44
SLIDE 44

Relational Profile Sampling (RPS)

D(v, Gs) = estimated change of RP given sampling v on the current graph Gs =E[ΔP(Gs, Gs+v)|Gs] , where ΔP = RMSERP

Goal: maximize expected property (Relational Profile distribution) change

Exploiting the existing RP, P(type(v)=t|Gs) can be

  • btained using the observed types of v’s neighbors

v

which can be calculated as

v RP(type |type ) RP(type |type ) RP(type |type ) RP(type |type )

P(type|type) can be obtained from the existing RP

Idea: Sample to increase the diversity

Gs

slide-45
SLIDE 45

Evaluation

  • Datasets: 3 real‐life large scale social networks
  • Baselines:

– Random Walk Sampling (RW) – Degree‐based sampling (HDS)

  • Evaluation I (Property Preservation): see how well the sampled

network approximates two properties of the full network

  • Evaluation II (Prediction): training a prediction model using the

sampled network to infer out‐of‐sampled network status:

– Node Type Prediction: Predict the type of unseen nodes in the network using a sub‐sampled network – Missing Relations Prediction: Recover/predict the missing links – Features:

  • fdeg = (in/out deg; avg in/out deg of neighbors)
  • ftopo = (Common Neighbors; Jaccard’s Coefficient; etc)
  • fnt = P(type(v)|Gs)=
  • fRPnode=
  • fRPpath=
slide-46
SLIDE 46

Experiments (Property Preservation)

  • RP (RMSE)
  • Weighted PageRank

民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 1 5 9 13 17 21 25 29 33 37 41 45 49 Kendall‐Tau # Nodes Sampled (in 10s)

RW HDS RPS

民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 1 5 9 13 17 21 25 29 33 37 41 45 49 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 1 6 11 16 21 26 31 36 41 46

Hep Aca Movie Type dependency preservation Preserving relative node weights propagated throughout entire network

slide-47
SLIDE 47

Experiments (Prediction)

  • We show Academic Network for brevity.

民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 民國前/通用格式 A c c u r a c y number of sampled nodes highDeg RandWalk RPS

Node Type Prediction Missing Relation Prediction

slide-48
SLIDE 48

Task‐driven Network Sampling

  • Sampling Community Structure

[Maiya’10][Satuluri’11]

  • Sampling Network Backbone for Influence

Maximization [Mathioudakis’11]

  • Sampling High Centrality Individuals [Maiya’10]
  • Sampling Personalized PageRank Values

[Vattani’11]

  • Sampling Network for Link/Label Prediction

[Ahmed’12]

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 41

slide-49
SLIDE 49

Short Summary

Lin et al., Sampling and Summarization for Social Networks, PAKDD 2013 tutorial 13/05/02 42

Homogeneous SN Heterogeneous SN Node and Edge Selection

[Leskovec’06] [Adamic’01] [Ahmed’12][Ribeiro’10] [Kurant’12]

Sampling by Exploration

[Krishnamurthy’05] [Leskovec’06][Hubler’08] [Gjoka’10][Ribeiro’10] [Maiya’11][Kurant’11] [Gjoka’11][Li’11][Kurant’12] [Yang’13]

Task‐driven Sampling

[Maiya’10][Satuluri’11][Mathioudakis’11] [Vattani’11][Ahmed’12]

  • Why sampling a social network?
  • the full network (e.g. Facebook) cannot be fully observed
  • crawling can be costly in terms of resource and time consumption (therefore

a smart sampling strategy is needed)

slide-50
SLIDE 50

Detecting Community Structures in Social Networks by Graph Sparsification Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder, Heritage Institute of Technology, Kolkata, India

8 / 10

slide-51
SLIDE 51

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Detecting Community Structures in Social Networks by Graph Sparsification

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder

Department of Computer Science and Engineering, Heritage Institute of Technology, Kolkata, India

September 5, 2016

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-52
SLIDE 52

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Figure: The tendency of people to live in racially homogeneous neighborhoods[1]. In yellow and

  • range blocks % of Afro-Americans ≤ 25, in brown and black boxes % ≥ 75.

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-53
SLIDE 53

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1, C2, ..., Ck} such that

i

Ci = V.

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-54
SLIDE 54

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1, C2, ..., Ck} such that

i

Ci = V. For disjoint communities, ∀i, j we have Ci Cj = ∅

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-55
SLIDE 55

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1, C2, ..., Ck} such that

i

Ci = V. For disjoint communities, ∀i, j we have Ci Cj = ∅ For overlapping communities, ∃i, j where Ci Cj = ∅

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-56
SLIDE 56

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1, C2, ..., Ck} such that

i

Ci = V. For disjoint communities, ∀i, j we have Ci Cj = ∅ For overlapping communities, ∃i, j where Ci Cj = ∅

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-57
SLIDE 57

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1, C2, ..., Ck} such that

i

Ci = V. For disjoint communities, ∀i, j we have Ci Cj = ∅ For overlapping communities, ∃i, j where Ci Cj = ∅

Figure: Zachary’s Karate Club Network

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-58
SLIDE 58

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1, C2, ..., Ck} such that

i

Ci = V. For disjoint communities, ∀i, j we have Ci Cj = ∅ For overlapping communities, ∃i, j where Ci Cj = ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes, C2 = green, C3 = blue is a disjoint cover

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-59
SLIDE 59

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1, C2, ..., Ck} such that

i

Ci = V. For disjoint communities, ∀i, j we have Ci Cj = ∅ For overlapping communities, ∃i, j where Ci Cj = ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes, C2 = green, C3 = blue is a disjoint cover However, ¯ C = { ¯ C1, ¯ C2}, ¯ C1 = yellow & green nodes and ¯ C2 = blue & green nodes is an overlapping cover

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-60
SLIDE 60

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Definition of a Community

For a given graph G(V, E), find a cover C = {C1, C2, ..., Ck} such that

i

Ci = V. For disjoint communities, ∀i, j we have Ci Cj = ∅ For overlapping communities, ∃i, j where Ci Cj = ∅

Figure: Zachary’s Karate Club Network

C = {C1, C2, C3}, C1 = yellow nodes, C2 = green, C3 = blue is a disjoint cover However, ¯ C = { ¯ C1, ¯ C2}, ¯ C1 = yellow & green nodes and ¯ C2 = blue & green nodes is an overlapping cover For our problem, we concentrate on disjoint community detection

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-61
SLIDE 61

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

A Little Background: Edge Betweenness Centrality

cB(e) =

  • s,t∈V

s=t

σ(s, t | e) σ(s, t)

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-62
SLIDE 62

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

A Little Background: Edge Betweenness Centrality

cB(e) =

  • s,t∈V

s=t

σ(s, t | e) σ(s, t)

Top 6 edges Edge cB(e) Type (10, 13) 0.3 inter (3, 5) 0.23333 inter (7, 15) 0.2079 inter (1, 8) 0.1873 inter (13, 15) 0.1746 intra (5, 7) 0.1476 intra

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-63
SLIDE 63

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

A Little Background: Edge Betweenness Centrality

cB(e) =

  • s,t∈V

s=t

σ(s, t | e) σ(s, t)

Top 6 edges Edge cB(e) Type (10, 13) 0.3 inter (3, 5) 0.23333 inter (7, 15) 0.2079 inter (1, 8) 0.1873 inter (13, 15) 0.1746 intra (5, 7) 0.1476 intra Bottom 6 edges Edge cB(e) Type (8, 11) 0.022 intra (1, 2) 0.0269 intra (9, 11) 0.031 intra (8, 9) 0.0412 intra (12, 15) 0.052 intra (3, 4) 0.060 intra

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-64
SLIDE 64

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-65
SLIDE 65

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-66
SLIDE 66

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-67
SLIDE 67

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm

1 Compute centrality for all edges Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-68
SLIDE 68

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm

1 Compute centrality for all edges 2 Remove edge with largest centrality; ties can be broken randomly Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-69
SLIDE 69

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm

1 Compute centrality for all edges 2 Remove edge with largest centrality; ties can be broken randomly 3 Recalculate the centralities on the running graph Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-70
SLIDE 70

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

The Girvan-Newman Algorithm

Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm

1 Compute centrality for all edges 2 Remove edge with largest centrality; ties can be broken randomly 3 Recalculate the centralities on the running graph 4 Iterate from step 2, stop when you get clusters of desirable quality Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-71
SLIDE 71

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

(a) Best edge: (10, 13) (f) Final graph (b) Best edge: (3, 5) (e) Best edge: (2, 11) (c) Best edge: (7, 15) (d) Best edge: (1, 8)

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-72
SLIDE 72

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-73
SLIDE 73

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-74
SLIDE 74

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-75
SLIDE 75

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection Performs hierarchical partitioning, stopping when there cannot be any further improvement in modularity

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-76
SLIDE 76

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Louvain Method: A Greedy Approach

Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection Performs hierarchical partitioning, stopping when there cannot be any further improvement in modularity Contracts the graph in each iteration thereby speeding up the process

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-77
SLIDE 77

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network

Louvain Method in Action

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-78
SLIDE 78

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Outline for Part I

1

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

2

Fast Detection of Communities from the Sparsified Network Methodology and Visualizations Experimental Results

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-79
SLIDE 79

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Our Method

Input: An unweighted network G(V, E) Output: A disjoint cover C

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-80
SLIDE 80

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Our Method

Input: An unweighted network G(V, E) Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E, W) Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-81
SLIDE 81

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Our Method

Input: An unweighted network G(V, E) Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E, W) 2 Construct an t-spanner of G(V, E, W). Take the complement of GS, call it Gcomm Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-82
SLIDE 82

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Our Method

Input: An unweighted network G(V, E) Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E, W) 2 Construct an t-spanner of G(V, E, W). Take the complement of GS, call it Gcomm 3 Use LINCOM to break Gcomm into small but pure fragments Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-83
SLIDE 83

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Our Method

Input: An unweighted network G(V, E) Output: A disjoint cover C

1 Use Jaccard coefficient to turn G into a weighted network G(V, E, W) 2 Construct an t-spanner of G(V, E, W). Take the complement of GS, call it Gcomm 3 Use LINCOM to break Gcomm into small but pure fragments 4 Use the second phase of Louvain Method to piece all the small bits and pieces

together to get C

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-84
SLIDE 84

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Jaccard Intro

Definition

wJ(e(vi, vj)) = |Γ(vi) ∩ Γ(vj)| |Γ(vi) ∪ Γ(vj)|

where Γ(vi) is the neighborhood of the node vi ∴ wJ ∈ [0, 1] Jaccard works well in domains where local influence is important[4][5][6]

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-85
SLIDE 85

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Jaccard Intro

Definition

wJ(e(vi, vj)) = |Γ(vi) ∩ Γ(vj)| |Γ(vi) ∪ Γ(vj)|

where Γ(vi) is the neighborhood of the node vi ∴ wJ ∈ [0, 1] Jaccard works well in domains where local influence is important[4][5][6] The computation takes O(m) time

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-86
SLIDE 86

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Jaccard Example

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-87
SLIDE 87

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Jaccard Example

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-88
SLIDE 88

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Jaccard Example

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-89
SLIDE 89

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-90
SLIDE 90

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Table: Jaccard weight statistics for top 10% edges in terms of wJ.

Network |E| intra-cluster top 10% edges in terms of wJ edge count Total edges Intra-edge Fraction Karate 78 21 7 7 1 Dolphin 159 39 15 15 1 Football 613 179 61 61 1 Les-Mis 254 56 25 25 1 Enron 180,811 48,498 18,383 18,220 0.99113 Epinions 405,739 146,417 40,573 36,589 0.90180 Amazon 925,872 54,403 92,587 92,584 0.99996 DBLP 1,049,866 164,268 104,986 104,986 1

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-91
SLIDE 91

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Spanner

A (α, β)-spanner of a graph G = (V, E, W) is a subgraph GS = (V, ES, WS), such that,

δS(u, v) ≤ α . δ(u, v) + β ∀ u, v ∈ V

Authors Size Running Time Alth¨

  • fer et al. [1993] [7]

O(n1+ 1

k )

O(m(n1+ 1

k + nlogn))

Alth¨

  • fer et al. [1993] [7]

1 2n1+ 1

k

O(mn1+ 1

k )

Roddity et al. [2004] [8]

1 2n1+ 1

k

O(kn2+ 1

k )

Roddity et al. [2005] [9] O(kn1+ 1

k )

O(km) (det.) Baswana and Sen [2007] [10] O(kn1+ 1

k )

O(km) (rand.)

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-92
SLIDE 92

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Spanner

A (α, β)-spanner of a graph G = (V, E, W) is a subgraph GS = (V, ES, WS), such that,

δS(u, v) ≤ α . δ(u, v) + β ∀ u, v ∈ V

A t-spanner is a special case of (α, β) spanner where α = t and β = 0 Authors Size Running Time Alth¨

  • fer et al. [1993] [7]

O(n1+ 1

k )

O(m(n1+ 1

k + nlogn))

Alth¨

  • fer et al. [1993] [7]

1 2n1+ 1

k

O(mn1+ 1

k )

Roddity et al. [2004] [8]

1 2n1+ 1

k

O(kn2+ 1

k )

Roddity et al. [2005] [9] O(kn1+ 1

k )

O(km) (det.) Baswana and Sen [2007] [10] O(kn1+ 1

k )

O(km) (rand.)

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-93
SLIDE 93

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-94
SLIDE 94

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-95
SLIDE 95

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Figure: Original network n = 11, m = 18 δ(1, 5) = 5 Figure: A 3-spanner of the network n = 11, m = 11 δs(1, 5) = 12

Since δs(1, 5) < t . δ(1, 5), the edge (1, 5) is discarded The other edges are discarded in a similar fashion.

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-96
SLIDE 96

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Figure: Dolphin network. n = 62, m = 159

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-97
SLIDE 97

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Figure: 3-spanner. n = 62, m = 150

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-98
SLIDE 98

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Figure: 5-spanner. n = 62, m = 148

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-99
SLIDE 99

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Figure: 7-spanner. n = 62, m = 144

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-100
SLIDE 100

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Figure: 9-spanner. n = 62, m = 138

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-101
SLIDE 101

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-102
SLIDE 102

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Name n Spanner #intra-community #inter-community Karate 34 Original 59 19 3 57 19 5 53 19 7 51 18 9 48 19 Dolphin 59 Original 120 39 3 117 38 5 102 38 7 100 38 9 90 38 Football 115 Original 447 163 3 385 166 5 376 166 7 293 166 9 286 165

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-103
SLIDE 103

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Assigning Meaningful Weights to Edges Sparsification using t-spanner

Figure: Original US Football network Figure: Sparsified network Gcomm Figure: Final network with communities marked as separate components

Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

slide-104
SLIDE 104

Today: In class work

◮ Implement node and edge sampling methods ◮ Compare their efficacy on various networks

9 / 10

slide-105
SLIDE 105

Graph Sparsification and Sampling Blank code and data available on website (Lecture 19) www.cs.rpi.edu/∼slotag/classes/FA16/index.html

10 / 10