 
              Sampling by Exploration (cont.) • Ego ‐ Centric Exploration (ECE) Sampling – Similar to random walk, but each neighbor has p probability to be selected – Multiple ECE (starting with multiple seeds) • Depth ‐ First / Breadth ‐ First Search [Krishnamurthy’05] – Keep visiting neighbors of earliest / most recently visited nodes • Sample Edge Count [Maiya’11] – Move to neighbor with the highest degree, and keep going • Expansion Sampling [Maiya’11] – Construct a sample with the maximal expansion. Select the neighbor v based on ������ �∈���� � � � ����� ∪ �� Lin et al., Sampling and Summarization for 13/05/02 21 S: the set of sampled nodes, N(S): the 1 st neighbor set of S Social Networks, PAKDD 2013 tutorial
Example: Expansion Sampling B C |N({A})|=4 A H |N({E}) – N({A}) ∪ {A}|=|{F,G,H}|=3 E |N({D}) – N({A}) ∪ {A}|=|{F}|=1 G D F
Drawback of Random Walk: Degree Bias! q k ‐ sampled node degree distribution p k ‐ real node degree distribution • Real average node degree ~ 94, Sampled average node degree ~ 338 • Solution: modify the transition probability : 1 �1, � � ∗ min � If w is a neighbor of v � � � � � �,� � 1 � � � If w = v �,� ��� 13/05/02 23 0 otherwise
Metropolis Graph Sampling [Hubler’08] • Step 1: Initially pick one subgraph sample S with n’ nodes randomly • Step 2: Iterate the following steps until convergence 2.1: Remove one node from S 2.2: Randomly add a new node to S  S’ � � � ∗ ��′� 2.3: Compute the likelihood ratio � ∗ ��� �� � � 1: ������ ����������: � ≔ � � �� � � 1: ������ ����������: � ≔ � � with probability � ������ ����������: � ≔ � � with probability 1 � � –  *(S) measures the similarity of a certain property between the sample S and the original network G • Be derived approximately using Simulated Annealing Lin et al., Sampling and Summarization for 13/05/02 24 Social Networks, PAKDD 2013 tutorial
Sampling for Heterogeneous Social Networks
Sampling on Heterogeneous Social Networks • Heterogeneous Social Networks (HSN) – A graph G=<V, E> has n nodes (v 1 ,v 2 , …, v n ), m directed edges (e 1 , …, e m ) and k different types – Each node/edge belongs to a type • Given a finite set L = {L 1 , ..., L k } denoting k types • Sampling methods for HSN – Multi ‐ graph sampling [Gjoka’10] – Type ‐ distribution preserving sampling (Li’ 11) – Relational ‐ profile preserving sampling (Yang’13) Lin et al., Sampling and Summarization for 13/05/02 26 Social Networks, PAKDD 2013 tutorial
Multigraph Sampling • Random walk sampling on the union multiple graph to avoid stopping on the disconnected graph. Lin et al., Sampling and Summarization for 13/05/02 27 Social Networks, PAKDD 2013 tutorial
Sampling Heterogeneous Social Networks • Sampling methods for HSN – Multi ‐ graph sampling [Gjoka’10] – Type ‐ distribution preserving sampling (Li’ 11) – Relational ‐ profile preserving sampling (Yang’13) Lin et al., Sampling and Summarization for 13/05/02 28 Social Networks, PAKDD 2013 tutorial
Node Type Distribution Preserving Sampling • Given a graph G and a sampled subgraph G S • The node type distribution of G S is expected to be the same as G, i.e., d ( Dist ( Gs ) ,Dist ( G )) = 0 – d() denotes the difference between two distributions Sampled Network Original Network (9:6) = (3:2) Lin et al., Sampling and Summarization for 13/05/02 29 Social Networks, PAKDD 2013 tutorial
Connection ‐ type Preserving Sampling • Heterogeneous Connection – For an edge E[v i ,v j ] – Intra ‐ connection edge: Type(v i ) = Type(v j ) – Inter ‐ connection edge: Type(v i ) != Type(v j ) • Intra ‐ Relationship preserving – The ratio of the intra ‐ connection should be preserved, that is: d(IR(G S ),IR(G)) = 0 – If the intra ‐ relationship is preserved, the inter ‐ relationship is also preserved Lin et al., Sampling and Summarization for 13/05/02 30 Social Networks, PAKDD 2013 tutorial
Respondent ‐ driven Sampling • First proposed in social science[Heck’99] to solve the hidden population in surveying. • Two Main Phases: Snowball sampling  Finding steady ‐ state in Recruitment matrix respondents G S 11 S 12 S 13 limited coupon c N ‐ step S 21 S 22 S 23 transition P 1 P 2 P 3 limited coupon c S 31 S 32 S 33 steady ‐ state vector Transition Matrix limited coupon c 31
Comparing Different Sampling algorithms Similarity of node type ‐ distribution Similarity of Intra ‐ link distribution • Respondent ‐ driven Sampling does a good job with small node size, but saturate to mediocre afterwards • Random node sampling performs poorly in the beginning, but reaches the best results after sufficient amount of nodes are sampled. Lin et al., Sampling and Summarization for 13/05/02 32 Social Networks, PAKDD 2013 tutorial
Heterogeneous Social Networks • Sampling methods for HSN – Multi ‐ graph sampling [Gjoka’10] – Type ‐ distribution preserving sampling (Li’ 11) – Relational ‐ profile preserving sampling (Yang’13) Lin et al., Sampling and Summarization for 13/05/02 33 Social Networks, PAKDD 2013 tutorial
Relational Profile Preserving Sampling • Node ‐ type/intra ‐ type preservation considers the semantics of nodes, but not the structure of networks • Propose the Relational Profile to consider semantic and structure all together – Capture the dependency between each Node Type(NT) and Edge Type(ET) of a directed Heterogeneous Network – Consists of 4 Relational Matrices • Conditional probabilities P(T j |T i ) (e.g. P(LT=cites|NT=paper) ) • Node to node, node to edge, edge to node, edge to edge NT ET cites author paper Transition Transition cites NT Matrix Matrix authored journal_of Transition Transition ET Matrix Matrix Lin et al., Sampling and Summarization for 13/05/02 34 Social Networks, PAKDD 2013 tutorial
Example of Relational Profile (RP) P A C J c p a P 0.44 0.22 0.22 0.11 0.44 0.33 0.22 A 1 1 C 1 1 J 1 1 c 1 0.22 0.44 0.33 p 0.5 0.33 0.17 0.66 0.33 a 0.5 0.5 0.6 0.4 P A C J c p a P 0.182 0.364 0.091 0.273 0.182 0.364 0.364 A 1 1 C 1 1 J 1 1 c 1 0.5 0.5 p 0.5 0.125 0.375 0.17 0.5 0.33 a 0.5 0.5 0.22 0.33 0.44 Lin et al., Sampling and Summarization for 13/05/02 35 Social Networks, PAKDD 2013 tutorial
Challenge: How to approximate RP when the true RP is unknown • We propose Exploration by Expectation Sampling • Aim to preserve the unknown relational profile while adding new sample node 1. Randomly choose a starting node and the corresponding edges 2. Based on current RP , select a next node from all 1 degree neighbor 3. Add the new node and all its edges 4. Update RP of the sub ‐ sampled graph 5. Repeat step 3, 4 & 5 until the converge of RP • Which node should be selected ? – Select the node whose inclusion can potentially lead to the largest change to the existing RP • Use the partially observed RP to generate the ‘expected amount of change’ of each node as its score • Weighted sampling based on the score Lin et al., Sampling and Summarization for 13/05/02 36 Social Networks, PAKDD 2013 tutorial
Relational Profile Sampling (RPS) Idea: Sample to increase the diversity D(v, G s ) = estimated change of RP given sampling v on the current graph G s =E[ Δ P (G s , G s +v)|G s ] , where Δ P = RMSE RP v which can be calculated as G s Exploiting the existing RP, P(type(v)=t|G s ) can be RP(type |type ) obtained using the observed types of v’s neighbors v RP(type |type ) P(type|type) can be obtained from the existing RP RP(type |type ) RP(type |type ) Goal: maximize expected property (Relational Profile distribution) change
Evaluation • Datasets: 3 real ‐ life large scale social networks • Baselines: – Random Walk Sampling (RW) – Degree ‐ based sampling (HDS) • Evaluation I ( Property Preservation ): see how well the sampled network approximates two properties of the full network • Evaluation II ( Prediction ): training a prediction model using the sampled network to infer out ‐ of ‐ sampled network status: – Node Type Prediction : Predict the type of unseen nodes in the network using a sub ‐ sampled network – Missing Relations Prediction : Recover/predict the missing links – Features: • f deg = (in/out deg; avg in/out deg of neighbors) • f topo = (Common Neighbors; Jaccard’s Coefficient; etc) • f nt = P(type(v)|G s )= • f RPnode = • f RPpath =
Experiments (Property Preservation) • RP (RMSE) Type dependency preservation Preserving relative node weights • Weighted PageRank propagated throughout entire network 民國前 / 通用格式 民國前 / 通用格式 RW HDS RPS 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 Kendall ‐ Tau 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 民國前 / 通用格式 1 5 9 13 17 21 25 29 33 37 41 45 49 1 5 9 13 17 21 25 29 33 37 41 45 49 1 6 11 16 21 26 31 36 41 46 # Nodes Sampled (in 10s) Hep Aca Movie
Experiments (Prediction) • We show Academic Network for brevity. Node Type Prediction Missing Relation Prediction 民國前 / 通用格式 民國前 / 通用格式 A c 民國前 / 通用格式 c u 民國前 / 通用格式 r a 民國前 / 通用格式 c y 民國前 / 通用格式 民國前 / 通用格式 highDeg RandWalk RPS number of sampled nodes
Task ‐ driven Network Sampling • Sampling Community Structure [Maiya’10][Satuluri’11] • Sampling Network Backbone for Influence Maximization [Mathioudakis’11] • Sampling High Centrality Individuals [Maiya’10] • Sampling Personalized PageRank Values [Vattani’11] • Sampling Network for Link/Label Prediction [Ahmed’12] Lin et al., Sampling and Summarization for 13/05/02 41 Social Networks, PAKDD 2013 tutorial
Short Summary • Why sampling a social network?  the full network (e.g. Facebook) cannot be fully observed  crawling can be costly in terms of resource and time consumption (therefore a smart sampling strategy is needed) Homogeneous SN Heterogeneous SN Node and Edge [Leskovec’06] [Adamic’01] [Kurant’12] [Ahmed’12][Ribeiro’10] Selection [Krishnamurthy’05] Sampling by [Leskovec’06][Hubler’08] [Gjoka’11][Li’11][Kurant’12] Exploration [Gjoka’10][Ribeiro’10] [Yang’13] [Maiya’11][Kurant’11] Task ‐ driven [Maiya’10][Satuluri’11][Mathioudakis’11] [Vattani’11][Ahmed’12] Sampling Lin et al., Sampling and Summarization for 13/05/02 42 Social Networks, PAKDD 2013 tutorial
Detecting Community Structures in Social Networks by Graph Sparsification Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder, Heritage Institute of Technology, Kolkata, India 8 / 10
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Detecting Community Structures in Social Networks by Graph Sparsification Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Department of Computer Science and Engineering, Heritage Institute of Technology, Kolkata, India September 5, 2016 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Figure: The tendency of people to live in racially homogeneous neighborhoods[1]. In yellow and orange blocks % of Afro-Americans ≤ 25, in brown and black boxes % ≥ 75. Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i Figure: Zachary’s Karate Club Network Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i C = { C 1 , C 2 , C 3 } , C 1 = yellow nodes, C 2 = green, C 3 = blue is a disjoint cover Figure: Zachary’s Karate Club Network Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i C = { C 1 , C 2 , C 3 } , C 1 = yellow nodes, C 2 = green, C 3 = blue is a disjoint cover However, ¯ C = { ¯ C 1 , ¯ C 2 } , ¯ C 1 = yellow & green nodes and ¯ C 2 = blue & green nodes is an overlapping cover Figure: Zachary’s Karate Club Network Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i C = { C 1 , C 2 , C 3 } , C 1 = yellow nodes, C 2 = green, C 3 = blue is a disjoint cover However, ¯ C = { ¯ C 1 , ¯ C 2 } , ¯ C 1 = yellow & green nodes and ¯ C 2 = blue & green nodes is an overlapping cover Figure: Zachary’s Karate Club Network For our problem, we concentrate on disjoint community detection Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network A Little Background: Edge Betweenness Centrality σ ( s, t | e ) � c B ( e ) = σ ( s, t ) s,t ∈V s � = t Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network A Little Background: Edge Betweenness Centrality σ ( s, t | e ) � c B ( e ) = σ ( s, t ) s,t ∈V s � = t Top 6 edges Edge c B ( e ) Type (10, 13) 0.3 inter (3, 5) 0.23333 inter (7, 15) 0.2079 inter (1, 8) 0.1873 inter (13, 15) 0.1746 intra (5, 7) 0.1476 intra Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network A Little Background: Edge Betweenness Centrality σ ( s, t | e ) � c B ( e ) = σ ( s, t ) s,t ∈V s � = t Bottom 6 edges Top 6 edges Edge c B ( e ) Type Edge c B ( e ) Type (8, 11) 0.022 intra (10, 13) 0.3 inter (1, 2) 0.0269 intra (3, 5) 0.23333 inter (9, 11) 0.031 intra (7, 15) 0.2079 inter (8, 9) 0.0412 intra (1, 8) 0.1873 inter (12, 15) 0.052 intra (13, 15) 0.1746 intra (3, 4) 0.060 intra (5, 7) 0.1476 intra Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm 1 Compute centrality for all edges Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm 1 Compute centrality for all edges 2 Remove edge with largest centrality; ties can be broken randomly Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm 1 Compute centrality for all edges 2 Remove edge with largest centrality; ties can be broken randomly 3 Recalculate the centralities on the running graph Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm 1 Compute centrality for all edges 2 Remove edge with largest centrality; ties can be broken randomly 3 Recalculate the centralities on the running graph 4 Iterate from step 2, stop when you get clusters of desirable quality Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network (a) Best edge: (10, 13) (b) Best edge: (3, 5) (c) Best edge: (7, 15) (e) Best edge: (2, 11) (d) Best edge: (1, 8) (f) Final graph Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection Performs hierarchical partitioning, stopping when there cannot be any further improvement in modularity Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection Performs hierarchical partitioning, stopping when there cannot be any further improvement in modularity Contracts the graph in each iteration thereby speeding up the process Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method in Action Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Outline for Part I Building Community Preserving Sparsified Network 1 Assigning Meaningful Weights to Edges Sparsification using t -spanner Fast Detection of Communities from the Sparsified Network 2 Methodology and Visualizations Experimental Results Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C 1 Use Jaccard coefficient to turn G into a weighted network G ( V , E , W ) Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C 1 Use Jaccard coefficient to turn G into a weighted network G ( V , E , W ) 2 Construct an t -spanner of G ( V , E , W ) . Take the complement of G S , call it G comm Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C 1 Use Jaccard coefficient to turn G into a weighted network G ( V , E , W ) 2 Construct an t -spanner of G ( V , E , W ) . Take the complement of G S , call it G comm 3 Use LINCOM to break G comm into small but pure fragments Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C 1 Use Jaccard coefficient to turn G into a weighted network G ( V , E , W ) 2 Construct an t -spanner of G ( V , E , W ) . Take the complement of G S , call it G comm 3 Use LINCOM to break G comm into small but pure fragments 4 Use the second phase of Louvain Method to piece all the small bits and pieces together to get C Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Jaccard Intro Definition w J ( e ( v i , v j )) = | Γ( v i ) ∩ Γ( v j ) | | Γ( v i ) ∪ Γ( v j ) | where Γ ( v i ) is the neighborhood of the node v i ∴ w J ∈ [0 , 1] Jaccard works well in domains where local influence is important[4][5][6] Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Jaccard Intro Definition w J ( e ( v i , v j )) = | Γ( v i ) ∩ Γ( v j ) | | Γ( v i ) ∪ Γ( v j ) | where Γ ( v i ) is the neighborhood of the node v i ∴ w J ∈ [0 , 1] Jaccard works well in domains where local influence is important[4][5][6] The computation takes O ( m ) time Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Jaccard Example Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Jaccard Example Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Jaccard Example Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Table: Jaccard weight statistics for top 10% edges in terms of w J . Network | E | intra-cluster top 10% edges in terms of w J edge count Total edges Intra-edge Fraction Karate 78 21 7 7 1 Dolphin 159 39 15 15 1 Football 613 179 61 61 1 Les-Mis 254 56 25 25 1 Enron 180,811 48,498 18,383 18,220 0.99113 Epinions 405,739 146,417 40,573 36,589 0.90180 Amazon 925,872 54,403 92,587 92,584 0.99996 DBLP 1,049,866 164,268 104,986 104,986 1 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Spanner A ( α, β ) -spanner of a graph G = ( V , E , W ) is a subgraph G S = ( V , E S , W S ), such that, δ S ( u, v ) ≤ α . δ ( u, v ) + β ∀ u, v ∈ V Authors Size Running Time O ( n 1+ 1 O ( m ( n 1+ 1 k ) k + nlogn )) Alth¨ ofer et al. [1993] [7] 2 n 1+ 1 O ( mn 1+ 1 1 k ) Alth¨ ofer et al. [1993] [7] k 2 n 1+ 1 O ( kn 2+ 1 1 k ) Roddity et al. [2004] [8] k O ( kn 1+ 1 k ) Roddity et al. [2005] [9] O ( km ) (det.) O ( kn 1+ 1 k ) Baswana and Sen [2007] [10] O ( km ) (rand.) Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Spanner A ( α, β ) -spanner of a graph G = ( V , E , W ) is a subgraph G S = ( V , E S , W S ), such that, δ S ( u, v ) ≤ α . δ ( u, v ) + β ∀ u, v ∈ V A t -spanner is a special case of ( α, β ) spanner where α = t and β = 0 Authors Size Running Time O ( n 1+ 1 O ( m ( n 1+ 1 k ) k + nlogn )) Alth¨ ofer et al. [1993] [7] 2 n 1+ 1 O ( mn 1+ 1 1 k ) Alth¨ ofer et al. [1993] [7] k 2 n 1+ 1 O ( kn 2+ 1 1 k ) Roddity et al. [2004] [8] k O ( kn 1+ 1 k ) Roddity et al. [2005] [9] O ( km ) (det.) O ( kn 1+ 1 k ) Baswana and Sen [2007] [10] O ( km ) (rand.) Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Figure: Original network n = 11 , m = 18 Figure: A 3-spanner of the network δ (1 , 5) = 5 n = 11 , m = 11 δ s (1 , 5) = 12 Since δ s (1 , 5) < t . δ (1 , 5) , the edge (1 , 5) is discarded The other edges are discarded in a similar fashion. Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Figure: Dolphin network. n = 62, m = 159 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Figure: 3 -spanner. n = 62, m = 150 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Figure: 5 -spanner. n = 62, m = 148 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Figure: 7 -spanner. n = 62, m = 144 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Figure: 9 -spanner. n = 62, m = 138 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification
Recommend
More recommend