Graph Sampling and Sparsification Lecture 19 CSCI 4974/6971 7 Nov - PowerPoint PPT Presentation

Sampling by Exploration (cont.) • Ego ‐ Centric Exploration (ECE) Sampling – Similar to random walk, but each neighbor has p probability to be selected – Multiple ECE (starting with multiple seeds) • Depth ‐ First / Breadth ‐ First Search [Krishnamurthy’05] – Keep visiting neighbors of earliest / most recently visited nodes • Sample Edge Count [Maiya’11] – Move to neighbor with the highest degree, and keep going • Expansion Sampling [Maiya’11] – Construct a sample with the maximal expansion. Select the neighbor v based on �� ∈�� ∪ �� Lin et al., Sampling and Summarization for 13/05/02 21 S: the set of sampled nodes, N(S): the 1 st neighbor set of S Social Networks, PAKDD 2013 tutorial

Example: Expansion Sampling B C |N({A})|=4 A H |N({E}) – N({A}) ∪ {A}|=|{F,G,H}|=3 E |N({D}) – N({A}) ∪ {A}|=|{F}|=1 G D F

Drawback of Random Walk: Degree Bias! q k ‐ sampled node degree distribution p k ‐ real node degree distribution • Real average node degree ~ 94, Sampled average node degree ~ 338 • Solution: modify the transition probability : 1 �1, � � ∗ min � If w is a neighbor of v � � � � � �,� � 1 � � � If w = v �,� �� 13/05/02 23 0 otherwise

Metropolis Graph Sampling [Hubler’08] • Step 1: Initially pick one subgraph sample S with n’ nodes randomly • Step 2: Iterate the following steps until convergence 2.1: Remove one node from S 2.2: Randomly add a new node to S  S’ � � � ∗ ��′� 2.3: Compute the likelihood ratio � ∗ �� 1: �� : � ≔ � � �� 1: �� : � ≔ � � with probability � �� : � ≔ � � with probability 1 � � –  *(S) measures the similarity of a certain property between the sample S and the original network G • Be derived approximately using Simulated Annealing Lin et al., Sampling and Summarization for 13/05/02 24 Social Networks, PAKDD 2013 tutorial

Sampling for Heterogeneous Social Networks

Sampling on Heterogeneous Social Networks • Heterogeneous Social Networks (HSN) – A graph G=<V, E> has n nodes (v 1 ,v 2 , …, v n ), m directed edges (e 1 , …, e m ) and k different types – Each node/edge belongs to a type • Given a finite set L = {L 1 , ..., L k } denoting k types • Sampling methods for HSN – Multi ‐ graph sampling [Gjoka’10] – Type ‐ distribution preserving sampling (Li’ 11) – Relational ‐ profile preserving sampling (Yang’13) Lin et al., Sampling and Summarization for 13/05/02 26 Social Networks, PAKDD 2013 tutorial

Multigraph Sampling • Random walk sampling on the union multiple graph to avoid stopping on the disconnected graph. Lin et al., Sampling and Summarization for 13/05/02 27 Social Networks, PAKDD 2013 tutorial

Sampling Heterogeneous Social Networks • Sampling methods for HSN – Multi ‐ graph sampling [Gjoka’10] – Type ‐ distribution preserving sampling (Li’ 11) – Relational ‐ profile preserving sampling (Yang’13) Lin et al., Sampling and Summarization for 13/05/02 28 Social Networks, PAKDD 2013 tutorial

Node Type Distribution Preserving Sampling • Given a graph G and a sampled subgraph G S • The node type distribution of G S is expected to be the same as G, i.e., d ( Dist ( Gs ) ,Dist ( G )) = 0 – d() denotes the difference between two distributions Sampled Network Original Network (9:6) = (3:2) Lin et al., Sampling and Summarization for 13/05/02 29 Social Networks, PAKDD 2013 tutorial

Connection ‐ type Preserving Sampling • Heterogeneous Connection – For an edge E[v i ,v j ] – Intra ‐ connection edge: Type(v i ) = Type(v j ) – Inter ‐ connection edge: Type(v i ) != Type(v j ) • Intra ‐ Relationship preserving – The ratio of the intra ‐ connection should be preserved, that is: d(IR(G S ),IR(G)) = 0 – If the intra ‐ relationship is preserved, the inter ‐ relationship is also preserved Lin et al., Sampling and Summarization for 13/05/02 30 Social Networks, PAKDD 2013 tutorial

Respondent ‐ driven Sampling • First proposed in social science[Heck’99] to solve the hidden population in surveying. • Two Main Phases: Snowball sampling  Finding steady ‐ state in Recruitment matrix respondents G S 11 S 12 S 13 limited coupon c N ‐ step S 21 S 22 S 23 transition P 1 P 2 P 3 limited coupon c S 31 S 32 S 33 steady ‐ state vector Transition Matrix limited coupon c 31

Comparing Different Sampling algorithms Similarity of node type ‐ distribution Similarity of Intra ‐ link distribution • Respondent ‐ driven Sampling does a good job with small node size, but saturate to mediocre afterwards • Random node sampling performs poorly in the beginning, but reaches the best results after sufficient amount of nodes are sampled. Lin et al., Sampling and Summarization for 13/05/02 32 Social Networks, PAKDD 2013 tutorial

Heterogeneous Social Networks • Sampling methods for HSN – Multi ‐ graph sampling [Gjoka’10] – Type ‐ distribution preserving sampling (Li’ 11) – Relational ‐ profile preserving sampling (Yang’13) Lin et al., Sampling and Summarization for 13/05/02 33 Social Networks, PAKDD 2013 tutorial

Relational Profile Preserving Sampling • Node ‐ type/intra ‐ type preservation considers the semantics of nodes, but not the structure of networks • Propose the Relational Profile to consider semantic and structure all together – Capture the dependency between each Node Type(NT) and Edge Type(ET) of a directed Heterogeneous Network – Consists of 4 Relational Matrices • Conditional probabilities P(T j |T i ) (e.g. P(LT=cites|NT=paper) ) • Node to node, node to edge, edge to node, edge to edge NT ET cites author paper Transition Transition cites NT Matrix Matrix authored journal_of Transition Transition ET Matrix Matrix Lin et al., Sampling and Summarization for 13/05/02 34 Social Networks, PAKDD 2013 tutorial

Example of Relational Profile (RP) P A C J c p a P 0.44 0.22 0.22 0.11 0.44 0.33 0.22 A 1 1 C 1 1 J 1 1 c 1 0.22 0.44 0.33 p 0.5 0.33 0.17 0.66 0.33 a 0.5 0.5 0.6 0.4 P A C J c p a P 0.182 0.364 0.091 0.273 0.182 0.364 0.364 A 1 1 C 1 1 J 1 1 c 1 0.5 0.5 p 0.5 0.125 0.375 0.17 0.5 0.33 a 0.5 0.5 0.22 0.33 0.44 Lin et al., Sampling and Summarization for 13/05/02 35 Social Networks, PAKDD 2013 tutorial

Challenge: How to approximate RP when the true RP is unknown • We propose Exploration by Expectation Sampling • Aim to preserve the unknown relational profile while adding new sample node 1. Randomly choose a starting node and the corresponding edges 2. Based on current RP , select a next node from all 1 degree neighbor 3. Add the new node and all its edges 4. Update RP of the sub ‐ sampled graph 5. Repeat step 3, 4 & 5 until the converge of RP • Which node should be selected ? – Select the node whose inclusion can potentially lead to the largest change to the existing RP • Use the partially observed RP to generate the ‘expected amount of change’ of each node as its score • Weighted sampling based on the score Lin et al., Sampling and Summarization for 13/05/02 36 Social Networks, PAKDD 2013 tutorial

Relational Profile Sampling (RPS) Idea: Sample to increase the diversity D(v, G s ) = estimated change of RP given sampling v on the current graph G s =E[ Δ P (G s , G s +v)|G s ] , where Δ P = RMSE RP v which can be calculated as G s Exploiting the existing RP, P(type(v)=t|G s ) can be RP(type |type ) obtained using the observed types of v’s neighbors v RP(type |type ) P(type|type) can be obtained from the existing RP RP(type |type ) RP(type |type ) Goal: maximize expected property (Relational Profile distribution) change

Evaluation • Datasets: 3 real ‐ life large scale social networks • Baselines: – Random Walk Sampling (RW) – Degree ‐ based sampling (HDS) • Evaluation I ( Property Preservation ): see how well the sampled network approximates two properties of the full network • Evaluation II ( Prediction ): training a prediction model using the sampled network to infer out ‐ of ‐ sampled network status: – Node Type Prediction : Predict the type of unseen nodes in the network using a sub ‐ sampled network – Missing Relations Prediction : Recover/predict the missing links – Features: • f deg = (in/out deg; avg in/out deg of neighbors) • f topo = (Common Neighbors; Jaccard’s Coefficient; etc) • f nt = P(type(v)|G s )= • f RPnode = • f RPpath =

Experiments (Property Preservation) • RP (RMSE) Type dependency preservation Preserving relative node weights • Weighted PageRank propagated throughout entire network 民國前 / 通用格式民國前 / 通用格式 RW HDS RPS 民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式 Kendall ‐ Tau 民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式民國前 / 通用格式 1 5 9 13 17 21 25 29 33 37 41 45 49 1 5 9 13 17 21 25 29 33 37 41 45 49 1 6 11 16 21 26 31 36 41 46 # Nodes Sampled (in 10s) Hep Aca Movie

Experiments (Prediction) • We show Academic Network for brevity. Node Type Prediction Missing Relation Prediction 民國前 / 通用格式民國前 / 通用格式 A c 民國前 / 通用格式 c u 民國前 / 通用格式 r a 民國前 / 通用格式 c y 民國前 / 通用格式民國前 / 通用格式 highDeg RandWalk RPS number of sampled nodes

Task ‐ driven Network Sampling • Sampling Community Structure [Maiya’10][Satuluri’11] • Sampling Network Backbone for Influence Maximization [Mathioudakis’11] • Sampling High Centrality Individuals [Maiya’10] • Sampling Personalized PageRank Values [Vattani’11] • Sampling Network for Link/Label Prediction [Ahmed’12] Lin et al., Sampling and Summarization for 13/05/02 41 Social Networks, PAKDD 2013 tutorial

Short Summary • Why sampling a social network?  the full network (e.g. Facebook) cannot be fully observed  crawling can be costly in terms of resource and time consumption (therefore a smart sampling strategy is needed) Homogeneous SN Heterogeneous SN Node and Edge [Leskovec’06] [Adamic’01] [Kurant’12] [Ahmed’12][Ribeiro’10] Selection [Krishnamurthy’05] Sampling by [Leskovec’06][Hubler’08] [Gjoka’11][Li’11][Kurant’12] Exploration [Gjoka’10][Ribeiro’10] [Yang’13] [Maiya’11][Kurant’11] Task ‐ driven [Maiya’10][Satuluri’11][Mathioudakis’11] [Vattani’11][Ahmed’12] Sampling Lin et al., Sampling and Summarization for 13/05/02 42 Social Networks, PAKDD 2013 tutorial

Detecting Community Structures in Social Networks by Graph Sparsification Partha Basuchowdhuri, Satyaki Sikdar, Sonu Shreshtha, Subhasis Majumder, Heritage Institute of Technology, Kolkata, India 8 / 10

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Detecting Community Structures in Social Networks by Graph Sparsification Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Department of Computer Science and Engineering, Heritage Institute of Technology, Kolkata, India September 5, 2016 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Figure: The tendency of people to live in racially homogeneous neighborhoods[1]. In yellow and orange blocks % of Afro-Americans ≤ 25, in brown and black boxes % ≥ 75. Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i Figure: Zachary’s Karate Club Network Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i C = { C 1 , C 2 , C 3 } , C 1 = yellow nodes, C 2 = green, C 3 = blue is a disjoint cover Figure: Zachary’s Karate Club Network Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i C = { C 1 , C 2 , C 3 } , C 1 = yellow nodes, C 2 = green, C 3 = blue is a disjoint cover However, ¯ C = { ¯ C 1 , ¯ C 2 } , ¯ C 1 = yellow & green nodes and ¯ C 2 = blue & green nodes is an overlapping cover Figure: Zachary’s Karate Club Network Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Definition of a Community For a given graph G ( V , E ) , find a cover C = { C 1 , C 2 , ..., C k } such that � C i = V . i � C j = ∅ For disjoint communities, ∀ i, j we have C i � C j � = ∅ For overlapping communities, ∃ i, j where C i C = { C 1 , C 2 , C 3 } , C 1 = yellow nodes, C 2 = green, C 3 = blue is a disjoint cover However, ¯ C = { ¯ C 1 , ¯ C 2 } , ¯ C 1 = yellow & green nodes and ¯ C 2 = blue & green nodes is an overlapping cover Figure: Zachary’s Karate Club Network For our problem, we concentrate on disjoint community detection Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network A Little Background: Edge Betweenness Centrality σ ( s, t | e ) � c B ( e ) = σ ( s, t ) s,t ∈V s � = t Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network A Little Background: Edge Betweenness Centrality σ ( s, t | e ) � c B ( e ) = σ ( s, t ) s,t ∈V s � = t Top 6 edges Edge c B ( e ) Type (10, 13) 0.3 inter (3, 5) 0.23333 inter (7, 15) 0.2079 inter (1, 8) 0.1873 inter (13, 15) 0.1746 intra (5, 7) 0.1476 intra Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network A Little Background: Edge Betweenness Centrality σ ( s, t | e ) � c B ( e ) = σ ( s, t ) s,t ∈V s � = t Bottom 6 edges Top 6 edges Edge c B ( e ) Type Edge c B ( e ) Type (8, 11) 0.022 intra (10, 13) 0.3 inter (1, 2) 0.0269 intra (3, 5) 0.23333 inter (9, 11) 0.031 intra (7, 15) 0.2079 inter (8, 9) 0.0412 intra (1, 8) 0.1873 inter (12, 15) 0.052 intra (13, 15) 0.1746 intra (3, 4) 0.060 intra (5, 7) 0.1476 intra Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm 1 Compute centrality for all edges Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm 1 Compute centrality for all edges 2 Remove edge with largest centrality; ties can be broken randomly Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm 1 Compute centrality for all edges 2 Remove edge with largest centrality; ties can be broken randomly 3 Recalculate the centralities on the running graph Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network The Girvan-Newman Algorithm Proposed by Michelle Girvan and Mark Newman[2] in 2002 The Key Ideas Based on reachability of nodes - shortest paths Edges are selected on the basis of the edge betweenness centrality The algorithm 1 Compute centrality for all edges 2 Remove edge with largest centrality; ties can be broken randomly 3 Recalculate the centralities on the running graph 4 Iterate from step 2, stop when you get clusters of desirable quality Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network (a) Best edge: (10, 13) (b) Best edge: (3, 5) (c) Best edge: (7, 15) (e) Best edge: (2, 11) (d) Best edge: (1, 8) (f) Final graph Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection Performs hierarchical partitioning, stopping when there cannot be any further improvement in modularity Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method: A Greedy Approach Proposed by Blondel et al[3] in 2008 Takes the greedy maximization approach Very fast in practice, it’s the current state-of-the-art in disjoint community detection Performs hierarchical partitioning, stopping when there cannot be any further improvement in modularity Contracts the graph in each iteration thereby speeding up the process Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Fast Detection of Communities from the Sparsified Network Louvain Method in Action Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Outline for Part I Building Community Preserving Sparsified Network 1 Assigning Meaningful Weights to Edges Sparsification using t -spanner Fast Detection of Communities from the Sparsified Network 2 Methodology and Visualizations Experimental Results Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C 1 Use Jaccard coefficient to turn G into a weighted network G ( V , E , W ) Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C 1 Use Jaccard coefficient to turn G into a weighted network G ( V , E , W ) 2 Construct an t -spanner of G ( V , E , W ) . Take the complement of G S , call it G comm Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C 1 Use Jaccard coefficient to turn G into a weighted network G ( V , E , W ) 2 Construct an t -spanner of G ( V , E , W ) . Take the complement of G S , call it G comm 3 Use LINCOM to break G comm into small but pure fragments Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Our Method Input : An unweighted network G ( V , E ) Output : A disjoint cover C 1 Use Jaccard coefficient to turn G into a weighted network G ( V , E , W ) 2 Construct an t -spanner of G ( V , E , W ) . Take the complement of G S , call it G comm 3 Use LINCOM to break G comm into small but pure fragments 4 Use the second phase of Louvain Method to piece all the small bits and pieces together to get C Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Jaccard Intro Definition w J ( e ( v i , v j )) = | Γ( v i ) ∩ Γ( v j ) | | Γ( v i ) ∪ Γ( v j ) | where Γ ( v i ) is the neighborhood of the node v i ∴ w J ∈ [0 , 1] Jaccard works well in domains where local influence is important[4][5][6] Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Jaccard Intro Definition w J ( e ( v i , v j )) = | Γ( v i ) ∩ Γ( v j ) | | Γ( v i ) ∪ Γ( v j ) | where Γ ( v i ) is the neighborhood of the node v i ∴ w J ∈ [0 , 1] Jaccard works well in domains where local influence is important[4][5][6] The computation takes O ( m ) time Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Jaccard Example Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Table: Jaccard weight statistics for top 10% edges in terms of w J . Network | E | intra-cluster top 10% edges in terms of w J edge count Total edges Intra-edge Fraction Karate 78 21 7 7 1 Dolphin 159 39 15 15 1 Football 613 179 61 61 1 Les-Mis 254 56 25 25 1 Enron 180,811 48,498 18,383 18,220 0.99113 Epinions 405,739 146,417 40,573 36,589 0.90180 Amazon 925,872 54,403 92,587 92,584 0.99996 DBLP 1,049,866 164,268 104,986 104,986 1 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Spanner A ( α, β ) -spanner of a graph G = ( V , E , W ) is a subgraph G S = ( V , E S , W S ), such that, δ S ( u, v ) ≤ α . δ ( u, v ) + β ∀ u, v ∈ V Authors Size Running Time O ( n 1+ 1 O ( m ( n 1+ 1 k ) k + nlogn )) Alth¨ ofer et al. [1993] [7] 2 n 1+ 1 O ( mn 1+ 1 1 k ) Alth¨ ofer et al. [1993] [7] k 2 n 1+ 1 O ( kn 2+ 1 1 k ) Roddity et al. [2004] [8] k O ( kn 1+ 1 k ) Roddity et al. [2005] [9] O ( km ) (det.) O ( kn 1+ 1 k ) Baswana and Sen [2007] [10] O ( km ) (rand.) Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Spanner A ( α, β ) -spanner of a graph G = ( V , E , W ) is a subgraph G S = ( V , E S , W S ), such that, δ S ( u, v ) ≤ α . δ ( u, v ) + β ∀ u, v ∈ V A t -spanner is a special case of ( α, β ) spanner where α = t and β = 0 Authors Size Running Time O ( n 1+ 1 O ( m ( n 1+ 1 k ) k + nlogn )) Alth¨ ofer et al. [1993] [7] 2 n 1+ 1 O ( mn 1+ 1 1 k ) Alth¨ ofer et al. [1993] [7] k 2 n 1+ 1 O ( kn 2+ 1 1 k ) Roddity et al. [2004] [8] k O ( kn 1+ 1 k ) Roddity et al. [2005] [9] O ( km ) (det.) O ( kn 1+ 1 k ) Baswana and Sen [2007] [10] O ( km ) (rand.) Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Figure: Original network n = 11 , m = 18 Figure: A 3-spanner of the network δ (1 , 5) = 5 n = 11 , m = 11 δ s (1 , 5) = 12 Since δ s (1 , 5) < t . δ (1 , 5) , the edge (1 , 5) is discarded The other edges are discarded in a similar fashion. Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Figure: Dolphin network. n = 62, m = 159 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Building Community Preserving Sparsified Network Assigning Meaningful Weights to Edges Fast Detection of Communities from the Sparsified Network Sparsification using t -spanner Figure: 3 -spanner. n = 62, m = 150 Partha Basuchowdhuri, Satyaki Sikdar , Sonu Shreshtha, Subhasis Majumder Detecting Community Structures in Social Networks by Graph Sparsification

Graph Sampling and Sparsification Lecture 19 CSCI 4974/6971 7 Nov - PowerPoint PPT Presentation

Graph Sampling and Sparsification Lecture 19 CSCI 4974/6971 7 Nov 2016 1 / 10 Todays Biz 1. Reminders 2. Review 3. Graph Sampling/Sparsification 2 / 10 Reminders Assignment 4: due date November 10th Setting up and running on CCI

What is the strengths and weakness of these sampling methods? Sampling Strengths /

Sampling Methods Oliver Schulte - CMPT 419/726 Bishop PRML Ch. 11 Sampling Rejection Sampling

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Multiple importance sampling Slides for CS6630 lecture 6 sampling the BRDF sampling the

Vertex Sparsification and Oblivious Reductions Ankur Moitra, MIT September 14, 2010 Ankur Moitra

Active Regression via Linear-Sample Sparsification Xue Chen Eric Price UT Austin Xue Chen, Eric

Improved Dynamic Graph Learning through Fault-Tolerant Sparsification Chun Jiang Zhu , Sabine

Graph Sparsifiers Smaller graph that (approximately) preserves the values of some set of

Sampling Sediment and Sampling Sediment and Sampling Sediment and Porewater Sampling Sediment

Sampling Overview R toy sampling Non-probability sampling Probability Methods (AKA random)

Sampling Methods CMSC 678 UMBC Outline Recap Monte Carlo methods Sampling Techniques Uniform

Quantum Speedup for Graph Sparsification, Cut Approximation and Laplacian Solving Simon Apers 1

Graph Sparsification Approaches to Scalable Integrated Circuit Modeling and Simulations Zhuo Feng

Random Projections, Graph Sparsification, and Differential Privacy Jalaj Upadhyay Center for

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Newfound Water Quality Sampling: In Lake Sampling 8 Historic Sampling locations

Simplifying Graph Convolutional Networks Amauri Holanda Felix Wu* Tianyi Zhang* Christopher

DeepWalk: Online Learning of Social Representations ACM SIG-KDD August 26, 2014 Bryan Perozzi ,

Social and Technological Networks Rik Sarkar Social Networks Network of friends Node:

Recommender Systems Instructor: Ekpe Okorafor 1. Accenture Big Data Academy 2. Computer

Understanding the Worldview Techniques to help you break through the firmament. Business Analysis

Edge-based graph partitioning Outline Introduction 2D Medium-grain Rob H. Bisseling

L ECTURE 35: N ETWORKS 2 T EACHER : G IANNI A. D I C ARO I MPORTANCE / P OWER IN NETWORKS Certain

Localization and Spreading of Diseases in Networks A. V. Goltsev, S. N. Dorogovtsev, J. G.