weighted graphs and disconnected components
play

Weighted Graphs and Disconnected Components Patterns and a - PowerPoint PPT Presentation

Weighted Graphs and Disconnected Components Patterns and a Generator Mary McGlohon, Leman Akoglu, Christos Faloutsos Carnegie Mellon University School of Computer Science 2 McGlohon, Akoglu, Faloutsos KDD08 Disconnected components


  1. Weighted Graphs and Disconnected Components Patterns and a Generator Mary McGlohon, Leman Akoglu, Christos Faloutsos Carnegie Mellon University School of Computer Science

  2. 2 McGlohon, Akoglu, Faloutsos KDD08

  3. “Disconnected” components ● In graphs a largest connected component emerges. ● What about the smaller-size components? ● How do they emerge, and join with the large one? 3 McGlohon, Akoglu, Faloutsos KDD08

  4. Weighted edges ● Graphs have heavy-tailed degree distribution. ● What can we also say about these edges? ● How are they repeated, or otherwise weighted? 4 McGlohon, Akoglu, Faloutsos KDD08

  5. Our goals ● Observe “Next-largest connected components” Q1. How does the GCC emerge? Q2. How do NLCC’s emerge and join with the GCC? ● Find properties that govern edge weights Q3: How does the total weight of the graph relate to the number of edges? Q4: How do the weights of nodes relate to degree? Q5: Does this relation change with the graph? ● Q6: Can we produce an emergent, generative model 5 McGlohon, Akoglu, Faloutsos KDD08

  6. Outline ● Motivation ● Related work ● Preliminaries ● Data ● Observations 1 2 3 4 5 ● Model ● Summary 6 6 McGlohon, Akoglu, Faloutsos KDD08

  7. Properties of networks ● Small diameter (“small world” phenomenon) – [Milgram 67] [Leskovec, Horovitz 07] ● Heavy-tailed degree distribution – [Barabasi, Albert 99] [Faloutsos, Faloutsos, Faloutsos 99] ● Densification – [Leskovec, Kleinberg, Faloutsos 05] ● “Middle region” components as well as GCC and singletons – [Kumar, Novak, Tomkins 06] 7 McGlohon, Akoglu, Faloutsos KDD08

  8. Generative Models ● Erdos-Renyi model [Erdos, Renyi 60] ● Preferential Attachment [Barabasi, Albert 99] ● Forest Fire model [Leskovec, Kleinberg, Faloutsos 05] ● Kronecker multiplication [Leskovec, Chakrabarti, Kleinberg, Faloutsos 07] ● Edge Copying model [Kumar, Raghavan, Rajagopalan, Sivakumar, Tomkins, Upfal 00] ● “Winners don’t take all” [Pennock, Flake, Lawrence, Glover, Giles 02] 8 McGlohon, Akoglu, Faloutsos KDD08

  9. Outline ● Motivation ● Related work ● Preliminaries ● Data ● Observations 1 2 3 4 5 6 ● Model ● Summary 9 9 McGlohon, Akoglu, Faloutsos KDD08

  10. Diameter ● Diameter of a graph is the “longest shortest path”. n 5 n 1 n 2 n 6 n 3 n 4 n 7 10 McGlohon, Akoglu, Faloutsos KDD08

  11. Diameter ● Diameter of a graph is the “longest shortest path”. n 5 n 1 diameter=3 n 2 n 6 n 3 n 4 n 7 11 McGlohon, Akoglu, Faloutsos KDD08

  12. Diameter ● Diameter of a graph is the “longest shortest path”. ● Effective diameter is the distance at which 90% of nodes can be reached. n 5 n 1 diameter=3 n 2 n 6 n 3 n 4 n 7 12 McGlohon, Akoglu, Faloutsos KDD08

  13. Outline ● Motivation ● Related work ● Preliminaries ● Data ● Observations 1 2 3 4 5 6 ● Model ● Summary 13 13 McGlohon, Akoglu, Faloutsos KDD08

  14. Unipartite Networks ● Postnet : Posts in blogs, hyperlinks between ● Blognet : Aggregated Postnet, repeated edges ● Patent: Patent citations ● NIPS : Academic citations n 1 n 3 ● Arxiv : Academic citations n 2 ● NetTraffic : Packets, repeated edges n 4 ● Autonomous Systems ( AS ): Packets, repeated edges n 5 n 6 n 7 14 McGlohon, Akoglu, Faloutsos KDD08

  15. Unipartite Networks ● Postnet : Posts in blogs, hyperlinks between ● Blognet : Aggregated Postnet, repeated edges ● Patent: Patent citations (3) ● NIPS : Academic citations n 1 n 3 ● Arxiv : Academic citations n 2 ● NetTraffic : Packets, repeated edges n 4 ● Autonomous Systems ( AS ): Packets, repeated edges n 5 n 6 n 7 15 McGlohon, Akoglu, Faloutsos KDD08

  16. Unipartite Networks ● Postnet : Posts in blogs, hyperlinks between ● Blognet : Aggregated Postnet, repeated edges ● Patent: Patent citations 10 ● NIPS : Academic citations n 1 1.2 n 3 ● Arxiv : Academic citations n 2 1 ● NetTraffic : Packets, repeated edges 8.3 n 4 ● Autonomous Systems ( AS ): Packets, 6 repeated edges n 5 2 n 6 n 7 16 McGlohon, Akoglu, Faloutsos KDD08

  17. Unipartite Networks ● (Nodes, Edges, Timestamps) ● Postnet : 250K, 218K, 80 days ● Blognet : 60K,125K, 80 days ● Patent : 4M, 8M, 17 yrs n 1 ● NIPS : 2K, 3K, 13 yrs n 3 n 2 ● Arxiv : 30K, 60K, 13 yrs ● NetTraffic : 21K, 3M, 52 mo n 4 ● AS : 12K, 38K, 6 mo n 5 n 6 n 7 17 McGlohon, Akoglu, Faloutsos KDD08

  18. Bipartite Networks ● IMDB : Actor-movie network ● Netflix : User-movie ratings ● DBLP : conference- repeated edges – Author-Keyword – Keyword-Conference n 1 – Author-Conference m 1 n 2 ● US Election Donations : $ weights, m repeated edges 2 n 3 – Orgs-Candidates m 3 n 4 – Individuals-Orgs 18 McGlohon, Akoglu, Faloutsos KDD08

  19. Bipartite Networks ● IMDB : Actor-movie network ● Netflix : User-movie ratings ● DBLP : repeated edges – Author-Keyword – Keyword-Conference n 1 – Author-Conference m 1 n 2 ● US Election Donations : $ weights, m repeated edges 2 n 3 – Orgs-Candidates m 3 n 4 – Individuals-Orgs 19 McGlohon, Akoglu, Faloutsos KDD08

  20. Bipartite Networks ● IMDB : Actor-movie network ● Netflix : User-movie ratings ● DBLP : repeated edges – Author-Keyword – Keyword-Conference 10 n 1 – Author-Conference m 1.2 1 2 n 2 ● US Election Donations : $ weights, 5 m repeated edges 2 n 3 1 – Orgs-Candidates 6 m 3 n 4 – Individuals-Orgs 20 McGlohon, Akoglu, Faloutsos KDD08

  21. Bipartite Networks ● IMDB : 757K, 2M, 114 yr ● Netflix : 125K, 14M, 72 mo ● DBLP : 25 yr – Author-Keyword: 27K, 189K – Keyword-Conference: 10K, 23K n 1 – Author-Conference: 17K, 22K m 1 n 2 ● US Election Donations : 22 yr m 2 – Orgs-Candidates: 23K, 877K n 3 m – Individuals-Orgs: 6M, 10M 3 n 4 21 McGlohon, Akoglu, Faloutsos KDD08

  22. Outline ● Motivation ● Related work ● Preliminaries ● Data ● Observations 1 2 3 4 5 6 ● Model ● Summary 22 22 McGlohon, Akoglu, Faloutsos KDD08

  23. Observation 1: Gelling Point Q1: How does the GCC emerge? 23 McGlohon, Akoglu, Faloutsos KDD08

  24. Observation 1: Gelling Point ● Most real graphs display a gelling point, or burning off period ● After gelling point, they exhibit typical behavior. This is marked by a spike in diameter. IMDB t=1914 Diameter Time 24 McGlohon, Akoglu, Faloutsos KDD08

  25. Observation 2: NLCC behavior Q2: How do NLCC’s emerge and join with the GCC? Do they continue to grow in size? Do they shrink? Stabilize? 25 McGlohon, Akoglu, Faloutsos KDD08

  26. Observation 2: NLCC behavior ● After the gelling point, the GCC takes off, but NLCC’s remain constant or oscillate. IMDB CC size Time 26 McGlohon, Akoglu, Faloutsos KDD08

  27. Outline ● Motivation ● Related work ● Preliminaries ● Data ● Observations 1 2 3 4 5 6 ● Model ● Summary 27 27 McGlohon, Akoglu, Faloutsos KDD08

  28. Observation 3 Q3: How does the total weight of the graph relate to the number of edges? 28 McGlohon, Akoglu, Faloutsos KDD08

  29. Observation 3: Fortification Effect ● $ = # checks ? Orgs-Candidates 2004 |$| 1980 |Checks| 29 McGlohon, Akoglu, Faloutsos KDD08

  30. Observation 3: Fortification Effect ● Weight additions follow a power law with respect to the number of edges: – W(t): total weight of graph at t Orgs-Candidates – E(t) : total edges of graph at t 2004 – w is PL exponent |$| – 1.01 < w < 1.5 = super-linear! – (more checks, even more $) 1980 |Checks| 30 McGlohon, Akoglu, Faloutsos KDD08

  31. Observation 4 and 5 Q4: How do the weights of nodes relate to degree? Q5: Does this relation change over time? 31 McGlohon, Akoglu, Faloutsos KDD08

  32. Observation 4: Snapshot Power Law ● At any time, total incoming weight of a node is proportional to in degree with PL exponent, iw. 1.01 < iw < 1.26, super-linear ● More donors, even more $ Orgs-Candidates e.g. John Kerry, $10M received, In-weights from 1K donors ($) Edges (# donors) 32 McGlohon, Akoglu, Faloutsos KDD08

  33. Observation 5: Snapshot Power Law ● For a given graph, this exponent is constant over time . Orgs-Candidates exponent Time 33 McGlohon, Akoglu, Faloutsos KDD08

  34. Outline ● Motivation ● Related work ● Preliminaries ● Data ● Observations ● Q6: Is there a generative, “emergent” model? ● Summary 34 34 McGlohon, Akoglu, Faloutsos KDD08

  35. Goals of model ● a) Emergent, intuitive behavior ● b) Shrinking diameter ● c) Constant NLCC’s ● d) Densification power law ● e) Power-law degree distribution 35 McGlohon, Akoglu, Faloutsos KDD08

  36. Goals of model ● a) Emergent, intuitive behavior ● b) Shrinking diameter ● c) Constant NLCC’s ● d) Densification power law ● e) Power-law degree distribution = “Butterfly” Model 36 McGlohon, Akoglu, Faloutsos KDD08

  37. Butterfly model in action ● A node joins a network, with own parameter. p step n 1 n 3 “Curiosity” n 2 n 8 n 4 n 5 n 6 n 7 37 McGlohon, Akoglu, Faloutsos KDD08

  38. Butterfly model in action ● A node joins a network, with own parameter. ● With (global) p host , chooses a random host n 1 p host n 3 “Cross-disciplinarity” n 2 n 8 n 4 n 5 n 6 n 7 38 McGlohon, Akoglu, Faloutsos KDD08

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend