optimization of network topology
play

Optimization of Network Topology Elias Boutros Khalil, Bistra - PowerPoint PPT Presentation

Scalable Diffusion-Aware Optimization of Network Topology Elias Boutros Khalil, Bistra Dilkina, Le Song Georgia Institute of Technology Problem Given G(V,E), a set of source nodes X (infected nodes) Linear Threshold Model


  1. Scalable Diffusion-Aware Optimization of Network Topology Elias Boutros Khalil, Bistra Dilkina, Le Song Georgia Institute of Technology

  2. Problem • Given • G(V,E), • a set of source nodes X (infected nodes) • Linear Threshold Model • Find a set of k edges to • remove, s.t., the spread of a certain substance is minimized • add, s.t., the spread of a certain substance is maximized 2

  3. Review: Diffusion Models • Linear Threshold Model • Each edge has a weight Wuv • each node u chooses a threshold uniformly at random in [0,1] • Node v will be infected if • Independent Cascade Model • Each edge has a propagation probability Puv • Each infected node u has only one chance to infect its neighbor v with prob. Puv 3

  4. Review: Influence Maximization • Given • G(V,E) • LT model or IC model • To find k nodes to activate to maximize the spread of a certain substance • Greedy algorithm • Objective function is submodular • (1-1/e)-appriximation 4

  5. Edge Deletion Problem • Given G, source set A, • Find k edges • Supermodular • Greedy algorithm provides (1-1/e)- approximation • Scaling up tricks 5

  6. Edge Addition Problem • Given G, source set A, • Find k edges • Still supermodular (Equivalent to constrained submodular minimization) • Algorithm: max. the lowerbound 6

  7. Edge Addition Problem • Marginal Gain is bounded • Apply an approach for constrained submodular minimization with approximation guarantees R. Iyer, S. Jegelka, and J. Bilmes. Fast semidifferential based submodular function optimization. In ICML, 2013. 7

  8. Experiments • Datasets • Syntetic dataset: generated by Kronecker graph model • (1) CorePeriphery, (2) ErdosRenyi and (3) Hierarchical • Real datasets: 8

  9. Experiments • Competing heuristics • Random • Weights: highest weights • Betweenness • Eigen: k edges to max the leading eigendrop • Degree: k edges whose destination nodes have the highest out-degrees [8] 9

  10. Experiments Edge deletion Edge addition 10

  11. Core Decomposition of Uncertain Graphs Francesco Bonchi, Francesco Gullo, Andreas Kaltenbrunner, Yana Volkovich Yahoo Labs, Spain

  12. Core decomposition • k-core of a graph • a maximal subgraph in which every vertex is connected to at least k other vertices within that subgraph • Core decomposition • The set of all k-cores of a graph G forms the core decomposition of G 12

  13. K-core under uncertain graphs • A maximal subgraph whose vertices have at least k neigbours in that subgraph with probability no less than η 13

  14. Example 14

  15. Motivation • core decomposition can be computed efficiently in deterministic graphs • computed in linear time • However, does not guarantee efficiency in uncertain graphs • even the simplest graph operations may become computationally intensive. • uncertain graph • edges are assigned a probability of existence • E.g.:, protein-interaction, the influence of one person on another 15

  16. Applications • Influence maximization • Idea: just reduce the input graph G by keeping only the inner-most η -shells • the higher the core index is, the more likely the vertex is an influential spreader [17] • Task-driven team formation • Node: individuals; edge: a probabilistic topic model • Given a pair <T,Q> where T is the set of terms, Q is a set of nodes • Goal: Find a node of nodes A where Q ⊆ A, which a good team to perform the task in T • Solution: find a connected component of (k, η )-core which contains A 16

  17. Algorithm framework the maximum degree such that the probability for v to have that degree is no less than η Non-trivial to compute Follow the deterministic case 17

  18. Experiments Influence Maximization Task-driven Team-formation 18

  19. Fast Influence-based Coarsening for Large Networks Manish Purohit ^ , B. Aditya Prakash *, Chanhyun Kang ^ , Yao Zhang * , V S Subrahmanian ^ *Virginia Tech ^University of Maryland KDD, New York City August 26, 2014

  20. Networks are getting huge! Flickr (friendship network): 87 million Amazon (friendship network): 237 million users and 8 billion photos until 2013 accounts until 2013 Facebook (friendship network): 829 Twitter (follower network): 271 million million daily active users on average in monthly active users 20 June 2014 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  21. Need for fast analysis • Ever growing list of applications of network effects • Viral Marketing • Immunization • Information Diffusion • … However, scaling up traditional algorithms up to millions of nodes is hard 21 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  22. How to handle large-scale networks • Approaches • Use faster / simpler algorithms • Perform analysis locally • i.e., divide the large network into smaller subgraphs • Zoom-out the network to obtain a smaller representation of the network this paper 22 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  23. Bird’s eye view of a network 23 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  24. Bird’s eye view of a network • “Zoom - out” of the graph to get a quick picture A D D A Zoom-out C C B B F E F E Small representation Big graph of the network Called “coarsen” in this paper 24 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  25. Outline • Motivation • Challenges • Problem Definition • Our Proposed Method • Experiments • Applications • Conclusion 25 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  26. Challenges • C1: How do we maintain diffusive characteristics when coarsening networks? • C2: How do we merge node to get the coarse network? • C3: how do we find the best node to merge fast? 26 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  27. C1: Information Diffusion • Cascading behavior in networks Blogs 1 Posts B 1 B 2 1 1 2 B 3 3 Links B 4 Information Blog network cascade Source: [McGlohon et. al., SDM2007] Diffusion is graph induced by a time ordered propagation of information (edges) 27 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  28. C1: Model information diffusion • Information spreads over networks • e.g.:, rumor/meme spreads over Twitter following network • Independent cascade model (IC ) [Kempe+, KDD03] • Weights p ij : propagation prob. from i to j • Each node has only one chance to infect its neighbors Meme spreading 28 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  29. C1: Diffusive characteristics • First eigenvalue λ 1 (of adjacency matrix) is enough for most diffusion models. (Prakash et al. [ICDM’12]) λ 1 is the epidemic threshold “ Safe” “Vulnerable” “Deadly” Increasing λ 1 , Increasing vulnerability 29 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  30. C1: maintain diffusive characteristics • Goal: maintain the diffusive characteristics of the original network in the coarsened network? Make the coarsened network has the least change in the first eigenvalue A D D A coarsen C C B B F E F E Original network Coarsened network 30 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  31. C2: How to merge nodes • Goal: Merge nodes of graph G to get the coarsened graph that “approximates” G with respect to diffusion Original network Merge b and a can 0.375! get the least change of λ 1 Is this correct? Influence from d to b: 0.5 Influence from d to a: 0.25 Average: 0.375 31 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  32. Details C2: How to merge nodes • In general: Merging a,b 32 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  33. C3: which nodes to merge • Goal: • Find the best nodes to merge • Fast, scalable to large network Talk about it later A D D A coarsen C C B B F E F E Coarsened network Original network 33 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  34. Outline • Motivation • Challenges • Problem Definition • Our Proposed Method • Experiments • Applications • Conclusion 34 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  35. Problem Definition Graph Coarsening Problem (GCP) Given: large graph G(V, E), and reduction factor α Find: the best set of edges to merge Such that: | λ G - λ H | is minimized • (i.e. H is the coarsened graph with the least change in the first eigenvalue) 35

  36. Naive Greedy Heuristic Step: • Score every edge by the change in eigenvalue • Greedily choose the edge (a,b) with the least score, and merge (a,b) • Re-evaluate the scores of every edge and repeat • Too slow! O(m 2 ) time to score all edges • Lose time benefits of analyzing the smaller graph 36 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  37. Outline • Motivation • Problem Definition • Challenges • Our Proposed Method • CoarseNet • Experiments • Applications • Conclusion 37 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  38. CoarseNet: idea • Can we approximate the edge scores faster? • Yes! • Use matrix perturbation arguments to estimate (up to first order terms) the score of an edge in constant time! • Score all edges in O(m) time • Naive Heuristic: O(m 2 ) time 38 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend