Efficient Densest Subgraph Computation in Evolving Graphs Alessandro - PowerPoint PPT Presentation

Efficient Densest Subgraph Computation in Evolving Graphs Alessandro Epasto Joint work with Silvio Lattanzi (Google Research, NY) and Mauro Sozio (Télécom ParisTech)

Social Networks are Constantly Evolving Brutus Julius

Social Networks are Constantly Evolving Julius Cleopatra Brutus

Social Networks are Constantly Evolving Cleopatra Brutus

Social Networks are Constantly Evolving Cleopatra Brutus Mark Anthony

Events in Social Media Streams • WWW2015 conference will be held in Florence. • Hofmann confirmed keynote at WWW2015 in Florence • WWW2015 opens May 20 in Florence Dense subgraphs represent events!

Event Detection

Dynamic Community Detection Algorithms Most algorithms assume a single static graph in input. Naive solution: run the algorithm once for each update . GOAL: efficiently keep track of the communities as the graph evolve.

Densest Subgraph Density H = 3/4 H

Densest Subgraph H

Densest Subgraph in Static Graphs • Community used in Social Networks, Web and Biology. • Polynomial exact algorithm (Goldberg, 1984) • (2+eps)-approximation MapReduce algorithm (Bahmani et al., 2012).

Densest Subgraph in Dynamic Graphs No results known * in dynamic graphs with sublinear update time ( before our publication ). Naive Approach: O(m + n) time per update! * Bhattacharya et al . - to appear in STOC 2015. Strong guarantees in streaming model.

Our Problem Goal: Preserve a 2+eps approximation with average time O(poly-log(n+m)) per update. Notice: Much better than O(n+m) per update and includes output time !

Our Dynamic Graph Model Start from an empty graph . Arbitrary long sequence of edge updates arrives… … … (A, B) (B, C) (A, B) This models also node addition/removals implicitly.

Incremental and Fully-Dynamic INCREMENTAL: arbitrary stream of edges additions only . (A, B) (B, C)

Incremental and Fully-Dynamic FULLY-DYNAMIC: stream of edges arbitrary additions and random deletion . (A, B) (B, C) (A, B)

Our Goal Design a Data Structure: 1) AddEdge(u,v) 2) RemoveEdge(u,v) Both operations can output a new densest subgraph S or nothing. Invariant: the last subgraph in output is a 2+eps approx. for the current graph

Result for edge additions (incremental) Theorem: We maintain a 2+eps approx. in O(log^2(n) / eps^2) average time and linear space Significant improvement over naive approach: O(m+n) average time

Result for edge additions and deletion (fully dynamic) Theorem: We maintain a 2+eps approx. in O(log^4(n) / eps^4) average time and linear space. Very fast also in practice !

Roadmap • Review Bahmani et al. for static graphs. • A new static graph algorithm. • Incremental algorithm. • Randomized fully-dynamic algorithm.

Static Case - Bahmani et al. Algorithm Graph G0 Let eps > 0: Iteration: 1 1) Compute Avg. Deg = K

Static Case - Bahmani et al. Algorithm Graph G0 Let eps > 0: Iteration: 1 1) Compute Avg. Deg = K 2) Let T = K (1+eps) T = 2.3

Static Case - Bahmani et al. Algorithm Graph G0 Let eps > 0: Iteration: 1 1) Compute Avg. Deg = K 2) Let T = K (1+eps) T = 2.3 3) Remove nodes with degree < T

Static Case - Bahmani et al. Algorithm Graph G0 Graph G1 Let eps > 0: Iteration: 2 1) Compute Avg. Deg = K T = 2.3

Static Case - Bahmani et al. Algorithm Graph G0 Graph G1 Let eps > 0: Iteration: 2 1) Compute Avg. Deg = K 2) Let T = K (1+eps) T = 3.2 T = 2.3

Static Case - Bahmani et al. Algorithm Graph G0 Graph G1 Let eps > 0: Iteration: 2 1) Compute Avg. Deg = K 2) Let T = K (1+eps) T = 3.2 T = 2.3 3) Remove nodes with degree < T

Static Case - Bahmani et al. Algorithm Graph G0 Graph G1 Iterate until all nodes are removed. O u t p u t t h e d e n s e s t subgraph Gi. G2 T = 3.2 T = 2.3

Static Case - Bahmani et al. Algorithm Graph G0 Graph G1 Iterate until all nodes are removed. O u t p u t t h e d e n s e s t subgraph Gi. G2 T = 3.2 T = 2.3 Theorem: (Bahmani et al.) 2+eps approx. in log(n) steps .

Towards a Dynamic Algorithm • Idea: Store graphs Gi ’s. • When an edge is added update the Gi’s Graph G0 Graph G1 This ensures a 2+eps u v approximation! T = 3.2 T = 2.3

Towards a Dynamic Algorithm • Idea: Store graphs Gi ’s. • When an edge is added update the Gi’s Graph G0 Graph G1 Deg > 2.3 u v T = 3.2 T = 2.3

Towards a Dynamic Algorithm • Idea: Store graphs Gi ’s. • When an edge is added update the Gi’s Graph G0 Graph G1 C hain effect! T = 4.0 T = 2.6

Idea: fix Threshold T for all iterations • Use same threshold T at each iteration. • Easier to analyze and maintain. For correct threshold T : same approximation of Bahamani et al.’s algorithm. You’d better use T = 3.1

Moving Threshold (Only Additions) 1) Set T = 1 to compute densest subgraph H and output it. This provides a 2+eps approx. in O(poly-log(n)) average time

Moving Threshold (Only Additions) 1) Set T = 1 to compute densest subgraph H and output it. 2) Maintain the Gi’ using threshold T as long as all nodes are removed in O(log(n)) steps. This provides a 2+eps approx. in O(poly-log(n)) average time

Moving Threshold (Only Additions) 1) Set T = 1 to compute densest subgraph H and output it. 2) Maintain the Gi’ using threshold T as long as all nodes are removed in O(log(n)) steps. 3) Repeat from 1) with higher threshold T = T * 2 This provides a 2+eps approx. in O(poly-log(n)) average time

Fully-Dynamic Case The analysis is significantly harder: • The density can increase/decrease in complex patterns… • …densest subgraph is stable under random removals. • We tackle the stability to recompute the subgraph few times .

Experimental Evaluation - Datasets • DBLP& Patent: co-authorship graph. • LastFM: songs co-listened. • Yahoo! Answers: >1 Billions edges. Edge if two users answer the same question.

Evolution Densest Subgraph 7 Density 100 Size 6 80 5 Density 4 Size 60 3 40 2 20 1 0 1970 1975 1980 1985 1990 1995 2000 2005 2010 Time DBLP - Sliding Window 5 years

Evolution Densest Subgraph 35 300 30 250 25 200 Density 20 Size 150 15 100 10 50 5 Density Size 0 0 1975 1980 1985 1990 1995 Time Patent Citations - Sliding Window 5 years

Evolution Densest Subgraph 1600 3500 Density Size 1400 3000 1200 Efficient in Highly 2500 1000 Density Dynamic Datasets 2000 Size 800 with Billions of 1500 600 Updates. 1000 400 500 200 0 0 0 5e+08 1e+09 1.5e+09 2e+09 2.5e+09 Time Yahoo Answers - Sliding Window 100M edges

Update Time vs Epsilon Avg. Time per Update vs Epsilon 90 0.5 80 0.3 0.1 70 Microseconds 0.05 Scales much 60 50 better with 40 Epsilon than 30 worst case. 20 10 0 d p p l y a a b a a s l t t h t p e e f o m n n o t t - - c c o i t a u t

Comparison with Static Algorithm Avg. Time per Update vs K Our Algorithm 100000 K=100000 K=10000 Microseconds K=1000 10000 1000 100 10 1 dblp patent-coaut patent-cit lastfm

Comparison With Static Algorithm Max Relative Error Static Algorithm vs K 100 100000 10000 1000 Relative Error 10 1 d p p l a b a a s l t t t p e e f m n n t t - - c c i o t a u t

Conclusions and Future Work • It is possible to maintain the densest subgraph efficiently in dynamic graphs. • Future work: Recent Techniques ( Bhattacharya et al.) to define 2+eps with adversarial removes? • Top-k Densest Subgraph in Dynamic Graphs.

Thank you for your attention

Recent Results - STOC Concurrently to our work Bhattacharya et al., STOC 2015 introduced a novel streaming algorithm for densest subgraph with strong guarantees . • Different model: Update vs Query time. • Strong space constraints (cannot store entire graph). • Adversarial additions and deletions. • 4+eps approx with O(n poly log) space, O(poly log) update time, O(n) query time. • 2+eps approx with O(n poly log) space, higher time complexity.

Incremental Case: Only Additions

Density vs Epsilon Density (Ex. LastFm and Yahoo) Maximum Density vs Epsilon Density (LastFm and Yahoo) 140 0.5 1400 0.3 0.1 120 1200 0.05 Max density is 100 1000 stable with 80 800 60 different 600 40 400 epsilons. 20 200 d p p l y a a b a a s l t t h t p e e f o m n n o t t - - c c o i t a u t

Analysis of the Algorithm We divide the edge additions in Rounds. Round 1 Round 2 Round i … Add Add Add Add Add Add Add Run of Run of Static Static Algorithm Algorithm H Overflow H Overflow H output output output T <- T(1+eps) T <- T(1+eps)

Densest Subgraph - LP Primal

Definitions We say that an algorithm is a approximation of the densest subgraph problem for a > 1 if it outputs a graph with density at least: OPT / a We say that an operation has T amortized time if for any sequence of k update operations the total time is O(k T)

Efficient Densest Subgraph Computation in Evolving Graphs Alessandro - PowerPoint PPT Presentation

Efficient Densest Subgraph Computation in Evolving Graphs Alessandro Epasto Joint work with Silvio Lattanzi (Google Research, NY) and Mauro Sozio (Tlcom ParisTech) Social Networks are Constantly Evolving Brutus Julius Social Networks are

CORE DECOMPOSITION AND DENSEST SUBGRAPH IN MULTILAYER NETWORKS CORE DECOMPOSITION AND DENSEST

Densest/Heaviest k -subgraph on Interval Graphs, Chordal Graphs and Planar Graphs Presented by

The densest subgraph of sparse random graphs Justin Salez (Universit e Paris 7) with Venkat

Hyperbolic Color Codes on Densest Tessellations Clarice Dias de Albuquerque (UFCA) Reginaldo

Frequent Subgraph Mining Frequent Subgraph Mining (FSM) Outline FSM Preliminaries FSM

Evolving Data Access Evolving Data Access Evolving Data Access Evolving Data Access

Mining Large Single Networks under Subgraph Mining Large Single Networks under Subgraph

Algorithms for the Densest Sublattice Problem Daniele Micciancio (UCSD) (Joint work with D.

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

Infinite graphs P eter Komj ath LC12 P eter Komj ath Infinite graphs Infinite

UI Evolving Platform Evolving Architecture Evolving About Me Xianning ( Pronunciation

Evolving Neural Networks This lecture is based on Xin Yaos tutorial slides From Evolving

Address: Phases 3A & 3C Cedars Park Land South Of Gun Cotton Way Stowmarket IP14 5EP

Opportunity Day Q3/2019 15 November 2019 1 Company Overview 2 Pre-2019 Key Highlights 3

1Q17 Investor Presentation Albaraka Trk Non-Deal Roadshow 11-12 July 2017 Agenda 1

FOOD WASTE REVOLUTION: INDUSTRY VOLUNTARY AGREEMENTS Nicola Jenkin Pinpoint Sustainability

Neighbour-swap Graphs Generating linear extensions of posets by adjacent transpositions Gijs

Graphs in PROLOG Adam Volk PROLOG Introduction Programmer tells the system what to find, not

WHO AM I? Mingxi Wu Ph.D. in Database & Data Mining, University of Florida 2008 SDE

Scott Wen en-tau au Yih Who is Justin Biebers sister? Jazmyn Bieber semantic parsing

Efficient Densest Subgraph Computation in Evolving Graphs Alessandro - PowerPoint PPT Presentation

Efficient Densest Subgraph Computation in Evolving Graphs Alessandro Epasto Joint work with Silvio Lattanzi (Google Research, NY) and Mauro Sozio (Tlcom ParisTech) Social Networks are Constantly Evolving Brutus Julius Social Networks are

CORE DECOMPOSITION AND DENSEST SUBGRAPH IN MULTILAYER NETWORKS CORE DECOMPOSITION AND DENSEST

Densest/Heaviest k -subgraph on Interval Graphs, Chordal Graphs and Planar Graphs Presented by

The densest subgraph of sparse random graphs Justin Salez (Universit e Paris 7) with Venkat

Hyperbolic Color Codes on Densest Tessellations Clarice Dias de Albuquerque (UFCA) Reginaldo

Frequent Subgraph Mining Frequent Subgraph Mining (FSM) Outline FSM Preliminaries FSM

Evolving Data Access Evolving Data Access Evolving Data Access Evolving Data Access

Mining Large Single Networks under Subgraph Mining Large Single Networks under Subgraph

Algorithms for the Densest Sublattice Problem Daniele Micciancio (UCSD) (Joint work with D.

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

Infinite graphs P eter Komj ath LC12 P eter Komj ath Infinite graphs Infinite

UI Evolving Platform Evolving Architecture Evolving About Me Xianning ( Pronunciation

Evolving Neural Networks This lecture is based on Xin Yaos tutorial slides From Evolving

Address: Phases 3A &amp; 3C Cedars Park Land South Of Gun Cotton Way Stowmarket IP14 5EP

Opportunity Day Q3/2019 15 November 2019 1 Company Overview 2 Pre-2019 Key Highlights 3

1Q17 Investor Presentation Albaraka Trk Non-Deal Roadshow 11-12 July 2017 Agenda 1

FOOD WASTE REVOLUTION: INDUSTRY VOLUNTARY AGREEMENTS Nicola Jenkin Pinpoint Sustainability

Neighbour-swap Graphs Generating linear extensions of posets by adjacent transpositions Gijs

Graphs in PROLOG Adam Volk PROLOG Introduction Programmer tells the system what to find, not

WHO AM I? Mingxi Wu Ph.D. in Database &amp; Data Mining, University of Florida 2008 SDE

Scott Wen en-tau au Yih Who is Justin Biebers sister? Jazmyn Bieber semantic parsing

Address: Phases 3A & 3C Cedars Park Land South Of Gun Cotton Way Stowmarket IP14 5EP

WHO AM I? Mingxi Wu Ph.D. in Database & Data Mining, University of Florida 2008 SDE