Spreading Rumours without the Network Alessandro Epasto P . - PowerPoint PPT Presentation

Spreading Rumours without the Network Alessandro Epasto P . Brach*, A. Panconesi°, P . Sankowski*. *U. of Warsaw °Sapienza U. Rome

Rumour Spreading Diffusive processes on graphs are an important paradigm in several fields : • Systems: How to spread information on network? • Social Networks: Why posts become viral? • Sociology: What makes innovations/products accepted? • Epidemiology: How diseases spread? We consider various models of information diffusion: Push, Pull and SIR .

Background Most results known are asymptotic bounds on the competition time: •At most O(n log(n)) (Feige et. al, 90) •Fast in Erdos Reyni and Preferential Attachement (Elsasser et al. 2006, Chierichetti et al. 2009). •Fast in high conductance graphs. (Chierichetti et al. 2010, Giakkoupis et al. 2011)

Our Goal Goal #1: Beyond asymptotics We are interested in the expected number of informed nodes for each time step of the process 40000 30000 Informed nodes 20000 10000 0 0 33 67 100 133 167 200 Notice: this is known only for very simple graphs (e.g. Clique, Pittel ’87)

Our Goal Goal #2: Prediction with limited information Motivation: real networks are often unavailable 40000 30000 20000 10000 0 0 33 67 100 133 167 200 Caveat: this is clearly an ill-posed question… … But surprisingly, it is possible for real social network

How Can we Achieve this? A simpler problem: model the unknown graph by a known random graph generation process. 40000 30000 Random 20000 graph model 10000 0 0 33 67 100 133 167 200

How Can we Achieve this? A simpler problem: model the unknown graph by a known random graph generation process. 40000 30000 Random 20000 graph model 10000 0 0 33 67 100 133 167 200 Prediction Real Graph

Which Graph Model? We use the configuration model as random graph model. SIR on configuration model matches real post diffusions in Twitter (Goel et al., 2013): • Distribution of popularity of posts. • Virality of the diffusion.

Our Contribution A predictor algorithm for the configuration model for the Push, Pull and SIR Processes: • Space efficient: very large graphs can fit in memory. • Provably exact on random graphs. The algorithm predicts accurately the both the popularity and the virality on real social networks.

Outline of the Talk • The diffusion processes; • Our algorithm(s); • Experimental evaluation; • Conclusions.

The Push-Pull Process

Push-Pull Protocol PUSH

Push-Pull Protocol PULL

SIR Process SIR

Our Algorithm

Naive Solution Simulate two random processes: the network generation and the rumour spreading. Naive algorithm: • Generate a random network G. • Simulate rumour spreading on G. • Run several times in parallel and average. Space bottleneck: Real networks are too large to fit in main memory!

Our Approach We can reduce the space to O(n) vs O(n+m) in directed graphs and even o(n) in undirected ones. This is a significant reduction not only in asymptotic! Deferred decision principle: the topology is discovered as nodes are involved in the rumor spreading process and immediately forget .

Intuition Only the local neighbourhood determines the evolution of the process. Num. Informed Num. Informed out-neighbours in-neighbours v We do not store the edges of the graph .

Undirected Graphs We use an efficient matrix representation. Low degree nodes stored in a K x K matrix High degree nodes stored individually K K

Undirected Graphs Graph Nodes Matrix SIze Saving in space Livejournal 5M 176 98% Facebook 720M <5000 >97% (estimates) 2 For power law graphs of exponent the cost is α 1+ α n In practice the entire Facebook graph could fit in few gigabytes.

Results on Random Graphs

Results on Random Graphs 80000 70000 Number of privy nodes 60000 50000 The model prediction 40000 is perfect 30000 20000 10000 Actual process Prediction 0 0 200 400 600 800 1000 Time This can be proved formally.

Results on Real Graphs

Social Networks - Push 70000 60000 Number of privy nodes 50000 Slashdot 40000 30000 20000 10000 Actual process Prediction 0 0 100 200 300 400 500 Time The model is qualitatively accurate for the social network we tested

More Social Networks - Push 4e+06 3.5e+06 3e+06 Number of privy nodes 2.5e+06 2e+06 1.5e+06 1e+06 500000 Actual process Prediction 0 0 100 200 300 400 500 Time Livejournal

More Social Networks - Push 800000 700000 Number of privy nodes 600000 500000 400000 300000 200000 100000 Actual process Prediction 0 0 50 100 150 200 Time DBLP

Non-Social Networks - Push Web Stanford For non-social networks the prediction is not accurate.

Results Prediction performances strongly depends on the network class: • Very good for social networks : friendship graphs, trust networks, collaboration networks. • Poor for non-social networks : web graphs, road networks, etc. This dichotomy has been observed in other contexts: degree correlations, graph compressibility, etc. What is the reason for this phenomenon?

Neighbourhood Function The neighbourhood function F(t) of graph measures how many pairs of nodes are at distance <= t This measure has been shown to tell apart social and non- social graphs.

Neighbourhood F. vs Prediction Quality Slashdot Neighbourhood F . Slashdot Prediction - SIR Social graphs have a neighbourhood function close to the configuration model.

Neighbourhood F. vs Prediction Quality 160000 Actual process 100 Prediction 140000 Number of infected nodes 120000 80 Number of nodes 100000 60 80000 60000 40 40000 20 20000 Actual graph Configuration Model 0 0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Distance Time Web Graph Neighbourhood F . Web Graph Prediction - SIR Non-Social graphs have a neighbourhood function far from the configuration model.

Neighbourhood F. vs Prediction Quality Correlation Neighborhood F. vs Prediction Error 0.8 SIR SIR (linear fit) 0.7 PUSH PUSH (linear fit) 0.6 0.5 MAPE 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Neighborhood F. L2/n norm The correlation is strong and statistically significant.

Conclusion • Rumour spreading processes can be predicted accurately in social graphs based on very limited information on the graph. • Our predictor is provably correct and space efficient. • We characterise the class of graph that can be predicted based on the Neighbourhood Function. • We would like to extend our model to more nuanced diffusion processes.

Thank you for your attention!

Spreading Rumours without the Network Alessandro Epasto P . - PowerPoint PPT Presentation

Spreading Rumours without the Network Alessandro Epasto P . Brach, A. Panconesi, P . Sankowski. *U. of Warsaw Sapienza U. Rome Rumour Spreading Diffusive processes on graphs are an important paradigm in several fields : Systems:

http://cs224w.stanford.edu Spreading through networks: Spreading through networks:

Spreading Scriptural Holiness: Spreading Scriptural Holiness: acknowledge that all we are, all we

Chaos- -Based Generation of Optimal Spreading Based Generation of Optimal Spreading Chaos

Rumor Spreading and Conductance Flavio Chierichetti Silvio Lattanzi Alessandro Panconesi Sapienza

1 blasphemous rumours? Too lucid to attribute value to everything I write (tittle-tattle,

Apple Update David Gray - Ready ABC Cairns May 2014 Agenda MYOB Mac and Windows Update Rumours

Rumours in Graphs Jilles Vreeken 24 July 2015 Service Announcement #1 The Exam. 20 minutes per

The Future of Voice TOTAL TELECOM Breakfast Briefing June 2013 Rumours of my death have

Are the Gospels Historically Trustworthy? Dr Max Baker-Hytch Rumours of doubt but not

Rumour Spreading without the Network Alessandro Panconesi Dipartimento di Informatica Joint work

Without sustaining injury Without sustaining injury Without sustaining injury Without sustaining

Adaptive Rumor Spreading e Correa 1 Marcos Kiwi 1 Jos Neil Olver 2 Alberto Vera 1 1 Universidad

Rumor Spreading Modeling: Profusion versus Scarcity Martine Collard*, Philippe Collard**, Laurent

Simulation of Powder Spreading Process for Binder Jetting Additive Manufacturing Guanxiong Miao,

Pooled steganalysis in JPEG: how to deal with the spreading strategy? Ahmad ZAKARIA 1 , 2 , Marc

Localization and Spreading of Diseases in Networks A. V. Goltsev, S. N. Dorogovtsev, J. G.

By Aurora Bautista, Ph.D. Behavioral Sciences Department Cynthia Fong , MA English as

THE STANDARD MODEL ASSUMPTIONS General formulation combining features of various specific models

Visual Analytics Approach to User-Controlled Evacuation Scheduling Natalia & Gennady

Issues in Pragmatic Clinical Trials: Introducing a Special Series in Clinical Trials Jeremy

Software Requirements Engineering Introduction R. Kuehl/J. Scott Hawker p. 1 R I T Software

Engaging the Social Engaging the Social Sciences Sciences SSC 101 SSC 101 Social Science

From ABC to Democracy, Entrepreneurship, and Freedom in Education John Raven Chart 1 Importance

Di Digital al Needs and and the he Public Sphere Paper at Philosophy and Social Science

Spreading Rumours without the Network Alessandro Epasto P . - PowerPoint PPT Presentation

Spreading Rumours without the Network Alessandro Epasto P . Brach*, A. Panconesi, P . Sankowski*. *U. of Warsaw Sapienza U. Rome Rumour Spreading Diffusive processes on graphs are an important paradigm in several fields : Systems:

http://cs224w.stanford.edu Spreading through networks: Spreading through networks:

Spreading Scriptural Holiness: Spreading Scriptural Holiness: acknowledge that all we are, all we

Chaos- -Based Generation of Optimal Spreading Based Generation of Optimal Spreading Chaos

Rumor Spreading and Conductance Flavio Chierichetti Silvio Lattanzi Alessandro Panconesi Sapienza

1 blasphemous rumours? Too lucid to attribute value to everything I write (tittle-tattle,

Apple Update David Gray - Ready ABC Cairns May 2014 Agenda MYOB Mac and Windows Update Rumours

Rumours in Graphs Jilles Vreeken 24 July 2015 Service Announcement #1 The Exam. 20 minutes per

The Future of Voice TOTAL TELECOM Breakfast Briefing June 2013 Rumours of my death have

Are the Gospels Historically Trustworthy? Dr Max Baker-Hytch Rumours of doubt but not

Rumour Spreading without the Network Alessandro Panconesi Dipartimento di Informatica Joint work

Without sustaining injury Without sustaining injury Without sustaining injury Without sustaining

Adaptive Rumor Spreading e Correa 1 Marcos Kiwi 1 Jos Neil Olver 2 Alberto Vera 1 1 Universidad

Rumor Spreading Modeling: Profusion versus Scarcity Martine Collard*, Philippe Collard**, Laurent

Simulation of Powder Spreading Process for Binder Jetting Additive Manufacturing Guanxiong Miao,

Pooled steganalysis in JPEG: how to deal with the spreading strategy? Ahmad ZAKARIA 1 , 2 , Marc

Localization and Spreading of Diseases in Networks A. V. Goltsev, S. N. Dorogovtsev, J. G.

By Aurora Bautista, Ph.D. Behavioral Sciences Department Cynthia Fong , MA English as

THE STANDARD MODEL ASSUMPTIONS General formulation combining features of various specific models

Visual Analytics Approach to User-Controlled Evacuation Scheduling Natalia &amp; Gennady

Issues in Pragmatic Clinical Trials: Introducing a Special Series in Clinical Trials Jeremy

Software Requirements Engineering Introduction R. Kuehl/J. Scott Hawker p. 1 R I T Software

Engaging the Social Engaging the Social Sciences Sciences SSC 101 SSC 101 Social Science

From ABC to Democracy, Entrepreneurship, and Freedom in Education John Raven Chart 1 Importance

Di Digital al Needs and and the he Public Sphere Paper at Philosophy and Social Science

Spreading Rumours without the Network Alessandro Epasto P . Brach, A. Panconesi, P . Sankowski. *U. of Warsaw Sapienza U. Rome Rumour Spreading Diffusive processes on graphs are an important paradigm in several fields : Systems:

Visual Analytics Approach to User-Controlled Evacuation Scheduling Natalia & Gennady