Scalable Influence Maximization for Prevalent Viral Marketing in - PowerPoint PPT Presentation

Scalable Influence Maximization for Prevalent Viral Marketing in Large-Scale Social Networks Wei Chen Microsoft Research Asia In collaboration with Chi Wang University of Illinois at Urbana-Champaign Yajun Wang Microsoft Research Asia KDD'10, July 27, 2010 1

Outline  Background and problem definition  Maximum Influence Arborescence (MIA) heuristic  Experimental evaluations  Related work and future directions KDD'10, July 27, 2010 2

Ubiquitous Social Networks KDD'10, July 27, 2010 3

A Hypothetical Example of Viral Marketing Avatar is great Avatar is great Avatar is great Avatar is great Avatar is great Avatar is great Avatar is great KDD'10, July 27, 2010 4

Effectiveness of Viral Marketing level of trust on different types of ads * very effective *source from Forrester Research and Intelliseek KDD'10, July 27, 2010 5

The Problem of Influence Maximization Social influence graph vertices are individuals links are social relationships number p(u,v) on a directed link from u to v is the probability that v is activated by u after u is activated Independent cascade model initially some seed nodes are activated 0.3 At each step, each newly activated node u activates its 0.1 neighbor v with probability p(u,v) influence spread : expected number of nodes activated Influence maximization: find k seeds that generate the largest influence spread KDD'10, July 27, 2010 6

Research Background Influence maximization as a discrete optimization problem proposed by Kempe, Kleinberg, and Tardos , in KDD’2003 Finding optimal solution is provably hard (NP-hard) Greedy approximation algorithm, 63% approximation of the optimal solution Repeat k rounds: in the i-th round, select a node v that provides the largest marginal increase in influence spread require the evaluation of influence spread given a seed set --- hard and slow Several subsequent studies improved the running time Serious drawback: very slow, not scalable: > 3 hrs on a 30k node graph for 50 seeds KDD'10, July 27, 2010 7

Our Work Design new heuristics MIA (maximum influence arborescence) heuristic for general independent cascade model 10 3 speedup --- from hours to seconds (or days to minutes) influence spread close to that of the greedy algorithm of [KKT’03] We also show that computing exact influence spread given a seed set is #P-hard (counting hardness) resolve an open problem in [KKT’03] indicate the intrinsic difficulty of computing influence spread KDD'10, July 27, 2010 8

Maximum Influence Arborescence (MIA) Heuristic I: Maximum Influence Paths (MIPs) For any pair of nodes u and v, find the maximum influence path (MIP) from u u to v ignore MIPs with too small probabilities ( < parameter  ) 0.3 0.1 v KDD'10, July 27, 2010 9

MIA Heuristic II: Maximum Influence in- (out-) Arborescences Local influence regions for every node v, all MIPs to v form its maximum u influence in-arborescence (MIIA ) 0.3 0.1 v KDD'10, July 27, 2010 10

MIA Heuristic II: Maximum Influence in- (out-) Arborescences Local influence regions for every node v, all MIPs to v form its maximum u influence in-arborescence (MIIA ) for every node u, all MIPs from u form its maximum 0.3 influence out- arborescence (MIOA ) 0.1 v These MIIAs and MIOAs can be computed efficiently using the Dijkstra shortest path algorithm KDD'10, July 27, 2010 11

MIA Heuristic III: Computing Influence through the MIA structure Recursive computation of activation probability ap(u) of a node u in its in-arborescence, given a seed set S Can be used in the greedy algorithm for selecting k seeds, but not efficient enough KDD'10, July 27, 2010 12

MIA Heuristic IV: Efficient Updates on Activation Probabilities If v is the root of a MIIA, and u is a node in the MIIA, then their activation probabilities have a linear relationship: All ‘s in a MIIA can be recursively computed time reduced from quadratic to linear time If u is selected as a seed, its marginal influence increase to v is Summing up the above marginal influence over all nodes v, we obtain the marginal influence of u Select the u with the largest marginal influence Update for all w’s that are in the same MIIAs as u KDD'10, July 27, 2010 13

MIA Heuristic IV: Summary Iterating the following two steps until finding k seeds Selecting the node u giving the largest marginal influence Update MIAs (linear coefficients) after selecting u as the seed Key features: updates are local, and linear to the arborescence size tunable with parameter  : tradeoff between running time and influence spread KDD'10, July 27, 2010 14

Experiment Results on MIA Heuristic Influence spread vs. seed set size NetHEPT dataset: Epinions dataset: • collaboration network from physics archive • who-trust-whom network of Epinions.com • 15K nodes, 31K edges • 76K nodes, 509K edges weighted cascade model: • influence probability to a node v = 1 / (# of in-neighbors of v) KDD'10, July 27, 2010 15

Experiment Results on MIA Heuristic running time 10 4 times >10 3 times speed up speed up Running time is for selecting 50 seeds KDD'10, July 27, 2010 16

Scalability of MIA Heuristic • synthesized graphs of different sizes generated from power-law graph model • weighted cascade model • running time is for selecting 50 seeds KDD'10, July 27, 2010 17

Related Work Greedy approximation algorithms Original greedy algorithm [Kempe, Kleinberg, and Tardos, 2003] Lazy-forward optimization [Leskovec, Krause, Guestrin, Faloutsos, VanBriesen, and Glance, 2007] Edge sampling and reachable sets [Kimura, Saito and Nakano, 2007; C., Wang, and Yang, 2009] reduced seed selection from days to hours (with 30K nodes), but still not scalable Heuristic algorithms SPM/SP1M based on shortest paths [Kimura and Saito, 2006], not scalable SPIN based on Shapley values [Narayanam and Narahari, 2008], not scalable Degree discounts [C., Wang, and Yang, 2009], designed for the uniform IC model CGA based on community partitions [Wang, Cong, Song, and Xie 2010] complementary our local MIAs naturally adapt to the community structure, including overlapping communities KDD'10, July 27, 2010 18

Future Directions Theoretical problem: efficient approximation algorithms: How to efficiently approximate influence spread given a seed set? Practical problem: Influence analysis from online social media How to mine the influence graph? KDD'10, July 27, 2010 19

Thanks! and questions? KDD'10, July 27, 2010 20

Experiment Results on MIA Heuristic Influence spread vs. seed set size running time 10 4 times >10 3 times speed up speed up Epinions dataset: NetHEPT dataset: • who-trust-whom network • collaboration network from physics archive of Epinions.com • 15K nodes, 31K edges • 76K nodes, 509K edges Running time is for selecting 50 seeds weighted cascade model: • influence probability to a node v = 1 / (# of in-neighbors of v) KDD'10, July 27, 2010 21

Scalable Influence Maximization for Prevalent Viral Marketing in - PowerPoint PPT Presentation

Scalable Influence Maximization for Prevalent Viral Marketing in Large-Scale Social Networks Wei Chen Microsoft Research Asia In collaboration with Chi Wang University of Illinois at Urbana-Champaign Yajun Wang Microsoft Research Asia

National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention: NCHHSTP Division of Viral

How to Win Friends and Influence People, Truthfully Analysing Viral Marketing Strategies

Submodular Maximization Seffi Naor Lecture 2 4th Cargese Workshop on Combinatorial Optimization

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

Expectation Maximization CMSC 691 UMBC Outline EM (Expectation Maximization) Basic idea Three

Presented by WAN, Pengfei Dept. ECE, HKUST Wei Chen, et al, Efficient Influence Maximization in

Viral Video Marketing Pecha Kucha Presentation Shane Hirschman, Creative Web, Winter 2008 Viral

Viral Shedding Viral Shedding <in hospital> <in hospital> into Patients' ??? ???

Viral Hepatitis Surveillance in Tennessee NASTAD Viral Hepatitis TA Meeting November 29, 2017

Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

On social influence, topics, and communities Francesco Bonchi www.francescobonchi.com Plan of

Outlines Topic-aware Social Influence Propagation Models by N Barbieri and et al. , ICDM

Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9

Maximization of Submodular Functions Seffi Naor Lecture 1 4th Cargese Workshop on Combinatorial

On the dual problem of utility maximization Yiqing LIN Joint work with L. GU and J. YANG

Theres Something About Bayes Effective Probabilistic Programming for the Rest of Us James

Avatar Mobility in 1 Wei Tsang Ooi Mehul Motani Huiguang Liang Ian Tay Ming Feng Neo

Characterizing Human Mobility in Networked Virtual Environments Siqi Shen, Niels Brouwers,

Force Field Limitations CPSC 599.86 / 601.86 Sonny Chan University of Calgary Outline

Human Computer Intelligent Interaction Thomas S. Huang Department of Electrical and Computer

Case Studies: Brtal Legend tara@doublefine.com ~50 unique unit types ~50 unique unit types

Outline Introduction Related Work System Architecture: three major software modules

Introduction to NEXT TUESDAY (25th November) and THURSDAY Second Life (27th November) we will

Scalable Influence Maximization for Prevalent Viral Marketing in - PowerPoint PPT Presentation

Scalable Influence Maximization for Prevalent Viral Marketing in Large-Scale Social Networks Wei Chen Microsoft Research Asia In collaboration with Chi Wang University of Illinois at Urbana-Champaign Yajun Wang Microsoft Research Asia

National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention: NCHHSTP Division of Viral

How to Win Friends and Influence People, Truthfully Analysing Viral Marketing Strategies

Submodular Maximization Seffi Naor Lecture 2 4th Cargese Workshop on Combinatorial Optimization

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

Expectation Maximization CMSC 691 UMBC Outline EM (Expectation Maximization) Basic idea Three

Presented by WAN, Pengfei Dept. ECE, HKUST Wei Chen, et al, Efficient Influence Maximization in

Viral Video Marketing Pecha Kucha Presentation Shane Hirschman, Creative Web, Winter 2008 Viral

Viral Shedding Viral Shedding &lt;in hospital&gt; &lt;in hospital&gt; into Patients' ??? ???

Viral Hepatitis Surveillance in Tennessee NASTAD Viral Hepatitis TA Meeting November 29, 2017

Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

On social influence, topics, and communities Francesco Bonchi www.francescobonchi.com Plan of

Outlines Topic-aware Social Influence Propagation Models by N Barbieri and et al. , ICDM

Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9

Maximization of Submodular Functions Seffi Naor Lecture 1 4th Cargese Workshop on Combinatorial

On the dual problem of utility maximization Yiqing LIN Joint work with L. GU and J. YANG

Theres Something About Bayes Effective Probabilistic Programming for the Rest of Us James

Avatar Mobility in 1 Wei Tsang Ooi Mehul Motani Huiguang Liang Ian Tay Ming Feng Neo

Characterizing Human Mobility in Networked Virtual Environments Siqi Shen, Niels Brouwers,

Force Field Limitations CPSC 599.86 / 601.86 Sonny Chan University of Calgary Outline

Human Computer Intelligent Interaction Thomas S. Huang Department of Electrical and Computer

Case Studies: Brtal Legend tara@doublefine.com ~50 unique unit types ~50 unique unit types

Outline Introduction Related Work System Architecture: three major software modules

Introduction to NEXT TUESDAY (25th November) and THURSDAY Second Life (27th November) we will

Viral Shedding Viral Shedding <in hospital> <in hospital> into Patients' ??? ???