SLIDE 1
Maximizing the Spread of Maximizing the Spread of I nfluence - - PowerPoint PPT Presentation
Maximizing the Spread of Maximizing the Spread of I nfluence - - PowerPoint PPT Presentation
Maximizing the Spread of Maximizing the Spread of I nfluence through a Social I nfluence through a Social Network Network By David Kempe, Jon Kleinberg, By David Kempe, Jon Kleinberg, Eva Tardos Eva Tardos Report by Joe Abrams Report by
SLIDE 2
SLIDE 3
Infectious disease networks Infectious disease networks
SLIDE 4
Viral Marketing Viral Marketing
SLIDE 5
Viral Marketing Viral Marketing
- Example:
Example: Hotmail Hotmail
- Included service
Included service’ ’s URL in every email sent s URL in every email sent by users by users
- Grew from zero to 12 million users in 18
Grew from zero to 12 million users in 18 months with small advertising budget months with small advertising budget
SLIDE 6
Domingos and Richardson Domingos and Richardson (2001, 2002) (2001, 2002)
- Introduction to maximization of influence
Introduction to maximization of influence
- ver social networks
- ver social networks
- Intrinsic Value vs. Network Value
Intrinsic Value vs. Network Value
- Expected Lift in Profit (ELP)
Expected Lift in Profit (ELP)
- Epinions,
Epinions, “ “web of trust web of trust” ”, 75,000 users and , 75,000 users and 500,000 edges 500,000 edges
SLIDE 7
Domingos and Richardson Domingos and Richardson (2001, 2002) (2001, 2002)
- Viral marketing (using greedy hill
Viral marketing (using greedy hill-
- climbing
climbing strategy) worked very well compared with strategy) worked very well compared with direct marketing direct marketing
- Robust (69% of total lift knowing only 5%
Robust (69% of total lift knowing only 5%
- f edges)
- f edges)
SLIDE 8
Diffusion Model: Linear Diffusion Model: Linear Threshold Model Threshold Model
- Each node (consumer) influenced by set
Each node (consumer) influenced by set
- f neighbors; has threshold
- f neighbors; has threshold Θ
Θ from from uniform distribution [0,1] uniform distribution [0,1]
- When combined influence reaches
When combined influence reaches threshold, node becomes threshold, node becomes “ “active active” ”
- Active node now can influence its
Active node now can influence its neighbors neighbors
- Weighted edges
Weighted edges
SLIDE 9
Diffusion Model: Linear Diffusion Model: Linear Threshold Model Threshold Model
SLIDE 10
Diffusion Model: Independent Diffusion Model: Independent Cascade Model Cascade Model
- Each active node has a probability
Each active node has a probability p p of
- f
activating a neighbor activating a neighbor
- At time
At time t t+1, all newly activated nodes try +1, all newly activated nodes try to activate their neighbors to activate their neighbors
- Only one attempt for per node on target
Only one attempt for per node on target
- Akin to turn
Akin to turn-
- based strategy game?
based strategy game?
SLIDE 11
Influence Maximization Influence Maximization
- Using greedy hill
Using greedy hill-
- climbing strategy, can
climbing strategy, can approximate optimum to within a factor of approximate optimum to within a factor of (1 (1 – – 1/e 1/e – – ε ε), or ~63% ), or ~63%
- Proven using theories of submodular
Proven using theories of submodular functions (diminishing returns) functions (diminishing returns)
- Applies to both diffusion models
Applies to both diffusion models
SLIDE 12
Testing on network data Testing on network data
- Co
Co-
- authorship network
authorship network
- High
High-
- energy physics theory section of
energy physics theory section of www.arxiv.org www.arxiv.org
- 10,748 nodes (authors) and ~53,000
10,748 nodes (authors) and ~53,000 edges edges
- Multiple co
Multiple co-
- authored papers listed as
authored papers listed as parallel edges (greater weight) parallel edges (greater weight)
SLIDE 13
Testing on network data Testing on network data
- Linear Threshold: influence weighed by #
Linear Threshold: influence weighed by #
- f parallel lines, inversely weighed by
- f parallel lines, inversely weighed by
degree of target node: w = c degree of target node: w = cu,v
u,v /d
/dv
v
- Independent Cascade:
Independent Cascade: p p set at 1% and set at 1% and 10%; total probability for 10%; total probability for u v u v is is 1 1 – – (1 (1 – – p p)^c )^cu,v
u,v
- Weighted Cascade:
Weighted Cascade: p p = 1/ d = 1/ dv
v
SLIDE 14
Algorithms Algorithms
- Greedy hill
Greedy hill-
- climbing
climbing
- High degree: nodes with greatest number
High degree: nodes with greatest number
- f edges
- f edges
- Distance centrality: lowest average
Distance centrality: lowest average distance with other nodes distance with other nodes
- Random
Random
SLIDE 15
Algorithms Algorithms
SLIDE 16
Results: Linear Threshold Model Results: Linear Threshold Model
Greedy: ~40% better than central, ~18% better than high degree
SLIDE 17
Results: Weighted Cascade Results: Weighted Cascade Model Model
SLIDE 18
Results: Independent Cascade, Results: Independent Cascade, p p = 1% = 1%
SLIDE 19
Results: Independent Cascade, Results: Independent Cascade, p p = 10% = 10%
SLIDE 20
Advantages of Random Selection Advantages of Random Selection
SLIDE 21
Generalized models Generalized models
- Generalized Linear Threshold: for node
Generalized Linear Threshold: for node v v, , influence of neighbors not necessarily sum influence of neighbors not necessarily sum
- f individual influences
- f individual influences
- Generalized Independent Cascade: for
Generalized Independent Cascade: for node node v v, probability , probability p p depends on set of depends on set of v v’ ’s s neighbors that have previously tried to neighbors that have previously tried to activate activate v v
- Models computationally equivalent,
Models computationally equivalent, impossible to guarantee approximation impossible to guarantee approximation
SLIDE 22
Non Non-
- Progressive Threshold
Progressive Threshold Model Model
- Active nodes can become inactive
Active nodes can become inactive
- Similar concept: at each time
Similar concept: at each time t t, whether , whether
- r not
- r not v
v becomes/stays active depends on becomes/stays active depends on if influence meets threshold if influence meets threshold
- Can
Can “ “intervene intervene” ” at different times; need at different times; need not perform all interventions at not perform all interventions at t t = 0 = 0
- Answer to progressive model with graph G
Answer to progressive model with graph G equivalent to non equivalent to non-
- progressive model with
progressive model with layered graph G layered graph Gτ
τ
SLIDE 23
General Marketing Strategies General Marketing Strategies
- Can divide up total budget
Can divide up total budget κ κ into equal into equal increments of size increments of size δ δ
- For greedy hill
For greedy hill-
- climbing strategy, can
climbing strategy, can guarantee performance within factor of guarantee performance within factor of 1 1 – – e^[ e^[-
- (
(κ κ * *γ γ)/( )/(κ κ + + δ δ * *n n)] )]
- As
As δ δ decreases relative to decreases relative to κ κ, result , result approaches 1 approaches 1 – – e e-
- 1
1 = 63%
= 63%
SLIDE 24
Strengths of paper Strengths of paper
- Showed results in two complementary
Showed results in two complementary fashions: theoretical models and test fashions: theoretical models and test results using real dataset results using real dataset
- Demonstrated that greedy hill
Demonstrated that greedy hill-
- climbing
climbing strategy could guarantee results within strategy could guarantee results within 63% of optimum 63% of optimum
- Used specific and generalized versions of
Used specific and generalized versions of two different diffusion models two different diffusion models
SLIDE 25
Weaknesses of paper Weaknesses of paper
- Doesn
Doesn’ ’t fully explain methodology of t fully explain methodology of greedy hill greedy hill-
- climbing strategy
climbing strategy
- Lots of work not shown
Lots of work not shown – – simply refers to simply refers to work done in other papers work done in other papers
- Threshold value uniformly distributed?
Threshold value uniformly distributed?
- Influence inversely weighted by degree of
Influence inversely weighted by degree of target? target?
SLIDE 26