on the submodularity of influence in social networks
play

On the Submodularity of Influence in Social Networks Elchanan - PowerPoint PPT Presentation

On the Submodularity of Influence in Social Networks Elchanan Mossel & Sebastien Roch STOC07 Speaker: Xinran He Xinranhe1990@gmail.com Social Network Social network as a graph Nodes represent individuals. Edges are social


  1. On the Submodularity of Influence in Social Networks Elchanan Mossel & Sebastien Roch STOC07 Speaker: Xinran He Xinranhe1990@gmail.com

  2. Social Network • Social network as a graph – Nodes represent individuals. – Edges are social relations with different strengths: • Neighbors, Coworkers relation in real life Virtual Friendship in Facebook • Follower-Followee relations in Twitter •

  3. Diffusion In Social Network • The adoption of new products can propagate in the social network  Diffusion in the social network • Information, rumors, innovation, ......

  4. Influence Maximization • Influence maximization: Find k people that generates the largest influence spread (i.e. expected number of activated nodes) [KKT 2003]

  5. Linear Threshold Model • Given a social network with edge weight w uv and a set of Initially active individuals S as seed. • Every individual independently chooses a threshold Θ v uniformly in [0,1]. • At any step t later, still inactive nodes become activated if ∑ ≥ θ w uv v ∈ u N v where N v is the set of activated direct neighbors of v. • The diffusion ends when no more nodes are activated. • The influence spread σ (S)=E[|P end ||S], is the expected number of active nodes when the diffusion process ends.

  6. Linear Threshold Example Inactive Node 0.6 Active Node Threshold 0.2 0.2 0.3 Active neighbors X 0.1 0.4 U 0.3 0.5 Stop! 0.2 0.5 w v Step 0 Step 1 Step 2 Step 3

  7. Influence Maximization • Find a seed set S, |S| ≤ k, σ (S) is maximized. • Influence Maximization Problem is NP-hard under linear threshold model[Kempe et.al 2003]. • We have to solve it approximately. • Main tool for analysis Theorem : The greedy algorithm is a 1-1/e approximation for maximizing monotone and submodular set functions[Nemhauser/Wolsey 1978].

  8. Submodular & Monotone • A set function f: 2 V  R is monotone if ≤ ⊆ ⊆ ( ) ( ) , for all f S f T S T V • A set function f: 2 V  R is submodular if + ≥ ∩ + ∪ ( ) ( ) ( ) ( ) f S f T f S T f S T ⊆ for all , S T V

  9. Submodularity • A function set f is submodular if + ≥ ∩ + ∪ ⊆ ( ) ( ) ( ) ( ) , for all , f S f T f S T f S T S T V • Or equivalently ∪ − ≤ ∪ − ⊆ ⊆ ( { }) ( ) ( { }) ( ) , for all f T v f T f S v f S S T V • Submodularity can be considered as diminishing return property.

  10. Submodularity: Examples • Maximum coverage problem: Given a collection of sets S ={S 1 ,…,S m } and a  ⊆ ≤ number k, find , maximize σ (S’)= S ' , | ' | S S S k i ∈ ' S S σ is submodular . . i • The influence spread σ under the linear threshold model is submodular [Kempe et.al 2003].  Influence Maximization Problem under linear Threshold model can be solved approximately.

  11. General Threshold Model ∑ ≥ θ Linear Threshold Model: w uv v ∈ u N f v (S) ≥ θ v v General Threshold Model: f v (S) : activation function of node v over S. S is the set of already activated nodes. • General Threshold model is generalization of many diffusion models: ∑ w Linear Threshold Model uv ∈ [KKT 2003] u N v ∏ − − 1 ( 1 p ) Independent Cascade Model uv f v (S)= ∈ u N [KKT 2003] v r ∏ ω 1 - (1 - p ( , S )) Decreasing Cascade Model i i - 1 v = [KKT 2005] i 1 … …

  12. General Threshold Model(2) For Linear Threshold model, the influence spread σ (S) is submodular [KKT 2003]. Conjecture : Under the general threshold model with monotone and submodular f v , σ (S) is monotone and submodular [KKT 2003].

  13. Main Result Theorem: Under the general threshold model with monotone and submodular f v , σ (S) is monotone and submodular [Mossel/Roch 2007]. Corollary: The greedy algorithm is a (1-1/e) approximation to solve the influence maximization problem under general threshold model.

  14. Proof: General Idea(1) • By coupling four diffusion process: A ={A 0 =S,A 1 ,A 2 ,…,A end } B ={B 0 =T,B 1 ,B 2 ,…,B end } C ={C 0 =S ∩ T,C 1 ,C 2 ,…,C end } D ={D 0 =S ∪ T,D 1 ,D 2 ,…,D end } ⊆ ∩ ⊆ ∪ • Such that and C A B D A B t t t t t t

  15. Proof: General Idea(2) ⊆ ∩ ⊆ ∪ If and C A B D A B A end B end t t t t t t + Then | | | | A B end end ≥ ∩ + ∪ | | | | A B A B end end end end ≥ + | | | | C D end end Then takin g expectatio n, we have σ + σ ≥ σ ∩ + σ ∪ ( ) ( ) ( ) ( ) S T S T S T

  16. ⊆ ∩ C A B t t t • Couple the four processes with the same thresholds θ v . ⊆ ⊆ • Show by induction. , C A C B t t t t = ∩ ⊆ = – Base Case: C S T S A 0 0 C ⊆ – Assume . A t t – For a node v still inactive at step t, we ≤ ( ) ( ) have . Therefore if v is activated in f C f A v t v t step t+1 in C, it must also be activated in A. ⇒ + ⊆ C A f v (C t ) f v (A t ) + 1 1 t t

  17. ⊆ ∪ : First Attempt D A B t t t • Let’s try the same coupling method ⊆ ∪ for . D A B t t t 1 2 1 2 1 2 0.3 0.3 0.3 0.3 0.3 0.3 Θ 3 =0.5 Θ 3 =0.5 Θ 3 =0.5 3 3 3 D A B

  18. Antisense Coupling ⊆ ∪ ? • Then how could we keep D A B t t t • Intuitively, using ϴ for activation of S and 1- ϴ for activation of T will maximize their union.

  19. Piecemeal Growth = ( 1 ) ( ) k Define ( ,..., ) as the the piecemeal growth diffusion P P S S ( 1 ) ( ) k process, where ,..., is a partition of seed set . S S S Grow S (1) Grow S (2) Grow S (k) …… Until it ends Until it ends Until it ends Add S (k) Add S (1) Add S (2) Lemma: The distribution over the activated node set at the end of original process with seed set S and the piecemeal growth process P(S (1) ,…,S (k) ) is identical.

  20. Piecemeal Growth: Proof • By coupling three piecemeal growth processes T’, T, T’’ and original process S with same θ . Grow S Grow nothing Add S at stage 1 Add nothing at stage 2 Grow S (2) Grow S (1) Add S (1) at stage 1 Add S (2) at stage 2 Grow nothing Grow S Add nothing at stage 1 Add S at stage 2 ⊆ ⊆ = = ' ' ' and ' ' ' T T T T T S s s s end end end = so that S T end end

  21. Need-to-know Representation(1) • Consider the diffusion in a different way: Need-to-know Representation . • Principle of Deferred Decisions: We don’t decide all thresholds at the beginning; instead we reveal the value of thresholds whenever needed. • For example: if node v is inactive at step t-1, we only want to know whether it is activated at step t. Θ v Θ v f v (S t-2 ) f v (S t-1 )

  22. Need-to-know Representation(2) Lemma: The following process is equivalent to the original one: = 1.Initiali ze S S 0 ≤ ≤ − = 2. At step 1 1 , we initialize and for each still inactive node t n S S v − 1 t t − ( ) ( ) f S f S − − 1 2 v t v t - With probabilit y , becomes activated v − 1 ( ) f S − 2 v t θ and we pick uniformly in [ ( ), ( )]. f S f S − − v 2 1 v t v t - Otherwise we do nothing. − 1 ( ) f v S − 2 t f v (S t-2 ) f v (S t-1 ) − ( ) ( ) f S f S − − v t 1 v t 2

  23. Antisense Coupling(1) = ( 1 ) ( ) k Define the antisense diffusion ( ,..., ; ) P P S S T ( 1 ) ( ) k where ,..., is a partition of seed set . S S S Grow S (1) Grow S (k) Grow T …… Until it ends Until it ends Until it ends Add T at the K stage piecemeal growth beginning of τ stage k+1 Any step t in the final stage, activate nodes ≥ + − θ under the condition . ( ) ( ) 1 f P f P τ v t v v

  24. Antisense Coupling(2) Grow S (1) …… Grow S(k) Grow T Grow S (1) …… Grow S(k) Grow T τ ≥ θ θ θ ( ) f v P t v f v (P t ) f v (P τ ) Θ ’ Θ ’ v =f v (P τ )+1- Θ v ≥ θ ' ( ) f v (Q t ) f v P f v (Q τ ) t v

  25. Antisense Coupling(3) Grow S (1) …… Grow S(k) Grow T Grow S (1) …… Grow S(k) Grow T t τ Lemma: The distributions over the activated node set at the end of the piecemeal growth process P(S (1) ,…,S (k) ;T) and the antisense diffusion process Q(S (1) ,…,S (k) ;T) are identical.

  26. Antisense Coupling: Proof(1) Grow S (1) …… Grow S(k) Grow T Grow S (1) …… Grow S(k) Grow T t τ • From Need-to-know Representation point of view: = τ For any node still inactive at time , we have v t θ = uniformly distribute d in [ ( ), 1 ] [ ( ), 1 ] f P f Q τ τ v v v

  27. Antisense Coupling: Proof(2) • Then for any still inactive node, we pick its Θ v uniformly in [f v (P τ ),1]. • We define Θ ’ v =f v (Q τ )+1- Θ v . • Since Θ v and Θ ’ v have the same distribution, the final stage in growing T in P and Q is identical. • Therefore P end and Q end have the same distribution.

  28. Coupling: Overview Grow S ∩ T Grow S\T Grow nothing Until it ends Until it ends Grow S ∩ T Grow T\S Grow Nothing Until it ends Until it ends Grow S ∩ T Grow S\T Grow T\S Until it ends Until it ends Until it ends ⊆ ∪ for any step t in all three stages D A B t t t

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend