maximizing the spread of influence
play

Maximizing the Spread of Influence through a Social Network Han - PowerPoint PPT Presentation

Maximizing the Spread of Influence through a Social Network Han Wang Department of Computer Science ETH Zrich Problem Example 1: Spread of Rumor 2012 = end! D A C E B F Problem Example 2: Viral Marketing ezPad 1 beats iPad 3 D


  1. Maximizing the Spread of Influence through a Social Network Han Wang Department of Computer Science ETH Zürich

  2. Problem Example 1: Spread of Rumor  2012 = end! D A C E B F

  3. Problem Example 2: Viral Marketing  ezPad 1 beats iPad 3 D A C E B F

  4. Problem Definition  G: a social network (n nodes)  Model: spread process  S: initially active subset (k seeds )  𝝉 𝑻 : #final active nodes ( achievement )  Task: Choose 𝑇 ∗  Goal: 𝜏 𝑇 ∗ = max 𝜏 𝑇 NP-Hard Realistic Goal: Approximate the maximum with a guarantee Choose S: 𝜏 𝑇 ≥ 𝑠 ∙ 𝜏 𝑇 ∗

  5. Contents in This Talk  G: a social network (n nodes)  Model: spread process Two Models  S: initially active subset (k seeds )  𝝉 𝑻 : #final active nodes ( achievement )  Task: Choose 𝑇 ∗ Prove:  Goal: 𝜏 𝑇 ∗ = max 𝜏 𝑇 NP-Hard Realistic Goal: Prove: Approximate the maximum with a guarantee Choose S: 𝜏 𝑇 ≥ 𝑠 ∙ 𝜏 𝑇 ∗

  6. Model 1: Independent Cascade Model

  7. Model 1: Cascade Model  Each active node try to activate his neighbors D 𝑞 𝐷,𝐸 = 0.2  𝑞 𝑣,𝑤 1 − 𝑞 𝑣,𝑤 𝑞 𝐷,𝐹 = 0.8 C E  Only a single chance 𝑞 𝐷,𝐺 = 0.6 F

  8. Model 1: Cascade Model D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  9. Model 1: Cascade Model  𝑇 = 𝐵, 𝐷 , 𝜏 𝑇 = 5 D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  10. Model 2: Linear Threshold Model

  11. Model 2: Threshold Model  Each inactive node picks a random 𝜄 𝑤 ∈ ,0,1-  Active condition: 𝑐 𝑣,𝑤 ≥ 𝜄 𝑤 𝑣: 𝑏𝑑𝑢𝑗𝑤𝑓 𝑜𝑓𝑗𝑕𝑖𝑐𝑝𝑠 𝑝𝑔 𝑤 𝜾 𝑬 = 𝟏. 𝟒 Iteration 2: 0.2 < 0.3 𝑐 𝐷,𝐸 = 0.2 D 𝑐 𝐹,𝐸 = 0.7 Iteration 4: E  active C E Iteration 5: 0.2+0.7 > 0.3 D  active

  12. Model 2: Threshold Model 𝜾 = 𝟏. 𝟒 Iteration: 1 2 D 0.2 A 0.7 0.8 C 0.4 0.3 E 𝜾 = 𝟏. 𝟕 0.6 B 𝜾 = 𝟏. 𝟔 F 𝜾 = 𝟏. 𝟘

  13. Model 2: Threshold Model  𝑇 = 𝐵, 𝐷 , 𝜏 𝑇 = 4 D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  14. How to Prove the Guarantee? ??? find 𝑇 , s.t. Given a 𝜏 𝑇 ≥ 𝑠 ∙ 𝜏 𝑇 ∗ spread model find 𝑇 , s.t. f 𝑇 ≥ (1 − 1 𝑓) ∙ 𝑔 𝑇 ∗ Nemhauser f(S): Non-negative monotone submodular

  15. Submodularity  𝑉 : a finite ground set  𝑄 𝑉 : power set of 𝑉  𝑔 ∙ : 𝑄 𝑉 → 𝑆 ∗  Submodularity: ∀ 𝑜𝑝𝑒𝑓 𝑤, ∀𝑇 ⊆ 𝑈 𝒈 𝐓 ∪ 𝒘 − 𝒈 𝑻 ≥ 𝒈 𝑼 ∪ 𝒘 − 𝒈 𝑼

  16. Example: Submodularity  𝒈 𝑻 : number of vertexes reachable from vertexes in S v v A A C C D D B B

  17. How to Prove the Guarantee? ??? find 𝑇 , s.t. Given a 𝜏 𝑇 ≥ 𝑠 ∙ 𝜏 𝑇 ∗ spread model find 𝑇 , s.t. f 𝑇 ≥ (1 − 1 𝑓) ∙ 𝑔 𝑇 ∗ Prove: 𝛕 𝑻 is Submodular Nemhauser f(S): Non-negative monotone submodular

  18. We Want to Prove… 𝛕 𝑇 is Model NP-hard Submodular Independent Cascade Linear Threshold

  19. Prove: Submodularity Cascade Model

  20. Submodularity (Cascade Model)  Recall: flip coin D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  21. Submodularity (Cascade Model)  Why not flip all the coins in the begining? D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  22. Submodularity (Cascade Model)  Live edges  live paths  blocked edges D 0.2 A 0.7 0.8 C 0.4 0.3 E 0.6 B F

  23. Simplify Cascade Model Node v ends up active A live path: some seed  v

  24. Achievement(Simplified Model) D  X: coin flipping outcome A  e.g. X1, X2 C E  𝑆 𝑌 𝑤 B F  𝑆 𝑌1 𝐵 = 𝐵, 𝐶  𝑆 𝑌1 𝐷 = 𝐷, 𝐸, 𝐹 D A C  𝜏 𝑌 𝑇 = | 𝑆 𝑌 𝑤 | E 𝑤∈𝑇  𝜏 𝑌1 *𝐵, 𝐷+ = 𝐵, 𝐶, 𝐷, 𝐸, 𝐹 = 5 B F

  25. Submodularity (Cascade Model)  Fix x, 𝜏 𝑌 𝑇 is submodular  Linear combination of submodular functions is still submodular 𝜏 𝑇 = 𝑄𝑠𝑝𝑐 𝑌 ∙ 𝜏 𝑌 𝑇 𝑌

  26. Summary of the proof Active = Has a live path 𝜏 𝑌 𝑇 is submodular 𝜏 𝑇 is submodular

  27. Prove: NP-hard Simplified Cascade Model

  28. NP-Hard (Cascade Model)  Set Cover Problem: k subsets cover all?  K=1: No  K=2: No  K=3: Yes  K=4: …

  29. NP-Hard (Cascade Model)  Influence maximization  Solve Set Cover Q: 𝑇 = 2, 𝜏 𝑇 ≥ 2 + 5 ? Q: 2 subsets cover all ? S2 A A S1 B C S1 C S2 B D D S3 S3 E E

  30. NP-Hard (Cascade Model) Influence Maximization Problem is at least as difficult as Set Cover Problem

  31. Prove: Submodularity Linear Threshold Model

  32. Recall: Threshold Model 𝜾 = 𝟏. 𝟒 D 0.2 A 0.7 0.8 C 0.4 0.3 E 𝜾 = 𝟏. 𝟕 0.6 B 𝜾 = 𝟏. 𝟔 F 𝜾 = 𝟏. 𝟘

  33. Gamble: Roulette

  34. Gamble: Roulette N1 None N1 0.2 N6 0.14 N6 0.15 v N2 0.1 N5 N2 N5 0.07 0.23 N3 N3 N4 N4 𝜾 = 𝟏. 𝟓

  35. Submodularity (Threshold Model) None 𝜾 = 𝟏. 𝟒 C D 0.2 E A 0.7 0.8 C 0.4 0.3 E 𝜾 = 𝟏. 𝟕 0.6 None B A 𝜾 = 𝟏. 𝟔 None C F None 𝜾 = 𝟏. 𝟘 C

  36. Submodularity (Threshold Model) 𝜾 = 𝟏. 𝟒  Live edges  live paths D 0.2 A 0.7 0.8 C 0.4 0.3 E 𝜾 = 𝟏. 𝟕 0.6 B 𝜾 = 𝟏. 𝟔 F 𝜾 = 𝟏. 𝟘

  37. Correctness of Simplification 𝐺𝑝𝑠 𝑜𝑝𝑒𝑓 𝑤: 𝑄 𝑏𝑑𝑢𝑗𝑤𝑓 𝑗𝑜 𝐽𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜 𝑢 + 1 𝑗𝑜𝑏𝑑𝑢𝑗𝑤𝑓 𝑗𝑜 𝐽𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜𝑡 ≤ 𝑢) = 𝑄(𝑏𝑑𝑢𝑗𝑤𝑓 𝑗𝑜 𝐽𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜 𝑢 + 1) 𝑄(𝑗𝑜𝑏𝑑𝑢𝑗𝑤𝑓 𝑗𝑜 𝐽𝑢𝑓𝑠𝑏𝑢𝑗𝑝𝑜𝑡 ≤ 𝑢)

  38. Simplified Model Active before iteration 5 becomes active in iteration 5 None N1 N1 0.2 N6 0.14 N6 0.15 v N2 0.1 N5 N2 N5 0.07 0.23 N3 N3 N4 N4

  39. Simplified Model 𝐵 𝑢 : Nodes becoming active in iteration t 𝑐 𝑣,𝑤 𝑣∈𝐵 𝑢 1 − 𝑐 𝑣,𝑤 𝑣∈ 𝐵 1 ∪𝐵 2 ∪⋯∪𝐵 𝑢−1

  40. Original Model N2 N6 N4 N3 N1 N5 None N1 0.2 N6 0.14 0.15 v N2 0.1 N5 0.07 0.23 N3 N4

  41. Original Model 𝐵 𝑢 : Nodes becoming active in iteration t 𝑐 𝑣,𝑤 𝑣∈𝐵 𝑢 1 − 𝑐 𝑣,𝑤 𝑣∈ 𝐵 1 ∪𝐵 2 ∪⋯∪𝐵 𝑢−1

  42. Simplify Threshold Model Node v ends up active A live path: some seed  v

  43. Similarly, we have… Active = Has a live path 𝜏 𝑌 𝑇 is submodular 𝜏 𝑇 is submodular

  44. Prove: NP-hard Linear Threshold Model

  45. NP-Hard (Threshold Model)  Vertex Cover Problem  k vertexes (S) each edge is incident to at least one vertex in S

  46. NP-Hard (Threshold Model)  Influence maximization  Vertex Set Cover Q: 𝑇 = 3, 𝜏 𝑇 = 6 ? Q: 3 vertexes cover all ? D D A A C C E E B B F F

  47. Influence Maximization Q: 𝑇 = 2, 𝜏 𝑇 = 6 ? Q: 𝑇 = 3, 𝜏 𝑇 = 6 ? D D A A C C E E B B F F

  48. NP-Hard (Threshold Model) Influence Maximization Problem is at least as difficult as Vertex Cover Problem

  49. End of Proofs  Influence Maximization Problem 𝛕 𝑇 is Model NP-hard Submodular Independent Cascade Linear Threshold

  50. Initial Problem find 𝑇 , s.t. Given a 𝜏 𝑇 ≥ (1 − 1 𝑓 − 𝝑) ∙ 𝜏 𝑇 ∗ spread model find 𝑇 , s.t. f 𝑇 ≥ (1 − 1 𝑓) ∙ 𝑔 𝑇 ∗ Prove: 𝛕 𝑻 is Submodular Greedy Hill Climbing 𝑵𝑩𝒀 𝒘 𝒈 𝐓 ∪ 𝒘 − 𝒈 𝑻 f(S): (Maximize Marginal Gain) Non-negative monotone submodular

  51. Summary  Problem Description  Two Models  Independent Cascade Model  Linear Threshold Model Submodular Functions  Proof of Approximation Guarantee  Proof of NP-Hardness 

  52. Q&A

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend