cascades and contagion
play

Cascades and Contagion Prof. Srijan Kumar - PowerPoint PPT Presentation

CSE 6240: Web Search and Text Mining. Spring 2020 Cascades and Contagion Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining Todays Lecture Introduction


  1. CSE 6240: Web Search and Text Mining. Spring 2020 Cascades and Contagion Prof. Srijan Kumar http://cc.gatech.edu/~srijan 1 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  2. Today’s Lecture • Introduction • Decision based models of diffusion – Single Adoption – Multiple Adoption • Probabilistic models of diffusion – SEIR model – Independent cascade model These slides are borrowed from Prof. Jure Leskovec’s CS224W class. 2 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  3. Epidemics vs Cascade Spreading • In decision-based models nodes make decisions based on pay-off benefits of adopting one strategy or the other. • In epidemic spreading: – Lack of decision making – Process of contagion is complex and unobservable • In some cases it involves (or can be modeled as) randomness 3 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  4. Simple model: Branching Process • First wave : A person carrying a disease enters the population and transmits to all she meets with probability 𝑟 . She meets 𝑒 people, a portion of which will be infected. • Second wave : Each of the 𝑒 people goes and meets 𝑒 different people. So we have a second wave of 𝑒 ∗ 𝑒 = 𝑒 % people, a portion of which will be infected. • Subsequent waves : same process 4 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  5. Example with k=3 5 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  6. Spreading Models of Viruses Virus Propagation: 2 Parameters: • (Virus) Birth rate β : – probability that an infected neighbor attacks • (Virus) Death rate δ : – Probability that an infected node heals Healthy Prob. δ N 2 Prob. β N 1 N Infected N 3 6 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  7. More Generally: S+E+I+R Models • General scheme for epidemic models: – Each node can go through phases: • Transition probs. are governed by the model parameters S…susceptible E…exposed I…infected R…recovered Z…immune 7 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  8. SIR Model • SIR model: Node goes through phases 𝜀 𝛾 S usceptible I nfected R ecovered – Models chickenpox or plague: • Once you heal, you can never get infected again • Assuming perfect mixing: The network is a complete graph S(t) • The model dynamics are: R(t) dS Number of nodes dR dt = − β SI dt = δ I I(t) dI dt = β SI − δ I time 8 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  9. SIS Model • Susceptible-Infective-Susceptible (SIS) model • Cured nodes immediately become susceptible • Virus “strength”: 𝒕 = 𝜸 / 𝜺 • Node state transition diagram: Infected by neighbor with prob. β Susceptible Infective Cured with prob. δ 9 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  10. SIS Model • Models flu: – Susceptible node I(t) becomes infected – The node then Number of nodes heals and become susceptible again • Assuming perfect mixing (a S(t) complete graph): dS = - b + d SI I dt time dI S usceptible I nfected = b - d SI I dt 10 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  11. Question: Epidemic threshold 𝝊 • SIS Model: Epidemic threshold of an arbitrary graph G is τ , such that: – If virus “strength” s = β / δ < τ the epidemic can not happen (it eventually dies out) • Given a graph what is its epidemic threshold? 11 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  12. [Wang et al. 2003] Epidemic Threshold in SIS Model • Fact: We have no epidemic if: Epidemic threshold (Virus) Death rate β / δ < τ = 1/ λ 1, A largest eigenvalue (Virus) Birth rate of adj. matrix A of G ► λ 1, A alone captures the property of the graph! 12 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  13. [Wang et al. 2003] Experiments on an Small Graph Autonomous Systems Graph 10,900 nodes and 500 Oregon 31,180 edges β = 0.001 Number of Infected Nodes s= β / δ > τ 400 (above threshold) 300 200 s= β / δ = τ 100 (at the threshold) 0 s= β / δ < τ 0 250 500 750 1000 (below threshold) Time δ : 0.05 0.06 0.07 13 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  14. Experiments • Does it matter how many people are initially infected? 14 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  15. [Gomes et al., 2014] Modeling Ebola with SEIR [Gomes et al., Assessing the International Spreading Risk Associated with the 2014 West African Ebola Outbreak, PLOS Current Outbreaks , ‘14] 15 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  16. Example: Ebola S: susceptible individuals, E: exposed individuals, I: infectious cases in the community, H: hospitalized cases, F: dead but not yet buried, R: individuals no longer transmitting the disease [Gomes et al., Assessing the International Spreading Risk Associated with the 2014 West African Ebola Outbreak, PLOS Current Outbreaks , ‘14] 16 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  17. Application: Rumor spread modeling using SEIZ model References: 1. Epidemiological Modeling of News and Rumors on Twitter. Jin et al. SNAKDD 2013 2. False Information on Web and Social Media: A survey. Kumar et al., arXiv :1804.08559 17 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  18. SEIZ model: Extension of SIS model 18 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  19. Recap: SIS model 19 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  20. Details of the SEIZ model Notation: – S = Susceptible – I = Infected – E = Exposed – Z = Skeptics 20 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  21. Dataset Tweets collected from eight stories: Four rumors and four real REAL EVENTS RUMORS 21 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  22. Method: Fitting SEIZ model to data • SEIZ model is fit to each cascade to minimize the difference |𝐽(𝑢) – 𝑢𝑥𝑓𝑓𝑢𝑡(𝑢)| : – 𝑢𝑥𝑓𝑓𝑢𝑡(𝑢) = number of rumor tweets – 𝐽(𝑢) = the estimated number of rumor tweets by the model • Use grid-search and find the parameters with minimum error 22 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  23. Fitting to “Boston Marathon Bombing” SEIZ model better models the real data, especially at initial points 23 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  24. Fitting to "Pope resignation” data SEIZ model better models the real data, especially at initial points 24 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  25. Rumor detection with SEIZ model Notation: S = Susceptible I = Infected E = Exposed Z = Skeptics All parameters learned by model New fitting to real data (from previous slides) metric: 25 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  26. Rumor detection by R SI Rumors Parameters obtained by fitting SEIZ model efficiently identifies rumors vs. news 26 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  27. Today’s Lecture • Introduction • Decision based models of diffusion – Single Adoption – Multiple Adoption • Probabilistic models of diffusion – SEIZ model – Independent cascade model 27 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  28. Linear Threshold Model • A decision-based model • A node v has random threshold 𝜄 𝑤 ~ U[0,1] • A node v is influenced by each neighbor w according to a weight 𝑐 𝑤,𝑥 such that å £ b 1 v w , w neighbor of v • A node v becomes active when >= (weighted) 𝜾 𝒘 fraction of its neighbors are å active ³ q b v w , v w active neighbor of v 28 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  29. Linear Threshold Model Inactive Node 0.6 Active Node Threshold 0.2 0.2 0.3 Active neighbors X 0.1 0.4 U 0.3 0.5 Stop! 0.2 0.5 w v 29 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

  30. Probabilistic Contagion • Independent Cascade Model – Directed finite 𝑯 = (𝑾, 𝑭) – Set 𝑻 starts out with new behavior • Say nodes with this behavior are “ active ” – Each edge (𝒘, 𝒙) has a probability 𝒒 𝒘𝒙 – If node 𝒘 is active, it gets one chance to make 𝒙 active, with probability 𝒒 𝒘𝒙 • Each edge fires at most once • Does scheduling matter? No • If 𝒗, 𝒘 are both active at the same time, it doesn’t matter which tries to activate 𝒙 first – But the time moves in discrete steps 30 Srijan Kumar, Georgia Tech, CSE6240 Spring 2020: Web Search and Text Mining

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend