The Dynamics of Dissemination on Graphs: Theory and Algorithms - - PowerPoint PPT Presentation

the dynamics of dissemination on
SMART_READER_LITE
LIVE PREVIEW

The Dynamics of Dissemination on Graphs: Theory and Algorithms - - PowerPoint PPT Presentation

The Dynamics of Dissemination on Graphs: Theory and Algorithms Hanghang Tong City College, CUNY Hanghang.tong@gmail.com http://www-cs.ccny.cuny.edu/~tong/ An Example: Virus Propagation/Dissemination Sick Healthy Contact 2 An Example:


slide-1
SLIDE 1

The Dynamics of Dissemination on Graphs: Theory and Algorithms

Hanghang Tong

City College, CUNY Hanghang.tong@gmail.com http://www-cs.ccny.cuny.edu/~tong/

slide-2
SLIDE 2

An Example: Virus Propagation/Dissemination

Healthy Sick Contact

2

slide-3
SLIDE 3

Healthy Sick Contact 1: Sneeze to neighbors 2: Some neighbors  Sick 3: Try to recover

3

An Example: Virus Propagation/Dissemination

slide-4
SLIDE 4

Healthy Sick Contact 1: Sneeze to neighbors 2: Some neighbors  Sick 3: Try to recover

Q: How to minimize infected population?

4

An Example: Virus Propagation/Dissemination

slide-5
SLIDE 5

Healthy Sick Contact 1: Sneeze to neighbors 2: Some neighbors  Sick 3: Try to recover

Q: How to minimize infected population?

  • Q1: Understand tipping point
  • Q2: Minimize the propagation
  • Q3: Maximize the propagation

5

An Example: Virus Propagation/Dissemination

slide-6
SLIDE 6

Why Do We Care? – Healthcare

[SDM’13b]

Critical Patient transferring

Move patients  specialized care

 highly resistant micro-

  • rganism  Infection controlling

 costly & limited

US-Medicare Network

SARS costs 700+ lives; $40+ Bn; H1N1 costs Mexico $2.3bn; Flu 2013: one of the worst in a decade, 105 children in US. Q: How to allocate resource to minimize overall spreading?

slide-7
SLIDE 7

Why Do We Care? – Healthcare

[SDM’13b]

Current Method Out Method

Red: Infected Hospitals after 365 days

SARS costs 700+ lives; $40+ Bn; H1N1 costs Mexico $2.3bn; Flu 2013: one of the worst in a decade, 105 children in US.

slide-8
SLIDE 8

Why Do We Care? (More)

Rumor Propagation Email Fwd in Organization Malware Infection Viral Marketing

8

slide-9
SLIDE 9

Roadmap

  • Motivations
  • Q1: Theory – Tipping Point
  • Q2: Minimize the propagation
  • Q3: Maximize the propagation
  • Conclusions

9

slide-10
SLIDE 10

SIS Model (e.g., Flu) (Susceptible-Infected-Susceptible)

  • Each Node Has Two Status:
  • β: Infection Rate (Prob ( | || ))
  • δ: Recovery Rate (Prob ( | | ))

Healthy Sick

t = 1 t = 2 t = 3

10

slide-11
SLIDE 11

SIS Model as A NLDS

pt+1 = g (pt)

  • Prob. vector: nodes

being sick at (t+1)

  • Prob. vector: nodes

being sick at t Non-linear function: depends on (1) graph structures (2) virus parameters (β, δ)

11

slide-12
SLIDE 12

SIS Model (e.g., Flu)

pt+1 = g (pt)

Theorem [Chakrabarti+ 2003, 2007]: Ifλ x (β/δ) ≤ 1; no epidemic

for any initial conditions of the graph)

, δ : virus par

λ: largest eigenvalue of the graph (~ connectivity of the graph)

β, δ : virus parameters (~strength of the virus)

Infection Ratio Time Ticks

slide-13
SLIDE 13

Beyond Static Graphs: Alternating Behavior

[PKDD 2010, Networking 2011] A1: adjacency matrix

8 8

DAY (e.g., work, school)

13

slide-14
SLIDE 14

Beyond Static Graphs: Alternating Behavior

[PKDD 2010, Networking 2011] A2: adjacency matrix

8 8

NIGHT (e.g., home)

14

slide-15
SLIDE 15

Formal Model Description

[PKDD 2010, Networking 2011]

  • SIS model

– recovery rate δ – infection rate β

  • Set of T arbitrary graphs

day

N N

night

N N

, weekend…..

Infected Healthy

X N1 N3 N2

  • Prob. β
  • Prob. δ
  • Prob. δ

15

slide-16
SLIDE 16

Epidemic Threshold for Alternating Behavior

[PKDD 2010, Networking 2011]

Theorem [PKDD 2010, Networking 2011]: No epidemic Ifλ(S) ≤ 1. System matrix S = Πi Si Si = (1-δ)I + β Ai

day

N N

night

N N

Ai

…… Log (Infection Ratio) Time Ticks

At Threshold Below Above

16

slide-17
SLIDE 17

Why is λ So Important?

  • λ  Capacity of a Graph:

Larger λ  better connected

17

1 1 1 2 2 2

Intuitions

slide-18
SLIDE 18
  • Key 1: Model Dissemination as an NLDS:
  • Key 2: Asymptotic Stability of NLDS [PKDD 2010]:

p = p* = 0 is asymptotic stable if | λ (J)|<1, where

Why is λ So Important?

18

Details

pt+1 = g (pt)

pt : Prob. vector: nodes being sick at t

g : Non-linear function (graph + virus parameters)

slide-19
SLIDE 19

Roadmap

  • Motivations
  • Q1: Theory – Tipping Point
  • Q2: Minimize the propagation
  • Q3: Maximize the propagation
  • Conclusions

19

slide-20
SLIDE 20

Minimizing Propagation: Edge Deletion

  • Given: a graph A, virus prop model and budget k;
  • Find: delete k ‘best’ edges from A to minimize λ

Bad

20

Good

slide-21
SLIDE 21

Q: How to find k best edges to delete efficiently?

[CIKM12 a]

Left eigen-score

  • f source

Right eigen-score

  • f target

21

slide-22
SLIDE 22

Minimizing Propagation: Evaluations [CIKM12 a]

Time Ticks Log (Infected Ratio)

(better)

Our Method

Aa

Data set: Oregon Autonomous System Graph (14K node, 61K edges)

slide-23
SLIDE 23

Discussions: Node Deletion vs. Edge Deletion

  • Observations:
  • Node or Edge Deletion  λ Decrease
  • Nodes on A = Edges on its line graph L(A)
  • Questions?
  • Edge Deletion on A = Node Deletion on L(A)?
  • Which strategy is better (when both feasible)?

Original Graph A Line Graph L(A)

slide-24
SLIDE 24

Discussions: Node Deletion vs. Edge Deletion

  • Q: Is Edge Deletion on A = Node Deletion on L(A)?
  • A: Yes!
  • But, Node Deletion itself is not easy:

24

Theorem: Hardness of Node Deletion. Find Optimal k-node Immunization is NP-Hard Theorem: Line Graph Spectrum. Eigenvalue of A  Eigenvalue of L(A)

slide-25
SLIDE 25

Discussions: Node Deletion vs. Edge Deletion

  • Q: Which strategy is better (when both feasible)?
  • A: Edge Deletion > Node Deletion

25

(better)

Green: Node Deletion [ICDM 2010](e.g., shutdown a twitter account) Red: Edge Deletion (e.g., un-friend two users)

slide-26
SLIDE 26

Roadmap

  • Motivations
  • Q1: Theory – Tipping Point
  • Q2: Minimize the propagation
  • Q3: Maximize the propagation
  • Conclusions

26

slide-27
SLIDE 27

Maximizing Dissemination: Edge Addition

  • Given: a graph A, virus prop model and budget k;
  • Find: add k ‘best’ new edges into A.
  • By 1st order perturbation, we have

λs - λ ≈Gv(S)= c ∑eєS u(ie)v(je)

  • So, we are done (?)

Left eigen-score

  • f source

Right eigen-score

  • f target

Low Gv High Gv

27

But … it has O(n2-m) complexity

slide-28
SLIDE 28

λs - λ ≈Gv(S)= c ∑eєS u(ie)v(je)

  • Q: How to Find k new edges w/ highest Gv(S) ?
  • A: Modified Fagin’s algorithm

k k

#3: Search space

k+d k+d

Search space

:existing edge Time Complexity: O(m+nt+kt2), t = max(k,d) #1: Sorting Sources by u #2: Sorting Targets by v

Maximizing Dissemination: Edge Addition

slide-29
SLIDE 29

Maximizing Dissemination: Evaluation

Time Ticks Log (Infected Ratio)

(better)

29

slide-30
SLIDE 30

Conclusions

  • Goal: Guild Dissemination by Opt. G
  • Theory: Opt. Dissemination = Opt. λ
  • Algorithms:

– NetMel to Minimize Dissemination – NetGel to Maximize Dissemination

  • More on This Topic

– Beyond Link Structure (content, attribute) [WWW11] – Beyond Full Immunity [SDM13b] – Node Deletion [ICDM2010] – Higher Order Variants [CIKM12a] – Immunization on Dynamic Graphs [PKDD10]

30

Acknowledgement: Lada A. Adamic, Albert-László Barabási, Tina Eliassi-Rad, Christos Faloutsos, Michalis Faloutsos, Theodore J. Iwashyna, B. Aditya Prakash, Chaoming Song, Spiros Papadimitriou, Dashun Wang.