Comparing Compartment and Agent-based Models Shannon Gallagher JSM - - PowerPoint PPT Presentation

comparing compartment and agent based models
SMART_READER_LITE
LIVE PREVIEW

Comparing Compartment and Agent-based Models Shannon Gallagher JSM - - PowerPoint PPT Presentation

Comparing Compartment and Agent-based Models Shannon Gallagher JSM Baltimore, MD August 2, 2017 Thesis work with: William F. Eddy (Chair) Joel Greenhouse Howard Seltman Cosma Shalizi Samuel L. Ventura Goal: Combine two good models into a


slide-1
SLIDE 1

Comparing Compartment and Agent-based Models

Shannon Gallagher JSM Baltimore, MD August 2, 2017

Thesis work with: William F. Eddy (Chair) Joel Greenhouse Howard Seltman Cosma Shalizi Samuel L. Ventura

slide-2
SLIDE 2

Goal: Combine two good models into a better one

1

slide-3
SLIDE 3

Studying infectious disease is important

2

slide-4
SLIDE 4

Compartment vs. Agent-based Models

slide-5
SLIDE 5

Compartment models (CMs) describe how individuals evolve over time

Assumptions (Anderson and May 1992) :

  • 1. Homogeneity of individuals
  • 2. Law of mass action

I t 1 I t

4

slide-6
SLIDE 6

Compartment models (CMs) describe how individuals evolve over time

Assumptions (Anderson and May 1992) :

  • 1. Homogeneity of individuals
  • 2. Law of mass action

I(t + 1) ∝ I(t)

4

slide-7
SLIDE 7

Agent-based models (AMs) simulate the spread of disease

Assumptions (Helbing 2002):

  • 1. Heterogeneity of agents
  • 2. Model adequately reflects reality

5

slide-8
SLIDE 8

Agent-based models (AMs) simulate the spread of disease

Assumptions (Helbing 2002):

  • 1. Heterogeneity of agents
  • 2. Model adequately reflects reality

5

slide-9
SLIDE 9

CMs and AMs: a side by side comparison

CMs

∙ Equation-based ∙ Computationally fast ∙ Homogeneous individuals ∙ No individual properties

AMs

∙ Simulation-based ∙ Computationally slow ∙ Heterogeneous individuals ∙ Individual properties

6

slide-10
SLIDE 10

Combining the two together

(Bobashev 2007, Banos 2015, Wallentin 2017) ∙ ad hoc approaches ∙ perspective from non-statisticians

Goal: Create a statistically justified hybrid model

7

slide-11
SLIDE 11

Combining the two together

(Bobashev 2007, Banos 2015, Wallentin 2017) ∙ ad hoc approaches ∙ perspective from non-statisticians

Goal: Create a statistically justified hybrid model

7

slide-12
SLIDE 12

Current Work

slide-13
SLIDE 13

There are two main avenues of improvement

  • 1. Quantifying how similar CMs and AMs are
  • 2. Speeding up AM run-time

9

slide-14
SLIDE 14

The SIR model: a detailed look

(Kermack and McKendrick 1927)

    

dS dt

= − βSI

N dI dt

= βSI

N − γI dR dt

= γI ∙ β – rate of infection ∙ γ – rate of recovery ∙ N – total population size

10

slide-15
SLIDE 15

The SIR model: a detailed look

(Kermack and McKendrick 1927)

    

∆S ∆t

= − βSI

N ∆I ∆t

= βSI

N − γI ∆R ∆t

= γI ∙ β – rate of infection ∙ γ – rate of recovery ∙ N – total population size

11

slide-16
SLIDE 16

Our stochastic CM approach

ˆ S(t + 1) = ˆ S(t) − st ˆ R(t + 1) = ˆ R(t) + rt ˆ I(t + 1) = N − ˆ S(t + 1) − ˆ R(t + 1), with st+1 ∼ Binomial ( ˆ S(t), βI(t) N ) rt+1 ∼ Binomial ( ˆ I(t), γ ) .

12

slide-17
SLIDE 17

Our stochastic AM approach

For an agent xn(t), n = 1, 2, . . . , N, the forward operator for t > 0 is xn(t + 1) =      xn(t) + Bernoulli (

βI(t) N

) if xn(t) = 1 xn(t) + Bernoulli (γ) if xn(t) = 2 xn(t)

  • therwise

. where xn(t) = k, k ∈ {1, 2, 3} corresponds to state S, I, and R, respectively Let the aggregate total in each compartment be ˆ Xk(t) =

N

n=1

I{xn(t) = k}

13

slide-18
SLIDE 18

The means overlap

25 50 75 100 25 50 75 100

Time % of Population

Type

S−CM I−CM R−CM S−AM I−AM R−AM 1000 agents; 5000 runs; β = 0.10; γ = 0.03

Mean Proportion of Compartment Values

14

slide-19
SLIDE 19

The distributions look the same

15

slide-20
SLIDE 20

These approaches are equivalent

Theorem Let the CM and AM be as previously described. Then for all t ∈ {1, 2, . . . , T}, ˆ S(t)

d

= ˆ XS(t) (1) ˆ I(t)

d

= ˆ XI(t) ˆ R(t)

d

= ˆ XR(t).

16

slide-21
SLIDE 21

These approaches are equivalent

Theorem Let the CM and AM be as previously described. Then for all t ∈ {1, 2, . . . , T}, ˆ S(t)

d

= ˆ XS(t) (1) ˆ I(t)

d

= ˆ XI(t) ˆ R(t)

d

= ˆ XR(t).

16

slide-22
SLIDE 22

We can compare CM/AM pairs and AM/AM pairs by fitting the underlying model

0.028 0.029 0.030 0.031 0.096 0.098 0.100 0.102

β γ

Simulation Type

CM AM 1000 agents; 5000 runs; β = 0.10; γ = 0.03

Fitted SIR parameters distribution

17

slide-23
SLIDE 23

AMs are appealing because they can be run multiple times

∙ Simulate an epidemic en masse! ∙ A run - same initial parameters, different random numbers ∙ Runs (L) are independent of one another = ⇒ parallelization ∙ Roughly, the variance of compartments ↓ when N, L ↑

Goal: Improve computation time without sacrificing statistical details

18

slide-24
SLIDE 24

AMs are appealing because they can be run multiple times

∙ Simulate an epidemic en masse! ∙ A run - same initial parameters, different random numbers ∙ Runs (L) are independent of one another = ⇒ parallelization ∙ Roughly, the variance of compartments ↓ when N, L ↑

Goal: Improve computation time without sacrificing statistical details

18

slide-25
SLIDE 25

There is a tradeoff between the number of agents and number of runs

6 8 10 12 14 25 50 75 100

Time V(S1) V(S2)

5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agents

Ratio of Variance of # Susceptibles

6 8 10 12 14 25 50 75 100

Time V(I1) V(I2)

5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agents

Ratio of Variance of # Infected

6 8 10 12 14 25 50 75 100

Time V(R1) V(R2)

5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agents

Ratio of Variance of # Recovered

19

slide-26
SLIDE 26

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V S t 1 S t 1 pt pt 1 pt 2V S t ∙ V S2 t

N2 N1 V S1 t

V

1 L1 runs S1 t N1

V

1 L2 runs S2 t N2

L2N2

2

L1N2

1

V S1 t V S2 t L2N2 L1N1

We can replace agents with runs!

20

slide-27
SLIDE 27

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V [ ˆ S(t + 1) ] = S(t)(1 − pt)pt + (1 − pt)2V [ ˆ S(t) ] ∙ V S2 t

N2 N1 V S1 t

V

1 L1 runs S1 t N1

V

1 L2 runs S2 t N2

L2N2

2

L1N2

1

V S1 t V S2 t L2N2 L1N1

We can replace agents with runs!

20

slide-28
SLIDE 28

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V [ ˆ S(t + 1) ] = S(t)(1 − pt)pt + (1 − pt)2V [ ˆ S(t) ] ∙ V[ˆ S2(t)] = N2

N1 V[ˆ

S1(t)] V

1 L1 runs S1 t N1

V

1 L2 runs S2 t N2

L2N2

2

L1N2

1

V S1 t V S2 t L2N2 L1N1

We can replace agents with runs!

20

slide-29
SLIDE 29

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V [ ˆ S(t + 1) ] = S(t)(1 − pt)pt + (1 − pt)2V [ ˆ S(t) ] ∙ V[ˆ S2(t)] = N2

N1 V[ˆ

S1(t)] V [

1 L1

runs ℓ ˆ S1(t) N1

] V [

1 L2

runs ℓ ˆ S2(t) N2

] = L2N2

2

L1N2

1

· V[ˆ S1(t)] V[ˆ S2(t)] = L2N2 L1N1 .

We can replace agents with runs!

20

slide-30
SLIDE 30

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V [ ˆ S(t + 1) ] = S(t)(1 − pt)pt + (1 − pt)2V [ ˆ S(t) ] ∙ V[ˆ S2(t)] = N2

N1 V[ˆ

S1(t)] V [

1 L1

runs ℓ ˆ S1(t) N1

] V [

1 L2

runs ℓ ˆ S2(t) N2

] = L2N2

2

L1N2

1

· V[ˆ S1(t)] V[ˆ S2(t)] = L2N2 L1N1 .

We can replace agents with runs!

20

slide-31
SLIDE 31

Through paralellization, we can get a speed-up without losing statistical information

0e+00 2e−04 4e−04 6e−04 25 50 75 100

Time Variance of ∑

l

S ^(t) (NL)

Simulation

100 agents, 4 cores 400 agents, 1 core S(t) − % susceptible averaged over # of runs

Variance of S(t)

0e+00 2e−04 4e−04 6e−04 25 50 75 100

Time Variance of ∑

l

I ^(t) (NL)

Simulation

100 agents, 4 cores 400 agents, 1 core I(t) − % infected averaged over # of runs

Variance of I(t)

0e+00 2e−04 4e−04 6e−04 25 50 75 100

Time Variance of ∑

l

R ^(t) (NL)

Simulation

100 agents, 4 cores 400 agents, 1 core R(t) − % recovered averaged over # of runs

Variance of R(t)

Simulation 1 (100 agents, 4 cores, 100 times): 3:30 minutes Simulation 2 (400 agents, 1 core, 100 times): 4:05 minutes

21

slide-32
SLIDE 32

Future work

slide-33
SLIDE 33

There is more work to be done: short-term

∙ Implementation of current methods in FRED

∙ FRED - an open source, supported, flexible AM ∙ Incorporate different levels of homogeneity

  • 1. Independent agents
  • 2. Agents go to one other activity (school, work, neighborhood)
  • 3. Multiple activities

∙ Compare CM and AM parameters empirically

∙ Empirically determine when different regions can be combined

23

slide-34
SLIDE 34

Thank you!

Questions?

24