[PPT] - Comparing Compartment and Agent-based Models Shannon Gallagher JSM PowerPoint Presentation

SLIDE 1

Comparing Compartment and Agent-based Models

Shannon Gallagher JSM Baltimore, MD August 2, 2017

Thesis work with: William F. Eddy (Chair) Joel Greenhouse Howard Seltman Cosma Shalizi Samuel L. Ventura

SLIDE 2

Goal: Combine two good models into a better one

1

SLIDE 3

Studying infectious disease is important

2

SLIDE 4

Compartment vs. Agent-based Models

SLIDE 5

Compartment models (CMs) describe how individuals evolve over time

Assumptions (Anderson and May 1992) :

1. Homogeneity of individuals
2. Law of mass action

I t 1 I t

4

SLIDE 6

Compartment models (CMs) describe how individuals evolve over time

Assumptions (Anderson and May 1992) :

1. Homogeneity of individuals
2. Law of mass action

I(t + 1) ∝ I(t)

4

SLIDE 7

Agent-based models (AMs) simulate the spread of disease

Assumptions (Helbing 2002):

1. Heterogeneity of agents
2. Model adequately reflects reality

5

SLIDE 8

Agent-based models (AMs) simulate the spread of disease

Assumptions (Helbing 2002):

1. Heterogeneity of agents
2. Model adequately reflects reality

5

SLIDE 9

CMs and AMs: a side by side comparison

CMs

∙ Equation-based ∙ Computationally fast ∙ Homogeneous individuals ∙ No individual properties

AMs

∙ Simulation-based ∙ Computationally slow ∙ Heterogeneous individuals ∙ Individual properties

6

SLIDE 10

Combining the two together

(Bobashev 2007, Banos 2015, Wallentin 2017) ∙ ad hoc approaches ∙ perspective from non-statisticians

Goal: Create a statistically justified hybrid model

7

SLIDE 11

Combining the two together

(Bobashev 2007, Banos 2015, Wallentin 2017) ∙ ad hoc approaches ∙ perspective from non-statisticians

Goal: Create a statistically justified hybrid model

7

SLIDE 12

Current Work

SLIDE 13

There are two main avenues of improvement

1. Quantifying how similar CMs and AMs are
2. Speeding up AM run-time

9

SLIDE 14

The SIR model: a detailed look

(Kermack and McKendrick 1927)

    

dS dt

= − βSI

N dI dt

= βSI

N − γI dR dt

= γI ∙ β – rate of infection ∙ γ – rate of recovery ∙ N – total population size

10

SLIDE 15

The SIR model: a detailed look

(Kermack and McKendrick 1927)

    

∆S ∆t

= − βSI

N ∆I ∆t

= βSI

N − γI ∆R ∆t

= γI ∙ β – rate of infection ∙ γ – rate of recovery ∙ N – total population size

11

SLIDE 16

Our stochastic CM approach

ˆ S(t + 1) = ˆ S(t) − st ˆ R(t + 1) = ˆ R(t) + rt ˆ I(t + 1) = N − ˆ S(t + 1) − ˆ R(t + 1), with st+1 ∼ Binomial ( ˆ S(t), βI(t) N ) rt+1 ∼ Binomial ( ˆ I(t), γ ) .

12

SLIDE 17

Our stochastic AM approach

For an agent xn(t), n = 1, 2, . . . , N, the forward operator for t > 0 is xn(t + 1) =      xn(t) + Bernoulli (

βI(t) N

) if xn(t) = 1 xn(t) + Bernoulli (γ) if xn(t) = 2 xn(t)

therwise

. where xn(t) = k, k ∈ {1, 2, 3} corresponds to state S, I, and R, respectively Let the aggregate total in each compartment be ˆ Xk(t) =

N

∑

n=1

I{xn(t) = k}

13

SLIDE 18

The means overlap

25 50 75 100 25 50 75 100

Time % of Population

Type

S−CM I−CM R−CM S−AM I−AM R−AM 1000 agents; 5000 runs; β = 0.10; γ = 0.03

Mean Proportion of Compartment Values

14

SLIDE 19

The distributions look the same

15

SLIDE 20

These approaches are equivalent

Theorem Let the CM and AM be as previously described. Then for all t ∈ {1, 2, . . . , T}, ˆ S(t)

d

= ˆ XS(t) (1) ˆ I(t)

d

= ˆ XI(t) ˆ R(t)

d

= ˆ XR(t).

16

SLIDE 21

These approaches are equivalent

Theorem Let the CM and AM be as previously described. Then for all t ∈ {1, 2, . . . , T}, ˆ S(t)

d

= ˆ XS(t) (1) ˆ I(t)

d

= ˆ XI(t) ˆ R(t)

d

= ˆ XR(t).

16

SLIDE 22

We can compare CM/AM pairs and AM/AM pairs by fitting the underlying model

0.028 0.029 0.030 0.031 0.096 0.098 0.100 0.102

β γ

Simulation Type

CM AM 1000 agents; 5000 runs; β = 0.10; γ = 0.03

Fitted SIR parameters distribution

17

SLIDE 23

AMs are appealing because they can be run multiple times

∙ Simulate an epidemic en masse! ∙ A run - same initial parameters, different random numbers ∙ Runs (L) are independent of one another = ⇒ parallelization ∙ Roughly, the variance of compartments ↓ when N, L ↑

Goal: Improve computation time without sacrificing statistical details

18

SLIDE 24

AMs are appealing because they can be run multiple times

∙ Simulate an epidemic en masse! ∙ A run - same initial parameters, different random numbers ∙ Runs (L) are independent of one another = ⇒ parallelization ∙ Roughly, the variance of compartments ↓ when N, L ↑

Goal: Improve computation time without sacrificing statistical details

18

SLIDE 25

There is a tradeoff between the number of agents and number of runs

6 8 10 12 14 25 50 75 100

Time V(S1) V(S2)

5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agents

Ratio of Variance of # Susceptibles

6 8 10 12 14 25 50 75 100

Time V(I1) V(I2)

5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agents

Ratio of Variance of # Infected

6 8 10 12 14 25 50 75 100

Time V(R1) V(R2)

5000 runs; β = 0.10; γ=0.03; Model 1−1000 agents, Model 2−100 agents

Ratio of Variance of # Recovered

19

SLIDE 26

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V S t 1 S t 1 pt pt 1 pt 2V S t ∙ V S2 t

N2 N1 V S1 t

V

1 L1 runs S1 t N1

V

1 L2 runs S2 t N2

L2N2

2

L1N2

1

V S1 t V S2 t L2N2 L1N1

We can replace agents with runs!

20

SLIDE 27

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V [ ˆ S(t + 1) ] = S(t)(1 − pt)pt + (1 − pt)2V [ ˆ S(t) ] ∙ V S2 t

N2 N1 V S1 t

V

1 L1 runs S1 t N1

V

1 L2 runs S2 t N2

L2N2

2

L1N2

1

V S1 t V S2 t L2N2 L1N1

We can replace agents with runs!

20

SLIDE 28

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V [ ˆ S(t + 1) ] = S(t)(1 − pt)pt + (1 − pt)2V [ ˆ S(t) ] ∙ V[ˆ S2(t)] = N2

N1 V[ˆ

S1(t)] V

1 L1 runs S1 t N1

V

1 L2 runs S2 t N2

L2N2

2

L1N2

1

V S1 t V S2 t L2N2 L1N1

We can replace agents with runs!

20

SLIDE 29

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V [ ˆ S(t + 1) ] = S(t)(1 − pt)pt + (1 − pt)2V [ ˆ S(t) ] ∙ V[ˆ S2(t)] = N2

N1 V[ˆ

S1(t)] V [

1 L1

∑

runs ℓ ˆ S1(t) N1

] V [

1 L2

∑

runs ℓ ˆ S2(t) N2

] = L2N2

2

L1N2

1

· V[ˆ S1(t)] V[ˆ S2(t)] = L2N2 L1N1 .

We can replace agents with runs!

20

SLIDE 30

The calculations show that the variance scales

∙ Note that for a given β and γ, if S1(0)

N1

= S2(0)

N2

= ⇒

S1(t) N1

= S2(t)

N2

∙ V [ ˆ S(t + 1) ] = S(t)(1 − pt)pt + (1 − pt)2V [ ˆ S(t) ] ∙ V[ˆ S2(t)] = N2

N1 V[ˆ

S1(t)] V [

1 L1

∑

runs ℓ ˆ S1(t) N1

] V [

1 L2

∑

runs ℓ ˆ S2(t) N2

] = L2N2

2

L1N2

1

· V[ˆ S1(t)] V[ˆ S2(t)] = L2N2 L1N1 .

We can replace agents with runs!

20

SLIDE 31

Through paralellization, we can get a speed-up without losing statistical information

0e+00 2e−04 4e−04 6e−04 25 50 75 100

Time Variance of ∑

l

S ^(t) (NL)

Simulation

100 agents, 4 cores 400 agents, 1 core S(t) − % susceptible averaged over # of runs

Variance of S(t)

0e+00 2e−04 4e−04 6e−04 25 50 75 100

Time Variance of ∑

l

I ^(t) (NL)

Simulation

100 agents, 4 cores 400 agents, 1 core I(t) − % infected averaged over # of runs

Variance of I(t)

0e+00 2e−04 4e−04 6e−04 25 50 75 100

Time Variance of ∑

l

R ^(t) (NL)

Simulation

100 agents, 4 cores 400 agents, 1 core R(t) − % recovered averaged over # of runs

Variance of R(t)

Simulation 1 (100 agents, 4 cores, 100 times): 3:30 minutes Simulation 2 (400 agents, 1 core, 100 times): 4:05 minutes

21

SLIDE 32

Future work

SLIDE 33

There is more work to be done: short-term

∙ Implementation of current methods in FRED

∙ FRED - an open source, supported, flexible AM ∙ Incorporate different levels of homogeneity

1. Independent agents
2. Agents go to one other activity (school, work, neighborhood)
3. Multiple activities

∙ Compare CM and AM parameters empirically

∙ Empirically determine when different regions can be combined

23

SLIDE 34

Thank you!

Questions?

24