Integrating genetic and epidemiological data to determine virus - - PowerPoint PPT Presentation

integrating genetic and epidemiological data to determine
SMART_READER_LITE
LIVE PREVIEW

Integrating genetic and epidemiological data to determine virus - - PowerPoint PPT Presentation

Integrating genetic and epidemiological data to determine virus transmission pathways Eleanor COTTAM 2,3 , Gal THBAUD 1,2 , Jemma WADSWORTH 3 , John GLOSTER 3 , Leonard MANSLEY 4 , David PATON 3 , Donald KING 3 , Dan HAYDON 2 1 UMR BGPI


slide-1
SLIDE 1

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

Integrating genetic and epidemiological data to determine virus transmission pathways

1 UMR BGPI (INRA-Montpellier) 2 Division of Environmental and

Evolutionary Biology (University of Glasgow)

3 Institute for Animal Health

(Pirbright)

4 Animal Health Divisional Office

(Perth)

Eleanor COTTAM 2,3, Gaël THÉBAUD 1,2, Jemma WADSWORTH 3, John GLOSTER 3, Leonard MANSLEY 4, David PATON 3, Donald KING 3, Dan HAYDON 2

slide-2
SLIDE 2

Molecular epidemiology and directionality

Introduction

  • Direction of transmission :

– reference (more or less implicit) to additional information

  • Genetic sequences:

– phylogeny – clades / groups / types

  • Comparison between genetic

similarity and

– geographic proximity – ecological zone – host species – ...

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-3
SLIDE 3
  • type of source and target individuals
  • transmission distances
  • important or missing sources
  • likely transmission modes
  • evolution during 1 transmission cycle

Accessible information

epidemiology evolution

Why is directionality interesting?

Introduction

  • Implications:
  • logical:
  • legal:

source “target” cause consequences

≈ ≈

responsible victim

≈ ≈

In theory, complete description of the epidemic In practice, data sets concerning few individuals

  • parameterise

epidemiological models (e.g., network models)

  • limit virus propagation
  • multiscale models

Use of the information

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-4
SLIDE 4

Questions on FMDV

Introduction

  • At which scale is there some viral genetic

polymorphism?

– animal, farm, disease focus?

  • Can we use the observed polymorphism

to identify transmission chains? How?

  • What is the reliability of veterinary

contact tracing?

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-5
SLIDE 5

Biological system

  • Foot-and-mouth disease virus outbreak (2001)
  • 20 complete genomes (~10 kb each)

– 5 initial infections with a known history – 15 farms from the same focus (Durham County)

10 km A K B D E C G I J M N O L P F 10 km A K B D E C G I J M N O L P F

  • Positive-strand RNA virus:

– High mutation rate (~10-4 errors/nucleotide/replication) – Limited recombination

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-6
SLIDE 6
  • Known root
  • 2 independent

introductions

  • 4 groups

10 20 30 40 50 60 70 80 90 100 110 Day of outbreak A B C D M E N G I J F K L 1 2 3 5 4 O P 10 20 30 40 50 60 70 80 90 100 110 Day of outbreak A B C D M E N G I J F K L 1 2 3 5 4 O P

24 nucleotide substitutions

SAR/19/2000

24 nucleotide substitutions

SAR/19/2000

TCS

10 km A K B D E C G I J M N O L P F 10 km A K B D E C G I J M N O L P F

Genetic data

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-7
SLIDE 7

Genetic data

10 20 30 40 50 60 70 80 90 100 110 Day of outbreak A B C D M E N G I J F K L 1 2 3 5 4 O P 10 20 30 40 50 60 70 80 90 100 110 Day of outbreak A B C D M E N G I J F K L 1 2 3 5 4 O P

24 nucleotide subst tions itu

SAR/19/ 200

24 nucleotide subst tions itu

SAR/19/ 200

TCS

  • Known root
  • 2 independent

introductions

  • 4 groups

How to identify transmission history?

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-8
SLIDE 8
  • 1 known chain of

transmissions

10 20 30 40 50 60 70 80 90 100 110 A B C D M E N G I J F K L 1 2 3 5 4 O P 10 20 30 40 50 60 70 80 90 100 110 A B C D M E N G I J F K L 1 2 3 5 4 O P

24 nucleotide substitutions

SAR/19/2000

24 nucleotide substitutions

SAR/19/2000

  • 3 obvious

transmissions

Which is the most likely transmission tree?

  • What about the
  • ther ones??

? ? ? ? ? ? ? ? Which is the most likely farm for each node ?

  • Known root

Use of contact tracing data

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

Genetic data

slide-9
SLIDE 9

Epidemiology

  • L : Probability density

for latency (Γ)

  • Ii : Probability density

for the infection date

  • f farm i
  • Fi (t) : Probability for

farm i to be infectious at date t

27 January 2001 03 February 2001 10 February 2001 17 February 2001 24 February 2001 03 M arch 2001 10 M arch 2001 17 M arch 2001 24 M arch 2001 31 M arch 2001 07 April 2001 14 April 2001 21 April 2001 28 April 2001 05 M ay 2001 12 M ay 2001 19 M ay 2001 26 M ay 2001 02 June 2001

1 2 3 4 5 B C A D E F G I J K L M N O P

09 June 2001 27 January 2001 03 February 2001 10 February 2001 17 February 2001 24 February 2001 03 M arch 2001 10 M arch 2001 17 M arch 2001 24 M arch 2001 31 M arch 2001 07 April 2001 14 April 2001 21 April 2001 28 April 2001 05 M ay 2001 12 M ay 2001 19 M ay 2001 26 M ay 2001 02 June 2001

1 2 3 4 5 B C A D E F G I J K L M N O P

09 June 2001

Animal movement ban

27 January 2001 03 February 2001 10 February 2001 17 February 2001 24 February 2001 03 M arch 2001 10 M arch 2001 17 M arch 2001 24 M arch 2001 31 M arch 2001 07 April 2001 14 April 2001 21 April 2001 28 April 2001 05 M ay 2001 12 M ay 2001 19 M ay 2001 26 M ay 2001 02 June 2001 09 June 2001 27 January 2001 03 February 2001 10 February 2001 17 February 2001 24 February 2001 03 M arch 2001 10 M arch 2001 17 M arch 2001 24 M arch 2001 31 M arch 2001 07 April 2001 14 April 2001 21 April 2001 28 April 2001 05 M ay 2001 12 M ay 2001 19 M ay 2001 26 M ay 2001 02 June 2001 09 June 2001 27J 2001

1 2 3 4 5 B C A D E F G I J K L M N O P

27J 2001

1 2 3 4 5 B C A D E F G I J K L M N O P Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-10
SLIDE 10

Epidemiology

  • λij: Likelihood of

i { j rather than another observed farm }

27 January 2001 03 February 2001 10 February 2001 17 February 2001 24 February 2001 03 M arch 2001 10 M arch 2001 17 M arch 2001 24 M arch 2001 31 M arch 2001 07 April 2001 14 April 2001 21 April 2001 28 April 2001 05 M ay 2001 12 M ay 2001 19 M ay 2001 26 M ay 2001 02 June 2001

1 2 3 4 5 B C A D E F G I J K L M N O P

09 June 2001 27 January 2001 03 February 2001 10 February 2001 17 February 2001 24 February 2001 03 M arch 2001 10 M arch 2001 17 M arch 2001 24 M arch 2001 31 M arch 2001 07 April 2001 14 April 2001 21 April 2001 28 April 2001 05 M ay 2001 12 M ay 2001 19 M ay 2001 26 M ay 2001 02 June 2001

1 2 3 4 5 B C A D E F G I J K L M N O P

09 June 2001

∑ ∑ ∑

≠ = = =

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⋅ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⋅ =

n i k k k C C t i j C C t i ij

t F t I t F t I

k j i j

1 ) , min( ) , min(

) ( ) ( ) ( ) ( λ

Animal movement ban

  • L : Probability density

for latency (Γ)

  • Ii : Probability density

for the infection date

  • f farm i
  • Fi (t) : Probability for

farm i to be infectious at date t

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-11
SLIDE 11
  • λij: likelihood of i { j rather than another observed farm }
  • λij can be computed for each transmission
  • Thus, for a complete transmission tree (k), λk = Πλij

E L E L

1 {L,E}

G I J F G I J F

2 {F,G} {1,2,K}

K

1728 trees

Loglikelihood Frequency

  • 250
  • 200
  • 150
  • 100

5 10 15 20 25 30

λ

  • And λk can be computed for any tree

… if all the possible trees can be enumerated Algorithm defining the possible trees by recurrence from the leaves back to the root

Genetics + epidemiology

All differing from contact tracing results

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-12
SLIDE 12
  • Rescaled likelihood:

λ’k = λk / Σλk

  • Which group of trees

represent 95% of the rescaled likelihood?

Which is the most likely group of trees?

4 trees

most likely tree

Genetics + epidemiology

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-13
SLIDE 13

( # ) Number of distinct sources among the 4 most likely trees [ # ] Likelihood of the most probable transmission

Genetics + epidemiology

Which is the most likely tree?

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-14
SLIDE 14

10 20 30 40 50 60 70 80 90 100 110 A B C D M E N G I J F K L 1 2 3 5 4 O P 10 20 30 40 50 60 70 80 90 100 110 A B C D M E N G I J F K L 1 2 3 5 4 O P

24 nucleotide substitutions

SAR/19/2000

24 nucleotide substitutions

SAR/19/2000

A K O L F

Genetics + epidemiology

Which is the most likely tree?

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-15
SLIDE 15

Genetics + epidemiology

Spatial pattern

Short distance transmission

10 km A K B D E C G I J M N O L P F 10 km A K B D E C G I J M N O L P F

MeanDistSim Frequency 4000 6000 8000 10000 5000 10000 15000 20000

P(1-sided) = 1.2 x 10-3 Mean distance 4850 m

7472 m

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008

slide-16
SLIDE 16

Conclusions

  • The whole set of possible transmission trees is

identified based on genetic data

  • Their relative likelihood is evaluated based on

epidemiological data

  • Interesting method for real-time forensic applications
  • Identifying the tree root
  • Dealing with censoring / sampling issues
  • Weighting different sources of information

Difficulties Summary

Cottam E.M. et al. (2008) Integrating genetic and epidemiological data to determine transmission pathways of foot-and-mouth disease virus.

  • Proc. R. Soc. B 275: 887-895.

Gaël Thébaud MIEP08 • Montpellier, France • 10-12/06/2008