Phylogeny-based HIV transmission networks
Prabhav Kalaghatgi
Max Planck Institute for Informatics
AREVIR, May 8 2015
HIV epidemiology Routes of infection MSM: Men who have sex with men - - PowerPoint PPT Presentation
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi Max Planck Institute for Informatics AREVIR, May 8 2015 eu resist HIV epidemiology Routes of infection MSM: Men who have sex with men Phylogeny-based HIV transmission networks
Prabhav Kalaghatgi
Max Planck Institute for Informatics
AREVIR, May 8 2015
Routes of infection MSM: Men who have sex with men
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 2/18
Routes of infection MSM: Men who have sex with men HET: Heterosexual partners
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 2/18
Routes of infection MSM: Men who have sex with men HET: Heterosexual partners IDU: Injection drug users
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 2/18
Routes of infection MSM: Men who have sex with men HET: Heterosexual partners IDU: Injection drug users
Motivation
A better understanding of the HIV epidemic through the analysis
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 2/18
Accumulation of genomic mutations via a series of transmissions
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 3/18
Accumulation of genomic mutations via a series of transmissions
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC 2 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGATTGGCCACCATAC
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 3/18
Accumulation of genomic mutations via a series of transmissions
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC 2 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGATTGGCCACCATAC 3 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGACTGGCCACCATAC
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 3/18
Accumulation of genomic mutations via a series of transmissions
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC 2 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGATTGGCCACCATAC 3 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGACTGGCCACCATAC 4 ATAGGTCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGGCCACCATAC
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 3/18
Accumulation of genomic mutations via a series of transmissions
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC 2 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGATTGGCCACCATAC 3 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGACTGGCCACCATAC 4 ATAGGTCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGGCCACCATAC 5 ATAGGTCCATAGCCAGATTGCCCAAATGGATCCACCAGACTGGCCACCATAC
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 3/18
Accumulation of genomic mutations via a series of transmissions
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC 2 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGATTGGCCACCATAC 3 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGACTGGCCACCATAC 4 ATAGGTCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGGCCACCATAC 5 ATAGGTCCATAGCCAGATTGCCCAAATGGATCCACCAGACTGGCCACCATAC 6 ATAGATCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGCCCACCATAC
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 3/18
Recover transmission network from sequences sampled from infected individuals
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC 2 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGATTGGCCACCATAC 3 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGACTGGCCACCATAC 4 ATAGGTCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGGCCACCATAC 5 ATAGGTCCATAGCCAGATTGCCCAAATGGATCCACCAGACTGGCCACCATAC 6 ATAGATCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGCCCACCATAC
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 4/18
Recover transmission network from sequences sampled from infected individuals
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC 2 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGATTGGCCACCATAC 3 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGACTGGCCACCATAC 4 ATAGGTCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGGCCACCATAC 5 ATAGGTCCATAGCCAGATTGCCCAAATGGATCCACCAGACTGGCCACCATAC 6 ATAGATCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGCCCACCATAC
1 2 3 4 5 6
Fully resolved network
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 4/18
Recover transmission network from sequences sampled from infected individuals
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC 2 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGATTGGCCACCATAC 3 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGACTGGCCACCATAC 4 ATAGGTCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGGCCACCATAC 5 ATAGGTCCATAGCCAGATTGCCCAAATGGATCCACCAGACTGGCCACCATAC 6 ATAGATCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGCCCACCATAC
1 2 3 4 5 6
Fully resolved network infeasible
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 4/18
Recover transmission network from sequences sampled from infected individuals
1 ATAGGTCCATAGCCAGATTGGCCAAATAGATCCACCAGATTGGCCACCATAC 2 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGATTGGCCACCATAC 3 ATAGGTCCATAGCCAGATTGCCCAAATAGATCCACCAGACTGGCCACCATAC 4 ATAGGTCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGGCCACCATAC 5 ATAGGTCCATAGCCAGATTGCCCAAATGGATCCACCAGACTGGCCACCATAC 6 ATAGATCCATAGCCAGATTGCCCAAATAGAACCGCCAGATTGCCCACCATAC
1 2 3 4 5 6
Fully resolved network
1 2 3 4 5 6
Partially resolved network infeasible
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 4/18
Selection criteria
Subtype B Country of origin in Europe Oldest sequence per individual Sampling times from 1999 to 2012
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 5/18
Selection criteria
Subtype B Country of origin in Europe Oldest sequence per individual Sampling times from 1999 to 2012
Data used
15,000 sequences Transmission mode: MSM (25%), HET (22%), IDU (19%), Unknown (34%)
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 5/18
Threshold-based networks
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 6/18
Threshold-based networks Threshold-free networks using phylogenetic distances
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 6/18
Threshold-based networks Threshold-free networks using phylogenetic distances Timed networks using molecular clock
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 6/18
Network constructed by thresholding distance between sequences (LogDet)
Kalaghatgi et al. European Workshop on HIV & HCV 2013 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 7/18
Network constructed by thresholding distance between sequences (LogDet) Low cross-country transmission
Kalaghatgi et al. European Workshop on HIV & HCV 2013 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 7/18
Network constructed by thresholding distance between sequences (LogDet) Low cross-country transmission Small cluster sizes
Kalaghatgi et al. European Workshop on HIV & HCV 2013 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 7/18
Network constructed by thresholding distance between sequences (LogDet) Low cross-country transmission Small cluster sizes 25% of sequences are linked
Kalaghatgi et al. European Workshop on HIV & HCV 2013 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 7/18
LogDet, TN93, Hamming distance yield inaccurate estimates
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 8/18
LogDet, TN93, Hamming distance yield inaccurate estimates Model-based estimates are more accurate
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 8/18
LogDet, TN93, Hamming distance yield inaccurate estimates Model-based estimates are more accurate – Phylogenetic tree
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 8/18
LogDet, TN93, Hamming distance yield inaccurate estimates Model-based estimates are more accurate – Phylogenetic tree – Substitution models:
GTR, HKY
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 8/18
LogDet, TN93, Hamming distance yield inaccurate estimates Model-based estimates are more accurate – Phylogenetic tree – Substitution models:
GTR, HKY
– Among-site rate
variation model
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 8/18
LogDet, TN93, Hamming distance yield inaccurate estimates Model-based estimates are more accurate – Phylogenetic tree – Substitution models:
GTR, HKY
– Among-site rate
variation model
Optimize models and phylogenetic tree
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 8/18
a b c d e f
Phylogenetic tree
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 9/18
a b c d e f
Phylogenetic tree
d c b a e f a b c d e f
Tree-based distances
3.1 4.1 5.9 2.6 5.3 3.4 9.8 6.4 7.5 8.5 4.2 5.8 8.6 6.5 6.1 _ _ _ _ _ _
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 9/18
a b c d e f
Phylogenetic tree
d c b a e f a b c d e f
Tree-based distances
3.1 4.1 5.9 2.6 5.3 3.4 9.8 6.4 7.5 8.5 4.2 5.8 8.6 6.5 6.1 _ _ _ _ _ _
Objective: Among all undirected trees select one that minimizes the sum of edge weights (distances)
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 9/18
a b c d e f f e c b a
Phylogenetic tree
d
Transmission network
d c b a e f a b c d e f
Tree-based distances
3.1 4.1 5.9 2.6 5.3 3.4 9.8 6.4 7.5 8.5 4.2 5.8 8.6 6.5 6.1 _ _ _ _ _ _
Objective: Among all undirected trees select one that minimizes the sum of edge weights (distances) Optimal tree is the minimum spanning tree.
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 9/18
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 10/18
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 10/18
using minimum spanning tree
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 10/18
using minimum spanning tree
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 10/18
support 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 11/18
Edge support: proportion of networks that contain the edge
20 30 40 0.0 0.2 0.4 0.6 0.8 1.0
Transmission network size Edge support
1 1 1 2 1 2 2 1 2 2 3 2 3 4 7 4 8 17 23 42 50 87 221
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 12/18
Phylogenetic tree
Transmission network
Each subtree in the phylogenetic tree induces a connected subgraph in the transmission network
Kalaghatgi et al. Bioinformatics; under review Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 13/18
Phylogenetic tree
Transmission network
Each subtree in the phylogenetic tree induces a connected subgraph in the transmission network
Kalaghatgi et al. Bioinformatics; under review Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 13/18
Phylogenetic tree
Transmission network
Each subtree in the phylogenetic tree induces a connected subgraph in the transmission network
Kalaghatgi et al. Bioinformatics; under review Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 13/18
Phylogenetic tree
Transmission network
Each subtree in the phylogenetic tree induces a connected subgraph in the transmission network
Kalaghatgi et al. Bioinformatics; under review Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 13/18
Phylogenetic tree
Transmission network
Each subtree in the phylogenetic tree induces a connected subgraph in the transmission network Low support in large networks due to difficulty in resolving distant divergence events
Kalaghatgi et al. Bioinformatics; under review Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 13/18
Kalaghatgi et al. European Workshop on HIV & HCV 2014 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 14/18
So far we constructed untimed networks
Kalaghatgi et al. European Workshop on HIV & HCV 2014 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 14/18
So far we constructed untimed networks Sampling times allow an estimation
Kalaghatgi et al. European Workshop on HIV & HCV 2014 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 14/18
So far we constructed untimed networks Sampling times allow an estimation
A B transmission time divergence time sampling time A sampling time B time virus lineage in A virus lineage in B
Kalaghatgi et al. European Workshop on HIV & HCV 2014 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 14/18
So far we constructed untimed networks Sampling times allow an estimation
A B transmission time divergence time sampling time A sampling time B time virus lineage in A virus lineage in B
Convert evolutionary distance to time using molecular clock
Kalaghatgi et al. European Workshop on HIV & HCV 2014 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 14/18
So far we constructed untimed networks Sampling times allow an estimation
A B transmission time divergence time sampling time A sampling time B time virus lineage in A virus lineage in B
Convert evolutionary distance to time using molecular clock Calibrate molecular clock using sampling times
Kalaghatgi et al. European Workshop on HIV & HCV 2014 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 14/18
So far we constructed untimed networks Sampling times allow an estimation
A B transmission time divergence time sampling time A sampling time B time virus lineage in A virus lineage in B
Convert evolutionary distance to time using molecular clock Calibrate molecular clock using sampling times Construct transmission networks with edges labeled with transmission time
Kalaghatgi et al. European Workshop on HIV & HCV 2014 Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 14/18
2003 2004 2005 2006 2007 2008 2009 ≥ 2010 Time of transmission
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 15/18
1995 2000 2005 2010 10 20 30 40 50 60 IDU−IDU
Number of transmissions
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 16/18
1995 2000 2005 2010 10 20 30 40 50 60 IDU−IDU
Number of transmissions
1995 2000 2005 2010 10 20 30 40 50 60 HET-IDU
Time of transmission Number of transmissions
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 16/18
1995 2000 2005 2010 10 20 30 40 50 60 IDU−IDU
Number of transmissions
1995 2000 2005 2010 10 20 30 40 50 60 MSM−MSM 1995 2000 2005 2010 10 20 30 40 50 60 HET-HET 1995 2000 2005 2010 10 20 30 40 50 60 HET-IDU
Time of transmission Number of transmissions
1995 2000 2005 2010 10 20 30 40 50 60 IDU−MSM
Time of transmission
1995 2000 2005 2010 10 20 30 40 50 60 MSM-HET
Time of transmission
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 16/18
Genetic data was used to infer transmission networks
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 17/18
Genetic data was used to infer transmission networks Threshold-based network suggests low cross-country transmission
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 17/18
Genetic data was used to infer transmission networks Threshold-based network suggests low cross-country transmission Threshold-free approach highlights fragility of large networks
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 17/18
Genetic data was used to infer transmission networks Threshold-based network suggests low cross-country transmission Threshold-free approach highlights fragility of large networks Incorporating temporal information suggests that IDU transmit to HET
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 17/18
Thomas Lengauer Glenn Lawyer Mathieu Flinders Nico Pfeifer Tomas Bastys Rolf Kaiser Valeria Ghisetti Maurizio Zazzi Francesca Incardona Anne-Mieke Vandamme
Phylogeny-based HIV transmission networks Prabhav Kalaghatgi 18/18