Genotypic analysis of coreceptor usage New developments and - - PowerPoint PPT Presentation
Genotypic analysis of coreceptor usage New developments and - - PowerPoint PPT Presentation
Genotypic analysis of coreceptor usage New developments and applications for geno2pheno [ coreceptor ] Alexander Thielen Department of Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrcken
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 2
Outlook Genotypic prediction of coreceptor usage New developments geno2pheno[coreceptor] in different applications
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 3
Adapted from Petropoulos CJ, et al. Antimicrob Agents Chemother. 2000;44:920-8.
Transfection
HIV env expression vector HIV genomic luc vector CD4 + CXCR4 +
Infection
Pseudovirus
- capable of a single
round of replication CCR5 and CXCR4 antagonists are used to confirm tropism
CD4+ CCR5+
+
HIV Entry Cell Assay (Trofile)
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 4
very accurate used in clinical trials of entry-inhibitors (Trofile) But: very expensive slow turnaround (up to 5 weeks) not always available (samples for Trofile have to be shipped to South San Francisco) restrictions (e.g. Trofile: viral load >1000, sometimes dry ice needed) Why do we need genotypic approaches?
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 5
Matched genotype (V3 region)-phenotype pairs:
CTRPNNNTRRSISIGPGRAFYATGDIIGDIRQAHC R5 CTRPNNNTRKGIHMGPGS-FYVTGEIIGDIRQAHC R5 CSRPNNNTRKSVHIGPGQAFYATGDVIGDIRQAHC X4
phenotyping genotyping
X=(X1, …, Xi, …, Xn) Y
multiple alignment (against fixed reference alignment) statistical learning
Genotypic Monitoring of Coreceptor Usage
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 6
Performance comparison of different methods Dataset of 1110 clonal samples from the Los Alamos database 10 replicates of 10-fold cross-validation Support Vector Machines and PSSMs significantly better than other methods Best performance among all tested methods: Support Vector Machines
(Sing et al, 2004)
Specificity ~ 90% Sensitivity ~80%
WetCat:
- Charge Rule (11/25)
- Decision trees
- Support Vector Machines
- http://genomiac2.ucsd.edu:8080/wetcat/tropism.html
WebPSSM:
- Position specific scoring matrices
- http://ubik.microbiol.washington.edu/computing/pssm
Geno2pheno[coreceptor]:
- Support Vector Machines
- http://coreceptor.bioinf.mpi-inf.mpg.de
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 7
Clinical samples
How do known predictors compare on clinically derived data? 920 antiretroviral naïve samples
Method Sensitivity Specificity 11/25 rule 30.5% 93.4% SVMgenomiac 21.8% 89.6% PSSMSi/NSI 33.8% (43.7%) 95.3% (90%) PSSMX4/R5 24.5% (43.7%) 96.9% (90%) Neural Network 44.5% 87.5% SVMgeno2pheno 44.7% 90.6%
(Low et al, 2007)
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 8
Problems with clinical samples
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 9
=> see talk by M.Däumer
Solution massively parallel sequencing?
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 10
Prevalence of X4 Phenotype by Baseline CD4 Count
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 11
Clinical samples Predictions of population based sequences generally worse Approach: Incorporation of clinical markers into the prediction model Data: HOMER cohort, coreceptors determined with Trofile Results: Significant improvements with clinical markers such as CD4- cell counts, viral loads
(Sing et al., 2007)
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 12
Geno2pheno[coreceptor] version 2.0
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 13
Regions involved in coreceptor binding
V3 is not the only region involved in coreceptor binding Mutations beyond V3 affecting coreceptor usage reported, e.g. in: Boyd et al., 1993: A single amino acid substitution in the V1 loop of human immunodeficiency virus type 1 gp120 alters cellular tropism. Koito et al., 1995: Small amino acid sequence changes within the V2 domain can affect the function of a T-cell line-tropic human immunodeficiency virus type 1 envelope gp120. Carrillo et al., 1996: Human immunodeficiency virus type 1 tropism for T-lymphoid cell lines: role of the V3 loop and C4 envelope determinants. Cho et al., 1998: “…both the V1/V2 and V3 regions increased the efficiency of CXCR4 use” However: no analysis on large dataset of experimentally determined genotype-phenotype pairs
(Kwong et al., 1998)
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 14
V2-V3 dataset 916 samples from 312 different patients Epidemiological bias reduced: Only at most one R5- and one X4-sequence per patient allowed (randomly selected) Experiments repeated 10 times Features / positions had to be significant in all 10 runs Position numbering according to Consensus B (Los Alamos, July 2007)
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 15
Data preparation / region of extensive length polymorphism Sequences profile-aligned with ClustalW
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 16
Properties of the different regions
Region Feature Mean in R5 Mean in X4 p-value # pos. charged amino acids 5.73 5.72 0.9381 # neg. charged amino acids 3.10 2.95 0.1201 charge 2.63 2.77 0.3926 # N-glycosylation sites 0.91 0.81 0.0247 length of region 32.95 32.86 0.2301 # pos. charged amino acids 0.63 1.03 0.0037 # neg. charged amino acids 1.85 1.48 0.0006 charge
- 1.21
- 0.45
< 0.0001 # N-glycosylation sites 0.69 0.83 0.1981 length of region 8.63 9.51 0.0495 # pos. charged amino acids 6.37 6.75 0.0718 # neg. charged amino acids 4.95 4.43 0.0004 charge 1.41 2.31 0.0001 # N-glycosylation sites 1.61 1.65 0.6959 length of region 41.58 42.38 0.0747 # pos. charged amino acids 10.40 10.44 0.8100 # neg. charged amino acids 6.71 6.80 0.5728 charge 3.68 3.63 0.8269 # N-glycosylation sites 5.77 5.78 0.9309 length of region 98.93 98.98 0.1789 # pos. charged amino acids 6.08 7.86 < 0.0001 # neg. charged amino acids 1.69 1.24 < 0.0001 charge 4.39 6.62 < 0.0001 # N-glycosylation sites 0.98 0.59 < 0.0001 length of region 34.90 35.08 0.1532 V3 C2 V2 (full) V2-polymorphic V2-stem
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 17
Mutations significantly associated with coreceptor usage Fisher’s exact test every occurring mutation at every position within V2-V3 tested Significant over all 10 replicates: 105 mutations at 52 positions in total (V2: 17/12, C2: 12/10, V3: 76/30) 64 mutations at 42 positions correlated with X4-phenotype (V2: 8/8, C2: 8/8, V3: 48/22) 41 mutations at 40 positions correlated with R5-phenotype (V2: 9/9, C2: 4/4, V3: 28/27)
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 18
Prediction results C2 slightly better than guessing V2 surprisingly “good” V2V3 significantly better than V3 alone (P = 0.0019)
11/25-rule
Region Ø AUC C2 0.658 V2 0.730 V3 0.914 V2V3 0.933
11/25-rule, Specificity: 96.1%, Sensitivity: 66.7% 83.5% 78.5%
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 19
Evaluation on clinical isolates 268 samples from therapy-naïve patients with Trofile phenotype models trained on Los Alamos data results: sensitivity at specificity of 90%
V3-alone: 54.2% V2+V3: 62.8%
area under the ROC curve:
V3-alone: 0.778 V2+V3: 0.841
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 20
- Dr. Rolf Kaiser
Uni Köln Patrick Braun PZB Aachen
- Dr. Berg
Berlin Labor Jajaprax München
- Dr. Martin Stürmer
Uni Frankfurt/ Labor Lademannbogen Hamburg Labor Thiele Kaiserslautern Labor Schönian Harzer/ Raunheim Labor Fenner Hamburg NRZ Dr. H. Walter Uni Erlangen
Tropism reference labs
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 21
Results on German “cohort” reference labs sequence V3 loop comparison between Trofile assay and geno2pheno dataset: 234 genotype-phenotype pairs in February 2008 161 (68.6%) R5, 73 (31.2%) X4 Geno2pheno-results: 10%-FPR: 64.4% sensitivity, 87.6% specificity 20%-FPR: 75.3% sensitivity, 76.4% specificity large differences between different labs
=> see talk by M.Obermeier
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 22
How does geno2pheno[coreceptor] perform on other subtypes?
dataset used for training mainly consists of subtype B samples
- nly small amount of genotype-phenotype pairs of other subtypes
idea: large scale analysis on all (with and without phenotype) V3 sequences from Los Alamos Sequence Database assumptions: sequences in Los Alamos reflect overall population to some extent geno2pheno[coreceptor] prediction-distribution resembles real distribution of phenotypes if subtype B prediction-scores are similarly distributed as the ones of another subtype, then geno2pheno should also work for this subtype
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 23
How does geno2pheno[coreceptor] perform on other subtypes?
R5 X4
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 24
How does geno2pheno[coreceptor] perform on other subtypes?
Kiwanuka et al., J Infect Dis, 2008: “The median time to AIDS onset was shorter for persons with subtype D (6.5 years), recombinant subtypes (5.6 years), or multiple subtypes (5.8 years), compared with persons with subtype A (8.0 years; P .022).” Nelson et al., AIDS 2007: “The survival among individuals in Thailand infected with HIV-1 subtype E appears to be similar to that reported among individuals in Africa infected with HIV-1 subtype D.” “The fact that both young military conscripts and blood donors and their wives in Thailand had similarly shortened survival compared to persons in the U.S. and Africa–except those infected with subtype D viruses—suggests that viral subtypes D and E may be more virulent than many other viral subtypes,”
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 25
How does geno2pheno[coreceptor] perform on other subtypes?
??? ???
Thielen, Alexander Genotypic analysis of coreceptor usage – New developments for geno2pheno[coreceptor] 4/3/2008 26