Learning Drug Resistance from Therapeutic History
Alejandro Pironti
Computational Biology and Applied Algorithmics Max-Planck-Institut für Informatik May 12, 2015
Learning Drug Resistance from Therapeutic History Alejandro Pironti - - PowerPoint PPT Presentation
Learning Drug Resistance from Therapeutic History Alejandro Pironti Computational Biology and Applied Algorithmics Max-Planck-Institut fr Informatik May 12, 2015 Motivation Genotypic drug-resistance Goals: determination:
Computational Biology and Applied Algorithmics Max-Planck-Institut für Informatik May 12, 2015
2
Figure 1: The genotype codes for the phenotype. Figure 2: Data-driven or rules-based? The benefits and disadvantages of each approach render them complimentary.
3 DPRRT DIN TPRRT TIN HIVdb DPRRT DIN TPRRT TIN HIVdb ABC 7,862 256 1,661 136 363 APV 1,369 50 375 36 206 AZT 20,923 332 3,796 214 1,190 ATV 3,549 198 738 102 76 d4T 15,172 209 2,705 124 1,101 DRV 936 121 313 90 7 ddC 4,928 40 1,189 56 339 FPV 1,139 72 274 52 30 ddI 13,836 176 2,552 119 817 IDV 10,760 155 2,053 93 812 FTC 4,699 261 788 152 83 LPV 8,951 246 1,837 175 197 3TC 23,063 406 3,954 242 0 NFV 8,407 126 1,520 77 764 TDF 9,873 349 1,636 200 272 SQV 7,371 126 1,764 97 493 DLV 142 8 90 23 0 TPV 673 73 217 49 11 EFV 10,311 221 1,922 155 454 EVG 10 3 0 ETR 272 49 130 59 2 RAL 694 132 209 95 0 NVP 9,094 167 1,635 132 570 Naïve 37,408 2,188 2,453 184 0 RPV 5 2 0 Total 63,593 2,674 6,886 461 1,517
Table: Numbers of sequences by dataset and drug exposure. DPRRT: training PRRT dataset; DIN: training IN dataset; TPRRT: test PRRT dataset; TIN: test integrase dataset.
4 AV Train PS Train Resist. Train Total Train AV Test PS Test Resist. Test Total Test Total 3TC 912 1537 1623 2449 108 175 184 283 2732 ABC 851 1468 902 2319 96 171 96 267 2586 AZT 859 1555 1234 2414 103 177 137 280 2694 d4T 898 1562 1026 2460 101 179 104 280 2740 ddC 833 448 139 1281 93 49 15 142 1423 ddI 900 1563 167 2463 102 180 17 282 2745 TDF 648 1224 696 1872 72 142 75 214 2086 DLV 1036 1621 1055 2657 106 186 109 292 2949 EFV 1133 1636 1362 2769 114 187 135 301 3070 ETR 374 460 268 834 32 68 35 100 934 NVP 1194 1640 1477 2834 122 188 156 310 3144 RPV 93 173 93 266 12 24 15 36 302 ATV 773 1156 975 1929 86 109 99 195 2124 DRV 270 648 349 918 34 60 33 94 1012 FPV 1086 1705 1413 2791 112 183 138 295 3086 IDV 1144 1739 1409 2883 132 189 159 321 3204 LPV 1041 1485 1486 2526 112 155 150 267 2793 NFV 1178 1783 1646 2961 134 196 180 330 3291 SQV 1177 1743 1187 2920 133 193 134 326 3246 TPV 742 880 584 1622 80 80 55 160 1782 EVG 106 589 206 695 8 70 26 78 773 RAL 106 622 220 728 8 73 30 81 809
Table: Numbers of Antivirogram (AV) and PhenoSense (PS) genotype- phenotype pairs.
5
Schematic Representation of a Support Vector Machine
6 DES DES After Cutoffs HIVdb Rule Set DES DES After Cutoffs HIVdb Rule Set 3TC/ FTC 0.84 0.81 0.73 ATV 0.61 0.58 0.56 ABC 0.76 0.72 0.68 DRV 0.65 0.62 0.62 AZT 0.84 0.81 0.74 IDV 0.79 0.76 0.7 d4T 0.85 0.82 0.77 LPV 0.7 0.67 0.65 ddC 0.84 0.5 NFV 0.79 0.5 0.74 ddI 0.86 0.83 0.77 SQV 0.81 0.78 0.72 TDF 0.73 0.69 0.61 TPV 0.83 0.79 0.8 EFV 0.77 0.74 0.7 RAL 0.75 0.69 ETR 0.78 0.75 0.72 Naïve PRRT 0.88 0.83 NVP 0.77 0.74 0.7 Naïve IN 0.65 0.64 APV/ FPV 0.8 0.73 0.74 Mean CD (SD) 0.77 (0.07) 0.73 (0.09) 0.7 (0.06) Mean AM (SD) 0.78 (0.07) 0.71 (0.1) DES DES After Cutoffs HIVdb Rule Set DES DES After Cutoffs HIVdb Rule Set 3TC/ FTC 0.73 0.66 0.76 ATV 0.61 0.57 0.54 ABC 0.7 0.66 0.66 DRV 0.88 0.89 0.89 AZT 0.62 0.6 0.67 IDV 0.76 0.73 0.73 d4T 0.65 0.62 0.65 LPV 0.67 0.64 0.64 ddI 0.73 0.69 0.68 NFV 0.76 0.5 0.71 TDF 0.57 0.54 0.55 SQV 0.76 0.73 0.74 EFV 0.83 0.79 0.79 TPV 0.79 0.74 0.78 ETR 0.58 0.63 0.64 Mean CD (SD) 0.72 (0.09) 0.67 (0.1)
CD: Common drugs; AM: All Models Table 1: Drug-Exposure Prediction Performance (AUC) on EuResist test set. Table 1: Drug-Exposure Prediction Performance (AUC) on HIVdb test set.
7 Antivirogram log RF Correlation PhenoSense log RF Correlation Resistant vs. Naïve after cutoffs AUC Antivirogram log RF Correlation PhenoSense log RF Correlation Resistant vs. Naïve after cutoffs AUC 3TC/FTC 0.75 0.76 0.99 APV/FPV 0.85 0.88 1 ABC 0.65 0.73 1 ATV 0.84 0.89 0.99 AZT 0.27 0.5 1 DRV 0.72 0.89 1 d4T 0.38 0.55 0.99 IDV 0.82 0.84 1 ddI 0.49 0.45 0.98 LPV 0.88 0.92 1 TDF 0.26 0.24 0.99 NFV 0.79 0.85 1 EFV 0.71 0.74 0.99 SQV 0.78 0.8 1 ETR 0.71 0.65 0.99 TPV 0.48 0.64 0.99 RPV 0.75 0.7 1 RAL 0.62 0.71 0.96 NVP 0.75 0.6 0.99 EVG 0.71 0.67 0.99 Mean (SD) 0.66 (0.19) 0.7 (0.17) 0.99 (0.01)
Table 1: The correlation of drug-exposure scores with log resistance factors is shown below. Additionally, cutoffs were applied to drug-exposure scores and the capability of discriminating between resistant genotypes and therapy-naïve genotypes was assessed.
EuResist TCEs HIVdb TCEs Drug Exposure Scores After Cutoffs 0.68 0.63 HIVdb Rule Set 0.67 0.66
Table 2: Therapy-success prediction performance (AUC).
8
9
May 12, 2015 Alejandro Pironti
Thomas Lengauer Nico Pfeifer Joachim Büch Prabhav Kalaghatgi Joachim Büch
Björn Jensen
Rolf Kaiser Mark Oette Saleta Sierra Aragon Elena Knops Maria Neumann-Fraune Eugen Schülter Eva Heger Claudia Müller Nadine Lübcke
Hauke Walter Martin Obermeier
Martin Däumer Alexander Thielen Berhard Thiele
Francesca Incardona Maurizzio Zazzi Mattia Prosperi
Claudia Kücherer