RDI RDI RDI RDI
HIV Resistance Response HIV Resistance Response Database Initiative Database Initiative
Seville 2002 Presentation Seville 2002 Presentation
RDI RDI RDI RDI HIV Resistance Response HIV Resistance Response - - PowerPoint PPT Presentation
RDI RDI RDI RDI HIV Resistance Response HIV Resistance Response Database Initiative Database Initiative Seville 2002 Presentation Seville 2002 Presentation Goal Goal To develop a relational database to To develop a relational
HIV Resistance Response HIV Resistance Response Database Initiative Database Initiative
Seville 2002 Presentation Seville 2002 Presentation
To develop a relational database to correlate HIV drug resistance correlate HIV drug resistance-
associated genotype data with response to antiretroviral agents response to antiretroviral agents
Initial aim is to collect genotype, treatment & clinical outcome information (VL &CD4) & clinical outcome information (VL &CD4) from substantial numbers of patients from substantial numbers of patients
To organise
data in an oracle-
based relational database relational database
To analyse analyse data using a number of data using a number of approaches to relate resistance mutation approaches to relate resistance mutation patterns with clinical response patterns with clinical response
Provide wide access to interrogation of data via the internet data via the internet
Core Team Core Team
Mission of the Core Team:
Contribute data & Develop database
Ensure data meets appropriate QA standards standards
Develop initial data analysis plan
Review requests to analyse analyse data from data from the database the database
Data Input Data Input
Sequence Sequence
DATABASE DATABASE
VL & CD4 VL & CD4
( (bl bl & follow & follow-
up)
Therapies Therapies
Data QA Data QA
Specific QA standards applied to data submitted to the database data submitted to the database
Developed for :
Clinical Data (Cohort, Clinical Trial)
Sequencing Data (RT, Protease regions)
Data Analysis Data Analysis
Data identified from about 3500 patients
– – More when additional genotyping performed More when additional genotyping performed
Oracle database hardware & software in place (with dedicated support) place (with dedicated support)
– – DB architecture constructed DB architecture constructed
Power calculations have been performed to estimate approx. number of required to estimate approx. number of required data points data points (see
(see DiRienzo DiRienzo & & DeGruttola DeGruttola) )
Initial NN models constructed (see Wang et al)
(see Wang et al)
Neural Network Model Neural Network Model
( (Dechao Dechao Wang et al) Wang et al) Mutations Mutations VL VL
( (bl bl & follow & follow-
up)
Therapies Therapies predicted predicted ∆ ∆VL VL
Neural Neural Network Network
62 62-
Parameter Neural Network model
( (Dechao Dechao Wang et al) Wang et al)
Input variables
20 PI codon codon positions positions
29 RT codon codon positions positions
12 drugs (5 PIs, 5 NRTIs NRTIs, 2 , 2 NNRTIs NNRTIs) )
Duration of therapy (weeks)
Output variable
Viral load change at on-
therapy time points points
VGI Vigilance II Database VGI Vigilance II Database
Viral load Viral load ( 598 ) ( 598 ) Genotype Genotype (781 ) (781 ) Regimen Regimen (715 ) (715 ) 442 samples 442 samples
Linear Regression Analysis: Linear Regression Analysis: Training Set Training Set
Training set (n=639)
R2 = 0.85
2 4
1 2 3 Actual viral load change Predicted viral load change
Linear Regression Analysis: Linear Regression Analysis: Independent Set Independent Set
Validation set (n=63)
R2 = 0.55
1 2 3
1 2 3 Actual viral load change Predicted viral load change
Predicting VL Trajectory: Predicting VL Trajectory:
Test Data Set Test Data Set Time Viral Load NN Prediction: 75%±1.8% correct (~86/115)
“In “In silico silico” Response Prediction ” Response Prediction
Baseline genotype for a virtual patient
RT: 41, 67, 118, 210, 215
PI: 10, 46, 82, 90
Alternative therapy regimens
ddI, , Kaletra Kaletra
ddI, , indinavir indinavir
Kaletra
indinavir
“In “In silico silico” Response Prediction ” Response Prediction
1 2 3
Baseline week 8 week 16
Viral Load (log10)
D4T/ddI/IDV D4T/ddI/Kal AZT/3TC/Kal AZT/3TC/IDV
Summary Summary
The RDI is focused on establishing relationships between baseline genotype & virological between baseline genotype & virological response via analysis of a large clinical dataset response via analysis of a large clinical dataset
Significant progress has been made:
– – Sources of data identified Sources of data identified – – Database architecture constructed Database architecture constructed – – Modeling work has begun Modeling work has begun
This initiative is open for groups to join & aims to provide open access to query the database provide open access to query the database
Utilization of large databases is likely to improve the accuracy of genotypic interpretation the accuracy of genotypic interpretation