www.hivarca.net Jun 1995. The Clinical Virology Unit at the - - PDF document
www.hivarca.net Jun 1995. The Clinical Virology Unit at the - - PDF document
www.hivarca.net Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV genotyping as a public service Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV
Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV genotyping as a public service
Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV genotyping as a public service Feb 2002. The clinical units are asked for their availability to integrate the sequence db with clinical data (“genotype-response”)
PHENOTYPE GENOTYPE
Fold-resistance CUT-OFF Biological Clinical (??) PREDICTION Choice of the best regimen (VirtualPhenotype)
>20 interpretation systems commercial providers
PHENOTYPE GENOTYPE
Fold-resistance CUT-OFF Biological Clinical (??) PREDICTION Choice of the best regimen
INPUT DATA (BASELINE VARIABLES) OUTPUT DATA (RESPONSE)
INPUT DATA (BASELINE VARIABLES) OUTPUT DATA (RESPONSE)
INPUT DATA (BASELINE VARIABLES) OUTPUT DATA (RESPONSE) PATIENT CASE (BASELINE) PREDICTION (RESPONSE)
Demographics Therapy AIDS events HBV/ HCV status HIV RNA CD4 Genotype
CLINIC LABORATORY
Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV genotyping as a public service Feb 2002. The clinical units are asked for their availability to integrate the sequence db with clinical data (“genotype-response”) Jun 2002. Informa srl to provide project management and HW/SW
- support. GSK funding through an unrestricted educational grant
Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV genotyping as a public service Feb 2002. The clinical units are asked for their availability to integrate the sequence db with clinical data (“genotype-response”) Jun 2002. Informa srl to provide project management and HW/SW
- support. GSK funding through an unrestricted educational grant
Jan 2004. ARCA web site launched. Other units enter the network CLINIC LABORATORY N = 7 0 N = 3 0
As of April 2008
PATIENTS (n = 11,523) PatientID Gender Year of birth Country of origin Transmission route HCV status HBV status THERAPY (N = 42,596) PatientID Treatment regimen Date of start Date of stop Reason for change/stop GENOTYPE (n = 18,695) PatientID Date Sequence Method Subtype Resistance mutations Other mutations CD4 (n = 180,361) PatientID Date CD4/mmc CD4% HIV RNA (n = 145,925) PatientID Date Copies/ml <LLD (undetectable) Method
Ratio naïve / pretreated increased (now ∼ 0.2)
509 patients with ≥ 5 genotypes 119 patients with ≥ 10 genotypes
- No. of patients
- No. of follow-up genotypes
Base d o n po lsubtyping o f the first available HI V-1 se que nc e fro m 10,778 patie nts (ARCA c o ho rt)
Base d o n po lsubtyping o f the first available HI V-1 se que nc e fro m 10,778 patie nts (ARCA c o ho rt) URFs
Base d o n po lsubtyping o f the first available HI V-1 se que nc e fro m 10,778 patie nts (ARCA c o ho rt)
Number of patients with treatment records with latest drugs
Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV genotyping as a public service Feb 2002. The clinical units are asked for their availability to integrate the sequence db with clinical data (“genotype-response”) Jun 2002. Informa srl to provide project management and HW/SW
- support. GSK funding through an unrestricted educational grant
Jan 2004. ARCA web site launched. Other units enter the network Feb 2005. ARCA opens to external research proposals
The requesting investigator fills and posts the dedicated form The ARCA centre does not agree to share its data The scientific board approves the proposal and sends it to the ARCA centres The proposal is unacceptable or there are insufficient data The ARCA centre agrees to share its data The study is performed on the data subset from the centres willing to share their data The authorship (up to 15 authors) is established proportionally to the valid
- cases. All the centres sharing their data but
not included in the authorship are given credits for future use
(see page ‘Info’ on the ARCA web site)
Unit c ase s %
Unit A n1 21 Unit C n2 16 Unit F n3 12 Unit H n4 10 Unit B n5 9 Unit X n6 8 Unit Z n7 7 … … …
Study1
Unit c ase s % Cr e dit
Unit A n1 21 Unit C n2 16 Unit F n3 12 Unit H n4 10 Unit B n5 9 9 Unit X n6 8 8 Unit Z n7 7 7 … … … …
Study1 (four slots)
In the authorship Not in the authorship, credit gained
Unit c ase s % Cr e dit
Unit A n1 21 Unit C n2 16 Unit F n3 12 Unit H n4 10 Unit B n5 9 9 Unit X n6 8 8 Unit Z n7 7 7 … … … …
Study1 (four slots)
Unit c ase s %
Unit A n1 21 Unit D n2 16 Unit E n3 12 Unit G n4 10 Unit B n5 9 Unit Y n6 8 Unit W n7 7 … … …
Study2 (three slots)
Unit c ase s % Cr e dit
Unit A n1 21 Unit C n2 16 Unit F n3 12 Unit H n4 10 Unit B n5 9 9 Unit X n6 8 8 Unit Z n7 7 7 … … … …
Study1 (four slots)
Unit c ase s % Past c r e dit
Unit A n1 21 Unit D n2 16 1 Unit E n3 12 2 Unit G n4 10 3 Unit B n5 9 9 Unit Y n6 8 2 Unit W n7 7 2 … … … …
Study2 (three slots)
Unit c ase s % Past c r e dit T
- tal
Unit A n1 21 21 Unit D n2 16 1 17 Unit E n3 12 2 14 Unit G n4 10 3 13 Unit B n5 9 9 18 Unit Y n6 8 2 10 Unit W n7 7 2 9 … … … … …
Study2 (three slots)
Unit c ase s % Past c r e dit T
- tal
Unit A n1 21 21 Unit D n2 16 1 17 Unit E n3 12 2 14 Unit G n4 10 3 13 Unit B n5 9 9 18 Unit Y n6 8 2 10 Unit W n7 7 2 9 … … … … …
Study2 (three slots)
Unit c ase s % Past c r e dit T
- tal
Cr e dit
Unit A n1 21 21 Unit D n2 16 1 17 Unit E n3 12 2 14 14 Unit G n4 10 3 13 13 Unit B n5 9 9 18 Unit Y n6 8 2 10 10 Unit W n7 7 2 9 9 … … … … … …
Study2 (three slots)
Requests so far posted only by ARCA affiliates but being an affiliate is not necessary!! None of the ARCA centres denied use of its own data
MAIN DRAWBACK Most of the posters/ presentations accepted at conferences have been not translated into papers Targeted conferences: Eur Workshop HIV Drug Res Intl Workshop HIV Drug Res CROI
Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV genotyping as a public service Feb 2002. The clinical units are asked for their availability to integrate the sequence db with clinical data (“genotype-response”) Jun 2002. Informa srl to provide project management and HW/SW
- support. GSK funding through an unrestricted educational grant
Jan 2004. ARCA web site launched. Other units enter the network Feb 2005. ARCA opens to external research proposals Jan 2006. ARCA cooperates with the major HIV resistance db’s in Germany and Sweden in the EU-funded STREP “EuResist”
Integration of viral genomics with clinical data to predict response to anti-HIV treatment (STREP)
Management Informa srl, Rome, Italy Data providers & virology ARCA, Italy AREVIR, Germany Karolinska DB, Sweden Data modeling Max Planck Society for the Advancement of Science, Germany
- Dept. Automation, Engineering, Rome III, Italy
IBM Labs, Haifa, Israel RMKI, Budapest, Hungary Supervision & peer reviewing EFPIA
Not a trivial process! Duplicate patients ›
Identical CD4 count and % on the same day
›
Identical sequence(s) detected via checksum
Many checking routines implemented when
inserting/modifying data in the original database
Most difficult when importing data from poorly designed
electronic records
›
Most frequent problem: overlapping therapies
Progressively reduced to ∼2%
›
Most challenging quality check: sequences
Formats, base mixtures, stop codons, frameshifts
Conservative strategy: get rid of uncertain data!
The ARCA server periodically generates and send tables with “suspicious” data The ARCA Monitor interacts with the
- riginal clinic/ lab
and tries to fix the issues
Issue declared not solvable Issue solved Issue still pending
Flagged and included in future inquiries Flagged and not included in future inquiries
Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV genotyping as a public service Feb 2002. The clinical units are asked for their availability to integrate the sequence db with clinical data (“genotype-response”) Jun 2002. Informa srl to provide project management and HW/SW
- support. GSK funding through an unrestricted educational grant
Jan 2004. ARCA web site launched. Other units enter the network Feb 2005. ARCA opens to external research proposals Jan 2006. ARCA cooperates with the major HIV resistance db’s in Germany and Sweden in the EU-funded STREP “EuResist” Sep 2006. Support from the main antiretroviral drug companies (statistics section)
Financial support for:
- HW & SW maintenance
- Data entry and cleansing
- Opportunities for young investigators
(coming soon)
Travel grants ARCA fellowship/ award
Over 40 graphs related to demographics, treatments, mutations, subtypes… On the whole db (public area) On your own data (private area)
On-line secure db for the lab/ clinic (specific infos and summary functions on a given patient and on the whole patient population)
On-line secure db for the lab/ clinic (specific infos and summary functions on a given patient and on the whole patient population) Genotype interpretation through the updated version of AntiRetroScan
Automatically updated at every new release of the interpreter Virtual Phenotype pdf link provided Subtyping periodically checked via phylogenetic analysis
Export of your own genotypes
Export of your own genotypes
On-line secure db for the lab/ clinic (specific infos and summary functions on a given patient and on the whole patient population) Genotype interpretation through the updated version of AntiRetroScan Statistical pages on your own data (drug & drug regimen use over time, mutation & HIV clades prevalence over time… )
On-line secure db for the lab/ clinic (specific infos and summary functions on a given patient and on the whole patient population) Genotype interpretation through the updated version of AntiRetroScan Statistical pages on your own data (drug & drug regimen use over time, mutation & HIV clades prevalence over time… ) Upcoming query designer (suggestions welcome!)
On-line secure db for the lab/ clinic (specific infos and summary functions on a given patient and on the whole patient population) Genotype interpretation through the updated version of AntiRetroScan Statistical pages on your own data (drug & drug regimen use over time, mutation & HIV clades prevalence over time) Upcoming query designer (suggestions welcome!) Platform for scheduled on-line discussion of clinical cases
A web platform for scheduled on- line discussion of clinical cases This project is specifically supported by Boehringer Ingelheim
Data collected from 100 lab/ clinical units
›
Most keep on transferring data on a regular basis
›
Few have abandoned (cultural issues, time constraints)
Useful features for contributors
›
On-line database for everyday clinical practice
›
Built-in genotype interpretation
›
Individual patient summary and overall statistics
Transparent system
›
Publication policy and publications (however too few!)
›
Balance sheet available on ARCA the web site
Data sharing policy rather successful (in Italy!!)
›
Ongoing challenging attempts to cooperate or merge with other Italian cohorts (ICoNa, MASTER)
International co-operations
›
Part of the EuResist Integrated Database (in line with the
- riginal genotype-response model aim)
›
Cooperation with Virolab
ARCA – summary and perspectives
Jun 1995. The Clinical Virology Unit at the University Hospital of Siena starts low-cost HIV genotyping as a public service Feb 2002. The clinical units are asked for their availability to integrate the sequence db with clinical data (“genotype-response”) Jun 2002. Informa srl to provide project management and HW/SW
- support. GSK funding through an unrestricted educational grant
Jan 2004. ARCA web site launched. Other units enter the network Feb 2005. ARCA opens to external research proposals Oct 2005. Data hosting and cooperation with the START project
STudy of STudy of Antiretroviral Antiretroviral Resistance in Resistance in Treated patients Treated patients with virological with virological failure failure
It is reasonable and partly demonstrated that additional variables impact the response to treatment Modeling techniques allow use of partially available data Definition of “optional” variables and analysis of their impact
time Genotype Treatment switch Viral load 0 to 90 days Short-term model: 4-12 weeks Viral load
time Genotype Treatment switch Viral load 0 to 90 days Short-term model: 4-12 weeks Viral load Pre-therapy HIV RNA Reason for change CD4 Patient demographics (age, gender, race, route of infection) Past genotypes Past treatments Past AIDS diagnosis
Vector size ›
IAS mutations
›
All positions
Functions ›
Activity scores (several methods)
›
Genetic barrier (probability not to develop resistance; see Beerenwinkel et al, JID 2005)
›
Genetic progression score (time to develop a resistant mutational pattern; see Rahnenfuhrer et al., Bioinformatrics 2005)
time Genotype Treatment switch Viral load 0 to 90 days Short-term model: 4-12 weeks Viral load
ASSUMPTIONS Any available genotype obtained while on therapy marks a failure of that therapy Any achievement of an undetectable viral load marks a success
- f the ongoing treatment
Larger data set but reduced information
TCE # Classical 2,824 Alternative 7,952
Data funnel
17,078 patients 59,982 therapies 19,444 sequences 214,516 viral load
% correct prediction* Feature set* * Logistic regression Decision trees Logistic model trees Random Forests
Genetic barrier
74.6 68.2 74.3 70.2
Genetic barrier Baseline VL
75.7 69.7 75.5 73.5
Genetic barrier # treatments Drug class exp.
75.3 69.6 74.7 71.0
Genetic barrier Baseline VL # treatments Drug class exp.
76.7 72.5 76.8 73.8
* Test set after training on > 2500 TCEs (67% responders = achieving undetectable VL) * * In addition to genotype and treatment (reference set)
Training Validation Model Mean accuracy SE Accuracy Logistic regression (MPI) 1
0.744 0.035 0.760
Logistic regression + naïve Bayes (IBM) 2
0.755 0.038 0.748
1Additional features: genetic barrier, baseline VL, # treatments,
drug class exposure
2Additional features: predicted phenotype, genotype history,
drug history, baseline VL, age, naïve Bayes on drug history and
- n age
Preliminary results
Random Forest
10-fold cross-validation
R2 = 0.41 Continuous prediction of viral load change far more challenging! Actual log viral load change Predicted log viral load change Poor adherence? Unexpected activity
Remote users Web server
Regimens ranked for P(Undetectable viral load) P(viral load decrease > 1 log) Viral load change CD4 change … .
OUTPUT
HIV genotype Viral load CD4 # treatment lines Drug class exposure Age Drugs to be filtered in/ out …