CPTR RDST Data Platform Concept September 22, 2014 Outline C-Path - - PowerPoint PPT Presentation

cptr rdst
SMART_READER_LITE
LIVE PREVIEW

CPTR RDST Data Platform Concept September 22, 2014 Outline C-Path - - PowerPoint PPT Presentation

CPTR RDST Data Platform Concept September 22, 2014 Outline C-Path overview and examples of data projects Knowledge sharing concept RDST approach Examples of RDST data types Database architecture Next steps timeline


slide-1
SLIDE 1

CPTR RDST Data Platform Concept

September 22, 2014

slide-2
SLIDE 2

Outline

  • C-Path overview and examples of data projects
  • Knowledge sharing concept
  • RDST approach
  • Examples of RDST data types
  • Database architecture
  • Next steps – timeline

CPTR-RDST Data Platform 2014 Workshop Slides

2

slide-3
SLIDE 3

C-Path Consortia

Coalition Against Major Diseases

UNDERSTANDING DISEASES OF THE BRAIN

Critical Path to TB Drug Regimens

TESTING DRUG COMBINATIONS

Multiple Sclerosis Outcome Assessments Consortium

DRUG EFFECTIVENESS IN MS

Polycystic Kidney Disease Consortium

NEW IMAGING BIOMARKERS

Patient-Reported Outcome Consortium

DRUG EFFECTIVENESS

Electronic Patient-Reported Outcome Consortium

DRUG EFFECTIVENESS

Predictive Safety Testing Consortium

DRUG SAFETY

Seven global consortia developing novel drug development tools

 Biomarkers  Clinical Outcome

Assessment Instruments

 Clinical Trial

Simulation Tools

 In vitro tools  Data Standards

3

slide-4
SLIDE 4

C-Path Online Data Repository

Current C-Path examples CAMD – AD Clinical Trial Simulation Tool PKD - Biomarker Qualification Project MSOAC – New Outcome Assessment Instrument for MS

MSOAC

4

CDC TB study data now available

slide-5
SLIDE 5

Datasets contributed to C-Path for consortia projects

Consortium Therapeutic Area # of Studies Total Number

  • f Subjects

Number of Data Contributors Coalition Against Major Diseases Alzheimer's disease 27 7340 11 Parkinson's disease 7 2597 2 Critical Path to TB drug Regimens Tuberculosis 10 2495 5 MS Outcome Assessments Consortium Multiple sclerosis 6 4700 4 Polycystic Kidney Disease Polycystic kidney disease 5 2941 4 Predictive Safety Testing Consortium Normal healthy volunteer-kidney 1 172 1 Skeletal-muscular (non- clinical) 38 1766 6 Hepato-toxicity (non-clinical) 43 2340 7 Nephro-toxicity (non-clinical) 14 941 8

5

slide-6
SLIDE 6

Value of data sharing, data standards & data pooling

 Nine member companies agreed to share data from 24 Alzheimer’s disease (AD) trials  The data were not in a common format  The data were remapped to the CDISC AD standard and pooled  A new clinical trial simulation tool was created and has been the first model endorsed by the FDA and EMA  Researchers utilizing database to advance research

Start Point Result

24 studies, >6500 patients 6

slide-7
SLIDE 7

7

Model endorsed by FDA and EMA Access to AD data available to qualified researchers

slide-8
SLIDE 8

Future TB model

8

slide-9
SLIDE 9

Rapid DST TB Data Sharing Platform Architecture Concept

9

slide-10
SLIDE 10

CPTR TB Drug Resistance DB

Data Platform to Inform Assay Development

10

How do we build this system?

Linking Global TB Sequence Researchers

  • TB sequence community inputs
  • Expert review to advance

investigational biomarkers to validated status

  • Can use separate or

consolidated DBs

  • FDA compliant CDISC architecture

for regulatory submission DB

Approved members

  • Academic labs
  • Reference labs
  • Commercial companies
  • Others…

Validated DR biomarkers

Expert Panel Review Approved biomarkers

Sequence repository

Anonymized sequence data Clinical annotation Phenotypic methods User friendly cloud interface Analysis files generated

Investigational DB

Analysis files generated Analysis files generated Analysis files generated Analysis files generated Analysis files generated

CPTR-RDST Data Platform 2014 Workshop Slides

slide-11
SLIDE 11

How do we accomplish this?

  • Clear objective

– Improved research resource to enable development of new rapid diagnostics for TB

  • Future objective

– With sustainability funding: resource for clinicians

  • Build on previous efforts for TB and for data sharing

– Apply technology product development discipline – Design to handle wide range of data types – Quality criteria and defined process for incoming data – Lean, efficient and well managed implementation – Expandable / adaptable / flexible – Great usability

  • Strong alignment with anticipated analysis use cases

11

CPTR-RDST Data Platform 2014 Workshop Slides

slide-12
SLIDE 12

Related efforts: TBDReamDB

12

CPTR-RDST Data Platform 2014 Workshop Slides

http://www.tbdreamdb.com/index.html

slide-13
SLIDE 13

Example of future objective: Stanford HIV database

13

CPTR-RDST Data Platform 2014 Workshop Slides

http://hivdb.stanford.edu/

slide-14
SLIDE 14

Product development discipline

  • Detailed, documented requirements
  • Early prototyping
  • Design to requirements
  • Staged development with clear milestones
  • Extensive testing

– Verification of all features and function – Usability – Performance – Scalability

  • Phased rollout

– RDST members – Qualified external researchers

  • Ongoing support and enhancements based on user feedback

Requirements Prototype Design Build Test Deploy Support and enhance

14

CPTR-RDST Data Platform 2014 Workshop Slides

slide-15
SLIDE 15

RDST Data: multiple data types

15

Need to incorporate multiple types of data

  • sequence data
  • SNP reports
  • resistance test data
  • clinical trial/study/registry data
  • external information resources
  • any other data that may be necessary

Which need to be analyzed to find and validate correlations

CPTR-RDST Data Platform 2014 Workshop Slides

slide-16
SLIDE 16

RDST Data: genotypic data example

@M00347:61:000000000-A9B8J:1:1101:15324:1677 1:N:0:1 TCTTGATCGCGAGTTCGCGGCCCGGGGTGAGCACCCAGGTGAGCGGGAAATGCGTGGTGTCGTGGTAGCTGACGTCGACGATGCCGTGGCG + 11>>1@BF1>>11AEF00000AA////A//A1AB/?/GAEC1GBE??///FFG/?E?EFHF/F?A?EG1BDBC/FCGGCC<FACHCCG/>CC-.<>10<<-< @M00347:61:000000000-A9B8J:1:1101:15765:1689 1:N:0:1 TGCGATTGCAGCGCGTCGGCGTCGGTGGTGTAACCGGTCTTGGTCTTCTTGGTCTTCGGCATCTCCAGCTCGTCGAACAGAACGGCCTGCA + 11111>A1D31B1A0EE00A/EA//E//EAFGHHFGCEEGHDBHHHHHHHBDHHHHFG/E?EHHHGFFGHHGEFG?FHGEEHHEGGGGGHGGH @M00347:61:000000000-A9B8J:1:1101:15578:1705 1:N:0:1 CTCGACGTCGGCAAGGGTCAGGTCGTGGTGGTGCTCGGCCCCTCGGGCTCGGGCAAATCGACGTTGTGCCGCACGATCAATCGCCTCGAG + 1>11>ADDA?1000000BFFFFHF0E?/AFEECGHH//AEEEGGEC?GGH?/@@GGFHHE0EEEFHGGHHGGGEGGHHGHFHGGGGGG/CHGG @M00347:61:000000000-A9B8J:1:1101:13636:1714 1:N:0:1 CGATTCGACGGCCTGTTCATCGCCGACGTGCTCGGTACCTACGACGTGTACGGCGGCAGCGACGAGGCCGCGATCCGTCACGCCGCGCAG + 111>AFABA1@AAEFGGGFGFFCFA?E/EFHGHGGGHGHHHHE?FGGGHHHHGGGGGEEECEGGGG?CGGGGGGGGHGHGGCFGCCGGG @M00347:61:000000000-A9B8J:1:1101:15489:1729 1:N:0:1 TATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAACAAACCAATAAACAA

16

sequence data with quality info

TB FASTQ file – 650 MB uncompressed

CPTR-RDST Data Platform 2014 Workshop Slides

slide-17
SLIDE 17

RDST Data: SNP report example

17

SNP reports

CPTR-RDST Data Platform 2014 Workshop Slides

slide-18
SLIDE 18

RDST Data: phenotypic data example

18

resistance test data

CPTR-RDST Data Platform 2014 Workshop Slides

https://tbdr.org/cgi/tbdr

slide-19
SLIDE 19

STUDYID DOMAIN USUBJID AGE SEX RACE ARM 19 DM 10001 27 F WHITE Ethambutol 5 Times Per Week 19 DM 10002 63 M WHITE Moxifloxacin 3 Times Per Week 19 DM 10003 42 M BLACK OR AFRICAN AMERICAN Moxifloxacin 5 Times Per Week 19 DM 10004 30 F ASIAN Moxifloxacin 5 Times Per Week 19 DM 10005 29 M BLACK OR AFRICAN AMERICAN Moxifloxacin 3 Times Per Week 19 DM 10006 35 M BLACK OR AFRICAN AMERICAN Ethambutol 3 Times Per Week 19 DM 10007 46 F UNKNOWN Ethambutol 3 Times Per Week 19 DM 10008 34 F BLACK OR AFRICAN AMERICAN Moxifloxacin 5 Times Per Week 19 DM 10009 55 M BLACK OR AFRICAN AMERICAN Ethambutol 3 Times Per Week 19 DM 10010 42 M ASIAN Moxifloxacin 5 Times Per Week 19 DM 10011 23 F BLACK OR AFRICAN AMERICAN Ethambutol 3 Times Per Week 19 DM 10012 47 F WHITE Ethambutol 3 Times Per Week 19 DM 10013 25 F BLACK OR AFRICAN AMERICAN Moxifloxacin 5 Times Per Week 19 DM 10014 21 M WHITE Ethambutol 3 Times Per Week 19 DM 10015 79 M WHITE Moxifloxacin 3 Times Per Week 19 DM 10016 27 F ASIAN Moxifloxacin 3 Times Per Week 19 DM 10017 37 M BLACK OR AFRICAN AMERICAN Ethambutol 3 Times Per Week 19 DM 10018 28 M BLACK OR AFRICAN AMERICAN Moxifloxacin 3 Times Per Week

RDST Data: clinical data example (hypothetical data)

19

STUDYID DOMAIN USUBJID MBTESTCD MBTEST MBORRES MBSPEC VISIT 13 MB 10001 AFB Acid Fast Bacilli NEGATIVE SPONT SPUTUM WEEK 8 13 MB 10001 ORGANISM Organism Present NEGATIVE FOR TUBERCULOSIS SPONT SPUTUM WEEK 4 13 MB 10001 MTBINH M.tuberculosis INH Resistant POSITIVE NON-OVERNIGHT SP SCREENING 15 MB 10001 ORGANISM Organism Present NEGATIVE FOR TUBERCULOSIS SPONT SPUTUM WEEK 4 13 MB 10002 AFB Acid Fast Bacilli NEGATIVE SPONT SPUTUM WEEK 8 13 MB 10002 ORGANISM Organism Present POSITIVE FOR M. TUBERCULOSIS COMPLEX SPONT SPUTUM SCREENING 15 MB 10002 ORGANISM Organism Present POSITIVE FOR M. TUBERCULOSIS COMPLEX SPONT SPUTUM SCREENING 13 MB 10003 ORGANISM Organism Present NEGATIVE FOR TUBERCULOSIS INDUCED SPUTUM WEEK 4 15 MB 10004 ORGANISM Organism Present NEGATIVE FOR TUBERCULOSIS SPONT SPUTUM WEEK 4 STUDYID DOMAIN USUBJID MOTESTCD MOTEST MOORRES MOSTRESC MOLOC VISIT MODY 17 MO 10001 CAVIT Cavitation Y Y LUNG, LEFT SCREENING

  • 4

17 MO 10002 CAVIT Cavitation Y Y LUNG, LEFT SCREENING

  • 5

17 MO 10002 PLEURALDPleural Disease N N LUNG, LEFT SCREENING 1 17 MO 10004 PLEURALDPleural Disease N N LUNG, LEFT SCREENING

  • 8

17 MO 10005 CAVIT Cavitation N N LUNG, LEFT SCREENING

  • 9

17 MO 10005 CAVIT Cavitation N N LUNG, LEFT SCREENING

  • 15

17 MO 10006 CAVIT Cavitation Y Y LUNG, LEFT SCREENING 1 17 MO 10006 PLEURALDPleural Disease N N LUNG, LEFT SCREENING

  • 7

clinical trial data

CPTR-RDST Data Platform 2014 Workshop Slides

slide-20
SLIDE 20

RDST Data: TB strain summary table

http://www.ncbi.nlm.nih.gov/genome/genomes/166

20

external resources

CPTR-RDST Data Platform 2014 Workshop Slides

slide-21
SLIDE 21

RDST data platform: design to handle multiple data types

  • Aggregated

Research Database

Subject – Level Clinical Trial Data

VAR1 1 2 3 4 5 6 7 s1 x1 x2 x3 x4 x5 x6 x7 s2 y1 y2 y3 y4 y5 y6 y7 s.. z1 z2 z3 z4 z5 z6 z7

Time

VAR2 1 2 3 4 5 6 7 s1 x1 x2 x3 x4 x5 x6 x7 s2 y1 y2 y3 y4 y5 y6 y7 s.. z1 z2 z3 z4 z5 z6 z7 VAR3 1 2 3 4 5 6 7 s1 x1 x2 x3 x4 x5 x6 x7 s2 y1 y2 y3 y4 y5 y6 y7 s.. z1 z2 z3 z4 z5 z6 z7

Strain 1 sequence data Strain 2 sequence data Strain 3 sequence data ACAAGATGCCATTGTCCCGCT… CCTGGAGGGTGGGAGACA… CTTTCCTCGCTTGGGTGG…..

21

Clinical Trial Data

Data Analysis Data Analysis

OBS1 1 2 3 4 5 6 7 s1 x1 x2 x3 x4 x5 s2 y1 y2 y3 y4 y5 s.. z1 z2 z3 z4 VAR1 1 2 3 4 5 6 7 s1 x1 x6 x7 s2 y1 y4 y5 y6 y7 s.. z1 z2 z3 z4 z5 z6 z7

TEST1 1 2 3 4 5 6 7 s1 x_base x_chk1 x7 s2 y_base y_chk1 y7 s.. z_base z2 z3 z7

Surveillance Data Time

Apply CDISC Data Standards

Surveillance data Genotypic data Phenotypic data

data analysis

CPTR-RDST Data Platform 2014 Workshop Slides

slide-22
SLIDE 22

Key success factors for incoming data

  • Buy in for data contributions

– Survey and prioritize – Proactive engagement – Recognition and incentives for contributions

  • Clearly defined quality criteria

– Develop and vet during initial data survey & prioritization

  • Consistent process for incoming data processing

– Unified pipeline for incoming sequence data – Ability to apply CDISC standards to create efficient database (vs large number of small data buckets)

  • Ongoing curation and quality control

22

CPTR-RDST Data Platform 2014 Workshop Slides

slide-23
SLIDE 23

Quality criteria and defined process for incoming sequence data

23

Incoming FASTQ plus associated SNP report New SNP report generated with RDST unified pipeline

RDST unified pipeline for sequence data

CPTR-RDST Data Platform 2014 Workshop Slides

slide-24
SLIDE 24

Data Element: Phase of TB treatment Data Element: TB Symptoms

24

TB clinical data mapping to CDISC  We do this today for CPTR

USUBJID EXTRT EXDOS EXDOSU USUBJID CETERM CEPRESP CEOCCUR

Clinical Events (CE) Exposure (EX) Skin Response (SR)

USUBJID SRTESTCD SRTEST SRORRES SRORRESU 12345 INDURDIA Induration Diameter 16 mm

Controlled Terminology

Map to CDISC domains

CDISC Variables

Data Element: Tuberculin Skin Test Result Definition: The number of millimeters in diameter of the induration, or raised hardening, at the tuberculin skin test site. Permissible value set: mm

  • f induration.
  • Preserve, do not change the data content
  • A place for everything, everything in its place
  • Capture the smallest usable elements of data

24

CPTR-RDST Data Platform 2014 Workshop Slides

slide-25
SLIDE 25

Rapid DST Data Platform

25

  • Can we apply CDISC standards to

TB genotypic and phenotypic data?

  • What is the benefit of doing this?

CPTR-RDST Data Platform 2014 Workshop Slides

slide-26
SLIDE 26

Interventions Special Purpose

Demographics Subject Elements Subject Visits

Findings

ECG Incl/Excl Exceptions

Events

Con Meds Disposition Comments

Trial Design

Trial Elements Trial Arms Trial Visits Trial Incl/Excl Exposure Substance Use Adverse Events Medical History Deviations Clinical Events PK Concentrations Vital Signs Microbiology Spec. Questionnaire Drug Accountability Subject Characteristics Labs Microbiology Suscept. PK Parameters Physical Exam Trial Summary Findings About

26

CPTR-RDST Data Platform 2014 Workshop Slides

CDISC Study Data Tabulation Model (SDTM) domains for classification of data elements

slide-27
SLIDE 27

Data Mapping: SNP report example

27

PFORREF – reference result (can apply to nucleotides or amino acids, depends on value in PFTEST) PFORRES – experimental result (can apply to nucleotides or amino acids, depends on value in PFTEST) PFRESCAT – category of result (is this a nonsense or missense mutation? frameshift? etc.) PFGENTYP – type of feature we’re looking at (gene, sector, protein, etc.) PFGENRI – region of interest (it is defined as the specific gene or locus being looked at) PFSTRESC – standard result of the analysis. Usually uses HGVS nomenclature

CPTR-RDST Data Platform 2014 Workshop Slides

slide-28
SLIDE 28

Rapid DST Data Platform

28

Three primary categories of data

  • Data as received from contributors
  • Quality checked, processed, standardized data

–Master copy –Full complement of data for RDST consortium use –Authorized subset for external researchers (as broad as possible within sharing terms and conditions imposed by each data contributor)

  • Analysis data extracts and reports to support research

CPTR-RDST Data Platform 2014 Workshop Slides

slide-29
SLIDE 29

Rapid DST Data Platform

+ Lean, efficient and well managed implementation + Expandable / adaptable / flexible + Great usability Strong alignment with anticipated analysis use cases

29

slide-30
SLIDE 30

Rapid DST Data Platform

30

Next Steps

CPTR-RDST Data Platform 2014 Workshop Slides

slide-31
SLIDE 31

2014 2015 2016 2017

S O N D J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O S O N D J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O

2014 2015 2016 2017

RDST Data Sharing Platform Timeline v4

1.1 Governance Model 1.5 Dev Ph 1 Dev Ph 2

Data Platform Available for Consortium Members

2.4 Perform phase 1 program assessment 2.6 Enable for external researchers

Sustainability Funding Secured

C-Path Milestone

1.3 Req’s, Arch and Design 1.2 Value proposition, DUA updates and Communication Plan 1.5 Test Ph 1 1.7 Test Ph 2 Dev Ph 3 1.8 Test Ph 3 3.5 Perform Phase 2 program assessment

Data Platform Available for external researchers

3.6/3.8 Release 2 Dev/Test

FIND Milestone

2.5 Expand Capacity 3.7 Expand Capacity 1.6/2.2/3.3 Prepare and load contributed data in Data Platform as it becomes available 2.3/3.4 Review and approve access requests as they are submitted 2.1/3.2 Monitor performance and usage 1.1.1 Inventory of available DBs 1.2.1 Form Expert Panel and support Data Platform development and use 1.2.2 Develop criteria for determination of resistance mutations 1.2.3 Develop algorithms for interpretation of genotypic data

1.2.3 Published algorithm for interpretation of genotypic data

1.3.1 Develop guidelines/criteria for clinical validation of assays to detect/interpret resistance mutation 1.4.1 Support for development of access models and tools for broad access

PHASE 2 PHASE 3

C-PATH Milestones

3.9 Pursue funding to support clinical use 2.8/3.1 Pursue sustainability funding 1.6 Load early data 2.7 Beta Test 1.4 Request early data 1.9 Prep for production Request early data 1.5.1/1.5.2 Support for sustainable business model and review process

1.1.2 Early Data Packages available for inclusion in Data Platform 1.2.2 Defined Criteria for determination of resistance mutations 1.3.1 WHO report on guidelines/criteria for validation of assays to detect and intrerpret resistance mutations

1.1.2 Prepare data packages for inclusion in Data Platform 1.1.3 Input to C-Path on design of Data Platform

PHASE 1

FIND Milestones

Assist with development of Value Proposition and Communications Plan

Next steps: the art of the start

Build and deploy Expanded access Sustainability

slide-32
SLIDE 32

Rapid DST Data Platform

32

  • Big job in front of us
  • We are not starting from scratch
  • We have lots of help

We can do this!

CPTR-RDST Data Platform 2014 Workshop Slides

slide-33
SLIDE 33

33

www.c-path.org

CPTR-RDST Data Platform 2014 Workshop Slides

slide-34
SLIDE 34

Rapid DST Data Platform

34

Backup

CPTR-RDST Data Platform 2014 Workshop Slides

slide-35
SLIDE 35

Rapid DST Data Platform

35

+ Lean, efficient and well managed implementation + Expandable / adaptable / flexible + Great usability

CPTR-RDST Data Platform 2014 Workshop Slides

slide-36
SLIDE 36

Rapid DST Data Platform

36

Investigational DB with user access levels (data team, RDST, external)

user friendly cloud interface

FASTQ data files

internal

Incoming data storage

external

CPTR-RDST Data Platform 2014 Workshop Slides

slide-37
SLIDE 37

Rapid DST Data Platform

37

Strong alignment with anticipated analysis use cases

CPTR-RDST Data Platform 2014 Workshop Slides