douglas teodoro emilie pasche julien gobeill patrick ruch
play

Douglas Teodoro, Emilie Pasche, Julien Gobeill, Patrick Ruch, - PowerPoint PPT Presentation

Douglas Teodoro, Emilie Pasche, Julien Gobeill, Patrick Ruch, Christian Lovis Rmy Choquet, Christel Daniel The project The problem Data technical and semantic heterogeneity Different languages: French, German, Greek, Swedish, etc.


  1. Douglas Teodoro, Emilie Pasche, Julien Gobeill, Patrick Ruch, Christian Lovis Rémy Choquet, Christel Daniel

  2. The project

  3. The problem • Data technical and semantic heterogeneity – Different languages: French, German, Greek, Swedish, etc. – Different types: RDBS, free text, xml files Drug Quantity / Frequency CICLOSPORINE (SANDIMMUN, ;;;;;;;;;1;;;;;;;;;;;;;;; NEORAL) MYCOPHENOLATE T 60 MINUTES ;;;;;;;;1;;;;;;;;;;;;;;;; PROTOCOLE:TEST STIMULATION ;;;;;;;;1;;;;;;;;;;;;;;;; SYNACTHENE 1H 3E TUBE /3 co-trimoxazole lundi - mercredi - vendredi (3x/sem) ciprofloxacine 1x/sem (dimanche) vancomycine 1x18h

  4. The problem • Data privacy – Very high concern – Patient identity and other confidential items cannot be revealed by any means to unauthorized people • Political barriers – External connection to ODBS – Security risks

  5. Our solution • Clinical Data Repository (CDR): a distributed storage system, which provides transparent access to heterogeneous data sources, featuring SQL/SPARQL query interfaces and result sets in SQL tuple and RDF, where patient privacy is assured. • Based on two visions: 1. Pragmatic: uses database federation, which is a known technology, in order to provide faster data integration to the other project components 2. Innovative: uses semantic web technology. A new approach that will be explored during the whole project duration

  6. CDR::Architecture • Database federation - based HUG AVERBIS INSERM LIU Query entry

  7. CDR::Architecture • Semantic web - based INSERM LIU HUG AVERBIS Query entry

  8. CDR::Information Model • Information Model: HL7-RIM based – Other candidates: OpenEHR, EAV/CR, customized model • The data stored in the CDR covers the following aspects: • Patient information • Pathogens related information • Objects related information • Information on locations • Operational data

  9. CDR::Information Model Adverse events Health care setting Adverse events Prescriptions Antibiograms Cultures Patient data Diseases Pathogens

  10. CDR::Business Model • Agents – Responsible for the CDR – CIS interoperability – Data management within the CDR – Communication with other DebugIT components Mark failed fail Try to Receive order Wait order execute order success • Orders: • DataExtraction Mark success • DataNormalisation • DataMigration • DataDepersonalisation • OntologyUpdate

  11. CDR::Business Model • Federated engine – Based on MySQL Federated Engine – Federate the distributed data sources – Receive, create plan and execute SQL requests • SPARQL Engine – Based on D2R – Transform the ER model into a semantic linked data model – Receive, create plan and execute SPARQL requests

  12. CDR::Data Privacy • Security – Sensitive data encrypted – Mapping table: original term  encrypted term – Original term kept only within the intranet – Encrypted term exposed on the internet Artefact Original Encrypted ID Artefact Artefact 1 10 001b98ab4335f1d3da23946bce9e4279 2 59 0109cfbecd89a3aaeeb92fde6420f29b 3 39 010c1482764323fd479510ef6a8f5f48 Patient ID Patient Patient Age Sex 001b98ab4335f1d3da23946bce9e4279 58 F 0109cfbecd89a3aaeeb92fde6420f29b 38 F 010c1482764323fd479510ef6a8f5f48 19 M

  13. Our results • DebugIT CDR has already its first pilot • SQL endpoints ready at HUG and LiU – Data integration via database federation – Based on MySQL Federated Engine – SQL requests and SQL tuple result sets • SPARQL endpoints set up at 3 demonstration centers: HUG, INSERM and LiU – Data integration via ‘linked data’ – Based on D2R – Transform the ER model into a semantically linked data model – SPARQL requests and RDF result sets

  14. Database federation CDR +-------------+---------------+-------+-------------+ L i U | data_source | sensibility | value | result_date | HUG +-------------+---------------+-------+-------------+ | hug | indeterminate | 2 | 2006 | | hug | resistant | 72 | 2004 | • select cr.data_source data_source, | hug | resistant | 71 | 2005 | • cr.antibiotic_tested_result sensibility, | hug | resistant | 112 | 2006 | | hug | resistant | 94 | 2007 | • count(cr.antibiotic_tested_result) value, | hug | resistant | 8 | 2008 | • date_format(c.result_date, '%Y') result_date | hug | susceptible | 302 | 2004 | • from culture_results cr | hug | susceptible | 318 | 2005 | • join culture c on cr.culture_id = c.culture_id | hug | susceptible | 288 | 2006 | | hug | susceptible | 269 | 2007 | • join bacteria b on b.bacterium_id = cr.identified_bacteria_name | hug | susceptible | 4 | 2008 | • join drug d on d.drug_id = cr.antibiotic_tested • 1 min 14.68 sec | liu | indeterminate | 1 | 2007 | • where | liu | resistant | 10 | 2005 | • b.name = 'Escherichia coli' | liu | resistant | 21 | 2006 | | liu | resistant | 30 | 2007 | • and d.name = 'sulfamethoxazole and trimethoprim' | liu | resistant | 46 | 2008 | • group by 1,2,4 | liu | susceptible | 108 | 2005 | | liu | susceptible | 90 | 2006 | | liu | susceptible | 132 | 2007 | | liu | susceptible | 100 | 2008 | +-------------+---------------+-------+-------------+

  15. Database federation Demonstration of CDR: query distributed between LiU and HUG CDR L i U H U G

  16. SPARQL endpoints

  17. SPARQL data query service

  18. Next steps • Improve overall database performance • Scale to more sites • Tighter integration with the DebugIT Ontology • Finalise semantic web integration • Security access based on roles

  19. Data Normalisation • CDR content automatically normalised • Terminologies used: SNOMED, NEWT, WHO-ATC, etc.

  20. Database federation Ecoli resistance pattern over time (monthly) CDR L i U H U G

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend