THE VODAN FAIR DATA POINT FAIR DATA THE UNDERLYING PROBLEM MOST - - PowerPoint PPT Presentation
THE VODAN FAIR DATA POINT FAIR DATA THE UNDERLYING PROBLEM MOST - - PowerPoint PPT Presentation
THE VODAN FAIR DATA POINT FAIR DATA THE UNDERLYING PROBLEM MOST DATA DONT TALK TO EACH OTHER FRAGMENTATION of Data sample collections image collections Regulations software tools research initiatives Funding Expertise etc.
FAIR DATA
THE UNDERLYING PROBLEM
FRAGMENTATION
- f…
Data sample collections image collections Regulations software tools research initiatives Funding Expertise etc.
WE NEED ACTIONABLE DATA!! MOST DATA DON’T ‘TALK’ TO EACH OTHER
Findable:
- F1. (meta)data are assigned a globally unique and persistent
identifier;
- F2. data are described with rich metadata;
- F3. metadata clearly and explicitly include the identifier of
the data it describes;
- F4. (meta)data are registered or indexed in a searchable
resource;
Accessible:
- A1. (meta)data are retrievable by their identifier using a
standardized communications protocol; A1.1 the protocol is open, free, and universally implementable; A1.2. the protocol allows for an authentication and authorization procedure, where necessary;
- A2. metadata are accessible, even when the data are no longer
available;
Interoperable:
- I1. (meta)data use a formal, accessible, shared, and broadly
applicable language for knowledge representation.
- I2. (meta)data use vocabularies that follow FAIR principles;
- I3. (meta)data include qualified references to other
(meta)data;
Reusable:
- R1. (meta)data are richly described with a plurality of accurate
and relevant attributes; R1.1. (meta)data are released with a clear and accessible data usage license; R1.2. (meta)data are associated with detailed provenance; R1.3. (meta)data meet domain-relevant community standards;
https://www.nature.com/articles/sdata201618
FAIR DATA PRINCIPLES
- Sci. Data 3: 160018 doi: 10.1038/sdata.2016.18 (2016
Accessible:
- A1. (meta)data are retrievable by their identifier using a
standardized communications protocol; A1.1 the protocol is open, free, and universally implementable; A1.2. the protocol allows for an authentication and authorization procedure, where necessary;
- A2. metadata are accessible, even when the data are no longer
available;
ACCESSIBLE UNDER WELL DEFINED CONDITIONS: NOT ALWAYS OPEN AND FREE!
STAY IN CONTROL OF YOUR DATA
THE INTERNET OF FAIR DATA AND SERVICES
0101110101 00111010111000111 010001110101001010101 101010101010101010011101 11000111010101101010001110 10100101010111010101010101 01010101001110101110001110 10101101010001110101001010 10111010101010101010101010 01110101110001110101011010 100011101010010101011101 010101010101010100111 100011101010110101 01001010101
DATA STATION
Provides FAIR access to data and metadata Allows train to access and interact with data
TRAIN
Interacts with data (process, integrate, analyze, …)
DATA GATEWAY
Provides access and control to the data authority regardless of where the data is located/stored
TRACKS
The routing and transport infrastructure
IFDS MAIN ELEMENTS
FDS FAIR Data Station (FDS) FDS FDS FDS FDS Algorithm
THE FAIR DATA TRAIN: ALGORITHMS TO DATA - TRAINS TO STATIONS
DATA DO NOT LEAVE THE STATION!
THE FAIR DATA TRAIN: ALGORITHMS TO DATA - TRAINS TO STATIONS
DATA VISTING VERSUS DATA SHARING
THE INTERNET FOR MACHINES
The Machine knows what I mean As open as possible, as closed as necessary As distributed as possible, as central as needed Global: FAIR aka Fully AI-Ready
FAIR DATA AND THE INTERNET OF FAIR DATA AND SERVICES
Data machine readable Principles generic; so applicable in all domains Principles; not standards so no replacement or threat Honoring what is already there, connecting at a higher level Local FAIR Data Points visited by algorithms Data do NOT leave the source or country Only use relevant data for exchange and analytics Controlled access
- FINDABLE
ACCESSIBLE INTEROPERABLE REUSABLE
Without good metadata NO effective interoperability Metadata ‘describe’ the data source in detail
Structure and Internal coherence Source reference and Licenses Time stamped changes to the data Quality Context Provenance and maintenance
META DATA ESSENTIAL FOR INTEROPERABILITY
META DATA
APPLIED INTELLIGENCE POSSIBILITIES
FAIR Data well
- rganized as basis for
AI (Fully AI Ready) Underlying ontology provides relationships Allows for creating semantic triples Enables knowledge graph based in silico discovery
THE VODAN FDP
VIRUS OUTBREAK DATA NETWORK (VODAN) PROJECT
Uganda China
VIRUS OUTBREAK DATA NETWORK (VODAN) PROJECT
Uganda China Italy Ireland USA
THE WORLD HEALTH ORGANIZATION ELECTRONIC CASE RECORD FORM
THE WORLD HEALTH ORGANIZATION ELECTRONIC CASE RECORD FORM
WHO provided form
THE WORLD HEALTH ORGANIZATION ELECTRONIC CASE RECORD FORM
VODAN team created a semantic model
THE WORLD HEALTH ORGANIZATION ELECTRONIC CASE RECORD FORM
VODAN project provides RDF
WHO eCRF as RDF
Based upon semantic model RDF is created so the input becomes available as machine readable FAIR data
FAIR DATA POINT
Current situation
W h a t ? Where? H
- w
?
VODAN data approach
Data Metadata Real-world phenomena Represents Describes
- Provenance
- License
- Access conditions
- Semantic description
- …
FAIR Data Point
- Provides access to structured and semantically-rich metadata describing:
- The data source itself;
- Groups of datasets (catalogs);
- Datasets;
- The accessible method(s) of each dataset (distributions);
- Common interface to access the metadata (REST API);
VODAN data infrastructure
A network of FDPs to:
- Facilitate exposure/publication
- f metadata about COVID-19
data;
- Provide rich metadata;
- Provide semantic descriptions
- f both metadata and data;
- Improve machine-actionability
- n metadata and data;
WHO’s COVID-19 CRF semantic data model
- Based on WHO’s COVID 19 CRF, rapid version - https://www.who.int/docs/default-
source/coronaviruse/who-ncov-crf.pdf?sfvrsn=84766e69_2 ;
- Provide machine-actionable semantic references for the form’s questions and answers;
- Open access: CC BY 4.0;
- Source file available at: https://github.com/FAIRDataTeam/WHO-COVID-CRF
- Documentation: https://vodan-ontology.github.io
- Published in BioPortal: http://bioportal.bioontology.org/ontologies/COVIDCRFRAPID
WHO’s COVID-19 CRF semantic data model
Antiviral à SNOMEDCT:372701006 Antiviral Agent à NCIT:C281 Antiviral Agents à MESH:D000998 … Ribavirin à SNOMEDCT:387188005 Ribavirin à NCIT:C807 Ribavirin à MESH:D012254 … Lopinavir- and ritonavir-containing product à SNOMEDCT:387067003 Lopinavir/Ritonavir à NCIT:C2096 Lopinavir-ritonavir drug combination à MESH:D558899 …
Albert Mons
International Project Consultant GO FAIR CEO FAIR Solutions
albert.mons@fairsolutions.com
THANK YOU AND Q&A
Luiz Bonino
Technical Advisor to GO FAIR Associate Professor at University of Twente And Leiden University Medical Center
luiz.bonino@go-fair.org
GLOSSARY OF FAIR TERMS: CONTENT
FAIR: Enabling both humans and machines to use data more efficiently by making data Findable, Accessible, Interoperable and Reusable GO FAIR: A global collaborative community implementing FAIR Data and Services using good practices Metadata: A set of data about a given object/resource that helps to describe it Semantic Interoperability: the ability of different systems to share meaning Conceptual Modelling: a discipline to support the formal description of conceptualizations Conceptual model: a formal representation of concepts to support the understanding of a given universe of discourse Ontology: a formal description of a shared conceptualization FAIR Implementation Profile (FIP): the collection of FAIR implementation choices made in a community of practice FAIR Data Stewardship: maximizing the re-use of data
GLOSSARY OF FAIR TERMS: EVENTS
FAIR awareness Event (FAE): A half day high level introduction to FAIR and the GO FAIR ecosystem FAIR Value Event (FVE): A three day event demonstrating the value of FAIR with a real use case Bring Your Own Data Event (BYOD): A 2 day event making data FAIR and answering research questions FAIR Data Stewardship Course (FDS): a 5 day course on FAIR data and FAIR Data Stewardship Ontology Modelling Course (OM): a 5 day course on Conceptual and Ontology Modelling FAIR Implementation Profile Workshop (FIP): A hands-on activity profiling the collection of FAIR implementation choices made in a community of practice Meta Data for Machines Workshop (M4M): A hands-on activity creating (or re-using) machine-actionable metadata templates and instances for deployment of FAIR Data and Services
GLOSSARY OF FAIR TERMS: TOOLS
FAIR Data Stewardship Wizard (FDSW): tool to create FAIR Data Policy plans FAIRifier: an application to support data wrangling, semantic data modelling, metadata definition and FAIR publication FAIR Data Station: a server application to publish metadata and allow the interaction with data in a FAIR way FAIR Evaluator: an application to assess the FAIRness levels of different types of resources, namely, metadata, data,
- ntologies, applications, etc.
FAIR Meta Data Search Engine: a server application enabling humans and machines to search for FAIR objects/resources based
- n the indexing of their metadata
Ontology Modelling tool: an application to support the modelling/definition of ontologies Ontology Management System: a server application to support the management of the lifecycle of models (data models,
- ntologies, semantic data models, etc.)