Data management, storage and sharing Managing data at - - PowerPoint PPT Presentation

data management storage and sharing managing data at
SMART_READER_LITE
LIVE PREVIEW

Data management, storage and sharing Managing data at - - PowerPoint PPT Presentation

Data management, storage and sharing Managing data at institute-level: an example Plateforms MRI neuroimaging Mouse Heterogeneous data Anatomical Diffusion Rat Multiple sources Functional Quantitative Marmoset


slide-1
SLIDE 1

Data management, storage and sharing

slide-2
SLIDE 2

Managing data at institute-level: an example

Plateforms Clinical Large international databases

Human Connectome Project

MRI neuroimaging

  • Anatomical
  • Diffusion
  • Functional
  • Quantitative

Optical Imaging

  • Bi-photon microscopy
  • Confocal microscopy
  • Mesoscopic optical imaging
  • Spectroscopy
  • Laser doppler flowmetry
  • Optical coherence tomography
  • Histology / tracing

Electrophysiology

  • EEG/MEG
  • Multi-electrodes array
  • SIngle cell recordings
  • Deep brain stimulation recordings

NeuroBioTools

  • Genomics
  • Transcriptomics

Mouse Rat Marmoset Macaque Baboon Chimpanzee Human Microscopic Mesoscopic Macroscopic In Vivo Post-mortem Heterogeneous data Multiple sources Multiple scales Large quantities (~150To) Large need of data processing

What management for such an amount and variety of data ?

slide-3
SLIDE 3

Where’s my data ?

« On a portable hard drive. My PhD student has got it. I’ll email him» Non secure and unreliable storage. No backup. Major risk: Complete data loss Other risks: loss of associated data and impossibility to reprocess. « On a workstation in the experimental room. From time to time I make a copy of the hard

  • drive. »

Non secure storage. Random backup. Risk: data loss Other risks: loss of associated data and impossibility to reprocess. « On a (professional level) storage server » Secure storage, guaranteed backup Can we find the data, can we proceed to new analyses ?

slide-4
SLIDE 4

Rationalizing data magement. Goals and motivations

To eliminate all possibility of data loss

slide-5
SLIDE 5

Rationalizing data magement. Goals and motivations

To eliminate all possibility of data loss To offer an easy and reliable access to all data using specifric queries databasing, indexation

slide-6
SLIDE 6

Rationalizing data magement. Goals and motivations

To eliminate all possibility of data loss To offer an easy and reliable access to all data using specifric queries To ease or automate data processing Formatage / standardisation du stockage

slide-7
SLIDE 7

Rationalizing data magement. Goals and motivations

To eliminate all possibility of data loss To offer an easy and reliable access to all data using specifric queries To ease or automate data processing Reduce costs

slide-8
SLIDE 8

Rationalizing data magement. Goals and motivations

To eliminate all possibility of data loss To offer an easy and reliable access to all data using specifric queries To facilitate data sharing between researchers, and/or journals requiring an access to experimental data To ease or automate data processing Reduce costs Universal formatting of data

slide-9
SLIDE 9

Rationalizing data magement. Goals and motivations

To eliminate all possibility of data loss To offer an easy and reliable access to all data using specifric queries To facilitate data sharing between researchers, and/or journals requiring an access to experimental data To propose a Data Management Plan to researchers To ease or automate data processing Reduce costs

slide-10
SLIDE 10

Rationalizing data magement. Goals and motivations

To eliminate all possibility of data loss To offer an easy and reliable access to all data using specifric queries To facilitate data sharing between researchers, and/or journals requiring an access to experimental data To propose a Data Management Plan to researchers To promote and facilitate reproducible and open science To ease or automate data processing Reduce costs

slide-11
SLIDE 11

Rationalizing data magement. Goals and motivations

To eliminate all possibility of data loss To offer an easy and reliable access to all data using specifric queries To facilitate data sharing between researchers, and/or journals requiring an access to experimental data To propose a Data Management Plan to researchers To promote and facilitate reproducible and open science To ease or automate data processing Reduce costs

slide-12
SLIDE 12

Rationalizing data magement. Goals and motivations

To eliminate all possibility of data loss To offer an easy and reliable access to all data using specifric queries To facilitate data sharing between researchers, and/or journals requiring an access to experimental data To propose a Data Management Plan to researchers To promote and facilitate reproducible and open science To ease or automate data processing To facilitate scientific projects using heterogeneous multi-modal data, or to facilitate machine learning Reduce costs

slide-13
SLIDE 13

The 3 pillars of good data management

Storage

Must guarantee security and regular data backup All data must be stored as automatically as possible on storage servers

No loss Indexing

Ensures that the data is traceable, and possibly accessible according to specific queries based on descriptive metadata This indexation is usually performed via a database engine.

Access Formatting

Standardised nomenclature defining storage and organization of data and associated metadata. Ensures that data can be exchanged and analysed autonomously

Sharing Automatic processing

slide-14
SLIDE 14

Some solutions exist – many need to be built

MR Neuroimaging

Storage server BIDS formatting Xnat database (partial) automation of processing

Bio-informatics

Storage server TranSMART database

Multi-electrod electrophysiology

NEO formatting, optimised for data transfer and sharing Python API for automatic indexation

Example organization

Clinical and demographic data

Storage server REDCap databse

A tool to join all databases