Towards standard, accessible and reproducible Metabolomics Reza - - PowerPoint PPT Presentation

towards standard accessible and reproducible metabolomics
SMART_READER_LITE
LIVE PREVIEW

Towards standard, accessible and reproducible Metabolomics Reza - - PowerPoint PPT Presentation

Towards standard, accessible and reproducible Metabolomics Reza Salek PhD Metabolism and Molecular Informatics The European Bioinformatics Institute (EMBL-EBI) Email: Reza.salek@ebi.ac.uk The 1st International Electronic Conference on


slide-1
SLIDE 1

Towards standard, accessible and reproducible Metabolomics

Reza Salek PhD

Metabolism and Molecular Informatics The European Bioinformatics Institute (EMBL-EBI)

Email: Reza.salek@ebi.ac.uk

The 1st International Electronic Conference on Metabolomics

slide-2
SLIDE 2

EBI Databases and services

Genomes Ensembl Ensembl Genomes EGA Nucleotide sequence ENA Functional genomics ArrayExpress Expression Atlas

Protein Sequences

UniProt Protein families, motifs and domains InterPro

Macromolecular

PDBe Protein activity IntAct , PRIDE Cheminformatics & Metabolism

MetaboLights, ChEBI

Pathways Reactome Systems BioModels BioSamples Literature and ontologies PubMC, GO Chemogenomics ChEMBL

slide-3
SLIDE 3

Is data growth, FAIR?

slide-4
SLIDE 4

Metabolomics Standard Initiative (WG)

  • Lives at http://msi-workgroups.sourceforge.net
  • 5 Workgroups
  • Biological context metadata WG
  • Chemical analysis WG
  • Data processing WG
  • Ontology WG
  • Exchange format WG

Roy Goodacre Metabolomics (2014) 10:5-7

slide-5
SLIDE 5

Data sharing repositories

http://www.metabolomicsworkbench.org/

http://ebi.ac.uk/metabolights/

slide-6
SLIDE 6

OmicsDI – Collection of omics

MX PX EGA TX

slide-7
SLIDE 7

Leading to data discovery

slide-8
SLIDE 8

OmicsDI

slide-9
SLIDE 9

https://github.com/ISA-tools/ISAcreator Developed a user friendly way to capture standards-compliant metadata

Capturing Metadata: ISA-Tab format

https://github.com/ISA-tools/ISAcreator/wiki/API https://github.com/ISA-tools/ISATab-Viewer

slide-10
SLIDE 10

ISAcreator – Using Ontologies

slide-11
SLIDE 11

Data Standards ; What is XML?

  • XML stands for EXtensible Markup Language
  • XML is a markup language much like HTML
  • XML was designed to carry data, not to display data
  • XML is designed to be self-descriptive

NMR analysis All spectra were recorded on a <Varian NMR Instrument> Varian VNMRS 600 NMR Spectrometer </Varian NMR Instrument>

  • perating at a proton NMR frequency of

<Irradiation frequency>599.83 <Megahertz>MHz</Megahertz> </Irradiation frequency> using a <cryoprobe>5 mm inverse detection cryoprobe</cryoprobe>. <acquisition nucleus>1H</acquisition nucleus> NMR spectra were recorded […].

slide-12
SLIDE 12

Generating ISA-Tab metadata files from metabolomics XML data

slide-13
SLIDE 13

MetaboLights – Study Validation Status

MetaboLights - an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucl. Acids Res. (2012) [ doi:10.1093/nar/gks1004

slide-14
SLIDE 14

MetaboLights – Study Validation details

slide-15
SLIDE 15

Tools the way forward!

3

slide-16
SLIDE 16

Current way and ideal

slide-17
SLIDE 17

Instrument .RAW files Frequency Spectra Averaged Transients

QC C5 S3 S7 C1 C10 QC S1 C3 S5 C7 S6 QC ..

C5 C5’ C5’’ IRFC5 IRFC5’ IRFC5’’ FSC5 FSC5’ FSC5’’

Stitched Peak Lists

SPLC5 SPLC5’ SPLC5’’ RFPLC5 Replicate Filtering S3 S3’ S3’’ IRFS3 IRFS3’ IRFS3’’ FSS3 FSS3’ FSS3’’ SPLS3 SPLS3’ SPLS3’’ Replicate Filtering .. ..’ ..’’ IRF.. IRF..’ IRF..’’ FS.. FS..’ FS..’’ SPL.. SPL..’ SPL..’’ Replicate Filtering DIMS Data Collection Apodisation, Zero-filling and FFT Mass Calibration and SIM-stitching RFPLS3 RFPL..

Replicate Filtered Peak Lists

Calibrant List ATC5 ATC5’ ATC5’’ ATS3 ATS3’ ATS3’’ AT.. AT..’ AT..’’ Batch Correction SFPM PQN + BATCH Spectral Cleaning SFPM PQN + BATCH + CLEAN Blank Filtering TIC Filtering SFPM PQN + BATCH + KNN SFPM PQN + BATCH + CLEAN + KNN SFPM PQN + KNN SFPM PQN + BATCH + KNN + GLOG SFPM PQN + BATCH + CLEAN + KNN + GLOG SFPM PQN + KNN + GLOG SFPM SFPM PQN Impute Missing Values using KNN Glog Transformation RFPLblank

Sample Filtered Peak Matrix Samples Technical Triplicates

Sample Filtering Missing-value Filtering PQN Normalisation

Complex analysis pipelines

slide-18
SLIDE 18

PhenoMeNal - Goal

PhenoMeNal VRE Portal Data Producer Data container Packaged tool Tool maker Infrastructure provider Compute Infrastructure

slide-19
SLIDE 19

Key objectives

  • Understand the computational needs of the Metabolomics

Community.

  • Integrate and scale existing Open Source tools into a well-

tested e-infrastructure.

slide-20
SLIDE 20

Major revolution

slide-21
SLIDE 21

Same in software

Cluster Cloud PI’s Collaborator’s Developer’s

slide-22
SLIDE 22

VRE Portal

  • Three usability

rounds

  • 80% functionality

running.

  • Public instance

access.

  • App Library, hooked

to EGI AppDB.

  • Documentation.

http://portal.phenomenal- h2020.eu/

slide-23
SLIDE 23

MetaboLights – The team

Previous: Paula de Matos, Mark Rijnbeek, Tejasvi Mahendraker, Pablo Conesa

Kenneth Haug Reza Salek Mark Williams Venkata Chandrasekhar Keeva Cochrane Jose Ramon Macias Gonzalez Christoph Steinbeck Jules Griffin (UC/MRC) Xuefei Li (MRC) Kalai Jayaseelan

slide-24
SLIDE 24

EBI PhenoMeNal – The team

Kenneth Haug Reza Salek Namrata Kale Christoph Steinbeck

Sijin He

Pablo Moreno

slide-25
SLIDE 25

COSMOS consortium

slide-26
SLIDE 26

PhenoMeNalconsortium

slide-27
SLIDE 27

Funding and Collaborators