A website for metabolomics data analysis and mining Sara Cardoso 1, - - PowerPoint PPT Presentation

a website for metabolomics data analysis and mining
SMART_READER_LITE
LIVE PREVIEW

A website for metabolomics data analysis and mining Sara Cardoso 1, - - PowerPoint PPT Presentation

WebSpecmine : A website for metabolomics data analysis and mining Sara Cardoso 1, *, Telma Afonso 1 , Marcelo Maraschin 2 , and Miguel Rocha 1,* 1 CEB - Centre Biological Engineering, University of Minho, Campus of Gualtar, Braga, Portugal; 2


slide-1
SLIDE 1

WebSpecmine: A website for metabolomics data analysis and mining

Sara Cardoso1,*, Telma Afonso1, Marcelo Maraschin2, and Miguel Rocha1,*

1 CEB - Centre Biological Engineering, University of Minho, Campus of Gualtar,

Braga, Portugal;

2 Plant Morphogenesis and Biochemistry Laboratory

, Federal University of Santa Catarina, Florianpolis, SC, Brazil.

* Corresponding authors: saracardoso501@gmail.com; mrocha@di.uminho.pt

1

slide-2
SLIDE 2

WebSpecmine: A website for metabolomics data analysis and mining

2

slide-3
SLIDE 3

Introduction

4

Metabolomics

✓ Analysingmetabolomics data correctly and efficiently is nowadays very important in biological and biomedical research. A website to perform metabolomicsdata analysisis an importantasset Most people that want to perform this analysismay not have the programming skills needed However

slide-4
SLIDE 4

Introduction

5

Some of the Existing Websites

Covered Techniques

LC/GC-MS Raw Spectra LC/GC-MS Raw Spectra LC/GC-MS Peak Lists NMR Peak Lists Metabolites' Concentrations (Quantitative Data)

Univariate Analysis

T-Test; ANOVA; Fold Change T-Test; ANOVA; Non-Parametric Tests;

Multivariate Analysis

PCA; Clustering; Machine Learning (only PLS-DA); Feature Selection (only Random Forests and SVM) PCA; Clustering; Machine Learning (only LDA, PLS-DA and Random Forests)

Other Features

Correlation Analysis; Metabolite Identification (only for MS); Pathway Analysis Metabolite Identification; Pathway Analysis; User Account

slide-5
SLIDE 5

Introduction

6

What is missing in the existing websites?

A wide variety of techniques and data formats supported SpectralData (Raman, UV-Vis and IR) is missing A wide variety of pre-processing methods Mostly just normalization, scaling, missing values treatment A wide variety of analysis methods There should be more model options for machine learning, for example Flexible Pipeline Most of the time, users have to follow a strict pipeline User Account So that data and results can be stored and shared

slide-6
SLIDE 6

What was our main goal, then?

7

SOLUTION: ✓ Create an easy-to-use and freely available website that provides a wide variety of methods and data types for analysis, and ways to store and share metabolomics data and the results generated.

slide-7
SLIDE 7

WebSpecmine: overview

8

Metabolomicsdata Supported ✓ NMR ✓ LC/GC-MS ✓ Infrared, UV-Visible, and Raman Spectra ✓ ConcentrationsData(QuantitativeData) Metabolomicsdata Analysis Available ✓ UnivariateStatistical Analysis ✓ Unsupervised MultivariateStatisticalAnalysis ✓ Supervised MultivariateStatistical Analysis ✓ MetaboliteIdentification ✓ Pathway Analysis Data Pre-Processing User Account ✓ Store data and results privately ✓ Share data across users Tutorials and User Guide

slide-8
SLIDE 8

WebSpecmine: Supported data

9

LC/GC-MS

Raw Spectra Data Formats ✓ .mzData ✓ .mzXML ✓ .netDF Peak Lists Data Formats ✓ CSV ✓ TSV

slide-9
SLIDE 9

WebSpecmine: Supported data

10

NMR

Raw Spectra Data Formats ✓ BRUKER ✓ VARIAN Peak Lists Data Formats ✓ CSV ✓ TSV

slide-10
SLIDE 10

WebSpecmine: Supported data

11

Spectral Data: Raman, IR and UV-Vis

Spectra Data Formats ✓ CSV ✓ (J)DX ✓ SPC ✓ MS EXCEL (.xlsx)

slide-11
SLIDE 11

WebSpecmine: Supported data

12

✓ CSV/TSV File:

Concentrations (Quantitative) Data

Metabolites Names or Identifiers Samples' Names Concentrations values of each metabolite in each sample

slide-12
SLIDE 12

WebSpecmine: Supported data

13

Metadata

✓ CSV/TSV File: ✓ All types of data should have a metadata file associated

Samples' Names Names of the metadata classes Metadata values for each metadata class in each sample

slide-13
SLIDE 13

WebSpecmine: User Account

14

Why a User Account?

✓ Main website functionalitiesare accessible without a user account ✓ But you will have to create an account if you want to:

  • Save and Share data and results
  • Leave an analysisin 'stand-by'
slide-14
SLIDE 14

WebSpecmine: User Account

15

Creationof a User Account

To have one, users have to send an email, asking to create an account, and an email with the credentials will be sent as soon as possible. Email: webspecmine@gmail.com

slide-15
SLIDE 15

WebSpecmine: User Account

16

Data Projects: What is?

A project is a study, or group of studies, which contains the data and metadatafor each study, as well as reports from the results obtained Projects can be: Private Public

slide-16
SLIDE 16

WebSpecmine: User Account

17

Your Projects

The projects stored in an account are accessible through My Projects sidebar tab

slide-17
SLIDE 17

WebSpecmine: User Account

18

Public Projects

Everyone that accesses the website can see all public projects, at the Public Projects sidebar tab

slide-18
SLIDE 18

WebSpecmine: User Account

19

Public Projects

Everyone that accesses the website can see all public projects, at Public Projects sidebar tab To analyse a public project, you would have to copy it to your account and analyse it from there, so that the original project is not compromised.

slide-19
SLIDE 19

WebSpecmine: User Account

20

Workspace: Users can leave their analysis in 'stand-by' and continue later

Users can leave an analysis at any time, by saving the workspace, and continue next time

slide-20
SLIDE 20

WebSpecmine: Select Data for Analysis

21

For Logged In Users 1

Select the Project, the data folder from that project where the data to analyse is, and the metadata file from that project that corresponds to the data selected

slide-21
SLIDE 21

WebSpecmine: Select Data for Analysis

22

For Logged In Users 1 2

Set the options required to correctly read the data and metadata files

slide-22
SLIDE 22

WebSpecmine: Select Data for Analysis

23

For Logged In Users 1 2 3

After finishing the setting of data and metadataoptions, the user can submit the data for analysis

slide-23
SLIDE 23

WebSpecmine: Select Data for Analysis

24

For Logged Out Users

✓ The Procedure is similar, but the data files and metadata files have to be submitted, as they are not stored in the website. ✓ The data submitted will only be temporarilystored, while the analysis is in action.

slide-24
SLIDE 24

Once the user selects the data, the data analysis pages will be accessible

25

slide-25
SLIDE 25

WebSpecmine: Data Visualization

26

1

The website provides a way to visualize the data

slide-26
SLIDE 26

WebSpecmine: Data Visualization

27

1

Data Summary

2

slide-27
SLIDE 27

WebSpecmine: Data Visualization

28

1

Data and Metadata Tables

2 3

slide-28
SLIDE 28

WebSpecmine: Data Visualization

29

1

Samples' and Variables' Statistics

2 3 4

slide-29
SLIDE 29

WebSpecmine: Data Visualization

30

1

Boxplots of the Variables

2 3 4 5

slide-30
SLIDE 30

WebSpecmine: Data Visualization

31

1

Plot for Peaks Data

2 3 4 5 6

slide-31
SLIDE 31

WebSpecmine: Data Visualization

32

1

Plot for Spectra

2 3 4 5 6

slide-32
SLIDE 32

WebSpecmine: Pre-Processing

33

1

The website provides a wide variety of pre-processing methods, that can be performed in the desired order

slide-33
SLIDE 33

WebSpecmine: Pre-Processing

34

1

Methods that are availablefor all types of data

2

slide-34
SLIDE 34

WebSpecmine: Pre-Processing

35

1

Methods that are availablefor all types of data

2

slide-35
SLIDE 35

WebSpecmine: Pre-Processing

36

1

Methods that are availablefor all types of data

2

slide-36
SLIDE 36

WebSpecmine: Pre-Processing

37

1

Methods only for spectral data

2 3

slide-37
SLIDE 37

WebSpecmine: Pre-Processing

38

1

Method only for NMR Spectra

2 3 4

slide-38
SLIDE 38

WebSpecmine: Pre-Processing

39

1

After processing the data, a name to the new dataset has to be given

2 3 4 5

To perform an analysis on the new dataset, the user will have to choose it

  • n the sidebar panel
slide-39
SLIDE 39

WebSpecmine: Data Analysis

40

slide-40
SLIDE 40

WebSpecmine: Data Analysis

41

Univariate Analysis

Example for T-Test

Analysis options for a T-Test

slide-41
SLIDE 41

WebSpecmine: Data Analysis

42

Univariate Analysis

Example for T-Test

Types of results available for this type of analysis: numerical results

slide-42
SLIDE 42

WebSpecmine: Data Analysis

43

Univariate Analysis

Example for T-Test

Types of results available for this type of analysis: plot

slide-43
SLIDE 43

WebSpecmine: Data Analysis

44

Univariate Analysis

Other Analysis

There are other Univariate Analysis methods available The types of results available for each analysis is similar to those showed for T-Test

slide-44
SLIDE 44

WebSpecmine: Data Analysis

45

Principal Components Analysis (PCA)

Analysis options for both normal and robust PCAs

slide-45
SLIDE 45

WebSpecmine: Data Analysis

46

Principal Components Analysis (PCA)

Types of results available for this type of analysis: numerical results

slide-46
SLIDE 46

WebSpecmine: Data Analysis

47

Principal Components Analysis (PCA)

Types of results available for this type of analysis: plot results

slide-47
SLIDE 47

WebSpecmine: Data Analysis

48

Principal Components Analysis (PCA)

Types of results available for this type

  • f analysis: plot results
slide-48
SLIDE 48

Hierarchical Clustering

WebSpecmine: Data Analysis

49

Clustering Analysis

Analysis options

slide-49
SLIDE 49

WebSpecmine: Data Analysis

50

Clustering Analysis

Hierarchical Clustering

Types of results available for this type of analysis

slide-50
SLIDE 50

K-Means Clustering

WebSpecmine: Data Analysis

51

Clustering Analysis

Analysis options

slide-51
SLIDE 51

WebSpecmine: Data Analysis

52

Clustering Analysis

K-Means Clustering

Types of results available for this type of analysis

slide-52
SLIDE 52

WebSpecmine: Data Analysis

53

Machine Learning

Train Models

Analysis options

slide-53
SLIDE 53

WebSpecmine: Data Analysis

54

Machine Learning

Train Models

Analysis options

Available models:

  • PLS
  • Decision Tree
  • Rule-Based Classifier
  • SVMs with Linear Kernel
  • Random Forests
  • Linear Discriminant Analysis
  • Neural Networks
slide-54
SLIDE 54

WebSpecmine: Data Analysis

55

Machine Learning

Train Models

Types of results available for this type of analysis

slide-55
SLIDE 55

WebSpecmine: Data Analysis

56

Machine Learning

Predict New Samples

Analysis options

slide-56
SLIDE 56

WebSpecmine: Data Analysis

57

Machine Learning

Predict New Samples

Types of results available for this type of analysis

slide-57
SLIDE 57

WebSpecmine: Data Analysis

58

Feature Selection

Analysis options

slide-58
SLIDE 58

WebSpecmine: Data Analysis

59

Feature Selection

Types of results available for this type of analysis

slide-59
SLIDE 59

WebSpecmine: Data Analysis

60

Regression Analysis

Regression Analysis

Analysis options

slide-60
SLIDE 60

WebSpecmine: Data Analysis

61

Regression Analysis

RegressionAnalysis

Types of results available for this type of analysis: numerical results

slide-61
SLIDE 61

WebSpecmine: Data Analysis

62

Regression Analysis

Regression Analysis

Types of results available for this type of analysis: plot results

slide-62
SLIDE 62

WebSpecmine: Data Analysis

63

Regression Analysis

Correlation Analysis

Analysis options

slide-63
SLIDE 63

WebSpecmine: Data Analysis

64

Regression Analysis

CorrelationAnalysis

Types of results available for this type of analysis: numerical results

slide-64
SLIDE 64

WebSpecmine: Data Analysis

65

Regression Analysis

Correlation Analysis

Types of results available for this type of analysis: plot results

slide-65
SLIDE 65

WebSpecmine: Data Analysis

66

Metabolite Identification

LC-MS Data

Analysis options

slide-66
SLIDE 66

WebSpecmine: Data Analysis

67

Metabolite Identification

LC-MS Data

Results available for this type of analysis

slide-67
SLIDE 67

WebSpecmine: Data Analysis

68

Metabolite Identification

NMR Data

Analysis options

slide-68
SLIDE 68

WebSpecmine: Data Analysis

69

Metabolite Identification

NMR Data

Results available for this type of analysis

slide-69
SLIDE 69

WebSpecmine: Data Analysis

70

Pathway Analysis

Analysis options

slide-70
SLIDE 70

WebSpecmine: Data Analysis

71

Pathway Analysis

Results available for this type of analysis

slide-71
SLIDE 71

WebSpecmine: Analysis of MetaboLights Studies

72

You can see information on some of the MetaboLights studies

slide-72
SLIDE 72

WebSpecmine: Analysis of MetaboLights Studies

73

You can see detailed information on the protocol and metadata information on each assay

slide-73
SLIDE 73

WebSpecmine: Analysis of MetaboLights Studies

74

You can download the data and metadata of an assay into your private account and analyse it

slide-74
SLIDE 74

Conclusions

75

✓ We were able to create an easy-to-use and freely availablewebsite with many advantages:

▪ Wide variety of techniques and data formats supported ▪ Wide variety of pre-processing methods ▪ Wide variety of analysis methods ▪ User Account ▪ Flexible Pipeline

✓ However, more analyses could be added, to add more biologicalmeaningto data, such as:

▪ Enrichment Analysis ▪ Biomarker Analysis

slide-75
SLIDE 75

For More Detailed Information ...

Website Link: https://webspecmine.bio.di.uminho.pt/ We have Tutorials and a complete User Guide at the ?Help page. We have a troubleshooting window, from where the users can report any problems and see the problems already encountered, but still being solved.

76

slide-76
SLIDE 76

Acknowledgments

77

Christopher Costa, for being the main developer of the specmine R package