PSIchomics Shiny application for the integrated analysis of alternative splicing from large transcriptomic datasets
Nuno Morais laboratory Nuno Agostinho 6 Dec. 2016 EuroBioC 2016
PSIchomics Shiny application for the integrated analysis of - - PowerPoint PPT Presentation
PSIchomics Shiny application for the integrated analysis of alternative splicing from large transcriptomic datasets Nuno Agostinho Nuno Morais EuroBioC 6 Dec. 2016 laboratory 2016 2 Alternative Splicing Gene Exon 1 Exon 2 Exon 3
PSIchomics Shiny application for the integrated analysis of alternative splicing from large transcriptomic datasets
Nuno Morais laboratory Nuno Agostinho 6 Dec. 2016 EuroBioC 2016
Introduction Workflow Case Study Testing Conclusions
2
Exon 1 Exon 2 Exon 3
Gene
Introduction Workflow Case Study Testing Conclusions
2
Exon 1 Exon 2 Exon 3
Gene
Introduction Workflow Case Study Testing Conclusions
3
Exon 1 Exon 2 Exon 3
Gene
Introduction Workflow Case Study Testing Conclusions
4
Studying alternative splicing changes may allow to identify prognostic factors and therapeutic targets
(Oltean & Bates, 2014)
Introduction Workflow Case Study Testing Conclusions
5
1 Exon 1 Exon 2 Exon 3 Extract RNA 3 Convert to DNA 2 Divide in fragments 4 Sequence DNA 5 Obtain reads 6 Map reads to DNA of reference Exon 2 Exon 3 Exon 1 Junction reads Exonic reads Reference DNA
Introduction Workflow Case Study Testing Conclusions
Alternative splicing annotation Junction read counts
6
Percent Spliced-In (PSI) = inclusion reads inclusion + exclusion reads
Distribution of PSI values
0.4 0.1 0.2 0.3 0.5 0.6 0.7 0.8 0.9 1 2 4
Median: 0.82 Variance: 0.05 Median: 0.17 Variance: 0.06 Mann–Whitney U test's p-value (FDR): 2.28e-07
ACTN1 (exon 19)
Introduction Workflow Case Study Testing Conclusions
splicing data
7
MISO rMATS VAST-TOOLS SUPPA jSplice AltAnalyze SpliceSeq TIN SGSeq splicegear JunctionSeq DRIMSeq SeqGSEA spliceR DEXSeq Cufflinks FineSplice JuncBASE Splicing Compass
Introduction Workflow Case Study Testing Conclusions
splicing data
8
No user-friendly interfaces in most tools Time-consuming quantification of alternative splicing No incorporation of clinical information Over-simplistic analyses or focus in the quantification step
Introduction Workflow Case Study Testing Conclusions
9
Quantify, analyse and visualise alternative splicing in cancer data Incorporate clinical information Modular architecture to easily modify and extend the program Visual and command-line interfaces
Introduction Design decisions Implementation Testing Conclusions
computation
10
and JavaScript
reactivity)
Introduction Workflow Case Study Testing Conclusions
Introduction Workflow Case Study Testing Conclusions
Alternative splicing annotation Junction read counts
12
Percent Spliced-In (PSI) = inclusion reads inclusion + exclusion reads
Introduction Workflow Case Study Testing Conclusions
13
Alternative splicing annotation
available
Quantification Analyses and Visualisation Data Retrieval
Clinical data Junction read counts
Firebrowse web API
(data from human tumours)
The Cancer Genome Atlas
Introduction Workflow Case Study Testing Conclusions
14
Quantification Analyses and Visualisation Data Retrieval
Quantify alternative splicing Retrieve TCGA data (optional)
Junction read counts
annotation
provided or prepared by user
Percent Spliced-In (PSI) = inclusion reads inclusion + exclusion reads
Introduction Workflow Case Study Testing Conclusions
15
Quantification Analyses and Visualisation Data Retrieval
Differential splicing analysis Gene, RNA and protein information Principal component analysis Survival analysis
Introduction Workflow Case Study Testing Conclusions
16
Survival curves Density plots
Quantification Analyses and Visualisation Data Retrieval
Introduction Workflow Case Study Testing Conclusions
18
19
20
21
22
23
PCA: dimensionality reduction by selecting the main directions of variance
24
25
26
Introduction Workflow Case Study Testing Conclusions
Performance Benchmarking Continuous and Unit Testing Usability Testing
Introduction Workflow Case Study Testing Conclusions
Breast cancer (1093 patients) Pan-kidney cohort (889 patients) Glioma cohort (676 patients) Liver cancer (371 patients) Running time (with default settings) 30s 1m 1m 30s 2m 2m 30s 3m 3m 30s 4m 4m 30s 5m 5m 30s 6m 6m 30s 2m 20s 2m 37s 2m 34s 2m 35s 35s 1m 20s 2m 2s 2m 39s 16s 33s 22s 47s
Load data Quantify AS (skipped exon) Differential analyses (Normal vs Tumour)
28 MacBook Pro 2011: i7 (8 cores), HDD and 8GB RAM
Performance Benchmarking Continuous and Unit Testing Usability Testing
Introduction Workflow Case Study Testing Conclusions
Introduction Workflow Case Study Testing Conclusions
30
6 minutes using processed data with the highest number of patients in TCGA Quantify, analyse and visualise alternative splicing in cancer data Incorporates clinical information Modular architecture to easily modify and extend the program Command-line and easy-to-use graphical interface
Introduction Workflow Case Study Testing Conclusions
31
32
GitHub Code hosting Bioconductor Biological R packages
(MIT license)
Nuno Morais Lab André Falcão Ana Rita Grosso