loredana martignetti laurence calzone eric bonnet
play

Loredana Martignetti, Laurence Calzone, Eric Bonnet, Emmanuel - PowerPoint PPT Presentation

Loredana Martignetti, Laurence Calzone, Eric Bonnet, Emmanuel Barillot and Andrei Zinovyev . ROMA: representation and quantification of module activity from target expression data (Highlight talk) institut Curie - INSERM U900 - PSL Research


  1. Loredana Martignetti, Laurence Calzone, Eric Bonnet, Emmanuel Barillot and Andrei Zinovyev . ROMA: representation and quantification of module activity from target expression data (Highlight talk) institut Curie - INSERM U900 - PSL Research University / Mines ParisTech Computational Systems Biology of Cancer Genome Informatics Workshop (GIW 2016), Shanghai, 5 October 2016

  2. Outline ROMA: Representation and Quantification of Module Activity • Why ROMA? • Principles of ROMA • Examples of application – Involvment of Notch and P53 pathways in colon cancer invasiveness – Integrating transcriptome data into a mathematical model of metastasis network – Analysing expression time series of oncogenesis in Ewing' sarcoma – Transcriptional signature of p53 in bladder carcinoma

  3. Quantifying Network Module Activity • Genome • Epigenome • Transcriptome • Proteome

  4. In cancer the same biological process can be affected by alterations in different individual genes

  5. Reasoning in terms of active/inactive gene-sets rather than single differentially expressed genes Gene set = target genes Gene set = genes involved in a co-regulated by the same TFs common signalling pathway Question: which gene sets (modules) are playing a role in my set of samples?

  6. Quantification of gene-set activity by PCA (Principal Component Analysis) F The uni-factor linear model of gene expression regulation : F F F F α 1 α 2 α 3 α n (F) (F) Expr(gene g, sample S ) ~ α g A S Gene n Gene 1 Gene 2 Gene 3 gene set (F) α g = response coefficient of gene g to factor F (F) A S = activity of factor F in sample s PC2 (F) (F) The values α g and A S are given by the PC1 (first principal component) of the gene set PC1 (F) (F) 2 min ( Expr(gene g, sample S) - α g A S )

  7. Identification of active/inactive gene-sets by PCA PC2 PC1 L1 = amount of variance captured by the PC1 Overdispersion (high L1) reflects activation (differential across the samples) Testing if the PC1 variance L1 of a gene-set significantly exceeds the genome-wide background expectation ( overdispersion ) Fan J et al, Nature Methods 2016 Tomfohr et al, BMC Bioinformatics 2005

  8. ROMA features: assessing the statistical significance of gene-set activity (overdispersion) L1 distribution strongly depend on the size of the gene set PC2 PC2 Small Large gene set gene set size size PC1 PC1 Statistical significance of L1 is assessed by estimating the null distribution from random set of genes having representative sizes

  9. Identification of coordinated gene-sets by PCA PC2 PC2 L2 = amount of variance L2 = amount of variance captured by the PC2 captured by the PC2 PC1 PC1 L1 = amount of variance captured L1 = amount of variance captured by the PC1 by the PC1 High L1/L2 reflects coordination across the samples Testing if the spectral gap L1/L2 of a gene-set significantly exceeds the expectation is assessed by estimating the null distribution from random set of genes having representative sizes ( coordination )

  10. Significances of overdispersion and coordination might differ overdispersed not overdispersed PC2 PC2 not coordinated L2 = amount of variance captured by the PC2 PC1 PC1 L2 L1 L1 = amount of variance captured by the PC1 PC2 PC2 coordinated PC1 PC1 L2 L2 L1 L1 Eg: E2F1 targets are usually overdispersed but not coordinated, contrary to E2F3

  11. The ROMA algorithm Expression data matrix Gene sets (gmt format) Extract expression submatrices Compute PCA based module activities Statistical significance of overdispersion and coordination OUTPUT Module activity score Gene projections Overdispersion estimation L1+ pv in each condition on PC1 Coordinated expr. estimation L1/L2 + pv https://github.com/sysbio-curie/Roma Martignetti et al, Front Genet. 2016

  12. ROMA features different patterns of overdispersion: Symmetric overdispersion vs displacement wrt data center An example of active pathway in single-cell transcriptomics Computed using PCA with fixed center:

  13. Modification of PCA: PCA with fixed center PC1 with fixed PC1 PC1 PC1 center with fixed center

  14. ROMA uses weighted genes to include a priori knowledge In ROMA, some weights w g per gene can be assigned by the user , eg : • Positive weights for “positively regulated genes” and negative for “inhibited genes” • Weights reflecting confidence in gene expression quantification (eg dropout problem in single cell data, Fan et al 2016) • Weights reflecting transcription factor binding strength Weighted PCA is then computed: These weights are also used for orienting the principal component and defining activated/repressed samples and activating/repressing genes: PC Green w g >0; Red w g <0; Blue unknown

  15. ROMA computes robust PCA Outlier genes abnormally affecting PC1 are identified by “leave one out” procedure and removed from the gene-set PC2 PC1 L1 estimate is affected by one single gene In ROMA outlier genes are identified by leave-one-out procedure: → computing L1 n times ( n = gene set size) removing at each time one gene in the gene set → outliers are identified as those genes that dramatically increase L1

  16. Application 1: Involvment of Notch and P53 pathways in colon cancer invasiveness

  17. Using ROMA to understand control of Epithelial-Mesenchymal Transition (EMT) • Network modelling of Notch, p53 and Wnt pathways and their control of EMT phenotype • Prediction: p53 inhibition + Notch intracellular domain activation are required for EMT and metastasis • Experimental validation in mouse • What about human tumors? Chanrion*, Kuperstein* et al, Nat. Comm 2014

  18. Single NOTCH signalling genes do not show differential expression between invasive and non invasive tumours - Two groups: metastatic patients (M1) / non metastatic patients (M0) - Prediction: Notch activation is associated to EMT and tumour invasiveness TCGA data (colorectal cancer)

  19. Module activity analysis on TCGA colon cancer data Module activity matrix S1 S2 S3 M1 M1 M0 ... Notch pathway A11 a12 a13 .... Wnt pathway A21 a22 a23 .… A31 a32 a33 …. P53 pathway Differential module activity analysis gives statistically significant signals

  20. Application 2: Consistency between transcriptome data and a mathematical model of metastasis network

  21. Master model of metastasis (boolean framework) Cohen D et al, PLoS Comp Bio 2015

  22. Pathways included in the module analysis Module activity is a valuable input of network simulation for one sample

  23. Application 3: EWS/FLI1 target time series expression In Ewing' sarcoma

  24. EWS/FLI1 transcription factor in Ewing's sarcoma A malignant genomic translocation and appearance of a chimeric gene EWS/FLI-1 whose activity leads to the uncontrolled cell growth. 6 Nb of cells (X10 6 ) EWS/FLI1 5 expressed 4 3 2 1 EWS/FLI1 silenced 0 D10 D13 D17 D21

  25. Putative mechanisms of EWS/FLI1 mediated gene regulation Time series of transcriptome 1 to 21 days after chimeric oncogene activation shows EWS-FLI1 target regulation: positively and negatively regulated targets

  26. Activity score of EWS-FLI1 downstream targets MODULE L1 L1_pv L1/L2 L1/L2_pv NUMBER_OF_GENES EWS/FLI_Down_signature 0.57 0.005 3.02 0.142 280 EWS/FLI_Up_signature 0.47 0.087 2.88 0.12 492 769 EWS/FLI_All_signature 0.52 0.001 2.96 0.06 One single test combining down and up regulated genes Outperforms other approaches where down and up sets cannot be combined

  27. Application 4: Transcriptional signature of p53 in bladder cancer

  28. P53 activity vs P53 mutation status in bladder cancer PCA analysis of the p53 signature (49 genes) from MSigDB P53 mutated tumors PC2 P53 non-mutated unknown PC1 P53 Mutation P53 Activity Total Number of samples - + Non-functional p53 Functional p53 (various mechanisms) (non-effective mutations) Data from F. Radvanyi et al, I. Curie (198 tumors) P53 activity predicts Low_grade High_grade High_grade tumor progression Low_grade much better than P53 mutation status

  29. Take-home message • ROMA: sample-unsupervised method for quantifying gene module activity • Characterize each module by overdispersion, coordination, activity/sample • Can quantify transcription factor activity, protein activity … • Use transcriptome or proteome data • Application to single-cell data • Much more sensitive than usual single gene approaches • Available as a java software in github.com/sysbio-curie/Roma • Check also Martignetti et al, 2016 Loredana Martignetti, Laurence Calzone, Eric Bonnet, Emmanuel Barillot, Andrei Zinovyev (2016) ROMA: Representation and Quantification of Module Activity from Target Expression Data. Frontiers in genetics

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend