inferring cancer subnetwork markers
play

Inferring Cancer Subnetwork Markers using Density-Constrained - PowerPoint PPT Presentation

Introduction Methods Experimental Results Inferring Cancer Subnetwork Markers using Density-Constrained Biclustering Phuong Dao , 1 , Recep Colak , 3 Raheleh Salari 1 , Flavia Moser 4 , Elai Davicioni 5 Alexander Schnhuth , 2 ,


  1. Introduction Methods Experimental Results Inferring Cancer Subnetwork Markers using Density-Constrained Biclustering Phuong Dao ∗ , 1 , Recep Colak ∗ , 3 Raheleh Salari 1 , Flavia Moser 4 , Elai Davicioni 5 Alexander Schönhuth † , 2 , Martin Ester 1 , † 1 School of Computing Science, Simon Fraser University, Canada 2 Centrum Wiskunde & Informatica, Amsterdam, Netherlands 3 Department of Computing Science, University of Toronto, Canada 4 Center for Disease Control, University of British Columbia 5 GenomeDX Biosciences Inc. ∗ : Joint first authors, † : Joint corresponding, last authors

  2. Introduction Methods Experimental Results Introduction Personalized Medicine • Determination of disease status based on patient genetics/genomics • Goal : Specific, individual choice of treatment • Necessary : Reliable disease markers

  3. Introduction Methods Experimental Results Introduction Personalized Medicine • Determination of disease status based on patient genetics/genomics • Goal : Specific, individual choice of treatment • Necessary : Reliable disease markers • Monogenic: Each marker is a single gene • Multigenic: Each marker is a set of genes

  4. Introduction Methods Experimental Results Single Gene Markers Control 1 Control 2 Control 3 Case 1 Case 2 Case 3 Control 1 Control 2 Control 3 Case 1 Case 2 Case 3 Gene 1 Gene 3 Gene 1 Differentially Expressed Gene 2 Gene 3 Gene 4 Gene 2 Gene 5 Gene 4 Gene 6 Gene 5 Gene 6 Non−Differentially Expressed Caveat : Single gene markers vary significantly across different studies

  5. Introduction Methods Experimental Results Marker Selection Multigenic Traits Control 1 Control 2 Control 3 Case 1 Case 2 Case 3 Gene 1 Gene 2 Gene 3 Gene 4 G1 (0.95) (0.85) (0.75) Gene 1 G2 Gene 2 (0.8) G3 Gene 3 (0.9) Gene 4 G4 Gene Expression Profiles Interaction/Association Network Solution: Differentially expressed genes participating in the same pathway [Chuang et al., 2007], [Chowdhury et al. 2010]

  6. Introduction Methods Experimental Results Our Approach Each of our subnetwork markers: • is a densely connected subnetwork ☞ Disease-related genes have more PPI interactions than expected [Goh et al., PNAS (2007)] • contains genes which are differentially expressed in a subset of samples ☞ cancer tumors vary greatly in phenotype, although belonging to the same (sub)type [Hampton et al., GR (2009)]

  7. Introduction Methods Experimental Results Density-Constrained Biclusters P e ∈ E w e Definition : G is called α -dense if ≥ α ≥ 0 . 5. ( | V | 2 ) S1 S2 S3 G1 0.95 0.95 0.8 0.6 0.85 G1 1 1 0 0.9 0.75 G3 1 1 1 0.45 G2 0.85 G2 1 1 0 G3 0.8 0.25 0.75 0.9 0.9 0.7 0.9 1 1 1 G4 G4 0.55 0.5 0.95 0.8 0.85 0.95 0.75 0.95 0.65 0.35 G4 0.45 0.8 0.9 S1 S2 S3 0.750.8 0.9 0.7 0.3 0.8 0.9 0.7 0.9 G4 1 1 1 0.65 0.85 G5 G6 G5 0 1 1 0.9 0.8 0.95 0.75 0 1 1 G6 0.85 0 1 1 G7 0.95 G7 Our markers are α -densely connected subnetworks of genes that are differentially expressed in a subset of patients of size at least k (here: k = 2).

  8. Introduction Methods Experimental Results Methods

  9. Introduction Methods Experimental Results Density Constrained Biclustering Search Strategy Theorem: Every α -densely connected network of size n contains an α -densely connected subnetwork of size n − 1. A A A C D C 0.4 0.6 0.9 0.8 B C D B B D C A A A D 0.6 0.6 0.9 0.8 0.4 0.6 B A C 0.4 C 0.9 D 0.4 B 0.9 B D 0.8 B C 0.8 D Density: 0.45 = [(0.8 + 0.9 + 0.6 + 0.4) / 6] C Not Dense wDCB 0.4 0.6 B A 0.9 0.8 Not Connected maximal wDCB D Search Strategy: Breadth-first search.

  10. Introduction Methods Experimental Results Classification 1. Marker computation: Feature space creation marker = dimension 2. Construct classifier using training data 3. Perform classification on test data Cross-platform study : Marker computation and test data from different platforms

  11. Introduction Methods Experimental Results Experimental Results

  12. Introduction Methods Experimental Results Network Data Confidence-scored PPI network [STRING, von Mering et al., NAR 2009] • Edges reflect physical protein-protein interactions • Confidence scores reflect the probability that the interaction is 0.95 0.8 0.6 associated with a cellular 0.9 0.45 0.85 0.25 0.75 0.9 0.9 0.7 phenomenon (and not an 0.55 0.5 0.95 0.8 0.85 0.95 0.75 experimental artifact) 0.65 0.95 0.35 0.45 0.8 0.9 0.750.8 0.9 0.7 0.3 0.8 • Scoring system based on KEGG 0.9 0.65 0.85 0.9 0.8 0.95 0.75 pathways

  13. Introduction Methods Experimental Results Gene Expression Data Colon cancer • GSE8671, 32 patients / tissue pairs • GSE10950, 24 patients / tissue pairs • GSE6988, 123 samples across several cancer subtypes Breast cancer • GSE3494, 251 patients with different TP53 mutation status (wildtype vs. mutant)

  14. Introduction Methods Experimental Results Colon Cancer Prediction GSE8671 >> GSE6988 1 0.95 0.9 0.85 AUC 0.8 0.75 0.7 SGM GMI 0.65 NETCOVER wDCB 0.6 0 5 10 15 20 25 30 35 40 45 50 #Subnetworks/Genes

  15. Introduction Methods Experimental Results Colon Cancer Prognosis GSE8671 >> GSE6988 prognosis 1 0.9 0.8 0.7 AUC 0.6 0.5 SGM GMI NETCOVER 0.4 wDCB 0 10 20 30 40 50 # Subnetworks/Genes

  16. Introduction Methods Experimental Results Colon Cancer: Prognosis Accuracy 8671 → 6988, Prognosis 10950 → 6988, Prognosis K SGM GMI NC wDCB SGM GMI NC wDCB 1 0.57 0.57 0.51 0.56 0.57 0.68 N/A 0.47 5 0.74 0.62 0.74 0.6 0.63 0.81 N/A 0.68 10 0.76 0.77 0.74 0.88 0.57 0.77 N/A 0.74 20 0.72 0.62 0.77 0.83 0.61 0.79 N/A 0.85 30 0.65 0.74 0.83 0.88 0.63 0.81 N/A 0.85 40 0.67 0.79 0.83 0.90 0.78 0.85 N/A 0.89 50 0.74 0.77 0.81 0.92 0.76 0.85 N/A 0.91 Top values previous methods Top value our method

  17. Introduction Methods Experimental Results Breast Cancer TP53 Wildtype vs. Mutant GSE3494 (Miller et al.) 0.9 0.85 Accuracy 0.8 0.75 SGM (mappable) GMI (mappable) wDCB (mappable) SPM (not mappable) 0.7 0 5 10 15 20 25 # Subnetworks/Genes

  18. Introduction Methods Experimental Results Subnetwork Marker Statistics # Subnetworks Enrichment # Subnetworks Enrichment GMI 806 0.38 755 0.34 NC 923 0.12 N/A N/A wDCB 282 0.76 216 0.74 8671 Subnetworks 10950 Subnetworks GMI = Greedy Mutual Information (Chuang et al.) NC = NetCover (Chowdhury et al.) wDCB = weighted Density Constrained Biclustering # Subnetworks = total number of subnetworks computed Enrichment = enrichment rate of the top-50 markers

  19. Introduction Methods Experimental Results Top Markers in GSE8671 • Enriched with DNA replication initiation (p=6.39e-14), DNA metabolic process (p=6.15e-12) • TP53, BRCA1: tumor suppressor genes • Minichromosome maintenance (MCM) complex • MCM2, MCM5: early markers for colon cancer (Burger et al., 2008)

  20. Introduction Methods Experimental Results Outlook / Acknowledgments Outlook : • Analyze subnetwork signatures • ncRNA-protein interaction data Acknowledgments : • Mehmet Koyutürk • David DesJardins, Google Inc. • Lab for Mathematical and Computational Biology, UC Berkeley

  21. Introduction Methods Experimental Results Thanks for the attention!

  22. Introduction Methods Experimental Results Densely Connected Subnetworks Properties Let G = ( V , E ) be a network with edge weights w e , e ∈ E . • The density θ ( G ) of G is � = 2 · � e ∈ E w e e ∈ E w e θ ( G ) := � | V | | V | ( | V | − 1 ) � 2 � | V | � where is the number of possible edges in G . 2 • G is called α -dense if θ ( G ) ≥ α ≥ 0 . 5 • An α -dense, connected network G is called α -densely connected.

  23. Introduction Methods Experimental Results Classifier Construction G4 G1 0.95 0.9 0.7 0.85 0.75 G3 1. Rank density constrained G5 G2 G6 biclusters according to density 0.8 0.85 0.9 0.95 significance G4 G7 2. Keep only high-ranked Gene 1 1.25 subnetworks with little overlap Gene 2 1.5 Gene 3 3. Feature space dimension = 1.0 Marker 1 1.25 Gene 4 1.25 Average number of markers Marker 2 0.5 Gene 5 0.5 Gene 6 0.0 4. SVM classification Gene 7 0.25 Gene Expression Profile Average Gene Expression Profile

  24. Introduction Methods Experimental Results Colon Cancer: Prediction Accuracy 8671 → 6988 10950 → 6988 K SGM GMI NC wDCB SGM GMI NC wDCB 1 0.56 0.84 0.72 0.84 0.63 0.37 N/A 0.77 5 0.73 0.72 0.72 0.82 0.82 0.68 N/A 0.86 10 0.76 0.76 0.83 0.85 0.82 0.81 N/A 0.88 20 0.80 0.84 0.86 0.89 0.84 0.83 N/A 0.89 30 0.80 0.83 0.84 0.91 0.83 0.85 N/A 0.85 40 0.85 0.85 0.87 0.90 0.84 0.84 N/A 0.89 50 0.85 0.84 0.85 0.93 0.81 0.82 N/A 0.89 Top values previous methods , our method

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend