deepgene an advanced cancer type classifier based on deep
play

DeepGene: An Advanced Cancer Type Classifier Based on Deep Learning - PowerPoint PPT Presentation

DeepGene: An Advanced Cancer Type Classifier Based on Deep Learning and Somatic Point Mutations (Shi, Yi) 2016.10.03 Center for Systems Biomedicine Shanghai Jiao Tong University Outline Motivation Methods Results & Discussion


  1. DeepGene: An Advanced Cancer Type Classifier Based on Deep Learning and Somatic Point Mutations 石毅 (Shi, Yi) 2016.10.03 Center for Systems Biomedicine Shanghai Jiao Tong University

  2. Outline Motivation Methods Results & Discussion

  3. Motivation

  4. Motivation Traditional cancer diagnosis • Morphological appearance, imaging techniques Image from radiology.med.nyu.edu • Gene expression Image from well.ox.ac.uk • Protein profiling Image from sigmaaldrich.com

  5. Motivation Inside drives • Somatic point mutations • Insertions and deletions (INDELs) • Chromatin translocations • Copy number abnormalities

  6. Motivation Neural network (1940’s) Support vector machine (1960’s) Deep neural network (1980’s) Supervised combined with un-supervised

  7. Motivation Applications of deep neural network (DNN) learning

  8. Methods

  9. Methods Three steps of DeepGene • Step1. Clustered gene filtering (CGF) • Step2. Indexed sparsity reduction (IDS) • Step3. Deep neural network (DNN) classifier

  10. Methods Step1. Clustered gene filtering (CGF) • Intuitive idea: Team A: Vs. Team B:

  11. Methods Step1. Clustered gene filtering (CGF)

  12. Methods Step2. Indexed sparsity reduction (ISR) 𝟐 Truncate the top if n NZ ≥ n ISR n ISR elements with . / 𝟒 ⎡𝟐 highest occurrence ⋮ ⎤ frequency 𝟐 𝟏 ⎢ ⎥ 𝟐 𝟒 ⎢ ⎥ * - ⋮ 𝟏 ⎡𝟐 ⋮ ⎢ ⎥ ⎤ ⋮ 𝑶 𝟒 ⎢ ⎥ ⎢ ⎥ ⋮ ⎢ ⎥ Indexed gene data 𝟏 ⎢ ⎥ n NZ x1 𝑶 Add zeros to tail ⎣ 𝟐⎦ ⎢ ⎥ 𝟏 if n NZ < n ISR Raw gene data ⎢ ⎥ ⋮ Nx1 ⎣ 𝟏⎦ Gene data after ISR n ISR x1

  13. Methods Step3. Deep neural network classifier

  14. Methods Overall flowchart of DeepGene ⎡𝟐 ⎤ 𝟏 ⎢ ⎥ ⎡ 𝟐 Clustered gene 𝟏 ⎢ ⎥ ⎤ filtering (CGF) 𝟏 ⎡𝟐 𝟏 ⎢ ⎥ ⎢ ⎥ ⎤ 𝟏 𝟏 ⎣ ⋮⎦ ⎢ ⎥ ⎢ ⎥ ⋮ 𝟐 ⎢ ⎥ ⎢ ⎥ Clustered discriminatory gene data 6 KIRP Concatenation DNN classifier 𝟐 n 1 × 1 ⎢ ⎥ 𝟏 ⎢ ⎥ Classification result Classification label ⎡ 𝟐 ⎢ 𝟒 ⎥ 1 × 1 ⎣ ⋮⎦ (cancer type) ⎤ ⎢ ⎥ 𝟐𝟏 𝟒 ⎢ ⎥ Raw gene data ⎣ ⋮ ⎦ Indexed sparsity 𝟐𝟏 ⎢ ⎥ N × 1 reducing (ISR) ⎢ 𝟑𝟗 ⎥ Input to the DNN classifier (n 1 +n 2 ) × 1 ⎣ ⋮ ⎦ Indexes of non-zero elements n 2 × 1

  15. Results & Discussion

  16. Results & Discussion Dataset • 12 tumor somatic point mutation datasets from TCGA. (ACC, BLCA, BRCA, CESC, HNSC, KIRP, LGG, LUAD, PAAD, PRAD, STAD, UCS) • 22,834 genes from 3,122 samples in total. Note: ACC, adrenocortical carcinoma; BLCA, bladder urothelial carcinoma; BRCA, breast invasive carcinoma; CESC, cervical squamous cell carcinoma and endocervical adenocarcinoma; HNSC, head and neck squamous cell carcinoma; KIRP , kidney renal papillary cell carcinoma; LGG, brain lower grade glioma; LUAD, lung adenocarcinoma; PAAD, pancreatic adenocarcinoma; PRAD, prostate adenocarcinoma; STAD, stomach adenocarcinoma; UCS, uterine carcinosarcomas.

  17. Results & Discussion Parameters (a) (b) (c) (d) (a) Parameter estimation for and , corresponding to Table 4; (b) parameter estimation for layer number and parameter number per layer for the DNN classifier, corresponding to Table 5; (c) parameter estimation for cost and gamma for SVM, corresponding to Table 6; (d) parameter estimation for Table 7.

  18. Results & Discussion Does CGF and/or ISR help? 10-fold cross validation accuracy of DeepGene with different design options

  19. Results & Discussion Comparing to other famous classifiers Testing accuracy of DeepGene against three widely adopted classifiers

  20. Results & Discussion Further investigation • Integrating other heterogeneous mutation data, e.g. INDEL, CNV, translocation • What feature (gene) combinations contribute to better prediction accuracy? Why? How this can help real diagnosis? • Applying to CTC or ctDNA for early diagnosis, subtyping, locating.

  21. Questions & Comments?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend