machine learning for cancer genomics
play

Machine learning for cancer genomics Jean-Philippe Vert - PowerPoint PPT Presentation

Machine learning for cancer genomics Jean-Philippe Vert Jean-Philippe.Vert@mines.org Mines ParisTech / Curie Institute / Inserm Informatics and mathematical sciences: interactions with biomedical sciences workshop, Paris, June 17, 2011.


  1. Machine learning for cancer genomics Jean-Philippe Vert Jean-Philippe.Vert@mines.org Mines ParisTech / Curie Institute / Inserm ”Informatics and mathematical sciences: interactions with biomedical sciences” workshop, Paris, June 17, 2011. Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 1 / 44

  2. Outline Introduction 1 Cancer prognosis from DNA copy number variations 2 Diagnosis and prognosis from gene expression data 3 Conclusion 4 Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 2 / 44

  3. Outline Introduction 1 Cancer prognosis from DNA copy number variations 2 Diagnosis and prognosis from gene expression data 3 Conclusion 4 Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 3 / 44

  4. Chromosomic aberrations in cancer Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 4 / 44

  5. Comparative Genomic Hybridization (CGH) Motivation Comparative genomic hybridization (CGH) data measure the DNA copy number along the genome Very useful, in particular in cancer research to observe systematically variants in DNA content 1 0.5 Log-ratio 0 -0.5 -1 Chromosome 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2021 22 23 X Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 5 / 44

  6. Cancer prognosis: can we predict the future evolution? 0.5 0.5 0 0 −0.5 −0.5 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 2 1 0 0 −2 −1 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 1 2 0 0 −1 −2 −2 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 4 1 2 0 0 −2 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 0.5 0 0 −2 −0.5 −4 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 Aggressive (left) vs non-aggressive (right) melanoma Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 6 / 44

  7. DNA → RNA → protein CGH shows the (static) DNA Cancer cells have also abnormal (dynamic) gene expression (= transcription) Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 7 / 44

  8. Tissue profiling with DNA chips Data Gene expression measures for more than 10 k genes Measured typically on less than 100 samples of two (or more) different classes (e.g., different tumors) Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 8 / 44

  9. Can we identify the cancer subtype? (diagnosis) Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 9 / 44

  10. Can we predict the future evolution? (prognosis) Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 10 / 44

  11. Pattern recognition, aka supervised classification 0.5 0.5 0 0 −0.5 −0.5 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 2 1 0 0 −2 −1 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 1 2 0 0 −1 −2 −2 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 4 1 2 0 0 −2 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 0.5 0 0 −2 −0.5 −4 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 11 / 44

  12. Pattern recognition, aka supervised classification 0.5 0.5 0 0 −0.5 −0.5 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 2 1 0 0 −2 −1 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 1 2 0 0 −1 −2 −2 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 4 1 2 0 0 −2 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 0.5 0 0 −2 −0.5 −4 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 11 / 44

  13. Pattern recognition, aka supervised classification 0.5 0.5 0 0 −0.5 −0.5 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 2 1 0 0 −2 −1 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 1 2 0 0 −1 −2 −2 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 4 1 2 0 0 −2 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 0.5 0 0 −2 −0.5 −4 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 11 / 44

  14. Pattern recognition, aka supervised classification 0.5 0.5 0 0 −0.5 −0.5 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 2 1 0 0 −2 −1 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 1 2 0 0 −1 −2 −2 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 4 1 2 0 0 −2 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 0.5 0 0 −2 −0.5 −4 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 11 / 44

  15. Pattern recognition, aka supervised classification Challenges Few samples High dimension Structured data Heterogeneous data Prior knowledge Fast and scalable implementations Interpretable models Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 12 / 44

  16. Shrinkage estimators Define a large family of "candidate classifiers", e.g., linear 1 predictors: f β ( x ) = β ⊤ x for x ∈ R p For any candidate classifier f β , quantify how "good" it is on the 2 training set with some empirical risk, e.g.: n R ( β ) = 1 � l ( f β ( x i ) , y i ) . n i = 1 Choose β that achieves the minimium empirical risk, subject to 3 some constraint: min β R ( β ) subject to Ω( β ) ≤ C . Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 13 / 44

  17. Shrinkage estimators Define a large family of "candidate classifiers", e.g., linear 1 predictors: f β ( x ) = β ⊤ x for x ∈ R p For any candidate classifier f β , quantify how "good" it is on the 2 training set with some empirical risk, e.g.: n R ( β ) = 1 � l ( f β ( x i ) , y i ) . n i = 1 Choose β that achieves the minimium empirical risk, subject to 3 some constraint: min β R ( β ) subject to Ω( β ) ≤ C . Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 13 / 44

  18. Shrinkage estimators Define a large family of "candidate classifiers", e.g., linear 1 predictors: f β ( x ) = β ⊤ x for x ∈ R p For any candidate classifier f β , quantify how "good" it is on the 2 training set with some empirical risk, e.g.: n R ( β ) = 1 � l ( f β ( x i ) , y i ) . n i = 1 Choose β that achieves the minimium empirical risk, subject to 3 some constraint: min β R ( β ) subject to Ω( β ) ≤ C . Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 13 / 44

  19. Why skrinkage classifiers? min β R ( β ) subject to Ω( β ) ≤ C . b* Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 14 / 44

  20. Why skrinkage classifiers? min β R ( β ) subject to Ω( β ) ≤ C . b est b* Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 14 / 44

  21. Why skrinkage classifiers? min β R ( β ) subject to Ω( β ) ≤ C . est b b* Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 14 / 44

  22. Why skrinkage classifiers? min β R ( β ) subject to Ω( β ) ≤ C . b est b est C b* Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 14 / 44

  23. Why skrinkage classifiers? min β R ( β ) subject to Ω( β ) ≤ C . b est b est C b* b* C Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 14 / 44

  24. Why skrinkage classifiers? min β R ( β ) subject to Ω( β ) ≤ C . b est b est C Variance b* Bias b* C Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 14 / 44

  25. Why skrinkage classifiers? b est b est C Variance b* Bias b* C "Increases bias and decreases variance" Common choices are Ω( β ) = � p i = 1 β 2 i (ridge regression, SVM, ...) Ω( β ) = � p i = 1 | β i | (lasso, boosting, ...) Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 15 / 44

  26. Including prior knowledge in the penalty? min β R ( β ) subject to Ω( β ) ≤ C . b est b* Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 16 / 44

  27. Including prior knowledge in the penalty? min β R ( β ) subject to Ω( β ) ≤ C . est b b* Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 16 / 44

  28. Including prior knowledge in the penalty? min β R ( β ) subject to Ω( β ) ≤ C . b est b est C b* b* C Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 16 / 44

  29. Including prior knowledge in the penalty? min β R ( β ) subject to Ω( β ) ≤ C . b est b* Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 16 / 44

  30. Including prior knowledge in the penalty? min β R ( β ) subject to Ω( β ) ≤ C . est b b* Jean-Philippe Vert (ParisTech) Machine learning in genomics Paris 2011 16 / 44

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend