Including prior knowledge in machine learning for genomic data - PowerPoint PPT Presentation

Including prior knowledge in machine learning for genomic data Jean-Philippe Vert Mines ParisTech / Curie Institute / Inserm StatLearn workshop, Grenoble, March 17, 2011 J.P Vert (ParisTech) Prior knowlege in ML StatLearn 1 / 68

Outline Motivations 1 Finding multiple change-points in a single profile 2 Finding multiple change-points shared by many signals 3 Supervised classification of genomic profiles 4 Learning molecular classifiers with network information 5 Conclusion 6 J.P Vert (ParisTech) Prior knowlege in ML StatLearn 2 / 68

Chromosomic aberrations in cancer J.P Vert (ParisTech) Prior knowlege in ML StatLearn 4 / 68

Comparative Genomic Hybridization (CGH) J.P Vert (ParisTech) Prior knowlege in ML StatLearn 5 / 68

Can we identify breakpoints and "smooth" each profile? 1.4 1.2 1 0.8 0.6 0.4 0.2 0 −0.2 0 100 200 300 400 500 600 700 800 900 1000 J.P Vert (ParisTech) Prior knowlege in ML StatLearn 6 / 68

Can we detect frequent breakpoints? 1 0.5 0 − 0.5 − 1 0 200 400 600 800 1000 1200 1400 1600 1800 2000 1 0.5 0 − 0.5 − 1 0 200 400 600 800 1000 1200 1400 1600 1800 2000 1 0.5 0 − 0.5 − 1 0 200 400 600 800 1000 1200 1400 1600 1800 2000 1 0.5 0 − 0.5 − 1 0 200 400 600 800 1000 1200 1400 1600 1800 2000 A collection of bladder tumour copy number profiles. J.P Vert (ParisTech) Prior knowlege in ML StatLearn 7 / 68

Can we detect discriminative patterns? 0.5 0.5 0 0 −0.5 −0.5 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 2 1 0 0 −2 −1 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 1 2 0 0 −1 −2 −2 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 4 1 2 0 0 −2 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 0.5 0 0 −2 −0.5 −4 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 Aggressive (left) vs non-aggressive (right) melanoma. J.P Vert (ParisTech) Prior knowlege in ML StatLearn 8 / 68

DNA → RNA → protein CGH shows the (static) DNA Cancer cells have also abnormal (dynamic) gene expression (= transcription) J.P Vert (ParisTech) Prior knowlege in ML StatLearn 9 / 68

Tissue profiling with DNA chips Data Gene expression measures for more than 10 k genes Measured typically on less than 100 samples of two (or more) different classes (e.g., different tumors) J.P Vert (ParisTech) Prior knowlege in ML StatLearn 10 / 68

Can we identify the cancer subtype? (diagnosis) J.P Vert (ParisTech) Prior knowlege in ML StatLearn 11 / 68

Can we predict the future evolution? (prognosis) J.P Vert (ParisTech) Prior knowlege in ML StatLearn 12 / 68

Summary 0.5 0.5 0 0 −0.5 −0.5 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 2 1 0 0 −2 −1 0 500 1000 1500 2000 2500 −4 0 500 1000 1500 2000 2500 1 2 0 0 −1 −2 −2 −4 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 4 1 2 0 0 −2 −1 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 2 0.5 0 0 −2 −0.5 −4 0 500 1000 1500 2000 2500 −1 0 500 1000 1500 2000 2500 Many problems... Data are high-dimensional, but "structured" Classification accuracy is not all, interpretation is necessary (pattern discovery) A general strategy min R ( β ) + λ Ω( β ) J.P Vert (ParisTech) Prior knowlege in ML StatLearn 13 / 68

The problem 1.4 1.2 1 0.8 0.6 0.4 0.2 0 −0.2 0 100 200 300 400 500 600 700 800 900 1000 Let Y ∈ R p the signal U ∈ R p with We want to find a piecewise constant approximation ˆ at most k change-points. J.P Vert (ParisTech) Prior knowlege in ML StatLearn 15 / 68

An optimal solution? 1.4 1.2 1 0.8 0.6 0.4 0.2 0 −0.2 0 100 200 300 400 500 600 700 800 900 1000 We can define an "optimal" piecewise constant approximation U ∈ R p as the solution of ˆ p − 1 � U ∈ R p � Y − U � 2 min such that 1 ( U i + 1 � = U i ) ≤ k i = 1 � p � This is an optimization problem over the partitions... k Dynamic programming finds the solution in O ( p 2 k ) in time and O ( p 2 ) in memory But: does not scale to p = 10 6 ∼ 10 9 ... J.P Vert (ParisTech) Prior knowlege in ML StatLearn 16 / 68

Promoting sparsity with the ℓ 1 penalty The ℓ 1 penalty (Tibshirani, 1996; Chen et al., 1998) If R ( β ) is convex and "smooth", the solution of p � β ∈ R p R ( β ) + λ min | β i | i = 1 is usually sparse. J.P Vert (ParisTech) Prior knowlege in ML StatLearn 17 / 68

Promoting piecewise constant profiles penalty The total variation / variable fusion penalty If R ( β ) is convex and "smooth", the solution of p − 1 � β ∈ R p R ( β ) + λ | β i + 1 − β i | min i = 1 is usually piecewise constant (Rudin et al., 1992; Land and Friedman, 1996). Proof: Change of variable u i = β i + 1 − β i , u 0 = β 1 We obtain a Lasso problem in u ∈ R p − 1 u sparse means β piecewise constant J.P Vert (ParisTech) Prior knowlege in ML StatLearn 18 / 68

TV signal approximator p − 1 � β ∈ R p � Y − β � 2 min such that | β i + 1 − β i | ≤ µ i = 1 Adding additional constraints does not change the change-points: � p i = 1 | β i | ≤ ν (Tibshirani et al., 2005; Tibshirani and Wang, 2008) � p i = 1 β 2 i ≤ ν (Mairal et al. 2010) J.P Vert (ParisTech) Prior knowlege in ML StatLearn 19 / 68

Solving TV signal approximator p − 1 β ∈ R p � Y − β � 2 � min such that | β i + 1 − β i | ≤ µ i = 1 QP with sparse linear constraints in O ( p 2 ) -> 135 min for p = 10 5 (Tibshirani and Wang, 2008) Coordinate descent-like method O ( p ) ? -> 3s s for p = 10 5 (Friedman et al., 2007) For all µ with the LARS in O ( pK ) (Harchaoui and Levy-Leduc, 2008) For all µ in O ( p ln p ) (Hoefling, 2009) For the first K change-points in O ( p ln K ) (Bleakley and V., 2010) J.P Vert (ParisTech) Prior knowlege in ML StatLearn 20 / 68

Speed trial : 2 s. for K = 100, p = 10 7 Speed for K=1, 10, 1e2, 1e3, 1e4, 1e5 1 0.9 0.8 0.7 0.6 seconds 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 6 7 8 9 10 signal length 5 x 10 J.P Vert (ParisTech) Prior knowlege in ML StatLearn 21 / 68

Summary 1.4 1.2 1 0.8 0.6 0.4 0.2 0 −0.2 0 100 200 300 400 500 600 700 800 900 1000 A fast method for multiple change-point detection An embedded method that boils down to a dichotomic wrapper method (very different from dynamic programming) J.P Vert (ParisTech) Prior knowlege in ML StatLearn 22 / 68

Including prior knowledge in machine learning for genomic data - PowerPoint PPT Presentation

Including prior knowledge in machine learning for genomic data Jean-Philippe Vert Mines ParisTech / Curie Institute / Inserm StatLearn workshop, Grenoble, March 17, 2011 J.P Vert (ParisTech) Prior knowlege in ML StatLearn 1 / 68 Outline

Genomic Knowledge Standards (GKS) genomicsandhealth.org Genomic Knowledge Standards GKS aims

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Prior Learning Assessment Content of Presentation Introduction to Prior Learning Assessment

Control Strategy EMA, London; 23 November 2017 1 EMA Prior Knowledge Workshop Case Study:

R. Martijn van der Plas 1 How to use prior knowledge in defining a control strategy? Some

Control Strategy EMA, London; 23 November 2017 1 EMA Prior Knowledge Workshop Case study Use of

Introduction to Machine Learning CMU-10701 3. Bayes classification Barnabs Pczos & Aarti

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Programming in Python Lecture 3: Patterns and Functions Michael Schroeder Sven Schreiber

Pairwise Sequence Alignment Todays Goal > DNA Sequence 1

PROTEIN SYNTHESIS RNA (ribonucleic acid) 3 types RNA DIFFERENCES 1. messenger RNA (mRNA) DNA

MOL2NET, 2018 , 4, http://sciforum.net/conference/mol2net-04 2 the other hand, in the red fruit of

Eukaryotes and Gene and is intended for the non-commercial use of students and teachers. These

CSI5126 . Algorithms in bioinformatics Substitution Score Marcel Turcotte School of Electrical

Protein structure and evolution GT MASIM 16 novembre 2017 Mathilde Carpentier Matre de

APNA 30th Annual Conference Session 2047: October 20, 2016 The Connection Between Food and Mood

Including prior knowledge in machine learning for genomic data - PowerPoint PPT Presentation

Including prior knowledge in machine learning for genomic data Jean-Philippe Vert Mines ParisTech / Curie Institute / Inserm StatLearn workshop, Grenoble, March 17, 2011 J.P Vert (ParisTech) Prior knowlege in ML StatLearn 1 / 68 Outline

Genomic Knowledge Standards (GKS) genomicsandhealth.org Genomic Knowledge Standards GKS aims

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Prior Learning Assessment Content of Presentation Introduction to Prior Learning Assessment

Control Strategy EMA, London; 23 November 2017 1 EMA Prior Knowledge Workshop Case Study:

R. Martijn van der Plas 1 How to use prior knowledge in defining a control strategy? Some

Control Strategy EMA, London; 23 November 2017 1 EMA Prior Knowledge Workshop Case study Use of

Introduction to Machine Learning CMU-10701 3. Bayes classification Barnabs Pczos &amp; Aarti

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Programming in Python Lecture 3: Patterns and Functions Michael Schroeder Sven Schreiber

Pairwise Sequence Alignment Todays Goal &gt; DNA Sequence 1

PROTEIN SYNTHESIS RNA (ribonucleic acid) 3 types RNA DIFFERENCES 1. messenger RNA (mRNA) DNA

MOL2NET, 2018 , 4, http://sciforum.net/conference/mol2net-04 2 the other hand, in the red fruit of

Eukaryotes and Gene and is intended for the non-commercial use of students and teachers. These

CSI5126 . Algorithms in bioinformatics Substitution Score Marcel Turcotte School of Electrical

Protein structure and evolution GT MASIM 16 novembre 2017 Mathilde Carpentier Matre de

APNA 30th Annual Conference Session 2047: October 20, 2016 The Connection Between Food and Mood

Introduction to Machine Learning CMU-10701 3. Bayes classification Barnabs Pczos & Aarti

Pairwise Sequence Alignment Todays Goal > DNA Sequence 1