Class discrimination for microarray studies
Vlad Popovici
Swiss Institute of Bioinformatics
February 5th, 2008
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 1 / 45
Class discrimination for microarray studies Vlad Popovici Swiss - - PowerPoint PPT Presentation
Class discrimination for microarray studies Vlad Popovici Swiss Institute of Bioinformatics February 5th, 2008 Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 1 / 45 Outline Introduction 1 Discriminant
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 1 / 45
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 2 / 45
Introduction
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 3 / 45
Introduction
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 3 / 45
Introduction
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 3 / 45
Introduction
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 4 / 45
Introduction
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 4 / 45
Introduction
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 4 / 45
Introduction
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 5 / 45
Introduction
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 5 / 45
Introduction
◮ well fitted model does not ensure good prediction (overfitted model) ◮ too many features used in the model (curse of dimensionality) ◮ feature selection on the full dataset(!)
◮ improper/insufficient validation ◮ batch effects unaccounted for ◮ insufficiently documented
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 6 / 45
Introduction
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 7 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 8 / 45
Discriminant analysis
1007_s_at 117_at 1053_at 211585_at 211584_s_at Tumor 1 Tumor k Tumor n Tumor 2 probeset i
p features
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 9 / 45
Discriminant analysis
1007_s_at 117_at 1053_at 211585_at 211584_s_at Tumor 1 Tumor k Tumor n Tumor 2 probeset i
p features
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 9 / 45
Discriminant analysis
1007_s_at 117_at 1053_at 211585_at 211584_s_at Tumor 1 Tumor k Tumor n Tumor 2 probeset i
p features
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 9 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 10 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 11 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 12 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 13 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 13 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 14 / 45
Discriminant analysis
◮ assuming data is normal with
◮ alternative: estimate w0 by
◮ can be used to embed prior
w w0 Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 15 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 16 / 45
Discriminant analysis
(from Duda, Hart & Stork, Pattern Classification) Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 17 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 18 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 19 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 20 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 20 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 21 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 22 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 22 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 23 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 24 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 24 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 24 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 24 / 45
Discriminant analysis
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 24 / 45
Performance assessment
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 25 / 45
Performance assessment
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 26 / 45
Performance assessment
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 26 / 45
Performance assessment
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 27 / 45
Performance assessment
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 28 / 45
Performance assessment
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 29 / 45
Performance assessment
1 1
True positive rate (SN) False positive rate (1−SP) SN SP
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 30 / 45
Performance assessment
1 1
random classifier
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 31 / 45
Performance assessment
1 1
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 32 / 45
Performance assessment
1 1
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 33 / 45
Performance assessment
1 1
True positive rate (SN) False positive rate (1−SP)
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 34 / 45
Performance assessment
1 1
True positive rate (SN) False positive rate (1−SP)
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 34 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 35 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 36 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 37 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 37 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 37 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 37 / 45
Estimating the performance parameters
Performance estimation (X,Y) (X1,Y1) (X2,Y2) (XB,YB) E1 E2 EB ...
1
2
3
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 38 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 39 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 40 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 40 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 41 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 42 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 42 / 45
Estimating the performance parameters
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 43 / 45
Estimating the performance parameters
◮ what are the most stable features ◮ what are the points always missclassified Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 44 / 45
Estimating the performance parameters
◮ what are the most stable features ◮ what are the points always missclassified
Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 44 / 45
Bibliography
Duda, Hart, Stork: Pattern Classification Hastie, Tibshirani, Friedman: The Elements of Statistical Learning T.Fawcett: ROC Graphs: Notes and Practical Considerations for Data Mining Researchers, HP Laboratories Tech. Rep. HPL–2003–4 A.Webb: Statistical Pattern Recognition I.Shmulevich, E.Dougherty: Genomic Signal Processing Vlad Popovici (SIB) Class discrimination for microarray studies February 5th, 2008 45 / 45