t 61 3050 machine learning basic principles
play

T-61.3050 Machine Learning: Basic Principles Dimensionality - PowerPoint PPT Presentation

Multivariate Methods Dimensionality Reduction T-61.3050 Machine Learning: Basic Principles Dimensionality Reduction Kai Puolam aki Laboratory of Computer and Information Science (CIS) Department of Computer Science and Engineering Helsinki


  1. Multivariate Methods Dimensionality Reduction T-61.3050 Machine Learning: Basic Principles Dimensionality Reduction Kai Puolam¨ aki Laboratory of Computer and Information Science (CIS) Department of Computer Science and Engineering Helsinki University of Technology (TKK) Autumn 2007 AB Kai Puolam¨ aki T-61.3050

  2. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Outline Multivariate Methods 1 Bayes Classifier Discrete Variables Multivariate Regression Dimensionality Reduction 2 Subset Selection Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA) AB Kai Puolam¨ aki T-61.3050

  3. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Bayes Classifier Data are real vectors. Idea: vectors are from class-specific multivariate normal distributions. Full model: covariance matrix has O ( Kd 2 ) parameters. P(C) C µ,Σ 2 x From Figure 5.3 of Alpaydin (2004). N AB Kai Puolam¨ aki T-61.3050

  4. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Bayes Classifier Data are real vectors. Idea: vectors are from class-specific multivariate normal distributions. Full model: O ( Kd 2 ) parameters in the covariance matrix. P(C) C µ,Σ 2 x From Figure 5.3 of Alpaydin (2004). N AB Kai Puolam¨ aki T-61.3050

  5. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Bayes Classifier Common covariance matrix Idea: the means are class-specific, covariance matrix Σ is common. O ( d 2 ) parameters in the covariance matrix. Figure 5.4: Covariances may be arbitary but shared by both classes. From: E. Alpaydın. 2004. c Introduction to Machine Learning . � The MIT Press. AB Kai Puolam¨ aki T-61.3050

  6. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Bayes Classifier Common diagonal covariance matrix Idea: the means are class-specific, covariance matrix Σ is common and diagonal (Naive Bayes). d parameters in the covariance matrix. j + log ˆ Discriminant: g i ( x ) = − 1 � d j − m ij ) 2 / s 2 j =1 ( x t P ( C i ). 2 P(C) µ,Σ C Figure 5.5: All classes have equal, diagonal x covariance matrices but variances are not equal. d N From: E. Alpaydın. 2004. Introduction to Machine AB Learning . � The MIT Press. c Kai Puolam¨ aki T-61.3050

  7. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Bayes Classifier Nearest mean classifier Idea: the means are class-specific, covariance matrix Σ is common and proportional to unit matrix Σ = σ 2 1 . 1 parameter in the covariance matrix. Discriminant: g i ( x ) = − || x − m i || 2 . Nearest mean classifier. Each mean is a prototype. P(C) µ,Σ C Figure 5.6: All classes have equal, diagonal x covariance matrices of equal variances on both d N dimensions. From: E. Alpaydın. 2004. Introduction AB to Machine Learning . � The MIT Press. c Kai Puolam¨ aki T-61.3050

  8. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Outline Multivariate Methods 1 Bayes Classifier Discrete Variables Multivariate Regression Dimensionality Reduction 2 Subset Selection Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA) AB Kai Puolam¨ aki T-61.3050

  9. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Discrete Features Most straightforward using Naive Bayes (replace Gaussian with Bernoulli): ! " ) ' / / F ; < H ! !"#$%& '($)*%(+, 8G G 8 "'- F G $%(-"#.(/(#.(#) 01$"2(-!$&(+34 2 ! " ! " ! " * F ; & F / F < H ' / ; & / G G 8 8G 8G ' G ; )5(-."+6%"7"#$#)-"+-8"#($% ! " ! " ! " ' % ; ! 89: - / ! < H 89: - > H 8 8 8 # ! " ! " $ ! " ( ' F - 89: - / % ; & F - 89: - ; & / % 89: - > H G 8G G 8G 8 G ( $ $ F & =+)"7$)(.-/$%$7()(%+ G 8 I / ' $ ( 8G & $ 8 $ AB !"#$%&"'()$"*'+)&','-./012 ! 3'4556'73$&)2%#$8)3'$)'90#:83"'!"0&383;'< =:"'97='>&"**'?@ABAC Kai Puolam¨ aki T-61.3050

  10. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Outline Multivariate Methods 1 Bayes Classifier Discrete Variables Multivariate Regression Dimensionality Reduction 2 Subset Selection Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA) AB Kai Puolam¨ aki T-61.3050

  11. Bayes Classifier Multivariate Methods Discrete Variables Dimensionality Reduction Multivariate Regression Multivariate Regression ! " & $ % ; G $ I H I H IBBBI H $ # G : 2 ! !"#$%&'(%'$)*#%+)'(*,-.)# H $ H G $ $ H G $ $ $ H G $ ! G : : = = 2 2 & ' : ! " ( = ! % $ ) ) $ ) ) $ , H I H IBBBI H I & H H G H G ! G : 2 G : : 2 2 = $ ! !"#$%&'(%'$)*/-#0+-,%'#*,-.)#1* 2)3%+)*+)4*5%65)(7-(.)(*&'(%'8#)9* F : ; G : <* F = ; G = <* F > ; G : = <* F ? ; G = = <* F @ ; G : G = '+.*"9)*$5)*#%+)'(*,-.)#*%+*$5%9*+)4* ! 9/'A)* B8'9%9*3"+A$%-+9<*C)(+)#*$(%AC<*DE!1*F5'/$)(*:GH !" !"#$%&"'()$"*'+)&','-./012 ! 3'4556'73$&)2%#$8)3'$)'90#:83"'!"0&383;'< =:"'97='>&"**'?@ABAC AB Kai Puolam¨ aki T-61.3050

  12. Subset Selection Multivariate Methods Principal Component Analysis (PCA) Dimensionality Reduction Linear Discriminant Analysis (LDA) Outline Multivariate Methods 1 Bayes Classifier Discrete Variables Multivariate Regression Dimensionality Reduction 2 Subset Selection Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA) AB Kai Puolam¨ aki T-61.3050

  13. Subset Selection Multivariate Methods Principal Component Analysis (PCA) Dimensionality Reduction Linear Discriminant Analysis (LDA) Why Reduce Dimensionality? #$%&'$()*+,$)'-,./$0+*12)3$(()'-,.&*4*+-5 !" #$%&'$()(.4'$)'-,./$0+*12)3$(().474,$*$7( 6" 94:$()*;$)'-(*)-<)-=($7:+5>)*;$)<$4*&7$ 8" 9+,./$7),-%$/()47$),-7$)7-=&(*)-5)(,4//)%4*4($*( ?" A-7$)+5*$7.7$*4=/$B)(+,./$7)$0./454*+-5 @" D4*4):+(&4/+E4*+-5)F(*7&'*&7$G)>7-&.(G)-&*/+$7(G)$*'H) C" +<)./-**$%)+5)6)-7)8)%+,$5(+-5( !"#$%&"'()$"*'+)&','-./012 ! 3'4556'73$&)2%#$8)3'$)'90#:83"'!"0&383;'< =:"'97='>&"**'?@ABAC AB Kai Puolam¨ aki T-61.3050

  14. Subset Selection Multivariate Methods Principal Component Analysis (PCA) Dimensionality Reduction Linear Discriminant Analysis (LDA) Feature Selection vs. Extraction ! !"#$%&"'(")"*$+,-. /0,,(+-1' H 2 2 +34,&$#-$'5"#$%&"(6 +1-,&+-1'$0"'&"3#+-+-1' 2 7 H 8%9("$'(")"*$+,-'#)1,&+$03( ! !"#$%&"'":$&#*$+,-. ;&,<"*$'$0"' ,&+1+-#)' G 8 6' 8' =>6???6 2 @+3"-(+,-('$,' -"A' H 2 2 @+3"-(+,-(6' I J 6' J' =>6???6 H ;&+-*+4#)'*,34,-"-$('#-#)B(+('C;/DE6')+-"#&' @+(*&+3+-#-$'#-#)B(+('CFGDE6'5#*$,&'#-#)B(+('C!DE ! !"#$%&"'()$"*'+)&','-./012 ! 3'4556'73$&)2%#$8)3'$)'90#:83"'!"0&383;'< =:"'97='>&"**'?@ABAC AB Kai Puolam¨ aki T-61.3050

  15. Subset Selection Multivariate Methods Principal Component Analysis (PCA) Dimensionality Reduction Linear Discriminant Analysis (LDA) Subset Selection ! !"#$#%&$#%' 2 ()*(#+(%,-% 2 -#&+)$#( ! .,$/&$0%(#&$1"2%300%+"#%*#(+%-#&+)$#%&+%#&1"%(+#4 " 5#+%,-%-#&+)$#(% F 676+6&889%:; " 3+%#&1"%6+#$&+6,7<%-670%+"#%*#(+%7#/%-#&+)$# G =%&$>?67 8 ,' @% F' ! H 8 A " 300% H G +,% F 6-% ,' @% F' ! H G A%B% ,' @% F' A% ! C688D186?*67>%E@ 2 ' A%&8>,$6+"? ! F&1G/&$0%(#&$1"2%5+&$+%/6+"%&88%-#&+)$#(%&70%$#?,H#% ,7#%&+%&%+6?#<%6-%4,((6*8#; ! .8,&+67>%(#&$1"%@300% I <%$#?,H#% . A ! !"#$%&"'()$"*'+)&','-./012 ! 3'4556'73$&)2%#$8)3'$)'90#:83"'!"0&383;'< =:"'97='>&"**'?@ABAC AB Kai Puolam¨ aki T-61.3050

  16. Subset Selection Multivariate Methods Principal Component Analysis (PCA) Dimensionality Reduction Linear Discriminant Analysis (LDA) Subset Selection Example Toy data set consists of 100 10-dimensional vectors from two classes (1 and 0). First two dimensions x t 1 and x t 2 : drawn from Gaussian with unit variance and mean of 1 or -1 for the classes 1 and 0, respectively. Remaining eight dimensions: drawn from Gaussian with zero mean and unit variance, that is, they contain no information of the class. Optimal classifier: If x 1 + x 2 is positive the class is 1, otherwise the class is 0. Use nearest mean classifier. Split data in random into training set of 30+30 items and AB validation set of 20+20 items. Kai Puolam¨ aki T-61.3050

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend