Introduction to Big Data and Machine Learning Dimensionality Reduction Continuous Latent Variables
- Dr. Mihail
October 8, 2019
(Dr. Mihail) Intro Big Data October 8, 2019 1 / 20
Introduction to Big Data and Machine Learning Dimensionality - - PowerPoint PPT Presentation
Introduction to Big Data and Machine Learning Dimensionality Reduction Continuous Latent Variables Dr. Mihail October 8, 2019 (Dr. Mihail) Intro Big Data October 8, 2019 1 / 20 Data Dimensionality Idea Many datasets have the property that
(Dr. Mihail) Intro Big Data October 8, 2019 1 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 2 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 3 / 20
1
2
(Dr. Mihail) Intro Big Data October 8, 2019 4 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 5 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 6 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 7 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 8 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 9 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 9 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 10 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 10 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 11 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 12 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 13 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 14 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 15 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 16 / 20
(Dr. Mihail) Intro Big Data October 8, 2019 17 / 20
# Compute c o v a r i a n c e matrix cov mat = xzeromean .T. dot ( xzeromean ) / ( xzeromean . shape [0] −1) # Compute e i g e n v a l u e decomposition e i g e n v a l s , e i g e n v e c s = np . l i n a l g . e i g ( cov mat ) # Arrange as p a i r s ( t u p l e s ) e i g p a i r s = [ ( e i g e n v a l s [ i ] , e i g e n v e c s [ : , i ] ) f o r i i n range ( l e n ( e i g v a l s ) ) ] # Sort the ( e i g e n v a l u e , e i g e n v e c t o r ) t u p l e s from high to low e i g p a i r s . s o r t ( key=lambda x : x [ 0 ] , r e v e r s e=True ) (Dr. Mihail) Intro Big Data October 8, 2019 18 / 20
f i g , ax = p l t . s u b p l o t s (5 , 9 , f i g s i z e = (25 , 15)) f o r d i g i t i n range ( 5 ) :
: ] ax [ d i g i t , 0 ] . imshow ( np . reshape ( onethree+xbar , (28 , 28))) ax [ d i g i t , 0 ] . s e t t i t l e ( ’ O r i g i n a l ’ ) f o r ( b a s i s i x , b a s i s ) i n enumerate ( [ 1 , 2 , 5 , 10 , 100 , 200 , 600 , 28∗28]): subspace = np . a r r a y ( [ e i g p a i r s [ i ] [ 1 ] f o r i i n range ( b a s i s ) ] ) . T X pca = np . dot (
subspace ) X recon = np . dot ( subspace , X pca ) + xbar ax [ d i g i t , b a s i s i x +1]. imshow ( np . reshape ( np . abs ( X recon ) , (28 , 28))) ax [ d i g i t , b a s i s i x +1]. s e t t i t l e ( s t r ( b a s i s )+ ’ components ’ ) ax [ d i g i t , b a s i s i x +1]. t i c k p a r a m s ( labelbottom=False , l a b e l l e f t=F a l s e ) (Dr. Mihail) Intro Big Data October 8, 2019 19 / 20