Probability and Statistics for Computer Science Principal - PowerPoint PPT Presentation

Probability and Statistics ì for Computer Science Principal Component Analysis --- Exploring the data in less dimensions Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC, 10.27.2020

Last time � Review of Bayesian inference � Visualizing high dimensional data & Summarizing data � The covariance matrix

Objectives gpr.in#*m- Analysis Two applications :O Dimension reduction ⑤ Compression , Reconstruction Ear :* t those in data see ! directions !

Examples: Immune Cell Data -38816 � There are 38816 white N - blood immune cells from DX N T cells a mouse sample T � Each immune cell has 40+ features/ measurements ↳ components B cells � Four features are used subset choose - as illustraSon. d=4 � There are at least 3 cell types involved Natural killer cells

Scatter matrix of Immune Cells � There are 38816 white blood immune cells from a mouse sample � Each immune cell has 40+ features/ components � Four features are used for the illustraSon. Dark red : T cells � There are at least 3 cell Brown: B cells types involved Blue: NK cells Cyan: other small populaSon

' Data PCA of Immune Cells > res1 Eigenvalues $values [1] 4.7642829 2.1486896 1.3730662 0.4968255 T - cell Eigenvectors $vectors NK - cell [,1] [,2] [,3] [,4] UL [1,] 0.2476698 0.00801294 -0.6822740 0.6878210 81 [2,] 0.3389872 -0.72010997 -0.3691532 § -0.4798492 [3,] -0.8298232 0.01550840 -0.5156117 if B-cell -0.2128324 & word : notes [4,] 0.3676152 0.69364033 -0.3638306 are along -0.5013477 d - ④ eigenvector

Properties of Covariance matrix { x } Covmat( ) 7×7 cov ( { x } ; j, k ) = cov ( { x } ; k, j ) ' ' ' s ) [ 1 2 3 4 5 6 7 � The covariance " " 1 * * * * * * * matrix is symmetric ! 2 * * * * * * * � And it’s posi6ve 3 * * * * * * * semi-definite , that is 4 * * * * * * * in all λ i ≥ 0 5 * * * * * * * 6 * * * * * * * � Covariance matrix is 7 * * * * * * * diagonalizable as

Properties of Covariance matrix { x } Covmat( ) � If we define x c as the 7×7 CoV C ' , 2 ) mean centered oil 1 2 3 4 5 6 7 matrix for dataset {x} 1 * * * * * * * Z 2 * * * * * * * Gz Covmat ( { x } ) = X c X T c Z 3 * * * * * * * 63 N 2 4 * * * * * * * 64 Z 5 * * * * * * * 65 � The covariance 2 6 * * * * * * * 06 matrix is a d×d matrix 2 7 * * * * * * * 67 - d =7

What is the correlation between the 2 components for the data m? § � � 20 25 Covmat ( m ) = GT 25 40 ' , feet 2) Corr ( feat - u ) Wr l l , 25 1 Tiki tr

Example: covariance matrix of a data set Mean centering mean ) A 2 = A 1 A T (II) (I) 1 " t � � 5 4 3 2 1 Inner product of each pairs: A 0 = [1,1] = 10 − 1 1 0 1 − 1 A 2 [2,2] = 4 A 2 � � 2 1 0 − 1 − 2 [1,2] = 0 A 2 A 1 = − 1 1 0 1 − 1 ' , 2) I (III) Cov C 0 Corr Cl , 4=0 Divide the matrix with N – the number of data poits � � � � = 1 N A 2 = 1 10 0 2 0 Covmat( ) { x } = 0 . 8 0 4 0 5

What do the data look like when Covmat({x}) is diagonal? X (2) X (1) � � 5 4 3 2 1 A 0 = − 1 1 0 1 − 1 X (2) * * X (1) * - * * g Max or , # ° � � � � { x } = 1 N A 2 = 1 10 0 2 0 Covmat( ) = 0-z.ms ' 0 . 8 0 4 0 5

: gatton Diagonal ' eisjrectz g- Et e - e. Etc : :] c. one : X " M X A X Xx " M = c :;H¥÷÷x÷÷x÷÷ ⇒ U U A = UN UT

Diagonalization of a symmetric matrix � If A is an n × n symmetric square matrix, the eigenvalues are real. � If the eigenvalues are also disSnct, their eigenvectors are orthogonal � We can then scale the eigenvectors to unit length, and place them into an orthogonal matrix U = [ u 1 u 2 …. u n ] � We can write the diagonal matrix such Λ = U T AU that the diagonal entries of Λ are λ 1 , λ 2 … λ n in that order.

Diagonalization example hi ? � For - AIL = 1 A 0 7. a) → ⇒ I ? I I � � 5 3 A = 3 5 eigenvectors ? A U , = 2 Up I , = 2 - 2114=0 ( A 3) v. → ⇒ v. =L . ! ) ✓ = fu , nil I } - fulfil ⇒ a- EH - full ] - t un res - normalized I ) eisen - et - A=?ff A= UTAU

Diagonalization example hi ? � For - NII = 1 A O 7. a) → ⇒ l ? g I � � 5 3 z A = 3 5 eigenvectors ? A 4=80 , a. = 8 - 8114=0 ( A 3) v. → ⇒ v. =/ ! ) ✓ = lui al f } - - tf , ] , ⇒ u . - = ? , An -_ 2 Uz=fzf - I ] T normalized eisen - et - ; ) A= ? ( § A= UTAU

⇒ Rotation Matrix - t RT R Def : = " Ute formed U is V if can prove we generators by , normalized . T called U L are U matrices orthonormal UT rotation N are u matrices .

* of : ! ! ) ' as =L ! ) u - =/ ! ) - f ! ) u . - - ← www.rfmdim ui÷m=÷T " Dot nd ' U z Ui = . Husk ? ' yay , = ? I - ti =

" ZD A T wi :3 - it :c - . T - O 3.1 ut-f.sn " : d UTC Ux ) = ¥ " x - . x u " ✓ = u ⇒ UT . U = I

Q.#Is#this#true?# Transforming+a+matrix+with+ orthonormal+matrix+only+rotates+the+ data+ UT x D A.+Yes+ u x B.+No+

Dimension reduction from 2D to 1D Credit: Prof. Forsyth

Step 1: subtract the mean Credit: Prof. Forsyth

Step 2: Rotate to diagonalize the covariance IT . im ⑧ § txt → u . , Credit: Prof. Forsyth

Step 3: Drop component(s) up -7117 Credit: Prof. Forsyth

Principal Components � The columns of are the normalized eigenvectors of U the Covmat({x}) and are called the principal components of the data {x}

Principal components analysis � We reduce the dimensionality of dataset { x } represented by matrix from d to s (s < d). D d × n � Step 1. define matrix such that m = D − mean ( D ) m d × n r i = U T m i � Step 2. define matrix such that r d × n True tht . Tom Λ = U T Covmat ( { x } ) U Λ Where saSsfies , is U T the diagonalizaSon of with the eigenvalues Covmat ( { x } ) sorted in decreasing order, is the orthonormal U eigenvectors’ matrix � Step 3. Define matrix such that is with the last p p d × n r d-s components of made zero. r

What happened to the mean? � Step 1. mean ( m ) = mean ( D − mean ( D )) = 0 � Step 2. mean ( r ) = U T mean ( m ) = U T 0 = 0 � Step 3. mean ( p i ) = mean ( r i ) = 0 while i ∈ 1 : s mean ( p i ) = 0 while i ∈ s + 1 : d

What happened to the covariances? � Step 1. Covmat ( m ) = Covmat ( D ) = Covmat ( { x } ) T � Step 2. - V m r - Covmat ( r ) = U T Covmat ( m ) U = Λ = A Grunt 3 × 3 ) AT Granat 4A the property for � Step 3. is with the last/smallest d-s Λ Covmat ( p ) diagonal terms turned to 0.

Sample covariance matrix � In many staSsScal programs, the sample covariance matrix is defined to be Covmat ( m ) = m m T C c N − 1 � Similar to what happens to the unbiased standard deviaSon

PCA an example � Step 1. � � � � 3 − 4 7 1 − 4 − 3 0 D = ⇒ mean ( D ) = 7 − 6 8 − 1 − 1 − 7 0 � � 3 − 4 7 1 − 4 − 3 m = 7 − 6 8 − 1 − 1 − 7 � Step 2. � Step 3.

PCA an example � Step 1. � � � � 3 − 4 7 1 − 4 − 3 0 D = ⇒ mean ( D ) = 7 − 6 8 − 1 − 1 − 7 0 � � 3 − 4 7 1 − 4 − 3 m = 7 − 6 8 − 1 − 1 − 7 � Step 2. � � 20 25 Covmat ( m ) = λ 1 ≃ 57; λ 2 ≃ 3 ⇒ 25 40 � � � � 0 . 5606288 0 . 8280672 0 . 5606288 − 0 . 8280672 U T = ⇒ U = − 0 . 8280672 0 . 5606288 0 . 8280672 0 . 5606288 � Step 3.

PCA an example � Step 1. � � � � 3 − 4 7 1 − 4 − 3 0 D = ⇒ mean ( D ) = 7 − 6 8 − 1 − 1 − 7 0 � � 3 − 4 7 1 − 4 − 3 m = 7 − 6 8 − 1 − 1 − 7 � Step 2. � � 20 25 Covmat ( m ) = λ 1 ≃ 57; λ 2 ≃ 3 ⇒ 25 40 � � � � 0 . 5606288 0 . 8280672 0 . 5606288 − 0 . 8280672 U T = ⇒ U = − 0 . 8280672 0 . 5606288 0 . 8280672 0 . 5606288 � � 7 . 478 − 7 . 211 10 . 549 − 0 . 267 − 3 . 071 − 7 . 478 ⇒ r = U T m = 1 . 440 − 0 . 052 − 1 . 311 − 1 . 389 2 . 752 − 1 . 440 � Step 3.

PCA an example � Step 1. � � � � 3 − 4 7 1 − 4 − 3 0 D = ⇒ mean ( D ) = 7 − 6 8 − 1 − 1 − 7 0 � � 3 − 4 7 1 − 4 − 3 m = 7 − 6 8 − 1 − 1 − 7 � Step 2. � � 20 25 Covmat ( m ) = λ 1 ≃ 57; λ 2 ≃ 3 ⇒ 25 40 � � � � 0 . 5606288 0 . 8280672 0 . 5606288 − 0 . 8280672 U T = ⇒ U = − 0 . 8280672 0 . 5606288 0 . 8280672 0 . 5606288 � � 7 . 478 − 7 . 211 10 . 549 − 0 . 267 − 3 . 071 − 7 . 478 ⇒ r = U T m = 1 . 440 − 0 . 052 − 1 . 311 − 1 . 389 2 . 752 − 1 . 440 coordinates → new � Step 3. � � 7 . 478 − 7 . 211 10 . 549 − 0 . 267 − 3 . 071 − 7 . 478 along Pcl ⇒ p = 0 0 0 0 0 0

What is this matrix for the previous example? U T Covmat ( m ) U =? ± e : :L ±

The Mean square error of the projection � The mean square error is the sum of the smallest d-s eigenvalues in Λ d 1 1 � r i − p i � 2 = ( r ( j ) � � � i ) 2 N − 1 N − 1 j = s +1 i i

Probability and Statistics for Computer Science Principal - PowerPoint PPT Presentation

Probability and Statistics for Computer Science Principal Component Analysis --- Exploring the data in less dimensions Credit: wikipedia Hongye Liu, Teaching Assistant Prof, CS361, UIUC, 10.27.2020 Last time Review of Bayesian inference

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Categorical Probability and Statistics Peter McCullagh Department of Statistics University of

Counting and Probability Whats to come? Counting and Probability Whats to come?

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Which probability Which probability Which probability Which probability theory for cosmology?

Recap of Basic Probability Elements of basic probability theory probability theory The

1 2 3 4 Stopping Probability Visiting Probability 5 Stopping

ACMS 20340 Statistics for Life Sciences Chapter 9: Introducing Probability Why Consider

Statistics 1B Statistics 1B 1 (11) 0. Lecture 1. Introduction and probability review

Statistics 370 Probability and Statistics for Engineers Instructor: Peter Bloomfield Course

Chapter II.2: Basic Probability Theory and Statistics 1. What is a probability? 1.1. Probability

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

ROBERT E. HOKE INDEPENDENT EVALUATION CONSULTANT GOALS FOR THIS SESSION Build on the lessons

RF Power David Peterson, James Steimel DOE Independent Project Review of PIP-II 15 November 2016

Application Characteristics and Performance on a Cray XE6 Performance on a Cray XE6 Courtenay T.

May 9, 2017 8:30am ET 1 1Q Safe Harbor Statement Certain statements made within this

EGI-EUDAT joint access to data and computing services: an executive report DI4R - Brussels

Long-Term Observations of NMHCs from the IAGOS-CARIBIC Flying Observatory Angela K. Baker 1 ,

Transitioning To The Cloud Josh Graham SaaS Architect jgraham@atlassian.com @delitescere

Integration of renewable energy sources and demand-side management into distribution networks by