Factor Analysis ! " " Leibny Paola Garca Perera. " - PowerPoint PPT Presentation

Factor Analysis ! " " Leibny Paola García Perera. " Carnegie Mellon University. " Tecnológico de Monterrey, Campus Monterrey, Mexico " Universidad de Zaragoza, Spain. " " Bhiksha Raj, Juan Arturo Nolazco Flores, Eduardo Lleida " " " " "

Agenda " Introduction "  Motivation: "  Dimension reduction "  Modeling: covariance matrix "  Factor Analysis (FA) "  Geometrical explanation "  Formulation (The Equations) "  EM algorithm "  Comparison with PCA and PPCA. "  Example with numbers "  Applications "  Speaker Verification: Joint Factor Analysis (JFA) "  Some results "  References " 

Introduction " Problem: Lots of data with n-dimensions vectors. " Example: " " ! $  a 11 a 12 a 13 a 1 P # & "  a 21 a 22 a 23 a 2 P # & Feature " # & "  Y = a 31 a 32 a 33 a 3 P # & Vectors " " #      & # &  a N 1 a N 2 a N 3 a NP " # & " % P >> 1 " Can we reduce the number of " dimensions? To reduce computing time, simplify process? " " YES!  "

Introduction: ! Covariance matrix " What can give us information of the data? (Just for this special  case) " The covariance matrix "  Get rid of not important information. "  Think of continuous factors that control the data. "  "

Factor Analysis (FA) " What is Factor Analysis? "  Analysis of the covariance in observed variables (Y). "  In terms of few (latent) common factors. "  Plus a specific error " x 2 x 1 x 3 λ 1 λ 2 λ 3 " ⊕ ⊕ ⊕ ⊕ ε 3 ε 2 ε 4 ε 1 y 1 y 2 y 3 y 4

Factor Analysis (FA): Geometrical Representation " x 3 " y λ 1 ε µ x " λ 2 x 2 x 1

Factor Analysis (FA): ! Formulation (the equations) " Assumptions " Form " " y − µ = Λ x + ε Ε ( x ) = Ε ( ε ) = 0 " T ) = Ι y = Λ x + ε Ε ( ΛΛ " y → P × 1 data vector " " " % 0 0 ψ 11 $ ' µ → P × 1 mean vector " T ) = ψ = "  Ε ( εε 0 0 $ ' $ ' loading Matrix " Λ → P × R 0 0 " ψ PP $ ' # & factor vector " x → R × 1 " Ε ( y , x ) = Λ error vector " ε → P × 1 " T ) = ΛΛ T + Ψ Full rank!! " Σ = Ε ( yy

Factor Analysis (FA): ! Formulation (the equations) " Now that we have checked the matrices dimensions. " The model: " ( ) ( ) = N x 0, I p x " ( ) = N ( y µ + Λ x , Ψ ) p y x , θ " " Quick notes: " ! ( ) p x , y " # # Are Gaussians!! " ( ) p y " " # ( ) p x y " # $ "

Factor Analysis (FA): ! Formulation (the equations) " Now, we can compute: " ) dx = N ( y u , ΛΛ T + Ψ ) " ( ) = ( ∫ p y θ p ( x ) p y x , θ " x This marginal is… a Gaussian!! " Compute the expected value and covariance. " Ε ( y ) = Ε ( µ + Λ x + ε ) = Ε ( µ ) + ΛΕ ( x ) + Ε ( ε ) = µ " $ & T Cov( y ) = Ε ( y − µ )( y − µ ) " % ' " $ & $ & T T = Ε ( µ + Λ x + ε − µ )( µ + Λ x + ε − µ ) ' = Ε ( Λ x + ε )( Λ x + ε ) % % ' " T + Ε εε T + Ψ $ T & $ T & = ΛΕ xx 'Λ ' = ΛΛ % % "

Factor Analysis (FA): Formulation (the equations) " So, factor analysis is a constrained covariance Gaussian Model!! " " ) = N ( y µ , ΛΛ T + Ψ ) ( p y θ " So, what is the covariance? " " ! $ Λ T 0 0 ψ 11 " # & + cov( y ) = Λ  0 0 # & " # & 0 0 ψ PP " # & " % " "

Factor Analysis (FA): ! Formulation (the equations) " How can we compute the likelihood function? " − 1 y n − µ T ΛΛ T + Ψ ) = − N 2 log ΛΛ T + Ψ − 1 y n − µ  θ , D ∑ ( ) ( ) ( ) ( " 2 n $ ' " ) = − N 2 log Σ − 1 y n − µ ) y n − µ T ( ( )  θ , D 2 tr Σ − 1 ∑ ( & ) " % ( n " ) = − N 2 log Σ − 1  θ , D ( ) 2 tr Σ − 1 S ( " S is the sample data covariance Matrix. " Conclusion: " Constrained model close to the Sample covariance! "

Factor Analysis (FA): ! Formulation (the equations) " So we need sufficient statistics… " " ∑ y n mean: " n " y n − µ ) y n − µ T ( ( ) ∑ covariance: " n " "

Factor Analysis (FA): ! Expectation Maximization " µ How to estimate ? "  Just compute the mean of the data. "  For the rest of the parameters ? " Λ , Ψ  Expectation Maximization "  " " "

Factor Analysis (FA): ! Expectation Maximization " Advantages "  Focuses on maximizing the likelihood "  Disadvantages "  Need to know the distribution "  No analytical solution "  " " "

Factor Analysis (FA): ! Expectation Maximization " Remember EM algorithm? " E-step: "  t + 1 = p x n y n , θ t ( ) q n " M-step "  x n y n θ t + 1 = argmax ) log p y n , x n θ ( ( ) dx n ∑ t + 1 ∫ q n " θ x n " "

Factor Analysis (FA): ! Expectation Maximization " What do we need? " E-step: "  " " Conditional probability!!! " " t + 1 = p x n y n , θ t ) = N x n m n , Σ n ( ( ) q n " M-step: "  " " Log of the complete data for: " " Λ t + 1 = argmax ( ) q t + 1 ∑  x n , y n " Λ n n Ψ t + 1 = argmax " ( ) q t + 1 ∑  x n , y n " Ψ n n

Factor Analysis (FA): ! Expectation Maximization " ( ) What else is needed? " p x y " Let’s start with: " ' * ! $ ' * ! $ ! $ ! $ x x & | 0 Λ T I ) , p , = N & , # & ) # & # # " ) ΛΛ T + Ψ , y y µ " % " % " % # & Λ ( + " % ( + " Remember that, " " ( T ) ( ) = E x µ + Λ x + ε − u ) = E ( x − 0)( y − u ) T ( ( ) cov x , y " ( ) = Λ T T ( ) = E x Λ x + ε " "

Factor Analysis (FA): ! Expectation Maximization " Now, " Remembering Gaussian conditioning formulas " " ( ) = N ( x m , V ) p x y " − 1 y − u m = Λ ΛΛ T + Ψ ( ) ( ) " − 1 Λ V = I − Λ T ΛΛ T + Ψ ( ) " Remember inversion lemma? " " − 1 = Ψ − 1 + Ψ − 1 Λ I + Λ T Ψ − 1 Λ − 1 Λ T Ψ − 1 Σ − 1 = ΛΛ T + Ψ ( ) ( ) " Inverting this matrix is much more efficient O(MP) instead of O(P 2 ), thanks to the lemma. " "

Factor Analysis (FA): ! Expectation Maximization " We finally obtain: " " " ( ) = N ( x m , V ) p x y " − 1 ( ) V = I − Λ T Ψ − 1 Λ " m = V Λ T Ψ − 1 y − u ( ) " " " " "

Factor Analysis (FA): ! Expectation Maximization " Some nice observations: " " ( ) = N ( x m , V ) p x y " − 1 ( ) V = I − Λ T Ψ − 1 Λ " m = V Λ T Ψ − 1 y − u " ( ) Means that the posterior mean is just a linear operation!!! " And the covariance does not depend on the observed data!!! " − 1 " ( ) V = I − Λ T Ψ − 1 Λ "

Factor Analysis (FA): ! Expectation Maximization " How does it look? " " x 3 " " y " " µ " x 2 " " x 1 "

Factor Analysis (FA): ! Expectation Maximization " Let’s subtract the mean for our computation. " The likelihood for the complete data is: "  Λ , Ψ ( ) ∑ log p x n , y n ( ) = " n " + log p x n y n ( ) ( )  Λ , Ψ ∑ log p x n ( ) = " n T Ψ − 1 y n − Λ x n ) = − N 2 log Ψ − 1 x T x − 1 " y n − Λ x n  Λ , Ψ ( ) ( ) ∑ ∑ ( 2 2 " n n ) = − N 2 log Ψ − N  Λ , Ψ 2 tr ( S Ψ − 1 ) ( " " T y n − Λ x n S = 1 y n − Λ x n ∑ ( ) ( ) " N n

Factor Analysis (FA): ! Expectation Maximization " Now, let’s compute the M step! (Almost there!) " We need to calculate the derivatives of the log likelihood " ∂  Λ , Ψ " ( ) = −Ψ − 1 ∑ y n x T + Ψ − 1 Λ ∑ x n x T n n " ∂Λ n n ∂  Λ , Ψ " ( ) = N Ψ − NS ∂Ψ − 1 " 2 2 t q And the expectations with respect to " " E  ' Λ ] = −Ψ − 1 ∑ y n m T + Ψ − 1 Λ ∑ [ V n n n n " [ ] − N ⋅ E S ( = N Ψ E  ' Ψ − 1 % ' " & 2 2

Factor Analysis (FA): ! Expectation Maximization " Finally, set the derivatives to zero and solve! " " − 1 # & # & Λ t + 1 = ∑ y n m nT ∑ V n " % ( % ( $ ' $ ' n n " # & Ψ t + 1 = 1 ∑ y n y nT + Λ t + 1 ∑ m n y nT N diag % ( " $ ' n n " " " " "

Factor Analysis ! " " Leibny Paola Garca Perera. " - PowerPoint PPT Presentation

Factor Analysis ! " " Leibny Paola Garca Perera. " Carnegie Mellon University. " Tecnolgico de Monterrey, Campus Monterrey, Mexico " Universidad de Zaragoza, Spain. " " Bhiksha Raj, Juan Arturo Nolazco

Triadic Factor Analysis Cynthia Glodeanu Institute of Algebra, TU Dresden October 19, 2010.

Confirmatory Factor Analysis and Exploratory-Confirmatory Factor Analysis Maximum

Week 7 Video 5 Factor Analysis Factor Analysis You have a whole lot of variables Can

Attribute Grammars intermediate syntax semantics representation Language Implementation 2

Certainty Factor certainty factor CF (is the certainty factor in the hypothesis H due to

(IHBG) Competitive NOFA Training Rating Factor 3: Soundness of Approach 1 Rating Factor 3

Predicting condition specific transcription factors for target gene. Kaur Alasoo 19.09.2012

Rating Factor 1 Review Rating Factor 1 Capacity of the Applicant 1 Rating Factor Review 2

Factor Analysis and Beyond Chris Williams School of Informatics, University of Edinburgh October

Factor Analysis Professor Patrick Sturgis Plan Measuring concepts using latent variables

Exploratory Factor Analysis: A Practical Guide James H. Steiger Department of Psychology and

Probabilistic Graphical Models 10-708 Factor Analysis and State Space Factor Analysis and State

Tumor Necrosis Factor Tumor Necrosis Factor A.A. 2006- 2007 TNF Receptor Family TNF Receptor

Factor VIII and factor IX development plans at the Paediatric Committee Overview Presented by:

Strategic Issues in Strategic Issues in Factor Investing Factor Investing Andrew Ang February

Real Application of Factor Investing in SA Ann Sebastian STANLIB FACTOR INVESTING WHAT IS IT?

Resolution 3ai Resolution is a clausal refutation system (it tries to derive False from Givens:)

JUST THE MATHS SLIDES NUMBER 15.3 ORDINARY DIFFERENTIAL EQUATIONS 3 (First order

JUST THE MATHS SLIDES NUMBER 1.8 ALGEBRA 8 (Polynomials) by A.J.Hobson 1.8.1 The

Outline Experiments with Two Factors (14.1) Experiments with Three or More Factors (14.2)

BOOLEAN MATRIX FACTORIZATIONS Pauli Miettinen Leap day, 2012 MATRIX FACTORIZATIONS

Loop Invariants: Part 2 7 January 2019 OSU CSE 1 Maintaining the Loop Invariant A claimed

t

CS 133 - Introduction to Computational and Data Science Instructor: Renzhi Cao Computer Science