+ The right answer to the wrong question The use of factor analysis - PowerPoint PPT Presentation

+ The right answer to the wrong question The use of factor analysis and principal component analysis in the social sciences Jonathan Rose Research Fellow University of Nottingham Jonathan.Rose@Nottingham.ac.uk

+ Before we start  A note of caution from the introductory section of a chapter on factor analysis and principal component analysis in The R Book :  These techniques are not recommended unless you know exactly what you are doing, and exactly why you are doing it. Beginners are sometimes attracted to multivariate techniques because of the complexity of the output they produce, making the classic mistake of confusing the opaque for the profound. (Crawley, 2007: 731)  This may somewhat be overstating the case, but is none the less a healthy reminder. In extremis, people‟s lives are being staked on incorrect models (more on this later).

+ A fundamental conception of latent variables  Latent structure: the possibility that the variance in the observed variables (indicators) can be accounted for by a smaller number of latent variables, which are conceivably of a more fundamental nature.  These variables are „latent‟ in the sense that they are not observed, and may well be unobservable .  Think, for instance, of intelligence, trust, confidence, happiness, etc.  Almost everything we are really interested in measuring is a latent variable, even if we don‟t use latent variable models.

+ What you want  A method to analyze the structure of data  Either by testing for a specific structure (confirmatory models), or by attempting to discover a structure through various means (exploratory models)  The understanding of which will tell you which of your indicators are „like‟ the others, and which are „different‟: basically, what we can lump together and what we can‟t.

+ Why do you need that?  More reliable measurement  Require fewer variables in an analysis  Avoid multicollinearity  Understand deep-seated processes that drive responses  Help with conceptualizing the world  Avoid spuriously high correlations caused by analyzing two halves of the same whole as if they were in a cause and effect relationship

+ So what might you do?  For many people, the first response would be this:

+ Then poke around the options…  Now we‟re really getting somewhere

+ This is going surprisingly well!  Now, we just move the variable names over. That big friendly „OK‟ button looks so inviting. I bet if I press that I‟ll get my factor analysis…The defaults will be fine. What‟s the worst that can happen?

+ Result  We have findings.  Yay! Science!  Now: interpret the numbers!

+ But what did we actually get here?  Remember the method of analysis we chose?  And remember the title of the options box?  Any guesses?

+ That‟s right!  We got a Principal Components Analysis (PCA).  If you look carefully, there are clues that this is what you‟re getting; but they don‟t make it anywhere near as explicit as it ought to be…

+ I‟ve heard of PCA – isn‟t it basically the same thing as a factor analysis?  No, despite how they are usually treated.  There are similarities, which we will discuss in a short while – but the take-home message of the presentation is that PCA and FA are fundamentally different things, even if the results can be similar in some circumstances.

+ Terminological confusion  Factor analysis has one of the most confused and contradictory terminologies of any analytical method  Confusion around principal components analysis and factor analysis  Confusion between various kinds of factor analysis  Confusion as to what you get out (e.g. factors, components, principal components…)  And that is without dealing with extraction system, eigenvalues, factor retention criteria, loadings…

+ Perpetuating confusion  One of the things that perpetuates confusion is the habit in introductory texts to deliberately conflate FA and PCA.  For example, in SPSS Survival Manual (2007, 3 rd Ed.), Pallant says, in the chapter called „Factor Analysis‟, “I have chosen to demonstrate principal components analysis in this chapter. If you would like to explore other approaches further, see Tabachnick and Fidell (2007)”.  Judging by sales, and the number of copies in the library at Nottingham, this book is clearly a popular way to learn about quantitative analysis using SPSS – but even in the FA chapter they don‟t discuss FA.

+ Perpetuating confusion  You might have seen in research papers people saying things like: “ we employed a principal components factor analysis (PCF) to aggregate groups of attitudinal questions that reflect a common cluster”. Or “ We performed a principal component factor analysis of all drug prescriptions during the entire course of the illness in a representative sample of naturalistically treated bipolar outpatients.” Or countless other examples.  „Principal components factor analysis‟ basically doesn‟t exist, it is a conflation of PCA and FA – and it‟s difficult to know exactly what one gets when papers say that they did this.

+ But PCA and FA are similar, right?  Somewhat. Indeed, sometimes people argue that “either that there is almost no difference between principal components and factor analysis, or that PCA is preferable (Arrindell & van der Ende, 1985; Guadagnoli and Velicer, 1988; Schoenmann, 1990; Steiger, 1990; Velicer & Jackson, 1990) .” (from Costello & Osborne, 2005, Best Practices in Exploratory Factor Analysis )  However…

+ PCA vs Factor Analysis  Whilst there are overlaps, and sometimes the solutions are similar, they are fundamentally different procedures. They are different:  Conceptually  Mathematically  Practically  However, you should note that how different analyses will be in practice is not easily specified before hand

+ Conceptual matters  A very general latent variable model  Applies to all kinds of latent variable models  Multiple causes of manifest items  But with an important shared cause  (note that this is slightly different from how you might see such models elsewhere).

+ The factor analytic conceptual model  Conceptually much like other latent variable models  Unique components are included in the „error‟; they are standardly lumped together because in reality you cannot separate them

+ The PCA conceptual model  Notably different from the FA model, and from the conceptual model of latent variables

+ PCA and causality  It is also more difficult to interpret PCA as a causal model, since PCA is aiming to give you a a number of linear combinations of the variables so as to capture the variance in the set of items as a whole, rather than an analysis of shared variance (as in FA). This breaks (standard) conceptual models of causality.  There is no need for the relationship to be causal, and so it‟s not such a big deal when people introduce items that are clearly not caused by an underlying factor.

+ Mathematics  The equations underlying the procedures reflect this difference in approaches.  For factor analysis, the model is:  For PCA it is:

+ The mathematical differences between FA and PCA  It‟s easy to see that the equations are different. One includes error and unique variance, and the other does not. But this difference means that the analyses are not even conducted upon the same information.

+ The PCA matrix

+ The factor analysis matrix

+ Different matrices, different answers?  So, we have seen that the mathematics are different, and that means that we use different matrices for our analysis – but does that mean that we are likely to see radically different results when we perform analyses?  According to Dunteman (1989) in the Sage green book on PCA, “Both principal components analysis and factor analysis give similar results if the communalities of the variables are high and/or there are a large number of variables”  That the communalities being high makes a difference is not surprising, since it makes the diagonal increasingly close to 1 (which is how it is in PCA).

+ Practical matters  If there were no practical implications of the choice between FA and PCA, or only minor ones, there would be very little to worry about. Yes, one model might be formally inappropriate, but we use formally inappropriate models all the time: linear regression of dichotomous items, SEM of non- multivariate normal data, etc., etc.  Unfortunately, FA and PCA are particularly susceptible to small deviations – not really because of any mathematical quirk, because of you. FA and PCA, perhaps more than any other method of analysis, require a significant degree of interpretation and theoretical consideration. Coefficients never fully speak for themselves, but they do so even less in FA/PCA than we are used to.

+ A worked example  Data on the psychological impact of Huntington‟s Disease  1803 cases  Dealing with:  Depressed mood  Aggression  Irritability  Low self-esteem  Hallucinations  Suicidal thoughts  Delusions  Anxiety  Compulsions  Perseveration  Apathy

+ The right answer to the wrong question The use of factor analysis - PowerPoint PPT Presentation

+ The right answer to the wrong question The use of factor analysis and principal component analysis in the social sciences Jonathan Rose Research Fellow University of Nottingham Jonathan.Rose@Nottingham.ac.uk + Before we start A note of

Whats wrong with the What s wrong with the What s wrong with the Whats wrong with the

Computer Adaptive Testing Lawrence M. Rudner Consultant Paper & Pencil 1.00 0.75 Easiness

Triadic Factor Analysis Cynthia Glodeanu Institute of Algebra, TU Dresden October 19, 2010.

Properties and Applications of Wrong Answers in Online Educational Systems Radek Pel

Finding the Right Target Audience Defining the Right Audience Right Visitors Right Time

Attribute Grammars intermediate syntax semantics representation Language Implementation 2

Certainty Factor certainty factor CF (is the certainty factor in the hypothesis H due to

(IHBG) Competitive NOFA Training Rating Factor 3: Soundness of Approach 1 Rating Factor 3

Predicting condition specific transcription factors for target gene. Kaur Alasoo 19.09.2012

Rating Factor 1 Review Rating Factor 1 Capacity of the Applicant 1 Rating Factor Review 2

Confirmatory Factor Analysis and Exploratory-Confirmatory Factor Analysis Maximum

Matrix COSEC Right People in Right Place at Right Time Matrix COmplete SECurity Matrix COSEC

light right light right light right light right to steady the tongue, hold the sides of

Part 3 Terroir is fragile Can be lost through: High yields Wrong grape varieties in wrong place

Why I Was Wrong About TypeScript TJ VanToll TypeScript TypeScript TypeScript Why I Was Wrong

Defences Structure of the Courts What is a Crime? a public wrong Wrong committed

Introduction to Principal Component Analysis and Indepedent Component Analysis Tristan A. Hearn

Advanced Section #4: Methods of Dimensionality Reduction: Principal Component Analysis (PCA)

Advanced Section #4: Methods of Dimensionality Reduction: Principal Component Analysis (PCA)

Top Feeds Errors on Shopping How to fix it in Lengow Rozenn LHelgoualch - Shopping Specialist

On the Karhunen-Love basis for continuous mechanical systems R. Sampaio Pontifcia

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

Application of Big Data Analytics via Soft Computing Yunus Yetis INTRODUCTION System of

Prediction of HIV viral tropism based on NGS data Nico Pfeifer Max Planck Institute for

Sambuz

Useful Links

Newsletter

Mail Us

+ The right answer to the wrong question The use of factor analysis - PowerPoint PPT Presentation

+ The right answer to the wrong question The use of factor analysis and principal component analysis in the social sciences Jonathan Rose Research Fellow University of Nottingham Jonathan.Rose@Nottingham.ac.uk + Before we start A note of

Whats wrong with the What s wrong with the What s wrong with the Whats wrong with the

Computer Adaptive Testing Lawrence M. Rudner Consultant Paper &amp; Pencil 1.00 0.75 Easiness

Triadic Factor Analysis Cynthia Glodeanu Institute of Algebra, TU Dresden October 19, 2010.

Properties and Applications of Wrong Answers in Online Educational Systems Radek Pel

Finding the Right Target Audience Defining the Right Audience Right Visitors Right Time

Attribute Grammars intermediate syntax semantics representation Language Implementation 2

Certainty Factor certainty factor CF (is the certainty factor in the hypothesis H due to

(IHBG) Competitive NOFA Training Rating Factor 3: Soundness of Approach 1 Rating Factor 3

Predicting condition specific transcription factors for target gene. Kaur Alasoo 19.09.2012

Rating Factor 1 Review Rating Factor 1 Capacity of the Applicant 1 Rating Factor Review 2

Confirmatory Factor Analysis and Exploratory-Confirmatory Factor Analysis Maximum

Matrix COSEC Right People in Right Place at Right Time Matrix COmplete SECurity Matrix COSEC

light right light right light right light right to steady the tongue, hold the sides of

Part 3 Terroir is fragile Can be lost through: High yields Wrong grape varieties in wrong place

Why I Was Wrong About TypeScript TJ VanToll TypeScript TypeScript TypeScript Why I Was Wrong

Defences Structure of the Courts What is a Crime? a public wrong Wrong committed

Introduction to Principal Component Analysis and Indepedent Component Analysis Tristan A. Hearn

Advanced Section #4: Methods of Dimensionality Reduction: Principal Component Analysis (PCA)

Advanced Section #4: Methods of Dimensionality Reduction: Principal Component Analysis (PCA)

Top Feeds Errors on Shopping How to fix it in Lengow Rozenn LHelgoualch - Shopping Specialist

On the Karhunen-Love basis for continuous mechanical systems R. Sampaio Pontifcia

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

Application of Big Data Analytics via Soft Computing Yunus Yetis INTRODUCTION System of

Prediction of HIV viral tropism based on NGS data Nico Pfeifer Max Planck Institute for

Sambuz

Useful Links

Newsletter

Mail Us

Computer Adaptive Testing Lawrence M. Rudner Consultant Paper & Pencil 1.00 0.75 Easiness