analysis of sorting data using multiple
play

Analysis of sorting data using multiple correspondence analysis and - PowerPoint PPT Presentation

Analysis of sorting data using multiple correspondence analysis and a related method E.M. Qannari Ph. Courcoux V. Cariou ONIRIS, Nantes, F-44322, France 1 Sorting data : Procedure n stimuli evaluated by m subjects: Please, sort the


  1. Analysis of sorting data using multiple correspondence analysis and a related method E.M. Qannari Ph. Courcoux V. Cariou ONIRIS, Nantes, F-44322, France 1

  2. Sorting data : Procedure n stimuli evaluated by m subjects: “ Please, sort the stimuli in as many groups as you consider necessary with the understanding that stimuli in the same group are perceived as similar ” Acid Salty Salty Fresh Sweet Bitter Subject 1 Subject 2 Subject m 2

  3. General setting and notations K m groups K 2 groups K j group K 1 group indicators indicators indicators indicators n X 1 X 2 X m X j m categorical variables (represented by their indicator variables) 3

  4. Beer data Data from Abdi H., Chollet S., Valentin D. and Chréa C. (2007) Analysing assessors and products in sorting tasks: DISTATIS,theory and applications. Food Quality and Preference. 4

  5. Data from Abdi et al. (2007) • The data relate to an experiment where ten consumers were instructed to sort eight commercial beers. # Beer Subj1 Subj2 Subj3 Subj4 Subj5 Subj6 Subj7 Subj8 Subj9 Subj10 1 Affligen 1 4 3 4 1 1 2 2 1 3 2 Budweiser 4 5 2 5 2 3 1 1 4 3 3 BucklerBlonde 3 1 2 3 2 4 3 1 1 2 4 Killian 4 2 3 3 1 1 1 2 1 4 5 StLandelin 1 5 3 5 2 1 1 2 1 3 6 BucklerHighland 2 3 1 1 3 5 4 4 3 1 7 FruitDefendu 1 4 3 4 1 1 2 2 2 4 8 EKU28 5 2 4 2 4 2 5 3 4 5 5

  6. Discrimination indices and MCA • Given a (quantitative) variable z and let’s consider (categorical) variable X j :  2 (z/j) : discrimination index : the between groups to total variance ratio associated with z and X j . • We seek z so as to maximize : m    2 I ( z ) ( z / j )  j 1 • It is know that this problem leads to MCA • Subsequent z variables (factors) are sought following the same strategy, under orthogonality constraints. 6

  7. Standardized MCA • Alternatively: m 1    2 I ( z ) ( z / j ) K  j 1 j 7

  8. MCA applied to beer data Reprsentation of the beers axes 3&4 Reprsentation of the beers axes 1&2 Buckler Blonde EKU28 0.4 Fruit Defendu 0.8 0.2 Affligen 0.6 EKU28 Killian 0.0 Buckler Highland axis 2 0.4 axe 4 -0.2 0.2 St Landelin -0.4 0.0 Buckler Highland Budweise r Killian -0.6 St Landelin Buckler Blonde -0.2 Affligen Budweiser Fruit Defendu 0.0 0.2 0.4 0.6 0.8 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 axis 1 axe 3 8

  9. Alternative method: maximizing the between groups variances • X=[X 1 , X 2 , …, X m ] (the indicator variables supposed to be centered) • Let z=Xu and denote by B(z/j) the between groups variance of z with respect to X j . • We define the total between groups variance as: m   B ( z ) B ( z / j )  j 1 9

  10. An alternative method to MCA • We can show that the vector of loadings u is an eigenvector of the matrix (associated with the largest eigenvalue).     m     1  T T T T X X X X X X X PX   j j j j    j 1   m   1  T T with P X X X X j j j j  j 1 • Subsequent z variables can be sought following the same strategy, under orthogonality constraints. 10

  11. The rationale behind the method of analysis • In addition to investigating the relationships between the categorical variables, we take account of the variances of the indicator variables. • VAR(Indicator)=p*(1-p) Variance of an indicator variable 0.25 0.20 p(1-p) 0.15 0.10 0.05 Presence of Presence of 0.00 rare categories rare categories 0.0 0.2 0.4 0.6 0.8 1.0 p 11

  12. Alternative method applied to beer data Representation of the beers axes 1&2 Representation of the beers axes 3&4 Buckler Highland 2.0 Buckler Blonde 1.5 1.5 Fruit Defendu 1.0 Fruit Defendu 1.0 0.5 EKU28 Affligen Affligen axis 2 0.5 axis 4 Killian 0.0 Killian Buckler Highland 0.0 -0.5 St Landelin -0.5 -1.0 St Landelin -1.0 Buckler Blonde -1.5 EKU28 Budweiser Budweiser -1.5 -1 0 1 2 -2 -1 0 1 axis 1 axis 3 12

  13. A continuum approach • MCA z=Xu with u eigenvetor of :  T 1 T ( X X ) X PX • Alternative method z=Xu with u eigenvetor of : X T PX • Regularized MCA: z=Xu with u eigenvetor of :    1     T T 1 X X I X PX 13

  14. continuum approach and Ridge Regularization The eigenvectors of :    1     T T 1 X X I X PX are also eigenvectors of :  1  T T X X kI X PX Ridge regularization   with k     1 14

  15. RMCA (lambda=0.95) Représentation des produits axes 1&2 Représentation des produits axes 3&4 EKU28 Buckler Blonde 1.5 2 1.0 Fruit Defendu 1 Budweiser 0.5 Affligen Buckler Blonde Killian EKU28 axe 2 axe 4 0.0 Buckler Highland Killian 0 St Landelin -0.5 Affligen -1.0 St Landelin -1 Fruit Defendu Buckler Highland -1.5 Budweiser -1 0 1 2 -2 -1 0 1 axe 1 axe 3 15

  16. Property 1 illustrated on beer data The variance of z increases with  Alternative MCA 16

  17. Property 2 illustrated on beer data The between groups variance of z increases with  Alternative MCA 17

  18. Property 3 illustrated on beer data The discrimination index (between to total variance ratio) of z decreases with  0.0 0.2 0.4 0.6 0.8 1.0 Alternative lambda MCA 18

  19. Conclusion • Proposition of an alternative method that handles the problem of rare categories • Further research work is needed to investigate this alternative method. • Proposition of a continuum approach whose end points are MCA and the alternative method. • This approach enjoys interesting properties and can easily be extended to the framework of Generalized Canonical Correlation Analysis. • See how it relates to Regularized MC by Takane and Hwang. 19

  20. TRUGAREZ! 20

  21. Co-occurrence matrix B e e r s 1 2 3 4 5 6 7 8 1 10 1 1 5 6 0 8 0 2 1 10 3 2 5 0 0 1 3 1 3 10 2 2 0 0 0 B e e r s 4 5 2 2 10 5 0 5 1 5 6 5 2 5 10 0 4 0 0 0 0 0 0 10 0 0 6 7 8 0 0 5 4 0 10 0 8 0 1 0 1 0 0 0 10 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend