regularized generalized cca
play

Regularized generalized CCA (RGCCA) Arthur Tenenhaus (SUPELEC) - PowerPoint PPT Presentation

Regularized generalized CCA (RGCCA) Arthur Tenenhaus (SUPELEC) Michel Tenenhaus (HEC Paris) 1 Regularized generalized CCA A generalization to more than two blocks of regularized canonical correlation analysis 2 References Paper Arthur


  1. Regularized generalized CCA (RGCCA) Arthur Tenenhaus (SUPELEC) Michel Tenenhaus (HEC Paris) 1

  2. Regularized generalized CCA A generalization to more than two blocks of regularized canonical correlation analysis 2

  3. References • Paper Arthur & Michel Tenenhaus Regularized Generalized CCA Psychometrika (june 2011) • R-code New package RGCCA with initial version 1.0 Title : Regularized Generalized Canonical Correlation Analysis Version : 1.0 Date : 2010-06-08 Author : Arthur Tenenhaus Repository : CRAN Date/Publication : 2010-10-15 14:58:02 More information about RGCCA at CRAN Path: /cran/new | permanent link 3

  4. Economic inequality and political instability Data from Russett (1964), in GIFI Economic inequality Political instability INST : Instability of executive Agricultural inequality (45-61) GINI : Inequality of land ECKS : Nb of violent internal distributions war incidents (46-61) FARM : % farmers that own half DEAT : Nb of people killed as a of the land (> 50) result of civic group RENT : % farmers that rent all violence (50-62) their land D-STAB : Stable democracy Industrial development D-UNST : Unstable democracy GNPR : Gross national product DICT : Dictatorship per capita ($ 1955) LABO : % of labor force employed in agriculture 4

  5. Economic inequality and political instability (Data from Russett, 1964) X3 X1 X2 Gini Farm Rent Gnpr Labo Inst Ecks Deat Demo 86.3 98.2 32.9 374 25 13.6 57 217 2 Argentine 92.9 99.6 * 1215 14 11.3 0 0 1 Australie 74.0 97.4 10.7 532 32 12.8 4 0 2 Autriche  58.3 86.1 26.0 1046 26 16.3 46 1 2 France  43.7 79.8 0.0 297 67 0.0 9 0 3 Yougoslavie 1 = Stable democracy 2 = Unstable democracy Three data blocks 3 = Dictatorship 5

  6. Block component     y X a a GINI a FARM a RENT 1 1 1 11 12 13    y X a a GNPR a LABO 2 2 2 21 22     y X a a INST a ECKS a DEATH 3 3 3 31 32 33    - - a D STB a D UNST a DICT 34 35 36 6

  7. RGCCA applied to the Russett data Agricultural inequality (X 1 ) INST GINI Agr. ECKS C 13 = 1 FARM ineq. DEAT Pol. RENT C 12 = 0 D-STB inst. GNPR D-INS Ind. C 23 = 1 dev. DICT LABO Political instability (X 3 ) Industrial development (X 2 )  Maximize g(Cov( , )) g(Cov( , )) X a X a X a X a 1 1 3 3 2 2 3 3 , , a a a 1 2 3 2       subject to the constraints (1 ) ( ) 1, 1,2,3 a Var X a j j j j j j 7 0 ≤  j ≤ 1, g = identity, square or absolute value

  8. The two-block case: Regularized CCA Maximize Cov( , ) X a X a 1 1 2 2      subject to (1 )Var( ) 1 a X a j j j j j Special cases Method Criterion Constraints   1 a a Maximize Cov( , ) X a X a PLS regression 1 1 2 2 1 2 Canonical   Maximize Cor( , ) Var( ) Var( ) 1 X a X a X a X a Correlation 1 1 2 2 1 1 2 2 Analysis  Redundancy Maximize 1 a 1 analysis of X 1 with  1/2 Cor( , )Var( ) X a X a X a Var( ) 1 X a respect to X 2 1 1 2 2 1 1 2 2 No stability condition Components X 1 a 1 and 1 st component is stable 8 for 2 nd component X 2 a 2 are well correlated.

  9. The two-block case: Regularized CCA Maximize Cov( , ) X a X a 1 1 2 2      subject to (1 )Var( ) 1 a X a j j j j j Special cases Method Criterion Comments Is favoring too much Maximize Cov( , ) X a X a 1 1 2 2 PLS regression stability with respect to   a a 1 1 2 correlation Is favoring too much Canonical Correlation Maximize Cor( , ) X a X a correlation with respect to 1 1 2 2 Analysis stability 9

  10. Choice of the shrinkage constant  j Maximize Cov( , ) X a X a 1 1 2 2      subject to (1 )Var( ) 1 a X a j j j j j  j 0 1 Favoring Favoring correlation stability Schäfer and Strimmer (2005) give a formula for an optimal choice of  j . 10

  11. Regularized generalized CCA J  Maximize c g( ( , )) Cov X a X a jk j j k k 1 ,..., a a   J j k , 1, j k 2       subject to the constraints (1 ) ( ) 1, 1,. .., a Var X a j J j j j j j A monotone convergent algorithm  1 if X and X are connected related to this optimization problem j k   c where: jk 0 otherwise  will be described.  identity (Horst scheme)    g square (Factorial scheme)  abolute value (Centroid scheme)    Shrinkage constant between 0 and 1 and: 11 j

  12. Construction of a monotone convergent algorithm for RGCCA • Construct the Lagrangian function related to the optimization problem. • Cancel the derivatives of the Lagrangian function with respect to each outer weights a j . • Use a procedure similar to Wold’s PLS approach to solve the stationary equations (  Gauss- Seidel algorithm or  MAXDIFF algorithm). • This procedure is monotonically convergent: the criterion increases at each step of the algorithm. 12

  13. The PLS algorithm for RGCCA y j = X j a j   Outer Estimation z e y (explains the block) j jk k Initial  k j a j 2      (1 ) ( ) 1 a Var X a step j j j j j Inner Estimation Iterate until convergence (takes into account of the criterion. relations between blocks) 1      1 t t [( (1 ) ] I X X X z Choice of inner weights e jk : j j j j j j n  a - Horst : e jk = c jk j 1      - Centroid : e jk = c jk sign(Cor(y k ,y j )) 1 t t t [( (1 ) ] z X I X X X z j j j j j j j j n - Factorial : e jk = c jk Cov(y k ,y j ) c jk = 1 if blocks are linked, 0 otherwise 13

  14. Special cases of Regularized generalized CCA RGCCA and Multi-block data analysis  Cor( , ) X a X a Max SUMCOR (Horst, 1961) j j k k  ( ) 1 , , Var X a  j j j k j k GENERALIZED CANONICAL CORRELATION ANALYSIS  2 Cor ( , ) Max X a X a SSQCOR (Kettenring, 1971) j j k k  Var X a ( ) 1 , ,  j j j k j k  Cor( , ) Max X a X a SABSCOR (Mathes, 1993, Hanafi, 2004) j j k k  ( ) 1 , , Var X a  j j j k j k  MAXDIFF (Van de Geer, 1984) Cov( , ) Max X a X a j j k k  1 a  [SUMCOV] j , , j k j k GENERALIZED PLS REGRESSION  MAXDIFF B Hanafi & Kiers, 2006) ( 2 Cov ( , ) Max X a X a j j k k  1 a  j , , [SSQCOV] j k j k  Cov( , ) Max X a X a SABSCOV (Krämer, 2007) j j k k  1 a  j , , j k j k 14

  15. Special cases of Regularized generalized CCA Hierarchical models (a) One second order block (b) Several second order blocks X ,..., X = Predictors 1 J 1 Very often: X ,..., X = Responses  J 1 J 15 1

  16. Special cases of Regularized generalized CCA Hierarchical model : one 2 nd order block Method Criterion Constraints J  Maximize g(Cov( , )) X a X a Hierarchical PLS      j j J 1 J 1 1, 1,..., 1 a j J a ,..., a   1 1 J 1 j j regression J  Hierarchical Maximize g(Cor( , )) X a X a      1 1 j j J J Var( ) 1, 1,..., 1 X a j J Canonical Correlation a ,..., a   1 J 1 1 j j j Analysis Stable predictors and good prediction Maximize Hierarchical a ,..., a  1 J 1   Redundancy analysis 1, 1,..., a j J J  j 1/2 X ’s with g(Cor( , )Var( ) ) of the X a X a X a    1 1 j j J J j j j Var( ) 1 X a    1 J 1 J 1 j respect to X J  1 Good predictors and stable response Maximize Hierarchical a ,..., a  1 J 1   Var( ) 1, 1,..., Redundancy analysis X a j J J j j  1/2 g(Cor( , )Var( ) ) of with respect X a X a X a X      J  1 1 1 1 1 a 1 j j J J J J  1  J 1 j X ’s to the j 16 g = identity, square or absolute value

  17. Special cases of Regularized generalized CCA Hierarchical model : one 2 nd order block Factorial scheme : g = square function Concordance analysis (Hanafi & Lafosse, 2001) J  2 Maximize Cov ( , ) X M b X M b    1 1 1 j j j J J J  1 j    t subject to 1, 1,..., 1 b M b j J j j j The previous methods are found again for the metrics M j equal to identity or Mahalanobis 17

  18. Special cases of Regularized generalized CCA Hierarchical model : X 1 X 1 y 1 y 1 the 2nd order block X 2 X 2 y 2 y 2 is a super-block X 1 | X 2 | … | X J X 1 | X 2 | … | X J y J+1 y J+1     1 ,..., X X X X J+1 X J+1 1 J J X J X J y J y J Method Criterion Constraints J  Maximize Cor( , ) X a X a   1 1 j j J J ,..., a a   1 J 1 1 j SUMCOR    Var( ) 1, 1,..., 1 X a j J or j j (Horst, 1961) J  Maximize Cor( , ) X a X a   1 1 j j J J ,..., a a   1 1 J j 1 J  1 2 Maximize Cor ( , ) X a X a      1 1 Var( ) 1, 1,..., , 1 j j J J X a j J J a ,..., a Generalized CCA   j j 1 1 J 1 1 j    (Carroll, 1968a,b) J 1, 1,..., a j J J   2 1 Cov ( , ) j X a X a 18   1 1 j j J J   j J 1 1

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend