[PPT] - The Power STATIS-ACT method Jacques B enass eni, Mohammed Bennani PowerPoint Presentation

SLIDE 1

Correspondence Analysis and Related Methods - CARME 2011

The Power STATIS-ACT method

Jacques B´ enass´ eni, Mohammed Bennani Dosse Universit´ e Rennes 2, UMR 6620 February 2011, Rennes, France

SLIDE 2

Data

◮ K data matrices X1, . . . , XK ◮ Each Xk is a n × pk matrix : measurements of the same units

n pk different variables.

◮ D = diag(π1, . . . , πn) : diagonal matrix of weights. ◮ Qk : positive definite matrices (metrics). ◮ We assume that D = In, Qk = Ipk and Xk centered.

SLIDE 4

STATIS-ACT method

◮ The STATIS-ACT method is a generalization of principal

component analysis used to study several data tables measured on the same observation units (or variables).

◮ The goal of this method is to analyze the relationship between

these data tables (Interstructure step) and to combine them into a common structure, called a compromise.

◮ The principal components derived from the compromise

solution are analyzed together with the original variables (intrastructure step).

SLIDE 5

STATIS-ACT method

◮ In STATIS, the individual association matrices Wk = XkX′ k,

k = 1, . . . , K play a central role.

◮ Wk contains all the information about the multidimensional

structure in the data matrix Xk.

◮ The use of association matrices Wk instead of Xk leads to

simplification of computations as it obviates the determination

f rotations (GPA, Gower 1975).

SLIDE 6

STATIS-ACT method

◮ Basic idea of STATIS : derive an optimal set of weights αk for

computing a compromise solution

W =

K

k=1

αkWk

◮ where the αk, k = 1, . . . , K maximize the criterion K

k=1
trace(

WWk) 2

◮ subject to the constraints αk 0 and K

k=1

α2

k = 1.

SLIDE 7

STATIS-ACT method

◮ Define the matrix C = (Ckℓ) where

Ckℓ = trace(WkWℓ) =

pk

i=1

pℓ

j=1
cov(xki, xℓj)

2

◮ Solution of STATIS : the vector a = (α1, . . . , αK)′ is the the

eigenvector of C corresponding to the largest eigenvalue of this matrix.

◮ Since C 0, the vecteur a can be choosen with all its

elements nonnegative (Perron-Frobenius theorem).

SLIDE 8

A s-power Criterion

◮ Since trace(

WWk) 0 there is no specific reason for considering in STATIS-ACT the criterion

K

k=1
trace(

WWk) 2 rather than any other power s based criterion (s 1) :

K

k=1
trace(

WWk) s

◮ We investigate the effect of varying the power s on the

ptimal weights in the compromise solution.

SLIDE 9

The special case s = 1

◮ The most simple choice for s. ◮ Simple solution and straighforward interpretation. ◮ It is possible to find a solution to ”the problem of the low rank

compromise” when considering a criterion based on s = 1.

◮ Close analogy between the compromise obtained from the

s = 1 criterion and the first principal component derived a PCA.

SLIDE 10

The special case s = 1

◮ Maximize the criterion K

k=1
trace(

WWk)

subject to the constraints αk 0 and

K

k=1

α2

k = 1. ◮ The solution is given by

a = Ce Ce where e = (1, . . . , 1)′.

◮ Clearly a 0.

SLIDE 11

The special case s = 1

◮ What happens if the constraint on αk is changed to

K

k=1 αk = 1 ? ◮ Standard linear program where the constraint set is a

polyhedron (Simplex)

◮ The optimal compromise solution is one of the initial matrices

Wk !

SLIDE 12

The general case s > 1

◮ Motivation : One of the most interesting feature of the

classical STATIS-ACT method based on power s = 2 is that the corresponding weights αk represent principal agreement between the given tables.

◮ A data table which is not in agreement with the others has a

low weight.

◮ What happens if s > 1 ?

SLIDE 13

The general case s > 1

◮ Maximizing K

k=1
trace(

WWk) s

◮ subject to K k=1 α2 k = 1 and αk 0 ◮ is equivalent to maximizing

f (a) =

K

k=1
Ca

s

k

subject to a′a = 1.

◮ f is convex and differentiable function on RK +.

SLIDE 14

The general case s > 1

◮ Iterative solution. ◮ Algorithm: ◮ Choose a(0) (randomly such that a(0) = 1). ν = 0 ◮ Repeat until convergence

◮ ν := ν + 1. ◮ Calculate z = (z1, . . . , zK)′ where zk =

Ca(ν)s−1

k

.

◮ set a(ν+1) =

Cz Cz

◮ End.

SLIDE 15

The general case s > 1

◮ We prove monotone convegence. ◮ When s = 2, this algorithm is simply the power method used

in the numerical calculation of the dominant eigenvector of C.

◮ Convergence to a global maximum is not necessarily

guaranteed (multistart, ...)

◮ Algebraic solution when s tends to infinity.

SLIDE 16

Low rank compromises

◮ The configuration of observations given by the compromise

solution is derived from principal components which are the eigenvectors of W.

◮ In practice, interest mainly focuses on graphical

representations based on the first R principal components (with R = 2 in most situations).

◮ However if

W corresponds to the maximum of the criterion of interest (with general power s), this point is no longer true when considering the approximation of rank R of W.

◮ For the s = 1 case, we can derive an algebraic solution to the

low rank compromise of the form R

ℓ=1 uℓu′ ℓ with uℓu′ j = 0.

SLIDE 17

Applications

◮ Real data sets (from sensory analysis, ecology) ◮ Comparison of the weights

Data set EN(1) EN(5) EN(∞) IN(1) IN(5) IN(∞) 1 0.002 0.006 0.074 0.002 0.005 0.049 2 0.026 0.053 0.084 0.023 0.049 0.087 3 0.008 0.026 0.133 0.005 0.017 0.081 4 0.015 0.054 0.238 0.012 0.046 0.167 5 0.087 0.170 0.292 0.052 0.079 0.139 6 0.024 0.045 0.182 0.015 0.025 0.111 7 0.212 0.302 0.474 0.185 0.235 0.284

Table: Comparison of a(2) and a(s) for s = 1, 5, ∞.

where EN(s) = a(2) − a(s)2 and IN(s) = a(2) − a(s)∞.

SLIDE 18

Applications

◮ Monte-Carlo simulations ◮ Comparison of the weights

a(s) s = 1 s = 2 s = 5 s = ∞ α(s)

1

0.489 0.386 0.239 0.223 (0.031) (0.092) (0.124) (0.108) α(s)

2

0.622 0.654 0.684 0.693 (0.023) (0.034) (0.033) (0.043) α(s)

3

0.610 0.643 0.676 0.674 (0.024) (0.036) (0.036) (0.046)

Table: Comparison of a(s) for s = 1, 2, 5, ∞.

SLIDE 19

Conclusions

◮ The weights attached to the compromise solution for s = 1

are in general fairly close to those obtained in the usual STATIS-ACT method.

◮ The compromise obtained with the s = 1 approach simply

requires elementary operations whereas the usual compromise needs calculation of the dominant eigenvector of C.

◮ The power parameter s is is relation with robustness of the

compromise solution.

◮ When there are some ”outlying” matrices Wk, increasing the

power parameter s in the generalized criterion downloads the influence of these matrices on the compromise, thus enhancing the well known ”majority effect” of the STATIS method.

SLIDE 20

References

◮ B´

enass´ eni, J. & Bennani Dosse, M., 2010. Analyzing multiset data by the Power STATIS-ACT method, Advances in Data Analysis and Classification, to appear.

◮ Lavit, C., Escoufier, Y., Sabatier, R. & Traissac, P., 1994.

The ACT (STATIS method). Computational Statistics & Data Analysis, 18, 97-117.

◮ Lavit, C., 1985. Application de la m´

ethode STATIS. Statistique et Analyse des donn´ ees, 10(1), 103-116.

◮ Gower, J.C., 1975. Generalised Procrustes Analysis.

Psychometrika, 40, 33-51.

SLIDE 21

Correspondence Analysis and Related Methods - CARME 2011

The Power STATIS-ACT method

Jacques B´ enass´ eni, Mohammed Bennani Dosse Universit´ e Rennes 2, UMR 6620 February 2011, Rennes, France

Contents

Data

◮ K data matrices X1, . . . , XK ◮ Each Xk is a n × pk matrix : measurements of the same units

◮ D = diag(π1, . . . , πn) : diagonal matrix of weights. ◮ Qk : positive definite matrices (metrics). ◮ We assume that D = In, Qk = Ipk and Xk centered.

STATIS-ACT method

◮ The STATIS-ACT method is a generalization of principal

component analysis used to study several data tables measured on the same observation units (or variables).

◮ The goal of this method is to analyze the relationship between

these data tables (Interstructure step) and to combine them into a common structure, called a compromise.

◮ The principal components derived from the compromise

solution are analyzed together with the original variables (intrastructure step).

STATIS-ACT method

◮ In STATIS, the individual association matrices Wk = XkX′ k,

k = 1, . . . , K play a central role.

◮ Wk contains all the information about the multidimensional

structure in the data matrix Xk.

◮ The use of association matrices Wk instead of Xk leads to

simplification of computations as it obviates the determination

STATIS-ACT method

◮ Basic idea of STATIS : derive an optimal set of weights αk for

computing a compromise solution

K

αkWk

◮ where the αk, k = 1, . . . , K maximize the criterion K

WWk) 2

◮ subject to the constraints αk 0 and K

α2

k = 1.

STATIS-ACT method

◮ Define the matrix C = (Ckℓ) where

Ckℓ = trace(WkWℓ) =

pk

pℓ

2

◮ Solution of STATIS : the vector a = (α1, . . . , αK)′ is the the

eigenvector of C corresponding to the largest eigenvalue of this matrix.

◮ Since C 0, the vecteur a can be choosen with all its

elements nonnegative (Perron-Frobenius theorem).

A s-power Criterion

◮ Since trace(

WWk) 0 there is no specific reason for considering in STATIS-ACT the criterion

K

WWk) 2 rather than any other power s based criterion (s 1) :

K

WWk) s

◮ We investigate the effect of varying the power s on the

The special case s = 1

◮ The most simple choice for s. ◮ Simple solution and straighforward interpretation. ◮ It is possible to find a solution to ”the problem of the low rank

compromise” when considering a criterion based on s = 1.

◮ Close analogy between the compromise obtained from the

s = 1 criterion and the first principal component derived a PCA.

The special case s = 1

◮ Maximize the criterion K

WWk)

K

α2

k = 1. ◮ The solution is given by

a = Ce Ce where e = (1, . . . , 1)′.

◮ Clearly a 0.

The special case s = 1

◮ What happens if the constraint on αk is changed to

K

k=1 αk = 1 ? ◮ Standard linear program where the constraint set is a

polyhedron (Simplex)

◮ The optimal compromise solution is one of the initial matrices

Wk !

The general case s > 1

◮ Motivation : One of the most interesting feature of the

classical STATIS-ACT method based on power s = 2 is that the corresponding weights αk represent principal agreement between the given tables.

◮ A data table which is not in agreement with the others has a

low weight.

◮ What happens if s > 1 ?

The general case s > 1

◮ Maximizing K

WWk) s

◮ subject to K k=1 α2 k = 1 and αk 0 ◮ is equivalent to maximizing

f (a) =