New pictures for correlation structure Jan Graffelman 1 1 Department - - PowerPoint PPT Presentation

new pictures for correlation structure
SMART_READER_LITE
LIVE PREVIEW

New pictures for correlation structure Jan Graffelman 1 1 Department - - PowerPoint PPT Presentation

1 Outline Correlations and cosines The interpretation function Biplots and interpretation function ( p = 2 ) Biplots and interpretation New pictures for correlation structure Jan Graffelman 1 1 Department of Statistics and Operations Research


slide-1
SLIDE 1

1 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

New pictures for correlation structure

Jan Graffelman1

1Department of Statistics and Operations Research

Universitat Polit` ecnica de Catalunya jan.graffelman@upc.edu

6th Carme conference, Rennes, 10th of February, 2011

Graffelman 6th Carme Conference, Rennes, February 2011

slide-2
SLIDE 2

2 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Outline

1 Correlations and cosines

Graffelman 6th Carme Conference, Rennes, February 2011

slide-3
SLIDE 3

3 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Outline

1 Correlations and cosines 2 The interpretation function

Graffelman 6th Carme Conference, Rennes, February 2011

slide-4
SLIDE 4

4 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Outline

1 Correlations and cosines 2 The interpretation function 3 Biplots and interpretation function (p = 2)

Graffelman 6th Carme Conference, Rennes, February 2011

slide-5
SLIDE 5

5 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Outline

1 Correlations and cosines 2 The interpretation function 3 Biplots and interpretation function (p = 2) 4 Biplots and interpretation function (p > 2)

Graffelman 6th Carme Conference, Rennes, February 2011

slide-6
SLIDE 6

6 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Outline

1 Correlations and cosines 2 The interpretation function 3 Biplots and interpretation function (p = 2) 4 Biplots and interpretation function (p > 2) 5 Final comments

Graffelman 6th Carme Conference, Rennes, February 2011

slide-7
SLIDE 7

7 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Correlations are cosines

Sample geometry of X: in Rn the cosines of the angles between p variable vectors equal their correlations. r(x, y) =

(xi−x)(yi−y)

√ (xi−x)2√ (yi−y)2 =

x′y

  • x
  • y

= cos (α).

In full space PCA biplots the cosine of the angle between two biplot vector equals the sample correlation coefficient of the corresponding variables. In CCA the canonical correlation is the cosine of the angle between two subspaces. ...

It seems natural to represent correlations by cosines.

Graffelman 6th Carme Conference, Rennes, February 2011

slide-8
SLIDE 8

8 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Correlations are not necessarily cosines

(a) PCA biplot (r = 0.5)

  • 2

1

−2 −1 1 2 3 −2 −1 1 2 3

Graffelman 6th Carme Conference, Rennes, February 2011

slide-9
SLIDE 9

9 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Correlations are not necessarily cosines

  • (b) Scatterplot (r = 0.5)

Graffelman 6th Carme Conference, Rennes, February 2011

slide-10
SLIDE 10

10 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

The interpretation function f (α)

You can relate r to the angle in the way you like:

1 2 3 4 5 6 −1.0 −0.5 0.0 0.5 1.0 Angle (in radians) Correlation

Graffelman 6th Carme Conference, Rennes, February 2011

slide-11
SLIDE 11

11 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

The interpretation function f (α)

You can relate r to the angle in the way you like:

1 2 3 4 5 6 −1.0 −0.5 0.0 0.5 1.0 Angle (in radians) Correlation

Graffelman 6th Carme Conference, Rennes, February 2011

slide-12
SLIDE 12

12 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

The interpretation function f (α)

You can relate r to the angle in the way you like:

1 2 3 4 5 6 −1.0 −0.5 0.0 0.5 1.0 Angle (in radians) Correlation

Graffelman 6th Carme Conference, Rennes, February 2011

slide-13
SLIDE 13

13 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

The interpretation function f (α)

You can relate r to the angle in the way you like:

1 2 3 4 5 6 −1.0 −0.5 0.0 0.5 1.0 Angle (in radians) Correlation

Graffelman 6th Carme Conference, Rennes, February 2011

slide-14
SLIDE 14

14 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

The interpretation function f (α)

The number of possible ways to represent a correlation is infinite. Which interpretation function makes most sense?

Graffelman 6th Carme Conference, Rennes, February 2011

slide-15
SLIDE 15

15 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

A biplot questionnaire

−2 −1 1 2 3 −4 −2 2

  • x1

x2

a) r(x1,x2)=........

−3 −2 −1 1 2 −4 −2 2

  • b) r(x1,x2)=........

x1 x2 −3 −2 −1 1 2 −4 −2 2 4

  • c) r(x1,x2)=........

x1 x2 −2 −1 1 2 3 −4 −2 2 4

  • d) r(x1,x2)=........

x1 x2 −2 −1 1 2 3 −4 −2 2 4

  • e) r(x1,x2)=........

x1 x2 −3 −2 −1 1 2 3 −2 2 4

  • f) r(x1,x2)=........

x1 x2

Graffelman 6th Carme Conference, Rennes, February 2011

slide-16
SLIDE 16

16 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

A snapshot of the results

  • −1.0

−0.5 0.0 0.5 1.0

Boxplots of estimated correlations

  • a

b c d e f

Graffelman 6th Carme Conference, Rennes, February 2011

slide-17
SLIDE 17

17 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

A snapshot of the results

  • −1.0

−0.5 0.0 0.5 1.0

Boxplots of estimated correlations

  • a

b c d e f

  • cosine

linear

Graffelman 6th Carme Conference, Rennes, February 2011

slide-18
SLIDE 18

18 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Suggestion

The human eye relates the correlation to the angle in a linear way, despite knowing that r = cos (α). Note: The median of the estimates is closer to the true sample correlation if a linear interpretation is used. If p > 2 errors pile up: we visually approximate approximated correlations. Then ... construct biplots that have r linear in the angle!

Graffelman 6th Carme Conference, Rennes, February 2011

slide-19
SLIDE 19

19 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

The interpretation function revisited: lincos

1 2 3 4 5 6 −1.0 −0.5 0.0 0.5 1.0 Angle (in radians) Correlation

Graffelman 6th Carme Conference, Rennes, February 2011

slide-20
SLIDE 20

20 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

How to make a plot with r linear in the angle?

Let: bx a vector representing x by a vector representing y Take: bx = (1, 0) by =

  • sin

π

2 r

  • , ± cos

π

2 r

  • .

Get your cases in: B = [bx, by] Xt = XB− More general: by =

  • cos
  • f −1(r)
  • , ± sin
  • f −1(r)
  • .

where f −1(r) is the inverse interpretation function.

Graffelman 6th Carme Conference, Rennes, February 2011

slide-21
SLIDE 21

21 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

A compelling question

Can we transform the data such that the standard tools (SVD, spectral decomposition) do produce the desired plot?

Graffelman 6th Carme Conference, Rennes, February 2011

slide-22
SLIDE 22

22 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

A compelling question

Can we transform the data such that the standard tools (SVD, spectral decomposition) do produce the desired plot? r⋆ = sin π 2 r

  • does the job.

Graffelman 6th Carme Conference, Rennes, February 2011

slide-23
SLIDE 23

23 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

A compelling question

Can we transform the data such that the standard tools (SVD, spectral decomposition) do produce the desired plot? r⋆ = sin π 2 r

  • does the job.

R⋆ = sin π 2 R

  • = UDV′

G = UD

1 2 Graffelman 6th Carme Conference, Rennes, February 2011

slide-24
SLIDE 24

24 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Effect of the transformation: biplots of R and R⋆

  • 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

  • 0.1 0.2 0.3 0.4

0.5 0.6 0.7 0.8 0.9 1

Graffelman 6th Carme Conference, Rennes, February 2011

slide-25
SLIDE 25

25 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Effect of the transformation: biplots of R and R⋆

  • 0.1

0.2 0.3 0.4 0.5 0.60.7 0.9 1

  • 0.1

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Graffelman 6th Carme Conference, Rennes, February 2011

slide-26
SLIDE 26

26 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Two problems

Multiple biplot vectors for one and the same variable

R⋆ not always positive semi-definite. In practice there are often negative eigenvalues. if p = 2 then R⋆ is positive semi-definite. if p > 2 the first two eigenvalues are positive.

Optimality

PCA was not designed to optimally display R by cosines. “PCA” of R⋆ will not be optimal for displaying R by lincosines. The approximation of R by angles (cos or lincos) needs to be

  • ptimized explicitly.

Graffelman 6th Carme Conference, Rennes, February 2011

slide-27
SLIDE 27

27 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Trosset’s correlogram

Explicitly fits angles to correlations θ = (0, θ1, . . . , θp) Minimize f (θ) =R − C(θ)2 C(θ) = (cos (θj − θk)) Numerical minimization with R’s routine nlminb. A minor modification of the objective function: Minimize f (θ) =R − C(θ)2 C(θ) = (lincos(θj − θk))

Graffelman 6th Carme Conference, Rennes, February 2011

slide-28
SLIDE 28

28 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Example: Manly’s archeological goblets (n = 25, p = 6)

PCA biplot (scp: 0.049; cos: 0.230)

  • 1

5 2 3 4 6

PFA biplot (scp: 0.026 cos: 0.304)

  • 1

2 5 3 4 6

FALSE

Cos correlogram (cos: 0.197)

1 2 3 4 6 5

Lin correlogram (lincos: 0.068)

1 5 2 3 4 6

Graffelman 6th Carme Conference, Rennes, February 2011

slide-29
SLIDE 29

29 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

Final comments

Scalar products do better than angles in approximating R (more flexible). If you insist on using angles, then using a linear interpretation function may be preferable. There is a closed form solution for p = 2, and and a numerical solution for p > 2. Multivariate graphics should not be tailored to having “nice” (but arbitrary) mathematical properties, but should be tailored to the human eye.

Graffelman 6th Carme Conference, Rennes, February 2011

slide-30
SLIDE 30

30 Outline Correlations and cosines The interpretation function Biplots and interpretation function (p = 2) Biplots and interpretation

References

Gabriel, K. R. (1971) The biplot graphic display of matrices with application to principal component analysis. Biometrika 58(3), pp. 453–467. Rodgers, J.L. (1988) Thirteen ways to look at the correlation

  • coefficient. The American Statistician 42(1), pp. 59–66.

Manly, B.F.J. (1989) Multivariate statistical methods: a primer, Chapman and Hall, London. Friendly, M. (2002) Corrgrams: exploratory displays for correlation matrices. The American Statistician, 56(4), pp. 316–324. Trosset, M.W. (2005) Visualizing correlation Journal of Computational and Graphical Statistics 14(1), pp. 1–19.

Graffelman 6th Carme Conference, Rennes, February 2011