Correspondence Analysis Outliers Confidence regions
Correspondence Analysis and Moderate Outliers
Anna Langovaya, Sonja Kuhnt
TU Dortmund
Ferbruar 9, 2011
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 1 / 23
Correspondence Analysis and Moderate Outliers Anna Langovaya, Sonja - - PowerPoint PPT Presentation
Correspondence Analysis Outliers Confidence regions Correspondence Analysis and Moderate Outliers Anna Langovaya, Sonja Kuhnt TU Dortmund Ferbruar 9, 2011 TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 1 / 23 Correspondence Analysis
Correspondence Analysis Outliers Confidence regions
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 1 / 23
Correspondence Analysis Outliers Confidence regions
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 2 / 23
Correspondence Analysis Outliers Confidence regions Model CA
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 3 / 23
Correspondence Analysis Outliers Confidence regions Model CA
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 4 / 23
Correspondence Analysis Outliers Confidence regions Model CA
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 5 / 23
Correspondence Analysis Outliers Confidence regions Model CA
2 UΣ
2 VΣ TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 6 / 23
Correspondence Analysis Outliers Confidence regions Idea Design Results
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 7 / 23
Correspondence Analysis Outliers Confidence regions Idea Design Results
1 Randomly generate marginal probabilities πi.., π.j., π..k 2 Define probabilities πijk = πi.. · π.j. · π..k 3 Simulate n observations from Multinomial(n, (πl)l=1,...,IJK),
4 Apply correspondence analysis (R-package: ca) TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 8 / 23
Correspondence Analysis Outliers Confidence regions Idea Design Results
1 Randomly generate marginal probabilities πi.., π.j., π..k 2 Define probabilities πijk = πi.. · π.j. · π..k 3 Outlier generation: replace chosen πijk by (1.2)max(πijk) 4 Rescale probabilities to
5 Simulate n observations from Multinomial(n, (πl)l=1,...,IJK),
6 Apply correspondence analysis (R-package: ca) TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 9 / 23
Correspondence Analysis Outliers Confidence regions Idea Design Results
[[A]] X1 X2 X3 a b c d 1, , 1 62 0 0 1 2 0 0 0 2 3 0 0 1 2 4 0 0 0 1 2, , 1 4 22 0 16 2 19 45 7 32 3 5 42 7 39 4 15 50 6 23 3, , 1 4 25 1 13 2 12 73 8 41 3 12 60 4 29 4 12 58 11 43 4, , 1 1 11 0 9 2 10 29 4 21 3 9 31 7 22 4 8 19 1 11 [[B]] X1 X2 X3 a b c d 1, , 1 61 20 15 4 2 1 64 47 39 3 0 21 20 14 4 3 53 51 41 2, , 1 0 11 10 6 2 0 46 34 31 3 0 20 10 11 4 0 27 34 37 3, , 1 0 3 3 2 2 0 11 3 9 3 0 4 2 1 4 0 5 6 5 4, , 1 1 11 6 7 2 0 32 28 16 3 0 7 7 7 4 0 31 40 22 [[C]] X1 X2 X3 a b c d 1, , 1 54 5 3 4 2 0 4 2 11 3 3 2 6 6 4 2 1 8 5 2, , 1 8 15 23 24 2 13 29 22 52 3 10 40 24 43 4 7 18 17 33 3, , 1 13 22 18 28 2 13 45 33 50 3 10 42 39 60 4 7 33 13 38 4, , 1 2 2 3 4 2 1 5 2 1 3 2 2 6 5 4 1 0 3 3 [[D]] X1 X2 X3 a b c d 1, , 1 90 13 37 35 2 10 21 53 63 3 0 2 13 9 4 4 7 28 25 2, , 1 0 1 3 3 2 1 3 2 6 3 0 0 2 0 4 1 1 3 2 3, , 1 5 18 65 71 2 7 19 84 84 3 1 5 15 13 4 5 17 39 34 4, , 1 1 3 9 10 2 2 1 12 16 3 0 0 7 1 4 1 3 6 8
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 10 / 23
Correspondence Analysis Outliers Confidence regions Idea Design Results
A
−3 −2 −1 1 2 −2 −1 1 2
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 a b c d
B
−3 −2 −1 1 2 −2 −1 1 2
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 a b cd
C
−3 −2 −1 1 2 −2 −1 1 2
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 a b c d
D
−3 −2 −1 1 2 −2 −1 1 2
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 a b c d
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 11 / 23
Dimension 1 −10 −5 5 10 Dimension 2 −10 −5 5 10 F r e q u e n c y 50 100
Independece_Rows: Coordinates of the cell 1
Dimension 1 −10 −5 5 10 Dimension 2 −10 −5 5 10 F r e q u e n c y 20 40 60 80
Independence_Columns: coordinates of the cell 1
Dimension 1 −10 −5 5 Dimension 2 −10 −5 5 F r e q u e n c y 50 100 150
Outlier_Rows: Coordinates of the cell 1
Dimension 1 −10 −5 5 Dimension 2 −10 −5 5 F r e q u e n c y 50 100 150 200
Outlier_Columns: coordinates of the cell 1
−2 2 4 −4 −2 2 4
Rows_independence: CA−coordinates of cell 1
xr yr
−2 2 4 −4 −2 2 4
Columns_independence: CA−coordinates of cell 1
xc yc
−2 2 4 −4 −2 2 4
Rows_outlier: CA−coordinates of cell 1
xr yr
−2 2 4 −4 −2 2 4
Columns_outlier: CA−coordinates of cell 1
xc yc
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 14 / 23
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
−4 −3 −2 −1 1 2 3 −4 −2 2 4
Simulated confidence regions
Dimension 1 Dimension 2 independence with an outlier
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 15 / 23
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
−4 −3 −2 −1 1 2 3 −4 −2 2 4
Simulated confidence regions
Dimension 1 Dimension 2 independence with an outlier
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 16 / 23
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
−4 −3 −2 −1 1 2 3 −4 −2 2 4
Simulated confidence regions
Dimension 1 Dimension 2 independence with an outlier
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 17 / 23
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
−2 −1 1 2 −2 −1 1 2
2 3 4 5 6 7 8 910 11 12 13 14 15 16 a b c d
−2 −1 1 2 −2 −1 1 2
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 a b c d
−2 −1 1 2 −2 −1 1 2
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 a b c d
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 18 / 23
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
−4 −3 −2 −1 1 2 3 −4 −2 2 4
Simulated confidence regions
Dimension 1 Dimension 2 independence with outliers
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 19 / 23
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
−4 −3 −2 −1 1 2 3 −4 −2 2 4
Simulated confidence regions
Dimension 1 Dimension 2 independence with outliers
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 20 / 23
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 21 / 23
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 22 / 23
Correspondence Analysis Outliers Confidence regions One outlier Several outliers Outlook
TU Dortmund A.Langovaya, S.Kuhnt CA and Outliers 23 / 23