Data Mining II
- Prof. Dr. Karsten Borgwardt, Department Biosystems, ETH Z¨
urich Basel, Spring Semester 2016
D-BSSE
Data Mining II Prof. Dr. Karsten Borgwardt, Department Biosystems, - - PowerPoint PPT Presentation
Data Mining II Prof. Dr. Karsten Borgwardt, Department Biosystems, ETH Z urich Basel, Spring Semester 2016 D-BSSE Our course - The team Dr. Damian Roqueiro, Dr. Dean Bodenham, Dr. Dominik Grimm, Dr. Xiao He D-BSSE Karsten Borgwardt Data
D-BSSE
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 2 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 3 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 4 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 5 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 6 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 7 / 117
x 2 1 1 2 3 4 5 6 y 10 8 6 4 2 2 4 z 6 4 2 2 4 6 z 6 4 2 2 4 6
class 1 class 2 class 3 class 4 class 5
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 8 / 117
6 4 2 2 4 6 8 Transformed X Values 10 8 6 4 2 2 Transformed Y Values
Variance Explained - PC1: 0.55, PC2: 0.32, PC3: 0.13
1.0 2.0 3.0 4.0 5.0
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 9 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 10 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 11 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 12 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 13 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 14 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 15 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 16 / 117
Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 17 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 18 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 19 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 20 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 21 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 22 / 117
Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 23 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 24 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 25 / 117
n
i=1 xix⊤ i
nXX ⊤.
nX and using the definition of K we obtain
n (Xv∗).
Xv∗ ||Xv∗|| is an eigenvector of Σ with eigenvalue λ n .
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 26 / 117
i ||Xv∗
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 27 / 117
i=1 λi
i=1 λi . D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 28 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 29 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 30 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 31 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 32 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 33 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 34 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 35 / 117
15 10 5 5 10 15 20 25 x 10 5 5 10 15 y D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 36 / 117
15 10 5 5 10 15 20 25 x 10 5 5 10 15 y
PC1 PC2
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 37 / 117
15 10 5 5 10 15 20 25 30 Transformed X Values 4 2 2 4 6 8 10 12 14 Transformed Y Values
1.0
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 38 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 39 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 40 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 41 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 42 / 117
L{1,1} ... L{1,n} L{n,1} ... L{n,n}
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 43 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 44 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 45 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 46 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 47 / 117
Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 48 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 49 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 50 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 51 / 117
Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 52 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 53 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 54 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 55 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 56 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 57 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 58 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 59 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 60 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 61 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 62 / 117
1.5 1.0 0.5 0.0 0.5 1.0 1.5 x 1.5 1.0 0.5 0.0 0.5 1.0 1.5 y
Nonlinear 2D Dataset D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 63 / 117
1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 2.5 x 1.0 0.5 0.0 0.5 1.0 1.5 y
Nonlinear 2D Dataset D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 64 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 65 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 66 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 67 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 68 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 69 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 70 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 71 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 72 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 73 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 74 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 75 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 76 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 77 / 117
Source of Table and subsequent figure: Wolfgang Karl H¨ ardle, Leopold Simar. Applied Multivariate Statistical Analysis. Springer 2015, Chapter 17 D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 78 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 79 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 80 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 81 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 82 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 83 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 84 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 85 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 86 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 87 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 88 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 89 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 90 / 117
1 2
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 91 / 117
1 2
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 92 / 117
2d2 ij}.
n11⊤.
1 2
p , and then the coordinates of the points are given by the rows of X.
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 93 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 94 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 95 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 96 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 97 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 98 / 117
w1 w2 w3
x0 x1 x2 x3
x0=x1w1+x2w2+x3w3 D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 99 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 100 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 101 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 102 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 103 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 104 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 105 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 106 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 107 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 108 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 109 / 117
−1 −0.5 0.5 1 1.5 −1 −0.5 0.5 1 1.5 −1 −0.5 0.5 1 1.5
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 110 / 117
2 3 4 5 1 2 3 4 5
2 3 4 5 1 2 3 4 5
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 111 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 112 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 113 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 114 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 115 / 117
Iteration Reconstruction Error 500 1000 1500 2000 2500 10 20 30 40 50
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 116 / 117
D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 117 / 117