RECSM Summer School: Machine Learning for Social Sciences
Session 3.2: Principal Components Analysis Reto Wüest
Department of Political Science and International Relations University of Geneva
1
RECSM Summer School: Machine Learning for Social Sciences Session - - PowerPoint PPT Presentation
RECSM Summer School: Machine Learning for Social Sciences Session 3.2: Principal Components Analysis Reto West Department of Political Science and International Relations University of Geneva 1 Principal Components Analysis Principal
1
1
2
2
3
4
j=1 φ2 j1 = 1.
5
6
j=1 φ2 j1 = 1.
φ11,...,φp1
n
p
2
p
j1 = 1.
7
8
9
10
First Principal Component Second Principal Component
Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire New Jersey New Mexico New York North Carolina Ohio Oklahoma Oregon Pennsylvania Rhode Island South Carolina South Dakota Tennessee Texas Utah Vermont Virginia Washington West Virginia Wisconsin Wyoming
−3 −2 −1 1 2 3 −3 −2 −1 1 2 3 −0.5 0.0 0.5 −0.5 0.0 0.5
rth Dakota
Murder Assault UrbanPop Rape
(Source: James et al. 2013, 378)
11
12
13
10 20 30 40 50 60 70 5 10 15 20 25 30 35
Population Ad Spending
(Source: James et al. 2013, 230)
14
First principal component Second principal component −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
(Left: the first two principal component directions span the plane that best fits the data. Right: Projection of the observations onto the plane; the variance on the plane is maximized. Source: James et al. 2013, 380)
15
16
First Principal Component Second Principal Component * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * −0.5 0.0 0.5 Murder Assault UrbanPop Rape
Scaled
−3 −2 −1 1 2 3 −100 −50 50 100 150 First Principal Component Second Principal Component * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * −3 −2 −1 1 2 3 −0.5 0.0 0.5 −100 −50 50 100 150 −0.5 0.0 0.5 1.0 −0.5 0.0 0.5 1.0 Murder Assau UrbanPop Rape
Unscaled
(Source: James et al. 2013, 381)
17
18
19
20
p
p
n
ij.
n
im = 1
n
p
2
21
i=1
j=1 φjmxij
j=1
i=1 x2 ij
22
Principal Component
Principal Component 1.0 1.5 2.0 2.5 3.0 3.5 4.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Cumulative Prop. Variance Explained
(Source: James et al. 2013, 383)
23
24