RECSM Summer School: Machine Learning for Social Sciences
Session 3.3: K-Means Clustering Reto Wüest
Department of Political Science and International Relations University of Geneva
1
RECSM Summer School: Machine Learning for Social Sciences Session - - PowerPoint PPT Presentation
RECSM Summer School: Machine Learning for Social Sciences Session 3.3: K -Means Clustering Reto West Department of Political Science and International Relations University of Geneva 1 Clustering Clustering Clustering refers to a set of
1
1
2
3
4
K=2 K=3 K=4 (The colors of the observations are the output of the clustering algorithm: they indicate the cluster to which each
5
1 C1 ∪ C2 ∪ . . . ∪ CK = {1, . . . , n}. In other words, each
2 Ck ∩ Ck′ = ∅ for all k = k′. In other words, no observation
6
C1,...,CK
7
p
C1,...,CK
K
p
8
9
1 Randomly assign a number, from 1 to K, to each of the
2 Iterate until the cluster assignments stop changing:
10
Data Step 1 Iteration 1, Step 2a Iteration 1, Step 2b Iteration 2, Step 2a Final Results
(Source: James et al. 2013, 389)
11
12
320.9 235.8 235.8 235.8 235.8 310.9
(Above each plot is the value of the objective (3.3.3). Source: James et al. 2013, 390)
13