Introduction to Machine Learning Part 2
Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison
[Based on slides from Jerry Zhu]
Introduction to Machine Learning Part 2 Yingyu Liang - - PowerPoint PPT Presentation
Introduction to Machine Learning Part 2 Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [Based on slides from Jerry Zhu] K-means clustering Very popular clustering method Dont confuse
[Based on slides from Jerry Zhu]
positions as initial cluster centers (not necessarily a data point)
cluster center it is closest to (very much like 1NN). The point belongs to that cluster.
new centroid, based on which points belong to it
new centroid, based on which points belong to it
convergence (cluster centers no longer move)…
y(x1)…y(xn) c1(1)…c1(D) … ck(1)…ck(D)
location of cluster centers
min x d=1…D [x(d) – cy(x)(d)]2 = min z=1..k y(x)=z d=1…D [x(d) – cz(d)]2
location of cluster centers
min x d=1…D [x(d) – cy(x)(d)]2 = min z=1..k y(x)=z d=1…D [x(d) – cz(d)]2
/cz(d) z=1..k y(x)=z d=1…D [x(d) – cz(d)]2 = 0
cz(d) = y(x)=z x(d) / |nz|
assigned to cluster z
in step 2.
repeat
There are finite number of points Finite ways of assigning points to clusters In step1, an assignment that reduces distortion has to be a new assignment not used before Step1 will terminate So will step 2 So k-means terminates
for the k clusters
#dimensions #clusters #points