Machine Learning
Lecture Notes on Clustering (II) 2017-2018
Davide Eynard
davide.eynard@usi.ch
Institute of Computational Science Universit` a della Svizzera italiana
– p. 1/39
Machine Learning Lecture Notes on Clustering (II) 2017-2018 Davide - - PowerPoint PPT Presentation
Machine Learning Lecture Notes on Clustering (II) 2017-2018 Davide Eynard davide.eynard@usi.ch Institute of Computational Science Universit` a della Svizzera italiana p. 1/39 Todays Outline K-Means limits K-Means extensions:
Davide Eynard
davide.eynard@usi.ch
Institute of Computational Science Universit` a della Svizzera italiana
– p. 1/39
Today’s Outline
– p. 2/39
K-Means limits
Importance of choosing initial centroids
– p. 3/39
K-Means limits
Importance of choosing initial centroids
– p. 4/39
K-Means limits
Differing sizes
– p. 5/39
K-Means limits
Differing density
– p. 6/39
K-Means limits
Non-globular shapes
– p. 7/39
K-Means: higher K
What if we tried to increase K to solve K-Means problems?
– p. 8/39
K-Means: higher K
What if we tried to increase K to solve K-Means problems?
– p. 9/39
K-Means: higher K
What if we tried to increase K to solve K-Means problems?
– p. 10/39
K-Medoids
the distribution of the data
representative point of the cluster
could be not part of the cluster
representative point of a set
– p. 11/39
PAM
PAM means Partitioning Around Medoids. The algorithm follows:
(squared-error criterion)
– p. 12/39
PAM
than a mean (can you tell why?)
data sets
positions of points but just their distances!
– p. 13/39
Fuzzy C-Means
Fuzzy C-Means (FCM, developed by Dunn in 1973 and improved by Bezdek in 1981) is a method of clustering which allows one piece of data to belong to two or more clusters.
Jm =
N
C
um
ijxi − cj2, 1 < m < ∞
where: m is any real number greater than 1 (fuzziness coefficient), uij is the degree of membership of xi in the cluster j, xi is the i-th of d-dimensional measured data, cj is the d-dimension center of the cluster, · is any norm expressing the similarity between measured data and the center.
– p. 14/39
K-Means vs. FCM
centroid B
– p. 15/39
K-Means vs. FCM
they may belong to several clusters (with different membership values)
– p. 16/39
Data representation
(KM)UN×C = 1 1 1 . . . . . . 1 (FCM)UN×C = 0.8 0.2 0.3 0.7 0.6 0.4 . . . . . . 0.9 0.1
– p. 17/39
FCM Algorithm
The algorithm is composed of the following steps:
– p. 18/39
FCM Algorithm
The algorithm is composed of the following steps:
cj = N
i=1 um ij · xi
N
i=1 um ij
– p. 19/39
FCM Algorithm
The algorithm is composed of the following steps:
cj = N
i=1 um ij · xi
N
i=1 um ij
uij = 1 C
k=1
xi−ck
m−1
– p. 20/39
FCM Algorithm
The algorithm is composed of the following steps:
cj = N
i=1 um ij · xi
N
i=1 um ij
uij = 1 C
k=1
xi−ck
m−1
– p. 21/39
An Example
– p. 22/39
An Example
– p. 23/39
An Example
– p. 24/39
FCM Demo
Time for a demo!
– p. 25/39
Hierarchical Clustering
– p. 26/39
Agglomerative Hierarchical Clustering
Given a set of N items to be clustered, and an N*N distance (or dissimilarity) matrix, the basic process of agglomerative hierarchical clustering is the following:
between the clusters be the same as the dissimilarities between the items they contain.
single cluster. Now, you have one cluster less.
– p. 27/39
Single Linkage (SL) clustering
shortest distance from any member of one cluster to any member of the other one (greatest similarity).
– p. 28/39
Complete Linkage (CL) clustering
greatest distance from any member of one cluster to any member of the other one (smallest similarity).
– p. 29/39
Group Average (GA) clustering
average distance from any member of one cluster to any member of the other one.
– p. 30/39
About distances
If the data exhibit strong clustering tendency, all 3 methods produce similar results.
produced clusters can violate the “compactness” property (cluster with large diameters)
violate the “closeness” property)
and relatively far apart. BUT it depends on the dissimilarity scale.
– p. 31/39
Hierarchical algorithms limits
Strength of MIN
– p. 32/39
Hierarchical algorithms limits
Limitations of MIN
– p. 33/39
Hierarchical algorithms limits
Strength of MAX
– p. 34/39
Hierarchical algorithms limits
Limitations of MAX
– p. 35/39
Hierarchical clustering: Summary
collection of groups
the number of objects
– p. 36/39
Hierarchical Clustering Demo
Time for another demo!
– p. 37/39
Bibliography
Tutorial Slides by A. Moore
Tutorial Slides by P .L. Lanzi
Online tutorials by K. Teknomo
– p. 38/39
– p. 39/39