Boston University Slideshow Title Goes Here
Partitional Clustering
- Clustering: David Arthur, Sergei Vassilvitskii. k-means
++: The Advantages of Careful Seeding. In SODA 2007
- Thanks A. Gionis and S. Vassilvitskii for the slides
Partitional Clustering Boston University Slideshow Title Goes Here - - PowerPoint PPT Presentation
Partitional Clustering Boston University Slideshow Title Goes Here Clustering: David Arthur, Sergei Vassilvitskii. k-means ++: The Advantages of Careful Seeding . In SODA 2007 Thanks A. Gionis and S. Vassilvitskii for the slides What is
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
✦ Why we care ? ✦ stand-alone tool to gain insight into the data ✦ visualization ✦ preprocessing step for other algorithms ✦ indexing or compression often relies on clustering
Boston University Slideshow Title Goes Here
functionality etc)
Boston University Slideshow Title Goes Here
✦ Basic questions: ✦ what does similar mean? ✦ what is a good partition of the objects?
✦ how to find a good partition?
Boston University Slideshow Title Goes Here
How many clusters? Four Clusters Two Clusters Six Clusters
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
p4 p1 p3 p2 p4 p1 p3 p2
p4 p1 p2 p3 p4 p1 p2 p3
Traditional Hierarchical Clustering Non-traditional Hierarchical Clustering Non-traditional Dendrogram Traditional Dendrogram
Boston University Slideshow Title Goes Here
Original Points A Partitional Clustering
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
so that the cost is minimized
n
i=1
j
2(xi, cj)
n
i=1
j
2
Boston University Slideshow Title Goes Here
cluster center,
is minimized
n
i=1
j
2 = k
j=1
x∈Xj
2
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
but not always
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
K-means (3 Clusters)
Boston University Slideshow Title Goes Here
Original Points K-means (3 Clusters)
Boston University Slideshow Title Goes Here
Original Points K-means (2 Clusters)
Boston University Slideshow Title Goes Here
but not always
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
✦ a = 0 random initialization ✦ a = ∞ furthest-first traversal ✦ a = 2 k-means++
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
✦ expected error:
a0∈A
a∈A
a∈A
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
b0∈B
b∈B D2(b)
b∈B
Boston University Slideshow Title Goes Here
b0∈B
b∈B D2(b)
b∈B
b∈B
b∈B
✦ recall
b∈B
b0∈B
b∈B
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
cluster median,
is minimized
n
i=1
j
k
j=1
x∈Xj
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
cluster center,
is minimized
n
i=1 k
j=1 ||xi − cj||2
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
cluster center,
is minimized
n
i=1 k
j=1 ||xi − cj||2
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
Boston University Slideshow Title Goes Here
= d(j,{1,2,...,j-1}) ≤ d(j,{1,2,...,i-1}) // j > i ≤ d(i,{1,2,...,i-1}) = Ri
Boston University Slideshow Title Goes Here
d(i,{1,2,...,k}) ≤ d(k+1,{1,2,...,k}) = Rk+1
Boston University Slideshow Title Goes Here
R(C) ≤ 2R(C*) ✪
points in {1,…,k}
½ Rk ≥ ½ Rk+1= ½ R(C)
Boston University Slideshow Title Goes Here
R(C) ≤ x ≤ z + R(C*) ≤ 2R(C*)