SLIDE 19 Cluster Validity 10/14/2010 19 Erin Wirch & Wenbo Wang Outline Hypothesis Testing
Random Position Hypothesis Random Graph Hypothesis Random Label Hypothesis
Relative Criteria
Methodology Clustering Indices - Hard Clustering
Questions
Hard Clustering Indices (Cont’)
◮ The Davies-Bouldin(DB) and DB-like indices:
◮ si is the measure of the spread around its mean vector
for cluster Ci
◮ dissimilarity function between two clusters: d(Ci, Cj) ◮ the similarity index Rij between Ci, Cj has the property: ◮ if sj > sk and dij = dik then Rij > Rik ◮ if sj = sk and dij < dik then Rij > Rik ◮ choose Rij = si+sj
dij , Ri = maxj=1,..m,j=i Rij
DBm = 1 m
m
Ri (3)
◮ The DB-like indices based on MST
◮ Rij =
sMST
i
+sMST
j
dij
◮ DBMST
m
= 1
m
m
i=1 RMST i