SLIDE 6 Dataset Clustering Framework Validation
Objective Validation Strategy
Algorithm 2: Objective Validation Strategy
1 p Ð number of partitions for each household 2 distp¨, ¨q Ð function to calculate Euclidean distance
Ź Output from clustering algorithm
3 labels Ð labels assigned to each household 4 Cpkˆd1q Ð cluster centers of each cluster 5 Initialize:
match, misMatch, counter “ 0;
6 repeat 7
foreach household do
8
D Ð data for each household (daysˆ24/r);
9
D’ Ð randomly shuffled data by rows;
10
Make p equal partitions from rows of D’; Ź Perform Pre-Processing steps
11
Mppˆp24{rqq
p
Ð new medians of p partitions;
12
M1
p Ð ℓ2-Normalization(Mp, row-wise);
Ź Do Dimensionality Reduction
13
Nppˆd1q
p
Ð dimReduce(M1
p, d1); 14
foreach partition P t1 ¨ ¨ ¨ pu as part do Ź Find Closest Cluster
15
CC Ð argminipdistpNprparts, C [i, :]qq;
16
if CC ““ labels rhouseholds then
17
match++;
18
else
19
misMatch++;
20
counter++;
21 until counter ă 100; 22 avgMatches “ match{100; 23 avgMisMatches “ misMatch{100;
Result: avgMatches & avgMisMatches
Results obtained by performing objective validation of the 4 clustering frameworks.
Clustering Framework %Matches %Mis-Matches p = 2 FA ` SC 22.67 77.33 FA ` KMC 29.07 70.93 PCA ` SC 18.78 81.22 PCA ` KMC 76.28 23.72 p = 3 FA ` SC 21.34 78.66 FA ` KMC 24.98 75.02 PCA ` SC 17.60 82.40 PCA ` KMC 67.15 32.85 SEST2020 Istanbul Clustering Framework 5/7