 
              Introduction Algorithm Experimental Results Conclusion Levelwise Clustering under a Maximum SSE Constraint Jeroen De Knijf Bart Goethals and Adriana Prado ADReM Computer Science and Mathematic Department Antwerp University Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion Research Question Question Can we use techniques from frequent itemset mining to find an optimal clustering of points in Euclidean Space ? Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion Introduction Set of points P ⊂ R d Sum of Squared Errors as optimization criterion. Sum of Squared Errors is monotone with respect to set inclusion. Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion Basic Algorithm Derive all sets with an SSE value lower than MAXSSE 1 (Apriori-style mining algorithm). Use the derived sets to construct a partition of the data 2 (greedy heuristic). Create a new dataset by replacing the points in the 3 clusters by its centroid. Continue with the first step. Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion One max SSE value for the whole input space ? Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion Mining Algorithm Perform | P | mining algorithms, one for each point p ∈ P and its region. The MAXSSE value for the mining algorithm of point p is depended on the density of the region of p . Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion The region of a point Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion The region of a point Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion Dataset Dataset #points #dimensions #classes Iris 150 5 3 Ecoli 336 8 8 Sonar 208 61 2 Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion Results on the Iris and Ecoli dataset. 8 120 7 100 6 80 5 SSE SSE 4 60 3 40 2 20 1 0 0 0 20 40 60 80 100 120 140 160 180 200 0 10 20 30 40 50 60 70 80 90 # of clusters # of clusters Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion Results on the Sonar dataset. 450 400 350 300 SSE 250 200 150 100 50 0 10 20 30 40 50 60 70 80 90 100 # of clusters Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Introduction Algorithm Experimental Results Conclusion Conclusion Our approach is able to capture essential clusters in a dataset. The number of cluster obtained is unpredictable and rather sensitive to the input parameter. Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint
Recommend
More recommend