Levelwise Clustering under a Maximum SSE Constraint Jeroen De Knijf - - PowerPoint PPT Presentation

levelwise clustering under a maximum sse constraint
SMART_READER_LITE
LIVE PREVIEW

Levelwise Clustering under a Maximum SSE Constraint Jeroen De Knijf - - PowerPoint PPT Presentation

Introduction Algorithm Experimental Results Conclusion Levelwise Clustering under a Maximum SSE Constraint Jeroen De Knijf Bart Goethals and Adriana Prado ADReM Computer Science and Mathematic Department Antwerp University Jeroen De


slide-1
SLIDE 1

Introduction Algorithm Experimental Results Conclusion

Levelwise Clustering under a Maximum SSE Constraint

Jeroen De Knijf Bart Goethals and Adriana Prado

ADReM Computer Science and Mathematic Department Antwerp University

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-2
SLIDE 2

Introduction Algorithm Experimental Results Conclusion

Research Question

Question Can we use techniques from frequent itemset mining to find an

  • ptimal clustering of points in Euclidean Space ?

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-3
SLIDE 3

Introduction Algorithm Experimental Results Conclusion

Introduction

Set of points P ⊂ Rd Sum of Squared Errors as optimization criterion. Sum of Squared Errors is monotone with respect to set inclusion.

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-4
SLIDE 4

Introduction Algorithm Experimental Results Conclusion

Basic Algorithm

1

Derive all sets with an SSE value lower than MAXSSE (Apriori-style mining algorithm).

2

Use the derived sets to construct a partition of the data (greedy heuristic).

3

Create a new dataset by replacing the points in the clusters by its centroid. Continue with the first step.

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-5
SLIDE 5

Introduction Algorithm Experimental Results Conclusion

One max SSE value for the whole input space ?

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-6
SLIDE 6

Introduction Algorithm Experimental Results Conclusion

Mining Algorithm

Perform |P| mining algorithms, one for each point p ∈ P and its region. The MAXSSE value for the mining algorithm of point p is depended on the density of the region of p.

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-7
SLIDE 7

Introduction Algorithm Experimental Results Conclusion

The region of a point

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-8
SLIDE 8

Introduction Algorithm Experimental Results Conclusion

The region of a point

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-9
SLIDE 9

Introduction Algorithm Experimental Results Conclusion

Dataset

Dataset #points #dimensions #classes Iris 150 5 3 Ecoli 336 8 8 Sonar 208 61 2

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-10
SLIDE 10

Introduction Algorithm Experimental Results Conclusion

Results on the Iris and Ecoli dataset.

10 20 30 40 50 60 70 80 90 1 2 3 4 5 6 7 8 SSE # of clusters 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 # of clusters SSE

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-11
SLIDE 11

Introduction Algorithm Experimental Results Conclusion

Results on the Sonar dataset.

10 20 30 40 50 60 70 80 90 100 50 100 150 200 250 300 350 400 450 SSE # of clusters Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint

slide-12
SLIDE 12

Introduction Algorithm Experimental Results Conclusion

Conclusion

Our approach is able to capture essential clusters in a dataset. The number of cluster obtained is unpredictable and rather sensitive to the input parameter.

Jeroen De Knijf, Bart Goethals and Adriana Prado Levelwise Clustering under a Maximum SSE Constraint