Using Local Neighborhoods to Find Subspace Clusters
Emin Aksehirli with Bart Goethals, Emmanuel Müller and Jilles Vreeken
Using Local Neighborhoods to Find Subspace Clusters Emin Aksehirli - - PowerPoint PPT Presentation
Using Local Neighborhoods to Find Subspace Clusters Emin Aksehirli with Bart Goethals, Emmanuel Mller and Jilles Vreeken High Dimensional Data ? 2 High Dimensional Data 3 High Dimensional Data 4 High Dimensional
Emin Aksehirli with Bart Goethals, Emmanuel Müller and Jilles Vreeken
2
3
4
5
6
7
8
FIM Clustering
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Our Method CartiClus FIRES PROCLUS STATPC SUBCLUE
0.0 0.2 0.4 0.6 0.8 1.0 1 2 4 8 16 32 100 200 F1 Score
24
S1500 S2500 S3500 S4500 S5500 1 10 100 1000 10000 Our Method CartiClus FIRES PROCLUS STATPC SUBCLU
Run Time (seconds)
25
D5 D10 D15 D25 D50 D75 1 10 100 1000 10000 Our Method CartiClus FIRES PROCLUS STATPC SUBCLU
Run Time (seconds)
26
Star Wars: A New Hope (a.k.a. Star Wars) (1977) Star Wars: The Empire Strikes Back (1980) Star Wars: Return of the Jedi (1983) LotR: The Fellowship of the Ring, The (2001) LotR: The T wo T
LotR: The Return of the King, The (2003) Back to the Future (1985) T erminator, The (1984) T erminator 2: Judgment Day (1991) Die Hard (1988) T erminator, The (1984) T erminator 2: Judgment Day (1991) Usual Suspects, The (1995) Pulp Fiction (1994) Silence of the Lambs, The (1991)
27
Star Wars: A New Hope (1977) Star Wars: The Empire Strikes Back (1980) Star Wars: Return of the Jedi (1983) LotR: The Fellowship of the Ring, The (2001) LotR: The T wo T
LotR: The Return of the King, The (2003) Brazil (1985)
Clockwork Orange, A (1971) 2001: A Space Odyssey (1968) Blade Runner (1982) Alien (1979) Chinatown (1974) Rear Window (1954) North by Northwest (1959) Vertigo (1958) Psycho (1960) Silence of the Lambs, The (1991) Third Man, The (1949) Citizen Kane (1941) Godfather: Part II, The (1974) Chinatown (1974) Godfather, The (1972) T axi Driver (1976)
28
→ Code and the data is available at our website.
29
30
S1500 S2500 S3500 S4500 S5500 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Our Method CartiClus FIRES PROCLUS STATPC SUBCLU
F1 Score
31
D05 D10 D15 D25 D50 D75
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Our Method CartiClus FIRES PROCLUS STATPC SUBCLU
F1 Score
32
PCA and Random Projection
33
F1 E4SC 0.2 0.4 0.6 0.8 1 Our Method CSPA Proclus K-Means PCA+KM RP+KM
Quality of the found clusters
10 clusters in 10 dimensions 200 irrelevant dimensions
34
F1 E4SC 0.2 0.4 0.6 0.8 1 Our Method CSPA Proclus K-Means PCA+KM RP+KM
Quality of the found clusters
10 clusters in 10 dimensions 200 irrelevant dimensions