subspace clustering ensemble clustering subspace
play

Subspace Clustering Ensemble Clustering Subspace Clustering, - PowerPoint PPT Presentation

LUDWIG- MAXIMILIANS- INSTITUTE DATABASE UNIVERSITT FOR SYSTEMS MNCHEN MNCHEN INFORMATICS INFORMATICS GROUP GROUP Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative Clustering, Multiview


  1. LUDWIG- MAXIMILIANS- INSTITUTE DATABASE UNIVERSITÄT FOR SYSTEMS MÜNCHEN MÜNCHEN INFORMATICS INFORMATICS GROUP GROUP Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative Clustering, Multiview Clustering: What Can We Learn From Each Other? MultiClust@KDD 2010 Hans-Peter Kriegel, Arthur Zimek Ludwig-Maximilians-Universität München Munich, Germany http://www.dbs.ifi.lmu.de {kriegel, zimek}@dbs.ifi.lmu.de

  2. Outline DATABASE SYSTEMS GROUP GROUP 1. Subspace Clustering 2. Ensemble Clustering 3. Alternative Clustering 4. Multiview Clustering 5. Discussion Kriegel/Zimek: What can we learn from each other? (MultiClust@KDD 2010) 2

  3. Subspace Clustering DATABASE SYSTEMS GROUP GROUP • Task: identify clusters of similar objects • similarity defined w.r.t. a certain subspace of the data space i il it d fi d t t i b f th d t • different subspaces for different clusters Kriegel/Zimek: What can we learn from each other? (MultiClust@KDD 2010) 3

  4. Subspace Clustering DATABASE SYSTEMS GROUP GROUP • Subspaces: different – selection – weighting – combination combination of attributes • learn subspace and clustering • learn subspace and clustering simultaneously (interdepency) • strategies: strategies: evant attribute – top-down (learn spatial characteristics of initially built sets of objects) irrele – bottom-up (learn 1-d clusters, combine them to 2-d clusters, etc. (APRIORI)) => many irrelevant clusters a y e e a c us e s relevant attribute/ relevant subspace Kriegel/Zimek: What can we learn from each other? (MultiClust@KDD 2010) 4

  5. Ensemble Clustering DATABASE SYSTEMS GROUP GROUP • basic idea: combine different clusterings to obtain one single, more reliable clustering i l li bl l t i • tasks: – how to create diverse clusterings h t t di l t i – how to combine different clusterings • induce diversity of clusterings • induce diversity of clusterings – use different feature-subsets – use different database subsets – use different clustering algorithms • correspondence between clusterings – useful for judging on redundancy of clusters? – a lot of different answers – but: could it not be that different clusterings are just different yet both meaningful? clusterings are just different, yet both meaningful? Kriegel/Zimek: What can we learn from each other? (MultiClust@KDD 2010) 5

  6. Alternative Clustering DATABASE SYSTEMS GROUP GROUP • given a clustering, use diversity or non-redundancy as a constraint to find a different clustering t i t t fi d diff t l t i • techniques: – ensemble techniques bl t h i – use different subspaces • relationship to subspace clustering: • relationship to subspace clustering: – subspace clustering can learn from the treatment of non-redundancy – alternative clustering can learn to allow for a certain level of g redundancy Kriegel/Zimek: What can we learn from each other? (MultiClust@KDD 2010) 6

  7. Multiview Clustering DATABASE SYSTEMS GROUP GROUP • seek different clusterings in different subspaces • special case of alternative clustering? – constraint: orthogonality of subspaces • special case of subspace clustering? – allowing maximal overlap of clusters – seeking minimally redundant clusters by accommodating different seeking minimally redundant clusters by accommodating different concepts • emphasizes the observation known from subspace p p clustering: highly overlapping clusters in different subspaces need not g y pp g p be redundant nor meaningless Kriegel/Zimek: What can we learn from each other? (MultiClust@KDD 2010) 7

  8. Discussion DATABASE SYSTEMS GROUP GROUP subspace clustering ensemble clustering •goal: different clusters in different l diff t l t i diff t •goal: different subspaces shall l diff t b h ll subspaces induce the same clusters •problem: redundancy of clusters •problem: correspondence of (same clusters reported for (same clusters reported for clusterings? What about actually clusterings? What about actually different subspaces) different clusterings? ? alternative clustering multiview clustering •goal: given a clustering, find a •goal: given a clustering find a •goal: find different cluster •goal: find different cluster different clustering concepts in different subspaces •problem: which level of •problem: balance between redundancy is admissible? redundancy is admissible? admissible overlap of clusters and admissible overlap of clusters and difference between concepts Kriegel/Zimek: What can we learn from each other? (MultiClust@KDD 2010) 8

  9. Discussion DATABASE SYSTEMS GROUP GROUP • how should we treat diversity of clustering solutions? – should diverse clusterings always be unified (ensemble)? – should diverse clusterings always be unified (ensemble)? – under which conditions is a unification of diverse clusterings meaningful? • can we learn from diversity itself? – again ensemble: exceptional clustering in one subspace will be outnumbered and lost – could it not be especially interesting? t b d d l t ld it t b i ll i t ti ? • how to treat redundancy (esp. overlap)? – when does a cluster qualify as redundant w.r.t. another cluster, when when does a cluster qualify as redundant w r t another cluster when does it represent a different concept (despite a certain overlap)? alternative clustering subspace clustering ? low redundancy l d d hi h high redundancy d d • how to assess similarity between clustering solutions? – possible overlap between clusters makes this problem really difficult poss b e o e ap be ee c us e s a es s p ob e ea y d cu – no simple mapping Kriegel/Zimek: What can we learn from each other? (MultiClust@KDD 2010) 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend