clustering
play

Clustering ! Unsupervised learning: learning without a teacher ! - PowerPoint PPT Presentation

So far in the course Supervised learning: learning with a teacher ! You had training data which was (feature, label) pairs and the goal was to learn a mapping from features to labels Clustering ! Unsupervised learning: learning without a


  1. So far in the course Supervised learning: learning with a teacher ! ‣ You had training data which was (feature, label) pairs and the goal was to learn a mapping from features to labels Clustering ! Unsupervised learning: learning without a teacher ! ‣ Only features and no labels Subhransu Maji Why is unsupervised learning useful? ! CMPSCI 689: Machine Learning ‣ Discover hidden structures in the data — clustering today ‣ Visualization — dimensionality reduction 2 April 2015 ➡ lower dimensional features might help learning 7 April 2015 CMPSCI 689 Subhransu Maji (UMASS) 2 /48 Clustering Clustering Basic idea: group together similar instances ! Basic idea: group together similar instances ! Example: 2D points Example: 2D points ! ! ! ! ! ! ! What could similar mean? ! ‣ One option: small Euclidean distance (squared) ! dist( x , y ) = || x − y || 2 2 ! ‣ Clustering results are crucially dependent on the measure of similarity (or distance) between points to be clustered CMPSCI 689 Subhransu Maji (UMASS) 3 /48 CMPSCI 689 Subhransu Maji (UMASS) 4 /48

  2. Clustering algorithms Clustering examples Simple clustering: organize Image segmentation: break up the image into similar regions elements into k groups ! ‣ K-means ‣ Mean shift ‣ Spectral clustering ! ! ! Hierarchical clustering: organize elements into a hierarchy ! ‣ Bottom up - agglomerative ‣ Top down - divisive image credit: Berkeley segmentation benchmark CMPSCI 689 Subhransu Maji (UMASS) 5 /48 CMPSCI 689 Subhransu Maji (UMASS) 6 /48 Clustering examples Clustering examples Clustering news articles Clustering queries CMPSCI 689 Subhransu Maji (UMASS) 7 /48 CMPSCI 689 Subhransu Maji (UMASS) 8 /48

  3. Clustering examples Clustering examples Clustering people by space and time Clustering species (phylogeny) phylogeny of canid species (dogs, wolves, foxes, jackals, etc) image credit: Pilho Kim [K. Lindblad-Toh, Nature 2005] CMPSCI 689 Subhransu Maji (UMASS) 9 /48 CMPSCI 689 Subhransu Maji (UMASS) 10 /48 Clustering using k-means Lloyd’s algorithm for k-means Given (x 1 , x 2 , …, x n ) partition the n observations into k ( ≤ n) sets Initialize k centers by picking k points randomly among all the points ! S = {S 1 , S 2 , …, S k } so as to minimize the within-cluster sum of Repeat till convergence (or max iterations) ! squared distances ‣ Assign each point to the nearest center (assignment step) ! k ! The objective is to minimize: X X || x − µ i || 2 arg min ! S k i =1 x ∈ S i ! X X || x − µ i || 2 arg min ‣ Estimate the mean of each group (update step) S i =1 x ∈ S i k X X || x − µ i || 2 arg min cluster center S i =1 x ∈ S i CMPSCI 689 Subhransu Maji (UMASS) 11 /48 CMPSCI 689 Subhransu Maji (UMASS) 12 /48

  4. k-means in action Properties of the Lloyd’s algorithm Guaranteed to converge in a finite number of iterations ! ‣ The objective decreases monotonically over time ‣ Local minima if the partitions don’t change. Since there are finitely many partitions, k-means algorithm must converge ! Running time per iteration ! ‣ Assignment step: O(NKD) ‣ Computing cluster mean: O(ND) ! Issues with the algorithm: ! ‣ Worst case running time is super-polynomial in input size ‣ No guarantees about global optimality ➡ Optimal clustering even for 2 clusters is NP-hard [Aloise et al., 09] k-means++ http://simplystatistics.org/2014/02/18/k-means-clustering-in-a-gif/ CMPSCI 689 Subhransu Maji (UMASS) 13 /48 CMPSCI 689 Subhransu Maji (UMASS) 14 /48 k-means++ algorithm k-means for image segmentation A way to pick the good initial centers ! ‣ Intuition: spread out the k initial cluster centers The algorithm proceeds normally once the centers are initialized ! K=2 k-means++ algorithm for initialization: ! 1.Chose one center uniformly at random among all the points 2.For each point x , compute D( x ), the distance between x and the nearest center that has already been chosen 3.Chose one new data point at random as a new center, using a K=3 Grouping pixels based weighted probability distribution where a point x is chosen with a on intensity similarity probability proportional to D( x ) 2 4.Repeat Steps 2 and 3 until k centers have been chosen ! [Arthur and Vassilvitskii’07] The approximation quality is O(log k) in expectation feature space: intensity value (1D) http://en.wikipedia.org/wiki/K-means%2B%2B CMPSCI 689 Subhransu Maji (UMASS) 15 /48 CMPSCI 689 Subhransu Maji (UMASS) 16 /48

  5. Clustering using density estimation Mean shift algorithm One issue with k-means is that it is sometimes hard to pick k ! Mean shift procedure: ! ‣ For each point, repeat till convergence The mean shift algorithm seeks modes or local maxima of density in the feature space — automatically determines the number of clusters ‣ Compute mean shift vector − || x − x i || 2 ✓ ◆ ‣ Translate the kernel window by m(x) # exp h Simply following the gradient of density 2 ! " # x - x $ n i x g ∑ % ' ( & i ' h ( % & i 1 = ) * Kernel density estimator m x ( ) x = − % & 2 # x - x $ ✓ − || x − x i || 2 ◆ n K ( x ) = 1 % & X g i exp ∑ ' ( % & Z h ' h ( i 1 i = ) * , - Small h implies more modes (bumpy distribution) Slide&by&Y.&Ukrainitz&&&B.&Sarel CMPSCI 689 Subhransu Maji (UMASS) 17 /48 CMPSCI 689 Subhransu Maji (UMASS) 18 /48 Mean shift Mean shift Search 
 Search 
 window window Center of Center of mass mass Mean Shift Mean Shift vector vector Slide&by&Y.&Ukrainitz&&&B.&Sarel Slide&by&Y.&Ukrainitz&&&B.&Sarel CMPSCI 689 Subhransu Maji (UMASS) 19 /48 CMPSCI 689 Subhransu Maji (UMASS) 20 /48

  6. Mean shift Mean shift Search 
 Search 
 window window Center of Center of mass mass Mean Shift Mean Shift vector vector Slide&by&Y.&Ukrainitz&&&B.&Sarel Slide&by&Y.&Ukrainitz&&&B.&Sarel CMPSCI 689 Subhransu Maji (UMASS) 21 /48 CMPSCI 689 Subhransu Maji (UMASS) 22 /48 Mean shift Mean shift Search 
 Search 
 window window Center of Center of mass mass Mean Shift Mean Shift vector vector Slide&by&Y.&Ukrainitz&&&B.&Sarel Slide&by&Y.&Ukrainitz&&&B.&Sarel CMPSCI 689 Subhransu Maji (UMASS) 23 /48 CMPSCI 689 Subhransu Maji (UMASS) 24 /48

  7. Mean shift Mean shift clustering Search 
 Cluster all data points in the attraction basin of a mode ! window Attraction basin is the region for which all trajectories lead to the Center of same mode — correspond to clusters mass Slide&by&Y.&Ukrainitz&&&B.&Sarel Slide&by&Y.&Ukrainitz&&&B.&Sarel CMPSCI 689 Subhransu Maji (UMASS) 25 /48 CMPSCI 689 Subhransu Maji (UMASS) 26 /48 Mean shift for image segmentation Mean shift clustering results Feature: L*u*v* color values ! Initialize windows at individual feature points ! Perform mean shift for each window until convergence ! Merge windows that end up near the same “peak” or mode http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html CMPSCI 689 Subhransu Maji (UMASS) 27 /48 CMPSCI 689 Subhransu Maji (UMASS) 28 /48

  8. Mean shift discussion Spectral clustering Pros: ! ‣ Does not assume shape on clusters ‣ One parameter choice (window size) ‣ Generic technique ‣ Finds multiple modes Cons: ! ‣ Selection of window size ‣ Is rather expensive: O(DN 2 ) per iteration ‣ Does not work well for high-dimensional features [Shi & Malik ‘00; Ng, Jordan, Weiss NIPS ‘01] Kristen Grauman CMPSCI 689 Subhransu Maji (UMASS) 29 /48 CMPSCI 689 Subhransu Maji (UMASS) 30 /48 Spectral clustering Spectral clustering Group points based on the links in a graph ! ! ! ! B A ! How do we create the graph? ! ‣ Weights on the edges based on similarity between the points ‣ A common choice is the Gaussian kernel ✓ − || x i − x j || 2 ◆ ! W ( i, j ) = exp ! 2 σ 2 One could create ! ‣ A fully connected graph ‣ k-nearest graph (each node is connected only to its k-nearest neighbors) [Figures from Ng, Jordan, Weiss NIPS ‘01] slide credit: Alan Fern CMPSCI 689 Subhransu Maji (UMASS) 31 /48 CMPSCI 689 Subhransu Maji (UMASS) 32 /48

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend