clustering lecture 8
play

Clustering Lecture 8 David Sontag New York University - PowerPoint PPT Presentation

Clustering Lecture 8 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein Clustering Clustering: Unsupervised learning


  1. Clustering ¡ Lecture ¡8 ¡ David ¡Sontag ¡ New ¡York ¡University ¡ Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein

  2. Clustering Clustering: – Unsupervised learning – Requires data, but no labels – Detect patterns e.g. in • Group emails or search results • Customer shopping patterns • Regions of images – Useful when don’t know what you’re looking for – But: can get gibberish

  3. Clustering • Basic idea: group together similar instances • Example: 2D point patterns

  4. Clustering • Basic idea: group together similar instances • Example: 2D point patterns

  5. Clustering • Basic idea: group together similar instances • Example: 2D point patterns • What could “ similar ” mean? – One option: small Euclidean distance (squared) y || 2 dist( ~ x, ~ y ) = || ~ x − ~ 2 – Clustering results are crucially dependent on the measure of similarity (or distance) between “points” to be clustered

  6. � � � � � � � � Clustering algorithms • 8+'%(%(,) � +"*,'(%-.$ � 9:"+%; – < � .&+)$ – =(>%#'& � ,? � @+#$$(+) – A2&0%'+" � !"#$%&'()* � • /(&'+'0-(0+" � +"*,'(%-.$ � – 1,%%,. � #2 � 3 +**",.&'+%(4& – 5,2 � 6,7) � 3 6(4($(4& � � � � � � �

  7. Clustering examples ¡ Image ¡segmenta2on ¡ Goal: ¡Break ¡up ¡the ¡image ¡into ¡meaningful ¡or ¡ perceptually ¡similar ¡regions ¡ [Slide from James Hayes]

  8. Clustering examples Clustering gene expression data Eisen et al, PNAS 1998

  9. Clustering examples ¡ Cluster ¡news ¡ ar2cles ¡

  10. Clustering examples Cluster ¡people ¡by ¡space ¡and ¡2me ¡ [Image from Pilho Kim]

  11. Clustering examples Clustering ¡languages ¡ [Image from scienceinschool.org]

  12. Clustering examples Clustering ¡languages ¡ [Image from dhushara.com]

  13. Clustering examples Clustering ¡species ¡ (“phylogeny”) ¡ [Lindblad-Toh et al., Nature 2005]

  14. Clustering examples Clustering ¡search ¡queries ¡

  15. K-Means • An iterative clustering algorithm – Initialize: Pick K random points as cluster centers – Alternate: 1. Assign data points to closest cluster center 2. Change the cluster center to the average of its assigned points – Stop when no points ’ assignments change

  16. K-Means • An iterative clustering algorithm – Initialize: Pick K random points as cluster centers – Alternate: 1. Assign data points to closest cluster center 2. Change the cluster center to the average of its assigned points – Stop when no points ’ assignments change

  17. K-­‑means ¡clustering: ¡Example ¡ • Pick K random points as cluster centers (means) Shown here for K =2 17

  18. K-­‑means ¡clustering: ¡Example ¡ Iterative Step 1 • Assign data points to closest cluster center 18

  19. K-­‑means ¡clustering: ¡Example ¡ Iterative Step 2 • Change the cluster center to the average of the assigned points 19

  20. K-­‑means ¡clustering: ¡Example ¡ • Repeat ¡unDl ¡ convergence ¡ 20

  21. ProperDes ¡of ¡K-­‑means ¡ algorithm ¡ • Guaranteed ¡to ¡converge ¡in ¡a ¡finite ¡number ¡of ¡ iteraDons ¡ • Running ¡Dme ¡per ¡iteraDon: ¡ 1. Assign data points to closest cluster center O(KN) time 2. Change the cluster center to the average of its assigned points O(N) ¡

  22. !"#$%& '(%)#*+#%,# !"#$%&'($ � � � � � � � � � ��� � ��� ��� � ��� -. /01 � � 2 � (340"05# � !" !"#$ � % � &' � ()#*+, � � � � � � � � � � � ��� � ��� � � � � � � � � ��� ��� � � 6. /01 � !# � (340"05# � �� � � � � � � � � � ��� ��� � ��� – 7$8# � 3$*40$9 � :#*0)$40)# � (; � � � $%: � &#4 � 4( � 5#*(2 � <# � =$)# with respect to � � � � �� � � � � !"#$ � - � &' � ()#*+, ��� � !"#$%& 4$8#& � $% � $94#*%$40%+ � (340"05$40(% � $33*($,=2 � #$,= � &4#3 � 0& � +>$*$%4##: � 4( � :#,*#$&# � 4=# � (?@#,40)# � A 4=>& � +>$*$%4##: � 4( � ,(%)#*+# [Slide from Alan Fern]

  23. Example: K-Means for Segmentation K=2 Original Goal of Segmentation is Original image K = 2 K = 3 K = 10 to partition an image into regions each of which has reasonably homogenous visual appearance.

  24. Example: K-Means for Segmentation K=2 K=3 K=10 Original Original image K = 2 K = 3 K = 10

  25. Example: K-Means for Segmentation K=2 K=3 K=10 Original Original image K = 2 K = 3 K = 10

  26. Example: Vector quantization FIGURE 14.9. Sir Ronald A. Fisher ( 1890 − 1962 ) was one of the founders of modern day statistics, to whom we owe maximum-likelihood, su ffi ciency, and many other fundamental concepts. The image on the left is a 1024 × 1024 grayscale image at 8 bits per pixel. The center image is the result of 2 × 2 block VQ, using 200 code vectors, with a compression rate of 1 . 9 bits/pixel. The right image uses only four code vectors, with a compression rate of 0 . 50 bits/pixel [Figure from Hastie et al. book]

  27. Initialization • K-means algorithm is a heuristic – Requires initial means – It does matter what you pick! – What can go wrong? – Various schemes for preventing this kind of thing: variance-based split / merge, initialization heuristics

  28. K-Means Getting Stuck A local optimum: Would be better to have one cluster here … and two clusters here

  29. K-means not able to properly cluster Y X

  30. Changing the features (distance function) can help R θ

  31. Hierarchical ¡Clustering ¡

  32. Agglomerative Clustering • Agglomerative clustering: – First merge very similar instances – Incrementally build larger clusters out of smaller clusters • Algorithm: – Maintain a set of clusters – Initially, each instance in its own cluster – Repeat: • Pick the two closest clusters • Merge them into a new cluster • Stop when there’s only one cluster left • Produces not one clustering, but a family of clusterings represented by a dendrogram

  33. Agglomerative Clustering • How should we define “ closest ” for clusters with multiple elements?

  34. Agglomerative Clustering • How should we define “ closest ” for clusters with multiple elements? • Many options: – Closest pair (single-link clustering) – Farthest pair (complete-link clustering) – Average of all pairs • Different choices create different clustering behaviors

  35. Agglomerative Clustering • How should we define “ closest ” for clusters with multiple elements? Closest pair Farthest pair (single-link clustering) (complete-link clustering) 1 5 6 2 1 5 2 6 3 4 7 8 3 4 7 8 [Pictures from Thorsten Joachims]

  36. Clustering ¡Behavior ¡ Average Farthest Nearest Mouse tumor data from [Hastie et al. ]

  37. AgglomeraDve ¡Clustering ¡ When ¡can ¡this ¡be ¡expected ¡to ¡work? ¡ Strong separation property: Closest pair All points are more similar to points in (single-link clustering) their own cluster than to any points in any other cluster Then, the true clustering corresponds to some pruning of the tree obtained by 1 5 6 2 single-link clustering! Slightly weaker (stability) conditions are solved by average-link clustering 3 4 7 8 (Balcan et al., 2008)

  38. Spectral ¡Clustering ¡ Slides adapted from James Hays, Alan Fern, and Tommi Jaakkola

  39. Spectral ¡clustering ¡ K-means Spectral clustering twocircles, 2 clusters two circles, 2 clusters (K − means) 5 5 5 4.5 4.5 4.5 4 4 4 3.5 3.5 3.5 3 3 3 2.5 2.5 2.5 2 2 2 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 1 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 [Shi & Malik ‘00; Ng, Jordan, Weiss NIPS ‘01]

  40. Spectral ¡clustering ¡ nips, 8 clusters lineandballs, 3 clusters fourclouds, 2 clusters 5 5 5 4.5 4.5 4.5 4 4 4 3.5 3.5 3.5 3 3 3 2.5 2.5 2.5 2 2 2 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 0 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 squiggles, 4 clusters threecircles − joined, 3 clusters twocircles, 2 clusters threecircles − joined, 2 clusters 5 5 5 5 4.5 4.5 4.5 4.5 4 4 4 4 3.5 3.5 3.5 3.5 3 3 3 3 2.5 2.5 2.5 2.5 2 2 2 2 1.5 1.5 1.5 1.5 − 1 1 1 1 − 0.5 0.5 0.5 0.5 − 0 0 0 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 [Figures from Ng, Jordan, Weiss NIPS ‘01]

  41. Spectral ¡clustering ¡ ¡ ¡Group ¡points ¡based ¡on ¡links ¡in ¡a ¡graph ¡ B A [Slide from James Hays]

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend