lecture 21
play

Lecture 21: Clustering K-Means Aykut Erdem December 2018 Hacettepe - PowerPoint PPT Presentation

Lecture 21: Clustering K-Means Aykut Erdem December 2018 Hacettepe University Last time Boosting Idea: given a weak learner, run it multiple times on (reweighted) training data, then let the learned classifiers vote On each


  1. � � � � � � � � Clustering algorithms • Partitioning algorithms � � %; - Construct various partitions 
 � and then evaluate them by 
 � � some criterion � • K-means • Mixture of Gaussians � • Spectral Clustering • Hierarchical algorithms � � - Create a hierarchical decomposition 
 � � of the set of objects using some 
 � � � criterion - Bottom-up – agglomerative - Top-down – divisive slide by Eric Xing � � � 38 � � � �

  2. Desirable Properties of a Clustering Algorithm • Scalability (in terms of both time and space) • Ability to deal with di ff erent data types • Minimal requirements for domain knowledge to determine input parameters • Ability to deal with noisy data • Interpretability and usability • Optional slide by Andrew Moore - Incorporation of user-specified constraints � 39

  3. K-Means 
 Clustering � 40

  4. K-Means Clustering Benefits • Fast • Conceptually straightforward • Popular slide by Tamara Broderick � 41

  5. K-Means: Preliminaries slide by Tamara Broderick � 42

  6. K-Means: Preliminaries Datum: Vector of continuous values slide by Tamara Broderick � 43

  7. K-Means: Preliminaries Distance North Datum: Vector of continuous values slide by Tamara Broderick Distance East � 44

  8. K-Means: Preliminaries Distance North Datum: Vector of continuous values 6 . 2 slide by Tamara Broderick 1 . 5 Distance East � 45

  9. K-Means: Preliminaries Distance North Datum: Vector of continuous values North x 3 = (1 . 5 , 6 . 2) East Nor East 1.2 5.9 x 1 6 . 2 4.3 2.1 x 2 1.5 6.3 x 3 ... 4.1 2.3 x N slide by Tamara Broderick Distance East 1 . 5 Distance East � 46

  10. K-Means: Preliminaries Datum: Vector of continuous values Feature 1 Feature 2 Feature 2 x 3 = (1 . 5 , 6 . 2) Nor East 1.2 5.9 x 1 6 . 2 4.3 2.1 x 2 1.5 6.3 x 3 ... 4.1 2.3 x N slide by Tamara Broderick Distance East 1 . 5 Feature 1 � 47

  11. K-Means: Preliminaries Datum: Vector of continuous values Feature 1 Feature 2 Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = (1 . 5 , 6 . 2) x 3 = ( x 3 , 1 , x 3 , 2 ) Nor East F F 1.2 5.9 x 1 , 1 x 1 , 2 x 1 x 1 4.3 2.1 x 2 x 2 x 2 , 1 x 2 , 2 1.5 6.3 x 3 , 1 x 3 , 2 x 3 x 3 ... ... 4.1 2.3 x N, 1 x N, 2 x N x N slide by Tamara Broderick Feature 1 Distance East Feature 1 � 48

  12. K-Means: Preliminaries Datum: Vector of D continuous values Feature 1 Feature 2 Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) Nor East F F 1.2 5.9 x 1 , 1 x 1 , 2 x 1 x 1 4.3 2.1 x 2 x 2 x 2 , 1 x 2 , 2 1.5 6.3 x 3 , 1 x 3 , 2 x 3 x 3 ... ... 4.1 2.3 x N, 1 x N, 2 x N x N slide by Tamara Broderick Feature 1 Distance East Feature 1 � 49

  13. K-Means: Preliminaries Datum: Vector of D continuous values Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) slide by Tamara Broderick Feature 1 Feature 1 � 50

  14. K-Means: Preliminaries Dissimilarity: Distance as the crow flies Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) slide by Tamara Broderick Feature 1 Feature 1 � 51

  15. K-Means: Preliminaries Dissimilarity: Distance as the crow flies Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 x 17 slide by Tamara Broderick Feature 1 Feature 1 � 52

  16. K-Means: Preliminaries Dissimilarity: Distance as the crow flies Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 x 17 slide by Tamara Broderick Feature 1 Feature 1 � 53

  17. K-Means: Preliminaries Dissimilarity: Euclidean distance Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 x 17 slide by Tamara Broderick Feature 1 Feature 1 � 54

  18. K-Means: Preliminaries Dissimilarity: Squared Euclidean distance Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) dis ( x 3 , x 17 ) = ( x 3 , 1 − x 17 , 1 ) 2 + ( x 3 , 2 − x 17 , 2 ) 2 x 3 x 17 slide by Tamara Broderick Feature 1 Feature 1 � 55

  19. K-Means: Preliminaries Dissimilarity: Squared Euclidean distance Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) D � dis ( x 3 , x 17 ) = ( x 3 , 1 − x 17 , 1 ) 2 ( x 3 ,d − x 17 ,d ) 2 dis ( x 3 , x 17 ) = d =1 + ( x 3 , 2 − x 17 , 2 ) 2 x 3 x 17 For each feature For each feature slide by Tamara Broderick Feature 1 Feature 1 � 56

  20. K-Means: Preliminaries Dissimilarity Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) slide by Tamara Broderick Feature 1 Feature 1 � 57

  21. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) K = number of clusters slide by Tamara Broderick Feature 1 Feature 1 � 58

  22. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers slide by Tamara Broderick Feature 1 Feature 1 � 59

  23. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers slide by Tamara Broderick Feature 1 Feature 1 � 60

  24. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers slide by Tamara Broderick Feature 1 Feature 1 � 61

  25. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers µ 2 µ 3 µ 1 slide by Tamara Broderick Feature 1 Feature 1 � 62

  26. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers µ 2 µ 3 1 = ( µ 1 , 1 , µ 1 , 2 ) µ 1 slide by Tamara Broderick Feature 1 Feature 1 � 63

  27. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers µ 1 , µ 2 , . . . , µ K µ 2 µ 3 1 = ( µ 1 , 1 , µ 1 , 2 ) µ 1 slide by Tamara Broderick Feature 1 Feature 1 � 64

  28. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers µ 1 , µ 2 , . . . , µ K • Data assignments to clusters slide by Tamara Broderick Feature 1 Feature 1 � 65

  29. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers µ 1 , µ • Data assignments to clusters µ 1 , µ 2 , . . . , µ K • Data assignments to clusters slide by Tamara Broderick Feature 1 Feature 1 � 66

  30. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers µ 1 , µ • Data assignments to clusters µ 1 , µ 2 , . . . , µ K • Data assignments to clusters = set of points in 
 S k = set of points in cluster k cluster k slide by Tamara Broderick Feature 1 Feature 1 � 67

  31. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers µ 1 , µ • Data assignments to clusters µ 1 , µ 2 , . . . , µ K • Data assignments to clusters S 1 , S 2 , . . . , S K = set of points in 
 S k = set of points in cluster k cluster k slide by Tamara Broderick Feature 1 Feature 1 � 68

  32. K-Means: Preliminaries Cluster summary Feature 2 Feature 2 x 3 = ( x 3 , 1 , x 3 , 2 ) x 3 = ( x 3 , 1 , x 3 , 2 ) • K cluster centers µ 1 , µ • Data assignments to clusters µ 1 , µ 2 , . . . , µ K • Data assignments to clusters µ 2 µ 3 S 1 , S 2 , . . . , S K = set of points in 
 S k = set of points in cluster k cluster k µ 1 slide by Tamara Broderick Feature 1 Feature 1 � 69

  33. K-Means: Preliminaries Dissimilarity Feature 2 Featur slide by Tamara Broderick Feature 1 � 70

  34. K-Means: Preliminaries Dissimilarity (global) Feature 2 K D � � � ( x n,d − µ k,d ) 2 dis global = Featur k =1 n : x n ∈ S k d =1 slide by Tamara Broderick Feature 1 � 71

  35. K-Means: Preliminaries Dissimilarity (global) Feature 2 K D � � � ( x n,d − µ k,d ) 2 dis global = Featur k =1 n : x n ∈ S k d =1 For each cluster slide by Tamara Broderick Feature 1 � 72

  36. K-Means: Preliminaries Dissimilarity (global) Feature 2 K D � � � ( x n,d − µ k,d ) 2 dis global = Featur k =1 n : x n ∈ S k d =1 For each cluster or each cluster For each data 
 or each data point in the 
 kth cluster slide by Tamara Broderick Feature 1 � 73

  37. K-Means: Preliminaries Dissimilarity (global) Feature 2 K D � � � ( x n,d − µ k,d ) 2 dis global = Featur k =1 n : x n ∈ S k d =1 For each cluster or each cluster For each data 
 or each data or each data point in the 
 point in the kth cluster kth cluster slide by Tamara Broderick Feature 1 or each featur For each feature � 74

  38. K-Means: Preliminaries Dissimilarity (global) Feature 2 K D � � � ( x n,d − µ k,d ) 2 dis global = Featur k =1 n : x n ∈ S k d =1 slide by Tamara Broderick Feature 1 � 75

  39. • Initialize K cluster centers K-Means Algorithm • Repeat until convergence: ✦ Assign each data point to 
 the cluster with the closest 
 center. ✦ Assign each cluster 
 Featur center to be the mean of its cluster’s data points slide by Tamara Broderick � 76

  40. • Initialize K cluster centers K-Means Algorithm • Repeat until convergence: ✦ Assign each data point to 
 the cluster with the closest 
 center. ✦ Assign each cluster 
 Featur center to be the mean of its cluster’s data points slide by Tamara Broderick � 77

  41. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until convergence: ✦ Assign each data point to 
 Featur the cluster with the closest 
 center. ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 78

  42. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until convergence: ✦ Assign each data point to 
 the cluster with the closest 
 center. ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 79

  43. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until convergence: ✦ Assign each data point to 
 the cluster with the closest 
 center. ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 80

  44. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until convergence: ✦ Assign each data point to 
 the cluster with the closest 
 center. ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 81

  45. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ Assign each data point to 
 the cluster with the closest 
 center. ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 82

  46. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t Or no change Or no change Or no change 
 change: in in dis global ✦ Assign each data point to 
 the cluster with the closest 
 center. ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 83

  47. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ Assign each data point to 
 the cluster with the closest 
 center. ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 84

  48. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ❖ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ❖ Put (and no 
 * Put (and no x n ∈ S k other S j ) ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 85

  49. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ❖ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ❖ Put (and no 
 * Put (and no x n ∈ S k other S j ) ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 86

  50. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ❖ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ❖ Put (and no 
 * Put (and no x n ∈ S k other S j ) ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 87

  51. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ✤ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ✤ Put (and no 
 * Put (and no x n ∈ S k other S j ) ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 88

  52. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ✤ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ✤ Put (and no 
 * Put (and no x n ∈ S k other S j ) For k = 1,...,K ✦ For k = 1,…,K � µ k ← | S k | − 1 * x n ✤ n : n ∈ S k slide by Tamara Broderick � 89

  53. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ✤ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ✤ Put (and no 
 * Put (and no x n ∈ S k other S j ) For k = 1,...,K ✦ For k = 1,…,K � µ k ← | S k | − 1 * x n ✤ n : n ∈ S k slide by Tamara Broderick � 90

  54. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ❖ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ❖ Put (and no 
 * Put (and no x n ∈ S k other S j ) ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 91

  55. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ❖ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ❖ Put (and no 
 * Put (and no x n ∈ S k other S j ) ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 92

  56. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ❖ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ❖ Put (and no 
 * Put (and no x n ∈ S k other S j ) ✦ Assign each cluster 
 center to be the mean of its cluster’s data points slide by Tamara Broderick � 93

  57. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ✤ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ✤ Put (and no 
 * Put (and no x n ∈ S k other S j ) For k = 1,...,K ✦ For k = 1,…,K � µ k ← | S k | − 1 * x n ✤ n : n ∈ S k slide by Tamara Broderick � 94

  58. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ✤ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ✤ Put (and no 
 * Put (and no x n ∈ S k other S j ) For k = 1,...,K ✦ For k = 1,…,K � µ k ← | S k | − 1 * x n ✤ n : n ∈ S k slide by Tamara Broderick � 95

  59. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ✤ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ✤ Put (and no 
 * Put (and no x n ∈ S k other S j ) For k = 1,...,K ✦ For k = 1,…,K � µ k ← | S k | − 1 * x n ✤ n : n ∈ S k slide by Tamara Broderick � 96

  60. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ✤ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ✤ Put (and no 
 * Put (and no x n ∈ S k other S j ) ✦ For k = 1,…,K For k = 1,...,K � µ k ← | S k | − 1 * x n ✤ n : n ∈ S k slide by Tamara Broderick � 97

  61. • For k = 1,…, K K-Means Algorithm ✦ Randomly draw n from 
 1,…,N without replacement ✦ µ k ← x n • Repeat until con • Repeat until S 1 ,…,S k don’t change: ✦ For n = 1,…N ✤ Find k with smallest 
 * Find k with smallest dis ( x n , µ k ) ✤ Put (and no 
 * Put (and no x n ∈ S k other S j ) ✦ For k = 1,…,K For k = 1,...,K � µ k ← | S k | − 1 * x n ✤ n : n ∈ S k slide by Tamara Broderick � 98

  62. K-Means: Evaluation slide by Tamara Broderick � 99

  63. K-Means: Evaluation • Will it terminate? Yes. Always. slide by Tamara Broderick � 100

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend