APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft - PowerPoint PPT Presentation

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Objectives Learn basic techniques for data clustering • K-means and soft K-means, GMM (next lecture) • DBSCAN Understand the issues and major challenges in clustering • Choice of metric • Choice of number of clusters 2

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING What is clustering? Clustering is a type of multivariate statistical analysis also known as cluster analysis, unsupervised classification analysis, or numerical taxonomy. Clustering is a process of partitioning a set of data (or objects) in a set of meaningful sub-classes, called clusters. Cluster: a collection of data objects that are “similar” to one another and thus can be treated collectively as one group. 3

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Classification versus Clustering Supervised Classification = Classification  We know the class labels and the number of classes. 1 3 2 2 1 3 Unsupervised Classification = Clustering  We do not know the class labels and may not know the number of classes. ? ? ? ? ? ? 4

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Classification versus Clustering Unsupervised Classification = Clustering  Hard problem when no pair of objects have exactly the same feature.  Need to determine how similar two or more objects are to one another. ? ? ? ? ? 5

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Which clusters can you create? Which two subgroups of pictures are similar and why? 6

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Which clusters can you create? Which two subgroups of pictures are similar and why? 7

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING What is Good Clustering? A good clustering method produces high quality clusters when: • The intra-class (that is, intra-cluster) similarity is high. • The inter-class similarity is low. • The quality measure of a cluster depends on the similarity measure used! 8

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Exercise:  Person1 with glasses  Person1 without glasses  Person2 without glasses  Person2 with glasses Intra-class similarity is the highest when: a) you choose to classify images with and without glasses b) you choose to classify images of person1 against person2 9

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Exercise:  Person1 with glasses  Person1 without glasses  Person2 without glasses  Person2 with glasses Projection onto first two principal components after PCA Intra-class similarity is the highest when: a) you choose to classify images with and without glasses b) you choose to classify images of person1 against person2 10

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Exercise:  Person1 with glasses  Person1 without glasses  Person2 without glasses  Person2 with glasses e1 e2 Projection onto e1 against e2 The eigenvector e1 is composed of a mix between the main characteristics of the two faces and it is hence explanatory of both. However, since both faces have little in common, the two groups have different coordinates onto e1 but have quasi identical coordinates for the glasses in each subgroup. Projecting onto e1 hence offers a mean to compute a metric of similarity across the two persons. 11

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Exercise:  Person1 with glasses  Person1 without glasses  Person2 without glasses  Person2 with glasses e1 e2 e3 Projection onto e1 against e3 When projecting onto e1 and e3, we can separate the image of the person1 with and without glasses, as the eigenvector e3 embeds features distinctive of person1 primarily. 12

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Exercise: Projection onto first two principal components after PCA Design a method to find out the groups when you no longer have the class labels? 13

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Sensitivity to Prior Knowledge Outliers (noise) x 3 Relevant Data x 1 x 2 Priors: • Data cluster within a circle • There are 2 clusters 14

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Sensitivity to Prior Knowledge x 3 x 1 x 2 Priors: • Data follow a complex distribution • There are 3 clusters 15

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Clusters’ Types DBSCAN K-means produces non- produces globular clusters globular clusters Globular Clusters Non-Globular Clusters 16

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING What is Good Clustering? Requirements for good clustering: • Discovery of clusters with arbitrary shape • Ability to deal with noise and outliers • Insensitivity to input records’ ordering • Scalability • High dimensionality • Interpretability and reusability 17

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING How to cluster? x 2 x 1 What choice of model (circle, ellipse) for the cluster? How many models? 18

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING K-means Clustering K-Means clustering generates a number K of disjoint clusters to miminize: 2   K        x 2 1 K J ,..., x i k   i k 1 x c k x i th data point i  k geometric centroid x 1 𝑑 𝒍 cluster label or number What choice of model (circle, ellipse) for the cluster? Circle How many models? Fixed number: K=2 Where to place them for optimal clustering? 19

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING K-means Clustering x 2 x 1 Initialization : initialize at random the positions of the centers of the clusters In mldemos; centroids are initialized on one datapoint with no overlap across centroids. 20

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING K-means Clustering      d x  k argmin , i k i k Responsibility of cluster for point k x i x 2   1 if k k   i k r i  0 otherwise x i th data point i x 1  k geometric centroid Assignment Step: • Calculate the distance from each data point to each centroid. • Assign the responsibility of each data point to its “closest” centroid. If a tie happens (i.e. two centroids are equidistant to a data point, one assigns the data point to the smallest winning centroid). 21

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING K-means Clustering      d x  k argmin , i k i k Responsibility of cluster for point k x i x 2   1 if k k   i k r i  0 otherwise    k r x i i i k x 1  k r i i Update step (M-Step): Recompute the position of centroid based on the assignment of the points 22

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING K-means Clustering      d x  k argmin , i k i k Responsibility of cluster for point k x i x 2   1 if k k   i k r i  0 otherwise    k r x i i i k x 1  k r i i Assignment Step: • Calculate the distance from each data point to each centroid. • Assign the responsibility of each data point to its “closest” centroid. If a tie happens (i.e. two centroids are equidistant to a data point, one assigns the data point to the smallest winning centroid). 23

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING K-means Clustering x 2 x 1 Update step (M-Step): Recompute the position of centroid based on the assignment of the points Stopping Criterion: Go back to step 2 and repeat the process until the clusters are stable. 24

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING K-means Clustering Intersection points x 2 x 1 K-means creates a hard partitioning of the dataset 25

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Effect of the distance metric on K-means L2-Norm L1-Norm L3-Norm L8-Norm 26

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING K-means Clustering: Algorithm 1. Initialization: Pick K arbitrary centroids and set their geometric means to random values (in mldemos; centroids are initialized on one datapoint with no overlap across centroids). 2. Calculate the distance from each data point to each centroid . 3. Assignment Step: Assign the responsibility of each data point to its “closest” centroid ( E-step). If a tie happens (i.e. two centroids are equidistant to a data point, one assigns the data point to the smallest winning centroid).     1 if k k     i  d x  k r k argmin , i k i  i 0 otherwise k 4. Update Step: Adjust the centroids to be the means of all data points    k r x i assigned to them (M-step) i i k  k r i i 5. Go back to step 2 and repeat the process until the clusters are stable. 27

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING K-means Clustering The algorithm of K-means is a simple version of Expectation-Maximization applied to a model composed of isotropic Gauss functions (see next lecture) 28

APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft - PowerPoint PPT Presentation

MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING APPLIED MACHINE LEARNING Methods for Clustering K-means, Soft K-means DBSCAN 1 MACHINE LEARNING - MSc Course APPLIED MACHINE LEARNING Objectives Learn basic techniques for data

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

Applied Machine Learning Introduction 1 APPLIED MACHINE LEARNING Practicalities Contact

Applied Machine Learning Introduction 1 APPLIED MACHINE LEARNING Practicalities Slides and

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

APPLIED MACHINE LEARNING Probability Density Functions Gaussian Mixture Models 1 APPLIED

Applied Machine Learning Applied Machine Learning Convolutional Neural Networks Siamak

Applied Machine Learning Applied Machine Learning Multilayer Perceptron Siamak Ravanbakhsh

Applied Machine Learning Applied Machine Learning Convolutional Neural Networks Siamak

Applied Machine Learning Applied Machine Learning Perceptron and Support Vector Machines Siamak

Applied Machine Learning Applied Machine Learning Decision Trees Siamak Ravanbakhsh Siamak

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Applied Machine Learning Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh

Applied Machine Learning Applied Machine Learning Bootstrap, Bagging and Boosting Siamak

Volume Consider a solid for which one wants the volume. Suppose an x axis is drawn and it is

Modular Catalan Numbers, Generalized Motzkin Numbers, and the Tamari Order Nickolas Hein

Game and Learn: An Introduction to Educational Gaming 14. TPCK, SAMR, and Games Ruben R.

Introduction to the dynamics of holomorphic endomorphisms of P k Dimitra Tsigkari Postgraduate

http://www.utdallas.edu/~kilgard/brain.jpg BIRS Canada-China Workshop on Industrial Mathematics

THE UPS AND DOWNS OF PLATELETS Dr Tung Moon Ley Associate Consultant Department of

The Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics Preslav

Data Mining in Bioinformatics Days 6 and 7: The Need for Data Mining in Bioinformatics Karsten