Clustering algorithms Machine Learning Hamid Beigy Sharif - PowerPoint PPT Presentation

Clustering algorithms Machine Learning Hamid Beigy Sharif University of Technology Fall 1393 Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 1 / 22

Table of contents Supervised & unsupervised learning 1 Clustering 2 Hierarchical clustering 3 Non-Hierarchical Clustering 4 Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 2 / 22

Supervised & unsupervised learning The learning methods covered in class up to this point have focused on the issue of classification/regression. An example consisted of a pair of variables ( x , t ), where x a feature vector and t the label/value. Such learning problems are called supervised since the system is given both the feature vector and the correct answer. We will investigate methods that operate on unlabeled data. Given a collection of feature vectors X = { x 1 , x 2 , . . . , x N } without labels/values t i , these methods attempt to build a model that captures the structure of the data. These methods are called unsupervised since they are not provided with the correct answer. The unsupervised learning methods may appear to have limited capabilities, there are several reasons that make them useful Labeling large data sets can be a costly procedure but raw data is cheap. Class labels may not be known beforehand. Large datasets can be compressed by finding a small set of prototypes. One can train with large amount of unlabeled data, and then use supervision to label the groupings found. Unsupervised methods can be used for feature extraction. Exploratory data analysis can provide insight into the nature or structure of the data. Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 3 / 22

Unsupervised Learning Unsupervised learning algorithms Non-parametric methods: These methods don’t make any assumption about the underlying densities, instead we seek a partition of the data into clusters. Parametric methods: These methods model the underlying class-conditional densities with a mixture of parametric densities, and the objective is to find the model parameters. � p ( x | θ ) = p ( x | ω i , θ i ) p ( ω i ) i Examples of unsupervised learning Dimensionality reduction Latent variable learning Clustering A cluster is a number of similar objects collected or grouped together. Clustering algorithm partitions examples into groups when labels are available. Sample applications Novelty detection and outliers detection. Clusters are connected regions of a multidimensional space containing a relatively high density of points, separated from other such regions by a region containing a relatively low density of points. Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 4 / 22

Application of Clustering Cluster retrieved documents to present more organized and understandable results to user → ”diversified retrieval” Detecting near duplicates such as entity resolution Exploratory data analysis Automated (or semi-automated) creation of taxonomies Comparison Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 5 / 22

Why do Unsupervised Learning? Clustering is a very difficult problem because data can reveal clusters with different shapes and sizes. How many clusters do you see in the above figure? Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 6 / 22

Why do Unsupervised Learning? (cont.) How many clusters do you see in the figure? Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 7 / 22

Clustering Clustering algorithms can be divided into several groups Exclusive (each pattern belongs to only one cluster) Vs non-exclusive (each pattern can be assigned to several clusters). Hierarchical (nested sequence of partitions) Vs partitioned (a single partition). Clustering algorithms Hierarchical clustering Centroid-based clustering Distribution-based clustering Density-based clustering Grid-based clustering Constraint clustering Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 11 / 22

Clustering Challenges in the clustering Selection of an appropriate measure of similarity to define clusters that is often both data (cluster shape) and context dependent. Choice of the criterion function to be optimized. Evaluation function Optimization method Similarity/distance measures Euclidean distance ( L 2 norm) � Σ N i =1 ( x i − y i ) 2 L 2 ( x , y ) = L 1 norm: � Σ N L 1 ( x , x ) = i =1 | x i − y i | Cosine similarity: xy cosine ( x , y ) = || x |||| y || Evaluation function that assigns a (usually real-valued) value to a clustering. This function typically function of withing-cluster similarity and between-cluster dissimilarity. Optimization method : Find a clustering that maximize the criterion. This can be done by global optimization methods (often intractable), greedy search methods, and approximation algorithms. Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 12 / 22

Hierarchical clustering Organizes the clusters in a hierarchical way Produces a rooted tree (Dendrogram) Animal Vertebrate Invertebrate Fish Reptile Amphibian Mammal Worm Insect Crustacean Recursive application of a standard clustering algorithm can produce a hierarchical clustering Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 13 / 22

Hierarchical clustering (cont.) Types of hierarchical clustering Agglomerative (bottom-up): Methods start with each example in its own cluster and iteratively combine them to form larger and larger clusters. Divisive(top-down): Methods separate all examples recursively into smaller clusters. Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 14 / 22

Agglomerative (bottom up) Assumes a similarity function for determining the similarity of two clusters. Starts with all instances in a separate cluster and then repeatedly joins the two clusters that are most similar until there is only one cluster. The history of merging forms a binary tree or hierarchy Basic algorithms: Start with all instances in their own cluster Until there is only one cluster: Among the current clusters, determine the two clusters, c i and c j that are most similar Replace c i and c j with a single cluster c i ∪ c j Cluster Similarity: How to compute similarity of two clusters each possibly containing multiple instances? Single Linkage: Similarity of two most similar members. Complete Linkage: Similarity of two least similar members. Group Average: Average similarity between members. This method uses the average of similarity across all pairs within the merged cluster to measure the similarity of two clusters. This method is a compromise between single and complete link. Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 15 / 22

Single-Link (bottom-up) sim ( c i , c j ) = max x ∈ c i , y ∈ c j sim ( x , y ) Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 16 / 22

Compelete-Link (bottom-up) sim ( c i , c j ) = min x ∈ c i , y ∈ c j sim ( x , y ) Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 17 / 22

Computational Complexity of HAC In the first iteration, all HAC methods need to compute similarity of all pairs n individual instances which is O ( n 2 ). In each of the subsequent O ( n ) merging iterations, must find smallest distance pair of clusters → Maintain heap O ( n 2 log ( n )) In each of the subsequent O ( n ) merging iterations, it must compute the distance between the most recently created cluster and all other existing cluster. Can this be done in constant time such that O ( n 2 log ( n )) overall? Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 18 / 22

Centroid-Based Clustering Assumes instances are real-valued vectors. Clusters represented via centroids (for example, average of points in a cluster) c µ ( c ) = 1 � x | c | x ∈ c Reassignment of instances to clusters is based on distance to the current cluster K-Means algorithm Input: k = number of clusters, distance measure d, Select k random instances s 1 , s 2 , ..., s k as seeds. Until clustering converges or other stopping criterion: For each instance x i : Assign x i to cluster c j such that d ( x i , s j ) is minimum For each cluster c j ,update its centroid s j = µ ( c j ) Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 19 / 22

Clustering algorithms Machine Learning Hamid Beigy Sharif - PowerPoint PPT Presentation

Clustering algorithms Machine Learning Hamid Beigy Sharif University of Technology Fall 1393 Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 1 / 22 Table of contents Supervised & unsupervised learning 1

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Clustering: Models and Algorithms Shikui Tu 2019-02-28 1 Outline Clustering K-mean

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Clustering Hierarchical clustering, k-mean clustering Genome 559: Introduction to Statistical and

CSCE 478/878 Lecture 8: Stephen Scott Clustering Introduction Outline Clustering Stephen

Clustering and Dimensionality Reduction Preview Clustering K -means clustering

Clustering kMeans, Expectation Maximization, Self-Organizing Maps Outline K-means

Lecture 23: Spectral clustering Hierarchical clustering What is a good clustering?

PAC-Bayesian Analysis of Co-clustering, Graph Clustering and Pairwise Clustering Yevgeny Seldin

Introduction to Machine Learning, Clustering and EM Barnab s P czos Contents Clustering

RECSM Summer School: Machine Learning for Social Sciences Session 3.4: Hierarchical Clustering

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Lecture 22: Clustering Distance measures K-Means Aykut Erdem May 2016 Hacettepe

Lecture 21: Unsupervised Learning and Clustering Algorithms Dr. Chengjiang Long Computer Vision

Geographic Data Science - Lecture VII Grouping Data over Space Dani Arribas-Bel Today The need

Machine Learning (AIMS) - MT 2017 2. Clustering Varun Kanade University of Oxford November 7,

INF4820: Algorithms for AI and NLP Clustering Milen Kouylekov & Stephan Oepen Language

Clustering Techniques Clustering Techniques Berlin Chen 2003 References: 1. Modern Information

Clustering algorithms Machine Learning Hamid Beigy Sharif - PowerPoint PPT Presentation

Clustering algorithms Machine Learning Hamid Beigy Sharif University of Technology Fall 1393 Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 1 / 22 Table of contents Supervised & unsupervised learning 1

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Clustering: Models and Algorithms Shikui Tu 2019-02-28 1 Outline Clustering K-mean

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Clustering Hierarchical clustering, k-mean clustering Genome 559: Introduction to Statistical and

CSCE 478/878 Lecture 8: Stephen Scott Clustering Introduction Outline Clustering Stephen

Clustering and Dimensionality Reduction Preview Clustering K -means clustering

Clustering kMeans, Expectation Maximization, Self-Organizing Maps Outline K-means

Lecture 23: Spectral clustering Hierarchical clustering What is a good clustering?

PAC-Bayesian Analysis of Co-clustering, Graph Clustering and Pairwise Clustering Yevgeny Seldin

Introduction to Machine Learning, Clustering and EM Barnab s P czos Contents Clustering

RECSM Summer School: Machine Learning for Social Sciences Session 3.4: Hierarchical Clustering

Data Mining and Machine Learning: Fundamental Concepts and Algorithms dataminingbook.info

Lecture 22: Clustering Distance measures K-Means Aykut Erdem May 2016 Hacettepe

Lecture 21: Unsupervised Learning and Clustering Algorithms Dr. Chengjiang Long Computer Vision

Geographic Data Science - Lecture VII Grouping Data over Space Dani Arribas-Bel Today The need

Machine Learning (AIMS) - MT 2017 2. Clustering Varun Kanade University of Oxford November 7,

INF4820: Algorithms for AI and NLP Clustering Milen Kouylekov &amp; Stephan Oepen Language

Clustering Techniques Clustering Techniques Berlin Chen 2003 References: 1. Modern Information

INF4820: Algorithms for AI and NLP Clustering Milen Kouylekov & Stephan Oepen Language