introduction
play

Introduction CSCE CSCE If no label information is available, can - PDF document

Introduction CSCE CSCE If no label information is available, can still perform 478/878 478/878 Lecture 8: Lecture 8: unsupervised learning Clustering Clustering CSCE 478/878 Lecture 8: Stephen Scott Stephen Scott Looking for structural


  1. Introduction CSCE CSCE If no label information is available, can still perform 478/878 478/878 Lecture 8: Lecture 8: unsupervised learning Clustering Clustering CSCE 478/878 Lecture 8: Stephen Scott Stephen Scott Looking for structural information about instance space Clustering instead of label prediction function Introduction Introduction Outline Outline Approaches: density estimation, clustering, Clustering Clustering dimensionality reduction Stephen Scott k -Means k -Means Clustering Clustering Clustering algorithms group similar instances together Hierarchical Hierarchical based on a similarity measure Clustering Clustering x1 x1 Clustering Algorithm sscott@cse.unl.edu x2 x2 1 / 19 2 / 19 Outline Clustering Background CSCE CSCE Goal: Place patterns into “sensible” clusters that reveal 478/878 478/878 Lecture 8: Lecture 8: similarities and differences Clustering Clustering Definition of “sensible” depends on application Stephen Scott Stephen Scott Introduction Introduction Clustering background Outline Outline Similarity/dissimilarity measures Clustering Clustering Measures: k -Means k -means clustering Point-Point Clustering Measures: Point-Set Measures: Set-Set Hierarchical Hierarchical clustering k -Means Clustering Clustering Hierarchical Clustering (a) How they bear young (b) Existence of lungs (c) Environment (d) Both (a) & (b) 3 / 19 4 / 19 Clustering Background Clustering Background (cont’d) (Dis-)similarity Measures: Between Instances CSCE CSCE Dissimilarity measure: Weighted L p norm: 478/878 478/878 Lecture 8: Lecture 8: Clustering Clustering ! 1 / p n Types of clustering problems: X w i | x i � y i | p Stephen Scott Stephen Scott L p ( x , y ) = Hard (crisp): partition data into non-overlapping i = 1 Introduction Introduction clusters; each instance belongs in exactly one cluster Outline Outline Special cases include weighted Euclidian distance ( p = 2 ), Clustering Clustering Fuzzy: Each instance could be a member of multiple weighted Manhattan distance Measures: Measures: Point-Point Point-Point clusters, with a real-valued function indicating the Measures: Point-Set Measures: Point-Set n Measures: Set-Set Measures: Set-Set degree of membership X L 1 ( x , y ) = w i | x i � y i | , k -Means k -Means Clustering Hierarchical: partition instances into numerous small Clustering i = 1 clusters, then group the clusters into larger ones, and Hierarchical Hierarchical and weighted L ∞ norm Clustering Clustering so on (applicable to phylogeny) End up with a tree with instances at leaves L ∞ ( x , y ) = max 1 ≤ i ≤ n { w i | x i � y i |} Similarity measure: Dot product between two vectors (kernel) 5 / 19 6 / 19

  2. Clustering Background Clustering Background (Dis-)similarity Measures: Between Instances (cont’d) (Dis-)similarity Measures: Between Instance and Set CSCE CSCE 478/878 478/878 Might want to measure proximity of point x to existing Lecture 8: Lecture 8: Clustering Clustering cluster C If attributes come from { 0 , . . . , k � 1 } , can use measures for Stephen Scott Stephen Scott real-valued attributes, plus: Can measure proximity α by using all points of C or by Introduction Introduction using a representative of C Outline Hamming distance : DM measuring number of places Outline If all points of C used, common choices: Clustering Clustering where x and y differ Measures: Measures: Point-Point Point-Point Tanimoto measure : SM measuring number of places α ps max ( x , C ) = max y ∈ C { α ( x , y ) } Measures: Point-Set Measures: Point-Set Measures: Set-Set Measures: Set-Set where x and y are same, divided by total number of k -Means k -Means places α ps min ( x , C ) = min y ∈ C { α ( x , y ) } Clustering Clustering Ignore places i where x i = y i = 0 Hierarchical Hierarchical Clustering Clustering avg ( x , C ) = 1 Useful for ordinal features where x i is degree to which x X α ps α ( x , y ) , possesses i th feature | C | y ∈ C where α ( x , y ) is any measure between x and y 7 / 19 8 / 19 Clustering Background Clustering Background (Dis-)similarity Measures: Between Instance and Set (cont’d) (Dis-)similarity Measures: Between Sets CSCE Alternative: Measure distance between point x and a CSCE 478/878 478/878 representative of the cluster C Lecture 8: Lecture 8: Clustering Clustering Mean vector m p = 1 Given sets of instances C i and C j and proximity measure Stephen Scott Stephen Scott X y α ( · , · ) | C | Introduction Introduction y ∈ C Mean center m c 2 C : Outline Outline Max : α ss max ( C i , C j ) = x ∈ C i , y ∈ C j { α ( x , y ) } max Clustering Clustering X X d ( m c , y )  d ( z , y ) 8 z 2 C , Measures: Measures: Min : α ss min ( C i , C j ) = x ∈ C i , y ∈ C j { α ( x , y ) } Point-Point Point-Point min Measures: Point-Set y ∈ C y ∈ C Measures: Point-Set Measures: Set-Set Measures: Set-Set where d ( · , · ) is DM (if SM used, reverse ineq.) 1 k -Means k -Means X X Average : α ss avg ( C i , C j ) = α ( x , y ) Clustering Clustering Median center : For each point y 2 C , find median | C i | | C j | Hierarchical Hierarchical x ∈ C i y ∈ C j dissimilarity from y to all other points of C , then take Clustering Clustering Representative (mean) : α ss mean ( C i , C j ) = α ( m C i , m C j ) , min; so m med 2 C is defined as med y ∈ C { d ( m med , y ) }  med y ∈ C { d ( z , y ) } 8 z 2 C Now can measure proximity between C ’s representative and x with standard measures 9 / 19 10 / 19 k -Means Clustering k -Means Clustering Algorithm CSCE CSCE 478/878 478/878 Very popular clustering algorithm Lecture 8: Lecture 8: Clustering Clustering Choose value for parameter k Represents cluster i (out of k total) by specifying its Stephen Scott Stephen Scott Initialize k arbitrary representatives m 1 , . . . , m k representative m i (not necessarily part of the original Introduction Introduction E.g., k randomly selected instances from X set of instances X ) Outline Outline Repeat until representatives m 1 , . . . , m k don’t change Each instance x 2 X is assigned to the cluster with Clustering Clustering For all x 2 X 1 nearest representative k -Means k -Means Assign x to cluster C j such that k x � m j k (or other Clustering Clustering Goal is to find a set of k representatives such that sum Algorithm Algorithm measure) is minimized Example Example of distances between instances and their I.e., nearest representative Hierarchical Hierarchical representatives is minimized Clustering Clustering For each j 2 { 1 , . . . , k } 2 NP-hard (intractable) in general m j = 1 X Will use an algorithm that alternates between y C j determining representatives and assigning clusters y 2 C j until convergence (in the style of the EM algorithm ) 11 / 19 12 / 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend