On the Approximability of Information Theoretic Clustering - PowerPoint PPT Presentation

1 H(X) 0,75 0,5 0,25 0 0,25 0,5 0,75 1 On the Approximability of Information Theoretic Clustering Ferdiando Cicalese, U. Verona Eduardo Laber, PUC-RIO Lucas Murtinho, PUC-RIO POSTER 165, Pacific Ballroom

Impurity Measures • Maps a vector v in R d into a non-negative value • The more homogeneous v with respect to its components the larger the impurity – (1,0,0,19): small impurity – (5,5,5,5) : large impurity • Well known impurity measures g log k v k 1 v i Entropy X I Ent ( v ) = k v k 1 , k v k 1 v i i =1 g ✓ ◆ v i v i Gini X I Gini ( v ) = k v k 1 1 � k v k 1 k v k 1 i =1

Clustering with minimum impurity Input • V : set of non-negative vectors in R d • I : impurity measure • k : number of clusters Goal P Partition V into k groups partition P = ( V (1) , . . . , V ( k ) ) so that impurity of a partition P then I ( P ) = P k i =1 I ( V ( i ) ) . the minimum possible impurity is minimized P P : impurity of the sum of the vectors in =1 I ( V ( i ) ) . I ( V ( i ) ) possible impurity possible impurity

Applications/ Motivations • Generalizes clustering using KL-divergence – Entropy impurity and KL-divergence of a clustering differ by a constant factor • Clustering probability distributions • Clustering nominal attributes in decision tree/ random forest construction • Channel Quantizer Design [Inf. Theory]

Our Contributions Approximation Algorithms • 3-approximation for Gini in linear time (arbitrary k) • O(log 2 (min{d,k}))- approximation for Entropy in polytime – First algorithm with approximation independent of n that does make assumption on the input domain

Our Contributions Approximation Algorithms Project vectors in dimension k incur small additive loss • 3-approximation for Gini in linear time (arbitrary k) • O(log 2 (min{d,k}))- approximation for Entropy in polytime – First algorithm with approximation independent of n that does make assumption on the input domain

Our Contributions Approximation Algorithms Project vectors in dimension k incur small additive loss • 3-approximation for Gini in linear time (arbitrary k) Each cluster is pure : all vectors have the same largest component • O(log 2 (min{d,k}))- approximation for Entropy in polytime – First algorithm with approximation independent of n that does make assumption on the input domain

Our Contributions Approximation Algorithms Project vectors in dimension k incur small additive loss • 3-approximation for Gini in linear time (arbitrary k) Each cluster is pure : all vectors have the same largest component • O(log 2 (min{d,k}))- approximation for Entropy in polytime – First algorithm with There is a clustering with exactly one non-pure cluster and impurity approximation independent of O(log 2 d) ・ OPT n that does make assumption Find this clustering in a 2-dim projection using DP on the input domain

Our Contributions APX-Hardness for Entropy • Reduction from c-gap vertex cover in cubic graphs • Solves open question from [Chaudhuri and McGregor, COLT08] and [Ackermann et al., ECCC11]

Our Contributions APX-Hardness for 0..010 … 010 ... 00 Entropy Edges to vectors with two 1’s 0..000 … 010 ... 01 • Reduction from c-gap Theorem vertex cover in cubic k’(G,k) = 3 log 3|E|+6(1-log3)k graphs MinVertexCover ≤ k ⇒ Opt-Impurity ≤ k’(G,k) • MinVertexCover > ck ⇒ Opt-Impurity > c’k’(G,k) • • Solves open question from [Chaudhuri and McGregor, COLT08] and [Ackermann et al., ECCC11]

Our Contributions APX-Hardness for 0..010 … 010 ... 00 Entropy Edges to vectors with two 1’s 0..000 … 010 ... 01 • Reduction from c-gap Theorem vertex cover in cubic k’(G,k) = 3 log 3|E|+6(1-log3)k graphs MinVertexCover ≤ k ⇒ Opt-Impurity ≤ k’(G,k) • MinVertexCover > ck ⇒ Opt-Impurity > c’k’(G,k) • • Solves open question Lemma. G cubic and min-VertexCover <= k from [Chaudhuri and ⇒ G decomposes into stars of sizes 2 and 3. McGregor, COLT08] and [Ackermann et al., ECCC11]

Our Contributions Ratio-Greedy Algorithm • Built on top of the theoretical ideas • Promising preliminary experimental comparisons – much faster than a k-means based method – close impurity

On the Approximability of Information Theoretic Clustering - PowerPoint PPT Presentation

1 H(X) 0,75 0,5 0,25 0 0,25 0,5 0,75 1 On the Approximability of Information Theoretic Clustering Ferdiando Cicalese, U. Verona Eduardo Laber, PUC-RIO Lucas Murtinho, PUC-RIO POSTER 165, Pacific Ballroom Impurity Measures Maps a

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

On the Approximability of Influence in Social Networks Yilin Shen January 27, 2010 Yilin Shen

Evolutionary Clustering Presenter: Lei Tang Evolutionary Clustering Evolutionary Clustering

Clustering A Categorization of Major Clustering Methods Partitioning Methods

Trust based Clustering for Group Trust based Clustering for Group Trust based Clustering for

Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering

Clustering Hierarchical clustering and k-mean clustering Genome 373 Genomic Informatics

Cl Clustering t i A Categorization of Major Clustering Methods Partitioning Methods

Clustering Hierarchical clustering, k-mean clustering Genome 559: Introduction to Statistical and

CSCE 478/878 Lecture 8: Stephen Scott Clustering Introduction Outline Clustering Stephen

Clustering and Dimensionality Reduction Preview Clustering K -means clustering

Clustering kMeans, Expectation Maximization, Self-Organizing Maps Outline K-means

Lecture 23: Spectral clustering Hierarchical clustering What is a good clustering?

PAC-Bayesian Analysis of Co-clustering, Graph Clustering and Pairwise Clustering Yevgeny Seldin

Introduction to Machine Learning, Clustering and EM Barnab s P czos Contents Clustering

Machine learning theory Theory of clustering Hamid Beigy Sharif university of technology June

Percolation Theory Percolation Theory Jie Gao Computer Science Department Stony Brook

Towards a Statistical Theory of Clustering Ulrike von Luxburg, Shai Ben-David Page 1 Ulrike von

New Developments In The Theory Of Clustering thats all very well in practice, but does it work

A Spectral Algorithm for Learning Class-Based n -gram Models of Natural Language Karl Stratos

Nice to meet you! The Network Matters Cloud-based applications generate significant network

Cluster algebras and applications Bernhard Keller Universit Paris Diderot Paris 7 DMV

Cluster Production in pBUU - Past and Future Pawel Danielewicz National Superconducting