Integrating Constraints and Metric Learning in Semi-Supervised - PowerPoint PPT Presentation

Integrating Constraints and Metric Learning in Semi-Supervised Clustering M. Bilenko, S. Basu, R.J.Mooney Presentor: Lei Tang Arizona State University Machine Learning Seminar

1 Introduction 2 Formulation K-means Integrating Constraints and Metric Learning 3 MPCK-Means Algorithm Initialization E-step M-step 4 Experiment Results

Semi-supervised Clustering 1 Constrained-based method Seeded KMeans , Constrained KMeans given partial label information. COP KMeans given pairwise constraint(must-link, cannot-link) 2 Metric-based method Learn a metric to satisfy the constraint, such that the data of the same cluster gets closer, whereas data of different clusters gets further away Limitations Previous metric learning excludes unlabeled data during metric training. A single distance metric is used for all clusterings, forcing them to have the same shape.

Constrait-based method K-means clustering: � || x i − µ l i || 2 Minimize x i ∈X Semi-supervised clustering with constraints � � � || x i − µ l i || 2 Minimize + w ij 1 [ l i � = l j ] + w ij 1 [ l i = l j ] ¯ x i ∈X ( x i , x j ) ∈M ( x i , x j ) ∈C � �� Typical k-means must-link cannot-link

Metric-based Method Euclidean distance: � || x i − x j || = ( x i − x j ) T ( x i − x j ) Mahalanobis distance: � ( x i − x j ) T A ( x i − x j ) || x i − x j || A = where A is a covariance matrix. A � 0 If a A is used for calculate distance, then each cluster is modeled as a multivariate Gaussian distribution with covariance A − 1 .

Clustering with different shape What if the shape of clusters are different? Use different A for each cluster(Assign different covariance). To Maximize the likelihood boils down to : � � � || x i − µ l i || 2 A li − log ( det A l i ) Minimize x i ∈X

Combine Constraints and Metric Learning � [ || x i − µ l i || 2 Minimize A li − log ( det A l i )] x i ∈X � �� Metric Learning � � + w ij 1 [ l i � = l j ] + w ij 1 [ l i = l j ] ¯ ( x i , x j ) ∈M ( x i , x j ) ∈C � �� Constraints Intuitively, the penality w ij and ¯ w ij should be based on distance of two data points. � [ || x i − µ l i || 2 A li − log ( det A l i )] Minimize x i ∈X � � + f M ( x i , x j ) 1 [ l i � = l j ] + f c ( x i , x j ) 1 [ l i = l j ] ( x i , x j ) ∈M ( x i , x j ) ∈C

Penality based on distance Must-link: Violations means data belongs to different cluster. f M ( x i , x j ) = 1 2( || x i − x j || 2 A li + || x i − x j || 2 A lj ) � �� Average The further away two data are, the more penality. Cannot-link: Violations means data belongs to the same cluster. l i || 2 −|| x i − x j || 2 || x ′ l i − x ′′ f C ( x i , x j ) = A li A li � �� Maximum distant points The closer two data are, the more penality.

Metric pairwise constrained K-means(MPCK) General Framework of MPCK algorithm based on EM Initialize clusters Repeat until convergence: Assign Cluster to minimize the objective goal. Estimate the mean Update the metric Difference with k-means Cluster assignment takes constraint into consideration. The metric is updated in each round.

Initialization Basic idea Construct traversive closure of the must-link Choose the mean of each component as the seed. Extend the sets of must-link and cannot-link. Construct traversive closure of the must-link Must-link: { AB, BC, DE } ; Cannot link: { BE } ;

Cluster Assignment 1 Randomly re-order the data points 2 Assign each data point to a cluster that minimize the objective function: � [ || x i − µ l i || 2 Minimize J = A li − log ( det A l i )] x i ∈X � � + f M ( x i , x j ) 1 [ l i � = l j ] + f c ( x i , x j ) 1 [ l i = l j ] ( x i , x j ) ∈M ( x i , x j ) ∈C

Update the metric 1 Update the centroid of each cluster 2 Update the distance metric of each cluster; Take the derivative of the goal function and set it to 0 to get the new metric:   � ( x i − µ i )( x i − µ i ) T A h = |X h |  x i ∈X h 1 � 2 w ij ( x i − x j )( x i − x j ) T 1 [ l i � = l j ]) + ( x i , x j ) ∈M h  − 1 � h ) T − ( x i − x j )( x i − x j ) T �  � ( x ′ h − x ′′ h )( x ′ h − x ′′ + w ij ¯ 1 [ l i = l j ]  ( x i , x j ) ∈C h

Some issues 1 Singularity: If the sum is singular, Set A − 1 = A − 1 + ǫ tr ( A − 1 h ) I to h h ensure nonsiguarity. 2 Semi-positive definiteness: If A h is negative definite, project it into set C = { A : A � 0 } by setting negative eigenvalues to 0. 3 Computational cost: Use diagonal matrix. Or the same distance metric for all clusters. 4 Convergence: Theoretically, each step reduce the objective goal. But if singularity and semi-positive definiteness are involved, the algorithm might not converge in theory. Anyhow, it works fine in reality.

Experiment Results(1) A single diagonal matrix is used.

Integrating Constraints and Metric Learning in Semi-Supervised - PowerPoint PPT Presentation

Integrating Constraints and Metric Learning in Semi-Supervised Clustering M. Bilenko, S. Basu, R.J.Mooney Presentor: Lei Tang Arizona State University Machine Learning Seminar 1 Introduction 2 Formulation K-means Integrating Constraints and

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Welcome back... Metric spaces. Approximate metric using a tree. Tree metric: 16 16 A metric

Metric Spaces Definition If d is a metric on X , then the metric topology on X induced by d is

Information- -Velocity Metric Velocity Metric Information-Velocity Metric Information for the

Distance Metric Learning: Beyond 0/1 Loss Praveen Krishnan CVIT, IIIT Hyderabad June 14, 2017 1

Semi-Crystalline Polymer Morphologies and their Hierarchical Morphologies 1 Semi-Crystalline

Information Theoretic Metric Learning Instructor: Sham Kakade 1 Metric Learning In k -nearest

Metric Conversions Ladder Method T. Trimpe 2008 http://sciencespot.net/ Metric System The

Dynamical Systems Continuous maps of metric spaces We work with metric spaces, usually a

The Metric Coalescent joint with David Aldous Daniel Lanoue University of California, Berkeley

The Metric Coalescent Process joint with David Aldous Daniel Lanoue June 17, 2014 Daniel Lanoue

The Metric Dimension Problem. J. D az Monash U., May 2018 The Metric Dimension problem

Watched Literals in SAT and CP T opics in this Series Why SAT & Constraints? SAT

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

Semi-Supervised Learning Maria-Florina Balcan 03/30/2015 Readings: Semi-Supervised Learning.

Filtering Algorithms for Global Constraints Classes of Constraints Application-Based Graph-Based

Community-wide Planning Session Stakeholder Conversations Welcome and Introductions Angela

a) Rayleigh-Schrdinger perturbation theory b) choice of Hamiltonian c) nuclear shieldings

Quantum quenches in the thermodynamic limit Marcos Rigol Department of Physics The Pennsylvania

A New Agenda for Norwegian Innovation-Driven Entrepreneurship Scott Stern MIT and NBER 1 Norway

ENG. CARLOS MIRANDA, PE, MAI, SRA, CCIM, MRICS, CMEA, CSBA Appraiser (Real Estate, Machinery

JICA-CMEA Cooperation Project on JCM and Low Carbon Development Jun Ichihara, JICA Expert/Chief

Proving Value in Face of Constant Change Presented by MGen ( Retd ) Daniel Benjamin, VP

2020-21 Proposed Budget Adoption June 15, 2020 1 Timeline for Updating the Districts

Integrating Constraints and Metric Learning in Semi-Supervised - PowerPoint PPT Presentation

Integrating Constraints and Metric Learning in Semi-Supervised Clustering M. Bilenko, S. Basu, R.J.Mooney Presentor: Lei Tang Arizona State University Machine Learning Seminar 1 Introduction 2 Formulation K-means Integrating Constraints and

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Welcome back... Metric spaces. Approximate metric using a tree. Tree metric: 16 16 A metric

Metric Spaces Definition If d is a metric on X , then the metric topology on X induced by d is

Information- -Velocity Metric Velocity Metric Information-Velocity Metric Information for the

Distance Metric Learning: Beyond 0/1 Loss Praveen Krishnan CVIT, IIIT Hyderabad June 14, 2017 1

Semi-Crystalline Polymer Morphologies and their Hierarchical Morphologies 1 Semi-Crystalline

Information Theoretic Metric Learning Instructor: Sham Kakade 1 Metric Learning In k -nearest

Metric Conversions Ladder Method T. Trimpe 2008 http://sciencespot.net/ Metric System The

Dynamical Systems Continuous maps of metric spaces We work with metric spaces, usually a

The Metric Coalescent joint with David Aldous Daniel Lanoue University of California, Berkeley

The Metric Coalescent Process joint with David Aldous Daniel Lanoue June 17, 2014 Daniel Lanoue

The Metric Dimension Problem. J. D az Monash U., May 2018 The Metric Dimension problem

Watched Literals in SAT and CP T opics in this Series Why SAT &amp; Constraints? SAT

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

Semi-Supervised Learning Maria-Florina Balcan 03/30/2015 Readings: Semi-Supervised Learning.

Filtering Algorithms for Global Constraints Classes of Constraints Application-Based Graph-Based

Community-wide Planning Session Stakeholder Conversations Welcome and Introductions Angela

a) Rayleigh-Schrdinger perturbation theory b) choice of Hamiltonian c) nuclear shieldings

Quantum quenches in the thermodynamic limit Marcos Rigol Department of Physics The Pennsylvania

A New Agenda for Norwegian Innovation-Driven Entrepreneurship Scott Stern MIT and NBER 1 Norway

ENG. CARLOS MIRANDA, PE, MAI, SRA, CCIM, MRICS, CMEA, CSBA Appraiser (Real Estate, Machinery

JICA-CMEA Cooperation Project on JCM and Low Carbon Development Jun Ichihara, JICA Expert/Chief

Proving Value in Face of Constant Change Presented by MGen ( Retd ) Daniel Benjamin, VP

2020-21 Proposed Budget Adoption June 15, 2020 1 Timeline for Updating the Districts

Watched Literals in SAT and CP T opics in this Series Why SAT & Constraints? SAT