When do birds of a feather flock together? k -means, proximity, and - PowerPoint PPT Presentation

When do birds of a feather flock together? k -means, proximity, and conic programming Shuyang Ling Courant Institute of Mathematical Sciences, NYU May 14, 2018 Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 1 / 26

Acknowledgement Research in collaboration with: Prof. Xiaodong Li (Statistics, UC Davis) Prof. Thomas Strohmer, Yang Li (Mathematics, UC Davis) Prof. Ke Wei (School of Data Sciences, Fudan University, Shanghai) Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 2 / 26

k -means Question : Given a set of N data points in R m , how to partition them into k clusters? Criterion : minimize the k -means objective function: k � � � x i − c l � 2 min , { Γ l } k l =1 l =1 i ∈ Γ l � �� within-cluster sum of squares { Γ l } is a partition of { 1 , · · · , N } c l is the sample mean of data points in Γ l Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 3 / 26

Difficulty of k -means Importance and Difficulties Widely used in vector quantization, unsupervised learning, Voronoi tessellation, etc. It is an NP-hard problem, even if m = 2. [Mahajan, etc 09] Heuristic method: Lloyd’s algorithm [Lloyd 82] works well in practice. But convergence is not always guaranteed: it may take exponentially (in N ) many steps to converge to stationary points (not even a local minimum). Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 4 / 26

Convex relaxation of k -means Focus of talk We are interested in the convex relaxation for k -means [Peng, Wei 07]. k -means To minimize k -means objective, it suffices to optimize over all possible choices of partition { Γ l } : k � � � x i − c l � 2 f ( { Γ l } ) := l =1 i ∈ Γ l Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 5 / 26

Convex relaxation of k -means Focus of talk We are interested in the convex relaxation for k -means [Peng, Wei 07]. An equivalent form: It suffices to minimize it over all choices of partition { Γ l } k l =1 : k k 1 � � � � � x i − c l � 2 = f ( { Γ l } k � x i − x j � 2 l =1 ) := | Γ l | l =1 i ∈ Γ l l =1 i ∈ Γ l , j ∈ Γ l which is the sum of the squared pairwise deviations of points in the same cluster. Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 6 / 26

A bit more calculation f ( { Γ l } k l =1 ) is the inner product between two matrices N N · 1 � � � x i − x j � 2 f ( { Γ l } ) = = � D , X � | Γ l | 1 { i ∈ Γ l , j ∈ Γ l } � �� i =1 j =1 � �� D ij X ij where D = ( � x i − x j � 2 ) 1 ≤ i , j ≤ N is the distance matrix and � 1 � X = | Γ l | · 1 { i ∈ Γ l , j ∈ Γ l } 1 ≤ i , j ≤ N We simply call X the partition matrix . What properties does X have for any given partition { Γ l } k l =1 ? Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 7 / 26

Relaxation Up to certain permutation, the matrix X is a block-diagonal matrix:   | Γ 1 | 1 | Γ 1 | 1 ⊤ 1 · · · 0 | Γ 1 | . . ...  . .  X = . .   | Γ k | 1 | Γ k | 1 ⊤ 1 · · · 0 | Γ k | We want to find a larger and convex search space containing all X as a proper subset. What constraints does X satisfy? Four constraints Nonnegativity: X ≥ 0. Positive semidefinite: X � 0. Tr( X ) = k (note that rank( X ) = k is nonconvex) Leading eigenvalues are 1 with multiplicities k : X 1 N = 1 N . Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 8 / 26

Convex relaxation Semidefinite programming relaxation [Peng, Wei, 07] The convex relaxation of k -means is min � D , Z � Z ≥ 0 , Z � 0 , Tr( Z ) = k , Z 1 N = 1 N . s . t . Key question Suppose we assume { Γ l } k l =1 is the ground truth partition, when does SDP relaxation recover X = � k | Γ l | 1 Γ l 1 ⊤ 1 Γ l ? l =1 Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 9 / 26

A short literature review Many excellent works for learning mixtures of distributions and SDP relaxation of k -means: SDP-relaxation of k -means: [Peng, Wei, 07], [Bandeira, Villar, Ward, etc, 17], [Mixon, Villar, etc, 15], etc. Spectral-projection based approaches: [Dasgupta, 99], [Vempala, Wang, 04], [Achlipotas, McSherry, 05], etc. Almost all works have one thing in common: data are assumed to be sampled from a generative model, i.e., stochastic ball model, Gaussian mixture models, etc. Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 10 / 26

When do birds of a feather flock together? k -means, proximity, and - PowerPoint PPT Presentation

When do birds of a feather flock together? k -means, proximity, and conic programming Shuyang Ling Courant Institute of Mathematical Sciences, NYU May 14, 2018 Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 1 / 26

Wednesday, May 15 Concurrent Sessions Concurrent Session #1 1A Birds of a Feather -

UNIVERSITY OF CHICAGO | iGEM 2013 PROBLEM | feather waste 2.3 billion pounds of feather waste is

Words of a feather flock together John Goldsmith (Linguistics and Computer Science) Work done

Ti Ti Tiny Directory Tiny Directory Di Di t t Making Coherence Tracking Making Coherence

FLOCK: A Density Based Clustering Method for FLOCK: A Density Based Clustering Method for

Habitat for Bush Birds 2014 - 2016 Making connections: a birds eye view 1 Habitat for Bush

Why Do Birds of Prey Fly in Circles? Does the Eagle Make It? p. 1/3 Why Do Birds of Prey Fly

Logic Puzzles Problem Solving Club Birds In Trees There are 2 trees in a garden (tree

Birds-of-Feather: ECP Center and Application Monitoring - Working Group Day: Thursday, February

Building an Open Community Runtime (OCR) framework for Exascale Systems Birds of a Feather

University of Illinois SharePoint Shared Service Birds of a Feather Roundtable

Drupal Mentoring Learn Through Contribution Birds of a Feather (BoF) Sessions Sprint Mentor

Flock of birds Multi-bird Scaling route servers easily Antonio M. Moreiras IX.br CGI.br is

FEATHER SOUND COMMUNITY SERVICES DISTRICT Fiscal Year 2010 Budget Development FSCSD Charter

Red Feather Thermal Energy for Homes Capstone Team 4: Edwin Beraud Will Legrand Jeff Macauley

Project 4: Red Feather Thermal Energy For Homes Team Members: Jake Shaw, Will Legrand, Edwin

Distributed Training for Large-scale Logistic Models Siddharth Gopal Carnegie Mellon Univeristy

Adopting OC Matterhorn for large scale deployment 7th TF-Media

Precoded Integer-Forcing Universally Achieves the MIMO Capacity to Within a Constant Gap Or

Presentation to IETF54, July 2002 (PPVPN WG) Mina Azad (Nortel Networks) Note: This presentation

Hyperbolic Polynomials Approach to Van der Waerden/Schrijver-Valiant like Conjectures : Sharper

Course on Automated Planning: Planning as Heuristic Search Hector Geffner ICREA & Universitat

Geoapplications development http://rgeo.wikience.org Higher School of Economics, Moscow,

Adaptive Elicitation of Rank-Depenent Aggregation Models based on Bayesian Linear Regression

When do birds of a feather flock together? k -means, proximity, and - PowerPoint PPT Presentation

When do birds of a feather flock together? k -means, proximity, and conic programming Shuyang Ling Courant Institute of Mathematical Sciences, NYU May 14, 2018 Shuyang Ling (New York University) ICCHA7 2018, Nashville, TN May 14, 2018 1 / 26

Wednesday, May 15 Concurrent Sessions Concurrent Session #1 1A Birds of a Feather -

UNIVERSITY OF CHICAGO | iGEM 2013 PROBLEM | feather waste 2.3 billion pounds of feather waste is

Words of a feather flock together John Goldsmith (Linguistics and Computer Science) Work done

Ti Ti Tiny Directory Tiny Directory Di Di t t Making Coherence Tracking Making Coherence

FLOCK: A Density Based Clustering Method for FLOCK: A Density Based Clustering Method for

Habitat for Bush Birds 2014 - 2016 Making connections: a birds eye view 1 Habitat for Bush

Why Do Birds of Prey Fly in Circles? Does the Eagle Make It? p. 1/3 Why Do Birds of Prey Fly

Logic Puzzles Problem Solving Club Birds In Trees There are 2 trees in a garden (tree

Birds-of-Feather: ECP Center and Application Monitoring - Working Group Day: Thursday, February

Building an Open Community Runtime (OCR) framework for Exascale Systems Birds of a Feather

University of Illinois SharePoint Shared Service Birds of a Feather Roundtable

Drupal Mentoring Learn Through Contribution Birds of a Feather (BoF) Sessions Sprint Mentor

Flock of birds Multi-bird Scaling route servers easily Antonio M. Moreiras IX.br CGI.br is

FEATHER SOUND COMMUNITY SERVICES DISTRICT Fiscal Year 2010 Budget Development FSCSD Charter

Red Feather Thermal Energy for Homes Capstone Team 4: Edwin Beraud Will Legrand Jeff Macauley

Project 4: Red Feather Thermal Energy For Homes Team Members: Jake Shaw, Will Legrand, Edwin

Distributed Training for Large-scale Logistic Models Siddharth Gopal Carnegie Mellon Univeristy

Adopting OC Matterhorn for large scale deployment 7th TF-Media

Precoded Integer-Forcing Universally Achieves the MIMO Capacity to Within a Constant Gap Or

Presentation to IETF54, July 2002 (PPVPN WG) Mina Azad (Nortel Networks) Note: This presentation

Hyperbolic Polynomials Approach to Van der Waerden/Schrijver-Valiant like Conjectures : Sharper

Course on Automated Planning: Planning as Heuristic Search Hector Geffner ICREA &amp; Universitat

Geoapplications development http://rgeo.wikience.org Higher School of Economics, Moscow,

Adaptive Elicitation of Rank-Depenent Aggregation Models based on Bayesian Linear Regression

Course on Automated Planning: Planning as Heuristic Search Hector Geffner ICREA & Universitat