Convex Biclustering Eric Chi Rice University joint work with - PowerPoint PPT Presentation

Convex Biclustering Eric Chi Rice University joint work with Genevera Allen and Rich Baraniuk E. Chi Convex Biclustering 1

The Biclustering Problem Task Given a data matrix X ∈ R p × n , find subgroups of rows & columns that go together. Text mining : similar documents share a small set of highly correlated words. Collaborative filtering : likeminded customers share similar preferences for a subset of products Cancer genomics : subtypes of cancerous tumors share similar molecular profiles over a subset of genes E. Chi Convex Biclustering 2

Cancer Genomics Collect expression data These genes are potential drug targets Which genes are driving “lung cancer?” “lung cancer” is heterogenous at the molecular level Genes mal mal mal mal Colon Carcinoid Carcinoid Colon Tissue Sample mal SmallCell Carcinoid Carcinoid E. Chi Colon Carcinoid mal Carcinoid Carcinoid mal mal mal SmallCell Carcinoid mal SmallCell Colon Carcinoid Colon mal mal Carcinoid Convex Biclustering Colon mal Carcinoid Colon SmallCell Carcinoid Carcinoid Carcinoid mal Carcinoid Colon Colon Colon Carcinoid Carcinoid SmallCell SmallCell Carcinoid Colon mal Carcinoid Carcinoid mal Colon Colon mal 3

Simple Solution: Cluster Dendrogram Hierarchical Clustering Genes Tissue Sample E. Chi Convex Biclustering 4

Hierarchical Clustering 2.5 ● B 2.0 0 1.5 ● ● C D y 1.0 E ● − 1 0.5 A 0.0 A E B C D ● a e b c d − 0.50 − 0.25 0.00 0.25 0.50 0.75 x E. Chi Convex Biclustering 5

Hierarchical Clustering 2.5 ● B 2.0 F 0 1.5 ● ● C D D y 1.0 E ● − 1 F 0.5 A 0.0 A E B C C D D ● a e b c d − 0.50 − 0.25 0.00 0.25 0.50 0.75 x E. Chi Convex Biclustering 5

Hierarchical Clustering 2.5 ● B 2.0 F 0 1.5 ● ● C D D G y 1.0 E E ● − 1 F 0.5 G A A 0.0 A E B C C D D ● a e b c d − 0.50 − 0.25 0.00 0.25 0.50 0.75 x E. Chi Convex Biclustering 5

Hierarchical Clustering 2.5 ● B B 2.0 F F H 0 1.5 ● ● C C D D D G y 1.0 E E H ● − 1 F 0.5 G A A 0.0 A E B C C D D ● a e b c d − 0.50 − 0.25 0.00 0.25 0.50 0.75 x E. Chi Convex Biclustering 5

Hierarchical Clustering I I 2.5 ● B B B 2.0 F F F H 0 1.5 ● ● C C C D D D D G y 1.0 E E E H H ● − 1 F 0.5 G G A A A 0.0 A A E E B B C C C D D D ● a e b c d − 0.50 − 0.25 0.00 0.25 0.50 0.75 x E. Chi Convex Biclustering 5

Simple Solution: Cluster Dendrogram The Good Easy to interpret Fast computation - greedy algorithm The Bad: Non-convex optimization problem Local Minimizers Instability (initialization, tuning parameters, or data) The Ugly: How to choose number of biclusters? E. Chi Convex Biclustering 6

More Sophisticated Approaches SVD-like methods Plaid - Lazzeroni & Owen (2000) Iterative signature algorithm - Bergmann et al. (2003 ) sparse SVD - Lee et al. 2010 Graph Cut Dhillon (2001), Kluger (2003) LAS - Shabalin et al. (2009) Sparse transposable biclustering - Tan & Witten (2013) Harmonic Analysis of Digital Databases - Coifman & Gavish (2010) Goal: Simple and interpretable like clustered dendrogram Good algorithmic behavior Global minimizer Stability with respect to data and other inputs E. Chi Convex Biclustering 7

Solution: Convex Relaxation Solve combinatorially hard problem with a convex surrogate. All local minima are global minima Algorithms converge to global minimizer regardless of initialization Solve a convex optimization problem to go from A to B mal mal mal mal Colon Carcinoid Carcinoid Colon mal SmallCell Carcinoid Carcinoid Colon Carcinoid mal Carcinoid Carcinoid mal mal mal SmallCell Carcinoid mal SmallCell Colon Carcinoid Colon mal mal Carcinoid Colon mal Carcinoid Colon SmallCell Carcinoid Carcinoid Carcinoid mal Carcinoid Colon Colon Colon Carcinoid Carcinoid SmallCell SmallCell Carcinoid Colon mal Carcinoid Carcinoid mal Colon Colon mal om A to B E. Chi Convex Biclustering 8

Convex Biclustering Contributions: Characterization of the solution to the convex program Stability of solution in tuning parameters and data Simple intuitive meta-algorithm to get unique global minimizer alternate convex clustering of rows and columns Essentially one tuning parameter controls number of biclusters Data-adaptive way for selecting number of biclusters E. Chi Convex Biclustering 9

Convex Clustering Not much existing work, most is recent Pelckmans et al. 2005, Lindsten et al. 2011, Hocking et al. 2011, Chi & Lange 2013 n 1 X X k x i � u i k 2 minimize 2 + γ w ij k u i � u j k 2 2 u i =1 i < j Assign a centroid u i to each data point x i . Convex Fusion Penalty shrinks cluster centroids together sparsity in pairwise di ff erences of centroids u i � u j = 0 ( ) x i and x j belong to the same cluster γ : tunes overall amount of regularization w ij : fine tunes pairwise shrinkage Generalizes fused lasso / edge lasso (Sharpnack et. al. 2012) E. Chi Convex Biclustering 10

Convex Clustering Not much existing work, most is recent Pelckmans et al. 2005, Lindsten et al. 2011, Hocking et al. 2011, Chi & Lange 2013 n 1 X X k x i � u i k 2 minimize 2 + γ w ij k u i � u j k 2 2 u i =1 i < j Assign a centroid u i to each data point x i . Convex Fusion Penalty shrinks cluster centroids together sparsity in pairwise di ff erences of centroids Too many degrees of freedom! u i � u j = 0 ( ) x i and x j belong to the same cluster γ : tunes overall amount of regularization w ij : fine tunes pairwise shrinkage Generalizes fused lasso / edge lasso (Sharpnack et. al. 2012) E. Chi Convex Biclustering 10

Convex Clustering Not much existing work, most is recent Pelckmans et al. 2005, Lindsten et al. 2011, Hocking et al. 2011, Chi & Lange 2013 n 1 X X k x i � u i k 2 minimize 2 + γ w ij k u i � u j k 2 2 u i =1 i < j Assign a centroid u i to each data point x i . Convex Fusion Penalty shrinks cluster centroids together sparsity in pairwise di ff erences of centroids u i � u j = 0 ( ) x i and x j belong to the same cluster γ : tunes overall amount of regularization w ij : fine tunes pairwise shrinkage Generalizes fused lasso / edge lasso (Sharpnack et. al. 2012) E. Chi Convex Biclustering 10

Convex Clustering Not much existing work, most is recent Pelckmans et al. 2005, Lindsten et al. 2011, Hocking et al. 2011, Chi & Lange 2013 n 1 X X k x i � u i k 2 minimize 2 + γ w ij k u i � u j k 2 2 u i =1 i < j Assign a centroid u i to each data point x i . p ≥ 1 okay Convex Fusion Penalty shrinks cluster centroids together sparsity in pairwise di ff erences of centroids u i � u j = 0 ( ) x i and x j belong to the same cluster γ : tunes overall amount of regularization w ij : fine tunes pairwise shrinkage Generalizes fused lasso / edge lasso (Sharpnack et. al. 2012) E. Chi Convex Biclustering 10

Choosing weights Rules of thumb: w ij / k x i � x j k − 1 Most w ij = 0 Why? Encourage similar points to fuse early ! better clusterings Computation and storage scale with number of non-zero w ij Fiddle free; set and forget E. Chi Convex Biclustering 11

The Solution Path ● ● ● ● ● ● ● ● 1.00 ● ● ● ● ● ● ● ● ● ● ● ● 0.75 0.50 y ● ● ● ● ● ● ● ● ●● ● ●● ●● + γ ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● 0.25 ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● 0.0 0.3 0.6 0.9 x n minimize 1 X k x i � u i k 2 X w ij k u i � u j k 2 2 + γ 2 i =1 i < j E. Chi Convex Biclustering 12

The Solution Path ● ● ● ● ● ● ● ● 1.00 ● ● ● ● ● ● ● ● ● ● ● ● 0.75 0.50 y ● + γ ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● 0.25 ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● 0.00 ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● 0.0 0.3 0.6 0.9 x n minimize 1 X k x i � u i k 2 X w ij k u i � u j k 2 2 + γ 2 i =1 i < j E. Chi Convex Biclustering 12

Convex Biclustering Eric Chi Rice University joint work with - PowerPoint PPT Presentation

Convex Biclustering Eric Chi Rice University joint work with Genevera Allen and Rich Baraniuk E. Chi Convex Biclustering 1 The Biclustering Problem Task Given a data matrix X R p n , find subgroups of rows & columns that go

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

Consistent Biclustering via Fractional 01 Programming Panos Pardalos, Stanislav Busygin and

Ekaterina Nosova DMI Dept of Mathematics and Informatics, University of Salerno, Italy

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

Convex hull: basic facts Convex hull: basic facts CG Lecture 1 CG Lecture 1 Problem : give a set

Convex hulls of spheres and convex hulls of convex polytopes lying on parallel hyperplanes

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

Convex Analysis Jos e De Don a September 2004 Centre of Complex Dynamic Systems and

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS133 Computational Geometry Convex Hull 4/12/2018 1 Convex Hull Given a set of n points,

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

3.1 Online Convex Programming Definition 3.1.1 (Convex Set) A set of vectors X R n is convex

Minimizing within convex bodies using a convex hull method Edouard Oudet Thomas

Algebra and tensors give interpretable groups for crosstalk mechanisms in breast cancer Mariano

Cluster Analysis This lab will demonstrate how to perform the following in Python:

Financial Toxicity of Cancer Ryan Nipp, MD, MPH Massachusetts General Hospital Cancer Center

Physician- Dr. Brandon Fisher Born in Utah- USA Radiation Oncology in Utah

Abstracting and Coding Boot Camp: Webinar Series Cancer Case Scenarios NAACCR 20152016

Anonymization Algorithms - Microaggregation and Clustering Li Xiong CS573 Data Privacy and

Administrative notes October 26, 2017 Well do some In the News Groupwork today

Introduction to K- means Dmitriy (Dima) Gorenshteyn Sr. Data Scientist, Memorial Sloan