Finding low-rank structure in messy data Laura Balzano University - PowerPoint PPT Presentation

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Finding low-rank structure in messy data Laura Balzano University of Michigan Michigan Institute for Data Science March 2017 Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Big Data means Messy Data Ozone Concentration 42 0.04 41 0.03 Latitude 40 0.02 0.01 39 0 38 − 81 − 80 − 79 − 78 − 77 − 76 − 75 Longitude Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Big Data means Messy Data &'+#/).0%%%1%%%'*-&)23-'.% &'()#"% *$#+),-'.% !"#$"% Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Structure In all these cases, we believe there is some structure in the data. That structure can help us predict, interpolate, detect anomalies, etc. Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Structure In all these cases, we believe there is some structure in the data. That structure can help us predict, interpolate, detect anomalies, etc. Much of my work focuses on low-rank structure. Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Subspace Representations 50 100 150 − 200 10 0 − sp ordered singular ordered singular values − values (normalized) (normalized) ordered singular values 10 -1 (normalized) − 10 -2 state 10 -3 Temperature%data%from%UCLA%Sensornet% Byte%Count%data%from%UW%network% 0 10 20 30 40 50 60 Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Subspace Representations Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Low-rank structure for Messy Data ! PCA with heteroscedastic data ! Structured Single Index Models E[ y| x] = g(x T w) 1 ! Union of Idealized ISE response 0.8 subspace 0.6 ! Matrix data – active 0.4 completion or 0.2 clustering or factorization with 0 completion 0 0.2 0.4 0.6 0.8 1 streaming data Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Collaborators NSF, Army Research Office, MCubed Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion LRMC with Monotonic Observations Low-rank Matrix Completion under Monotonic Transformation: Can we recover a low-rank matrix where every entry has been perturbed using an unknown monotonic function? Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Low-rank Matrix Completion We have an n × m , rank r matrix X . However, we only observe a subset of the entries, Ω ⊂ { 1 , . . . , n } × { 1 , . . . , m } . Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Example 1: Recommender Systems &'+#/).0%%%1%%%'*-&)23-'.% &'()#"% *$#+),-'.% !"#$"% Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Example 1: Recommender Systems Ne#lix'Prize' Compe//on' 200642009' Winning'team' received'$1M' Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Low-rank Matrix Completion We have an n × m , rank r matrix X . However, we only observe a subset of the entries, Ω ⊂ { 1 , . . . , n } × { 1 , . . . , m } . We may find a solution by solving the following NP-hard optimization: minimize rank( M ) M subject to M Ω = X Ω Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Low-rank Matrix Completion We have an n × m , rank r matrix X . However, we only observe a subset of the entries, Ω ⊂ { 1 , . . . , n } × { 1 , . . . , m } . Or we may solve this convex problem: n � minimize � M � ∗ = σ i ( M ) M i =1 subject to M Ω = X Ω Exact recovery guarantees: X is exactly low-rank and incoherent. MSE guarantees: X is nearly low-rank with bounded ( r + 1) th singular value. Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Low-rank Matrix Completion Algorithms There are a plethora of algorithms to solve the nuclear norm problem or reformulations. LMaFit, APGL, FPCA Singular value thresholding: iterated SVD, SVT, FRSVT Grassmannian: OptSpace, GROUSE Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Example 1: Recommender Systems &'+#/).0%%%1%%%'*-&)23-'.% &'()#"% *$#+),-'.% !"#$"% Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Example 2: Blind Sensor Calibration Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Example 2: Blind Sensor Calibration Ion Selective Electrodes have a 1 nonlinear response to their ions Idealized ISE response 0.8 (pH, ammonium, calcium, etc) 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Single Index Model Suppose we have predictor variables x and response variables y , and we seek a transformation g and vector w relating the two such that � � x T w E [ y | x ] = g . Generalized Linear Model: g is known, y | x are RVs from an exponential family distribution parameterized by w . Includes linear regression, log-linear regression, and logistic regression Single Index Model: Both g and w are unknown. Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Single Index Model Learning Algorithm 1 Lipshitz-Isotron Algorithm [Kakade et al., 2011] Given T > 0, ( x i , y i ) p i =1 ; Set w (1) := 1; for t = 1 , 2 , . . . , T do Update g using Lipschitz-PAV: g ( t ) = LPAV � � i w ( t ) , y i ) p ( x T . i =1 Update w using gradient descent: p w ( t +1) = w ( t ) + 1 � � � y i − g ( t ) ( x T i w ( t ) ) x i p i =1 end for Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Lipschitz Pool Adjacent Violator The Pool Adjacent Violator 6 (PAV) algorithm pools 5 points and averages to 4 minimize mean squared error 3 g ( x i ) − y i . PAV 2 Data L-PAV adds the additional PAV 1 LPAV constraint of a given 0 0 0.2 0.4 0.6 0.8 1 Lipschitz constant. Laura Balzano University of Michigan Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion High-rank (and effective rank) matrices For Z low-rank, 1 Y ij = g ( Z ij ) = 1+exp − γ Zij , Y has full rank. Y ij = g ( Z ij ) = quantize to grid ( Z ij ), Y has full rank. These matrices even have high effective rank. For a rank-50, 1000x1000 matrix Z , we can plot the eff rank of Y : Logistic function Quantizing to a grid 1000 1000 800 800 ǫ =0.001 effective rank ǫ =0.01 effective rank 600 600 400 400 200 200 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0 20 40 60 80 100 120 gamma number of grid points Laura Balzano University of Michigan erank Low-rank structure in messy data

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Optimization Formulation We observe Y ij = g ∗ ( Z ∗ ij ) + N ij for ( i , j ) ∈ Ω, where Ω is the set of observed entries. � ( g ( Z i , j ) − Y i , j ) 2 min g , Z Ω subj. to g : R → R is Lipschitz and monotone rank( Z ) ≤ r Non-convex in each variable, but we can alternate the standard approaches: Use gradient descent and projection onto the low-rank cone for Z . Use LPAV for g . We call this algorithm MMC-LS. Laura Balzano University of Michigan Low-rank structure in messy data

Finding low-rank structure in messy data Laura Balzano University - PowerPoint PPT Presentation

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Finding low-rank structure in messy data Laura Balzano University of Michigan Michigan Institute for Data Science March 2017 Laura Balzano University of

Introduction to Data Science: Common observation to be religion, income, frequency where sex and

Starting point: Mission Data Set - Messy Data Set Messy Storage Messy to Arrange

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Tidy data Tidy datasets are all alike but every messy dataset is messy in its own way

Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

A new family of maximum rank distance codes or: Maximum rank distance codes and finite semifields

Predictive low-rank decomposition for kernel methods Francis Bach Michael Jordan Ecole des

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

Bayesian Estimation of Low-rank Matrices Pierre Alquier Journes de Statistique du Sud,

The Statistics of Dirty Data Sanjay Krishnan coax treasure out of messy, unstructured data 204

Finding Hidden Supernovae with Finding Hidden Supernovae with Finding Hidden Supernovae with

MESSY DATA AND RELUCTANT USERS - THE TROUBLE WITH HEALTHCARE DATA Sam Bail @spbail DataCouncil

2018 - 2019 Teacher Salary Comparison Report 0-Year 5-Year 10-Year 15-Year 20-Year District

Introduction to rank-based cryptography Philippe Gaborit University of Limoges, France ASCRYPTO

Ranking and Calibrating Click-Attributed Purchases in Performance Display Advertising Sougata

Approximating likelihood ratios with calibrated classifiers Gilles Louppe DIANA meeting

Op#mizing u#lity: Postprocessing to ensure constraints CompSci

Order Restricted Clustering for Dose- Response Microarray Data Adetayo Kasim Interuniversity

Machine learning on the symmetric group Jean-Philippe Vert ML ML ML ML What if inputs are

Between Discrete and Continuous Optimization: Submodularity & Optimization Stefanie

Structured Graph Learning Via Laplacian Spectral Constraints Sandeep Kumar, Jiaxi Ying, Jos

Linear Dimension Reduction (in L 2 ) Linear Dimension Reduction: R D R d Goal: Find a low-dim.

Sambuz

Useful Links

Newsletter

Mail Us

Finding low-rank structure in messy data Laura Balzano University - PowerPoint PPT Presentation

Introduction LRMC with Monotonic Observations PCA for Heteroscedastic Data Conclusion Finding low-rank structure in messy data Laura Balzano University of Michigan Michigan Institute for Data Science March 2017 Laura Balzano University of

Introduction to Data Science: Common observation to be religion, income, frequency where sex and

Starting point: Mission Data Set - Messy Data Set Messy Storage Messy to Arrange

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Tidy data Tidy datasets are all alike but every messy dataset is messy in its own way

Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

A new family of maximum rank distance codes or: Maximum rank distance codes and finite semifields

Predictive low-rank decomposition for kernel methods Francis Bach Michael Jordan Ecole des

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

Bayesian Estimation of Low-rank Matrices Pierre Alquier Journes de Statistique du Sud,

The Statistics of Dirty Data Sanjay Krishnan coax treasure out of messy, unstructured data 204

Finding Hidden Supernovae with Finding Hidden Supernovae with Finding Hidden Supernovae with

MESSY DATA AND RELUCTANT USERS - THE TROUBLE WITH HEALTHCARE DATA Sam Bail @spbail DataCouncil

2018 - 2019 Teacher Salary Comparison Report 0-Year 5-Year 10-Year 15-Year 20-Year District

Introduction to rank-based cryptography Philippe Gaborit University of Limoges, France ASCRYPTO

Ranking and Calibrating Click-Attributed Purchases in Performance Display Advertising Sougata

Approximating likelihood ratios with calibrated classifiers Gilles Louppe DIANA meeting

Op#mizing u#lity: Postprocessing to ensure constraints CompSci

Order Restricted Clustering for Dose- Response Microarray Data Adetayo Kasim Interuniversity

Machine learning on the symmetric group Jean-Philippe Vert ML ML ML ML What if inputs are

Between Discrete and Continuous Optimization: Submodularity &amp; Optimization Stefanie

Structured Graph Learning Via Laplacian Spectral Constraints Sandeep Kumar, Jiaxi Ying, Jos

Linear Dimension Reduction (in L 2 ) Linear Dimension Reduction: R D R d Goal: Find a low-dim.

Sambuz

Useful Links

Newsletter

Mail Us

Between Discrete and Continuous Optimization: Submodularity & Optimization Stefanie