Low-rank Matrix Completion via Convex Optimization Ben Recht - PowerPoint PPT Presentation

Low-rank Matrix Completion via Convex Optimization Ben Recht Center for the Mathematics of Information Caltech

Recommender Systems

Netflix Prize • One million big ones! • Given 100 million ratings on a scale of 1 to 5, predict 3 million ratings to highest accuracy • 17770 total movies x 480189 total users • Over 8 billion total ratings • How to fill in the blanks?

Abstract Setup: Matrix Completion X ij known for black cells X ij unknown for white cells X = Rows index movies Columns index users • How do you fill in the missing data? R * L X = k x n k x r r x n r (k+ n) entries kn entries

Low-rank Matrix Completion X ij known for black cells X ij unknown for white cells X = • How do you fill in the missing data?

Multitask Recommender Euclidean Learning Systems Embedding Data Gram Matrix of Rank of: Matrix Matrix Classifiers G K Controller Model System Design Reduction Identification Constraints involving the rank of the Hankel Operator, Matrix, or Singular Values

Affine Rank Minimization • PROBLEM: Find the matrix of lowest rank that satisfies/ approximates the underdetermined linear system • NP-HARD: – Reduce to finding solutions to polynomial systems – Hard to approximate – Exact algorithms are awful

Proposed Heuristic Affine Rank Minim ization: Convex Relaxation: • Proposed by Fazel (2002). • Nuclear norm is the “numerical rank” in numerical analysis • The “trace heuristic” from controls if X is p.s.d.

Parsimonious Models rank model atoms weights • Search for best linear combination of fewest atoms • “rank” = fewest atoms needed to describe the model

• 2x2 matrices • plotted in 3d rank 1 x 2 + z 2 + 2y 2 = 1 Convex hull:

• 2x2 matrices • plotted in 3d • Projection onto x-z plane is l 1 ball

w 2 A ( X )= b w 1

• 2x2 matrices • plotted in 3d • Not polyhedral… So how do we compute it? And when does it work?

Equivalent Formulations • Semidefinite embedding: • Low rank parametrization:

Computationally: Gradient Descent! • “Method of multipliers” • Schedule for  controls the noise in the data • Same global minimum as nuclear norm • Dual certificate for the optimal solution • When will this fail and when it might succeed?

First theory result • If m > c 0 r(k+ n-r)log(kn), the heuristic succeeds for most A Recht, Fazel, and Parrilo. 2007. • Number of measurements c 0 r(k+ n-r) log(kn) am bient constant intrinsic dim ension dim ension • Approach: Show that a random A is nearly an isometry on the manifold of low-rank matrices. • Stable to noise in measurement vector b and returns as good an answer as a truncated SVD of the true X .

Low-rank Matrix Completion X ij known for black cells X ij unknown for white cells X = • How do you fill in the missing data?

Which matrices? • Any subset of entries that misses the (1,1) X = component tells you nothing! • Still need to see the entire first row X = • Want each entry to provide nearly the same amount of information

Incoherence • Let U be a subspace of R n of dimension r and P U be the orthogonal projection onto U . Then the coherence of U (with respect to the standard basis e i ) is defined to be  ( U ) ≥ • 1 – e.g., span of r columns of the Fourier transform  ( U ) ≤ • n/ r – e.g., any subspace that contains a standard basis element  ( U ) = O(1) • – sampled from the uniform distribution with r > log n

Matrix Completion • Suppose X is k x n (k ≤ n) has rank r and has row and column spaces with incoherence bounded above by  . Then the nuclear norm heuristic recovers X from most subsets of entries  with cardinality at least  -1 n 1/ 5 , • If, in addition, r ≤ then entries suffice. Candès and Recht. 2008

Proof Tools • Convex Analysis – KKT Conditions: Find dual certificate proving minimum nuclear norm solution is the hidden low rank matrix – Compressed Sensing: Use ansatz for multiplier and bound its norm • Probability on Banach Spaces – Moment bounds for norms of matrix valued random variables [ Rudelson] – Decoupling [ Bourgain-Tzafiri, de la Pena et al ] : Indicators variables can be treated as independent – Non-commutative Khintchine Inequality [ Lust-Piquard] : Tightly bound the operator norm in terms of the largest entry.

Mixture of hundreds of models, including nuclear norm Gradient descent … … … … on low-rank nuclear norm parameterization

Parsimonious Modeling: A road map • Open Problem s in rank m inim ization : optimal bounds, noise performance, faster algorithms, more mining of connections with compressed sensing • Expanding the parsim ony catalog : dynamical systems, nonlinear models, tensors, completely positive matrices, Jordan Algebras, and beyond • Autom atic parsim onious program m ing : computational complexity of norms. algorithm and proof generation • Broad applied im pact : data mining time series in biology, medicine, social networks, and human computer interfaces

Acknowledgements • See: http: / / www.ist.caltech.edu/ ~ brecht/ publications.html for all references • Results developed in collaboration with Emmanuel Candès, John Doyle, Babak Hassibi, and Weiyu Xu at Caltech, Ali Rahimi at Intel Research, Harvey Cohen, Lawrence Recht, and John Whitin at Stanford, Maryam Fazel at U Washington, and Pablo Parrilo and the RealNose team at MIT.

Low-rank Matrix Completion via Convex Optimization Ben Recht - PowerPoint PPT Presentation

Low-rank Matrix Completion via Convex Optimization Ben Recht Center for the Mathematics of Information Caltech Recommender Systems Netflix Prize One million big ones! Given 100 million ratings on a scale of 1 to 5, predict 3

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Convex Hell 362 dnc CS 16: Convex Hull Whoops, I mean... Convex Hull Whats a Convex Hull?

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Optimization Problems Instructor:

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Optimization Problems

constrained convex optimization virgil pavlu 1 convex set a set X in a vector space is convex if

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical

Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation

CS675: Convex and Combinatorial Optimization Spring 2018 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Sets Instructor: Shaddin Dughmi

CS675: Convex and Combinatorial Optimization Fall 2014 Convex Functions Instructor: Shaddin

Convex hull 1 - 1 Convex hull 1 - 2 Convex hull 1 - 3 Convex hull Definition, extremal

CS133 Computational Geometry Convex Hull 1 Convex Hull Given a set of n points, find the

Low Rank Matrix Completion: A Smoothed 0 -Search Wei Dai Jointly with Guangyu Zhou and

Some Recent Advances in Non-convex Optimization Purushottam Kar IIT KANPUR Outline of the Talk

A Primer in Convex Optimization Moritz Diehl partly based on material by Colin Jones, Stephen

Machine Learning: Course Overview CS 760@UW-Madison Class enrollment typically the class was

Recommender Systems: Tutorial Andras Benczur Insitute for Computer Science and Control Hungarian

CSE 158 Lecture 8 Web Mining and Recommender Systems Extensions of latent-factor models,

Transfer to Rank for Top-N Recommendation Wei Dai, Qing Zhang, Weike Pan and Zhong Ming

1 2 3 4 5 Second Project Implement collaborative filtering algorithm Apply to

CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets

Conclusions Larry Holder CptS 570 Machine Learning School of Electrical Engineering and

Ensemble and Boosting Algorithms Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net