rank aggregation via nuclear norm minimization
play

Rank aggregation via nuclear norm minimization David F. Gleich - PowerPoint PPT Presentation

Rank aggregation via nuclear norm minimization David F. Gleich Purdue University @dgleich Lek-Heng Lim University of Chicago KDD2011 San Diego, CA Lek funded by NSF CAREER award (DMS-1057064); David funded by DOE John von Neumann and NSERC


  1. Rank aggregation via nuclear norm minimization David F. Gleich Purdue University @dgleich Lek-Heng Lim University of Chicago KDD2011 San Diego, CA Lek funded by NSF CAREER award (DMS-1057064); David funded by DOE John von Neumann and NSERC David F. Gleich (Purdue) KDD 2011 1/20

  2. Which is a better list of good DVDs? Lord of the Rings 3: The Return of … Lord of the Rings 3: The Return of … Lord of the Rings 1: The Fellowship Lord of the Rings 1: The Fellowship Lord of the Rings 2: The T wo T owers Lord of the Rings 2: The T wo T owers Lost: Season 1 Star Wars V: Empire Strikes Back Battlestar Galactica: Season 1 Raiders of the Lost Ark Fullmetal Alchemist Star Wars IV: A New Hope Trailer Park Boys: Season 4 Shawshank Redemption Trailer Park Boys: Season 3 Star Wars VI: Return of the Jedi T enchi Muyo! Lord of the Rings 3: Bonus DVD Shawshank Redemption The Godfather Standard Nuclear Norm rank aggregation based rank aggregation (the mean rating) ( not matrix completion on the netflix rating matrix) David F. Gleich (Purdue) KDD 2011 2/20

  3. Rank Aggregation Given partial orders on subsets of items, rank aggregation is the problem of finding an overall ordering. Voting Find the winning candidate Program committees Find the best papers given reviews Dining Find the best restaurant in San Diego (subject to a budget?) David F. Gleich (Purdue) KDD 2011 3/20

  4. Ranking is really hard John Kemeny Dwork, Kumar, Naor, Ken Arrow Sivikumar All rank aggregations involve some measure A good ranking is the of compromise “average” ranking under NP hard to compute a permutation distance Kemeny’s ranking David F. Gleich (Purdue) KDD 2011 4/20

  5. Embody chair John Cantrell (flickr) Given a hard problem, what do you do? Numerically relax! It’ll probably be easier. David F. Gleich (Purdue) KDD 2011 5/20

  6. Suppose we had scores Let be the score of the ith movie/song/paper/team to rank Suppose we can compare the ith to jth: Then is skew-symmetric, rank 2. Also works for with an extra log. Numerical ranking is intimately intertwined with skew-symmetric matrices Kemeny and Snell, Mathematical Models in Social Sciences (1978) David F. Gleich (Purdue) KDD 2011 6/20

  7. Using ratings as comparisons Ratings induce various skew-symmetric matrices. Arithmetic Mean Log-odds David 1988 – The Method of Paired Comparisons David F. Gleich (Purdue) KDD 2011 7/20

  8. Extracting the scores 10 7 Given with all entries, then is the Borda Movie Pairs 10 5 count , the least-squares solution to 10 1 How many do we have? Most. 10 1 10 5 Number of Comparisons Do we trust all ? Not really. Netflix data 17k movies, 500k users, 100M ratings – 99.17% filled David F. Gleich (Purdue) KDD 2011 8/20

  9. Only partial info? Complete it! Let be known for We trust these scores . Goal Find the simplest skew-symmetric matrix that matches the data noiseless noisy Both of these are NP-hard too. David F. Gleich (Purdue) KDD 2011 9/20

  10. Solution Go Nuclear From a French nuclear test in 1970, image from http://picdit.wordpress.com/2008/07/21/8-insane-nuclear-explosions/ David F. Gleich (Purdue) KDD 2011 10

  11. The nuclear norm The analog of the 1-norm or for matrices. For vectors For matrices Let be the SVD. is NP-hard while is convex and gives the same best convex under- answer “under appropriate estimator of rank on unit ball. circumstances” David F. Gleich (Purdue) KDD 2011 11/20

  12. Only partial info? Complete it! Let be known for We trust these scores . Goal Find the simplest skew-symmetric matrix that matches the data NP hard Heuristic Convex David F. Gleich (Purdue) KDD 2011 12/20

  13. Solving the nuclear norm problem Use a LASSO formulation 1. 2. REPEAT 3. = rank-k SVD of 4. 5. 6. UNTIL Jain et al. propose SVP for this problem without Jain et al. NIPS 2010 David F. Gleich (Purdue) KDD 2011 13/20

  14. Skew-symmetric SVDs Let be an skew-symmetric matrix with eigenvalues , where and . Then the SVD of is given by for and given in the proof. Proof Use the Murnaghan-Wintner form and the SVD of a 2x2 skew-symmetric block This means that SVP will give us the skew- symmetric constraint “for free” David F. Gleich (Purdue) KDD 2011 14/20

  15. Exact recovery results David Gross showed how to recover Hermitian matrices. i.e. the conditions under which we get the exact Note that is Hermitian. Thus our new result! Gross arXiv 2010. David F. Gleich (Purdue) KDD 2011 15/20

  16. Recovery Discussion and Experiments Confession If , then just look at differences from a connected set. Constants? Not very good. Intuition for the truth . David F. Gleich (Purdue) KDD 2011 16/20

  17. The Ranking Algorithm 0. INPUT (ratings data) and c (for trust on comparisons) 1. Compute from 2. Discard entries with fewer than c comparisons 3. Set to be indices and values of what’s left 4. = SVP( ) 5. OUTPUT David F. Gleich (Purdue) KDD 2011 17/20

  18. Synthetic Results Construct an Item Response Theory model. Vary number of ratings per user and a noise/error level Our Algorithm! The Average Rating David F. Gleich (Purdue) KDD 2011 18/20

  19. Conclusions and Future Work “aggregate, then complete” 1. Compare against others Rank aggregation with 2. Noisy recovery! More the nuclear norm is realistic sampling. principled easy to compute 3. Skew-symmetric Lanczos based SVD? The results are much better than simple approaches. David F. Gleich (Purdue) KDD 2011 19/20

  20. Google nuclear ranking gleich

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend