lecture 20
play

Lecture 20 Jan-Willem van de Meent Schedule Schedule Adjustments - PowerPoint PPT Presentation

Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 20 Jan-Willem van de Meent Schedule Schedule Adjustments Wed 28 Nov: Review Lecture Mon 3 Dec: Project Presentations Fri 7 Dec: Project Reports


  1. Unsupervised Machine Learning 
 and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 20 Jan-Willem van de Meent

  2. Schedule

  3. Schedule Adjustments • Wed 28 Nov: Review Lecture • Mon 3 Dec: Project Presentations • Fri 7 Dec: Project Reports Due • Wed 12 Dec: Final Exam • Fri 14 Dec: Peer Reviews Due

  4. Project

  5. Project Reports • ~10 pages (rough guideline) • Guidelines for contents • Introduction / Motivation • Exploratory analysis (if applicable) • Data mining analysis • Discussion of results

  6. Project Review • 2 per person (randomly assigned) • Reviews should discuss 4 aspects 
 of the report • Clarity 
 ( is the writing clear? ) • Technical merit 
 ( are methods valid? ) • Reproducibility 
 ( is it clear how results were obtained? ) • Discussion 
 ( are results interpretable? )

  7. Recommender Systems

  8. The Long Tail (from: https://www.wired.com/2004/10/tail/)

  9. The Long Tail (from: https://www.wired.com/2004/10/tail/)

  10. The Long Tail (from: https://www.wired.com/2004/10/tail/)

  11. Problem Setting

  12. Problem Setting

  13. Problem Setting

  14. Problem Setting • Task : Predict user preferences for unseen items

  15. Content-based Filtering serious Braveheart The Color Amadeus Purple Lethal Weapon Sense and Sensibility Ocean’s ¡ 11 Geared Geared towards towards males females Dave The Lion King Dumb and Dumber The Princess Independence Diaries Day Gus escapist Two Approaches: 
 1. Predict rating using item features on a per-user basis 2. Predict rating using user features on a per-item basis

  16. Collaborative Filtering #3 #2 #1 Joe #4 Idea : Predict rating based on similarity to other users

  17. Problem Setting • Task : Predict user preferences for unseen items • Content-based filtering : Model user/item features • Collaborative filtering : Implicit similarity of users or items

  18. Applications of Recommender Systems • Movie recommendation (Netflix) • Related product recommendation (Amazon) • Web page ranking (Google) • Social recommendation (Facebook) • Priority inbox & spam filtering (Google) • Online dating (OK Cupid) • Computational Advertising (Everyone)

  19. Challenges • Scalability • Millions of objects • 100s of millions of users • Cold start • Changing user base • Changing inventory • Imbalanced dataset • User activity / item reviews 
 power law distributed • Ratings are not missing at random

  20. Running Example: Netflix Data Training data Test data user movie date score user movie date score 1 21 5/7/02 1 1 62 1/6/05 ? 1 213 8/2/04 5 1 96 9/13/04 ? 2 345 3/6/01 4 2 7 8/18/05 ? 2 123 5/1/05 4 2 3 11/22/05 ? 2 768 7/15/02 3 3 47 6/13/02 ? 3 76 1/22/01 5 3 15 8/12/01 ? 4 45 8/3/00 4 4 41 9/1/00 ? 5 568 9/10/05 1 4 28 8/27/05 ? 5 342 3/5/03 2 5 93 4/4/05 ? 5 234 12/28/00 2 5 74 7/16/03 ? 6 76 8/11/02 5 6 69 2/14/04 ? 6 56 6/15/03 4 6 83 10/3/03 ? • Released as part of $1M competition by Netflix in 2006 • Prize awarded to BellKor in 2009

  21. Running Yardstick: RMSE s X | S | − 1 r ui − r ui ) 2 rmse( S ) = (ˆ ( i,u ) ∈ S

  22. Running Yardstick: RMSE s X | S | − 1 r ui − r ui ) 2 rmse( S ) = (ˆ ( i,u ) ∈ S (doesn’t tell you how to actually do recommendation)

  23. Content-based Filtering

  24. Item-based Features

  25. Item-based Features

  26. Item-based Features

  27. Per-user Regression Learn a set of regression coefficients for each user | r u − X w | 2 w u = argmin w

  28. User Bias and 
 Item Popularity

  29. Bias

  30. Bias Moonrise Kingdom 4 5 4 4 0.3 0.2

  31. Bias Moonrise Kingdom 4 5 4 4 0.3 0.2 Problem : Some movies are universally loved / hated

  32. Bias 3 3 Moonrise Kingdom 4 5 3 4 4 0.3 0.2 Problem : Some movies are universally loved / hated 
 some users are more picky than others

  33. Bias 3 3 Moonrise Kingdom 4 5 3 4 4 0.3 0.2 Problem : Some movies are universally loved / hated 
 some users are more picky than others Solution: Introduce a per-movie and per-user bias

  34. Collaborative 
 Filtering

  35. Neighborhood Based Methods #3 #2 #1 Joe #4 Users and items form a bipartite graph (edges are ratings)

  36. Neighborhood Based Methods (user, user) similarity • predict rating based on average 
 from k-nearest users • good if item base is small • good if item base changes rapidly (item,item) similarity • predict rating based on average 
 from k-nearest items • good if the user base is small • good if user base changes rapidly

  37. Parzen-Window Style CF #3 #2 #1 Joe #4 • Define a similarity s ij between items • Find set ε k ( i , u ) of k -nearest neighbors 
 to i that were rated by user u • Predict rating using weighted average over set • How should we define s ij ?

  38. • – Pearson Correlation Coefficient • – each item rated by a distinct set of users User ratings for item i: ? ? 1 ? ? 5 5 3 ? ? 4 2 ? ? ? 4 ? 5 4 1 ? User ratings for item j: ? ? 4 2 5 ? ? 1 2 5 ? ? 2 ? ? 3 ? ? ? 5 4 • Cov[ r ui , r uj ] s ij = Std[ r ui ]Std[ r uj ]

  39. (item,item) similarity Empirical estimate of Pearson correlation coefficient P u ∈ U ( i,j ) ( r ui − b ui )( r uj − b uj ) ρ ij = ˆ qP u ∈ U ( i,j ) ( r ui − b ui ) 2 P u ∈ U ( i,j ) ( r uj − b uj ) 2 U(i, j): set of users who have rated both i and j Regularize towards 0 for small support | U ( i, j ) | − 1 s ij = | U ( i, j ) | − 1 + λ ˆ ρ ij Regularize towards baseline for small neighborhood

  40. Similarity for binary labels Pearson correlation not meaningful for binary labels 
 (e.g. Views, Purchases, Clicks) Jaccard similarity Observed / Expected ratio m ij s ij = observed m ij s ij = expected ≈ α + m i m j /m α + m i + m j − m ij m i users acting on i m ij users acting on both i and j m total number of users

  41. Matrix Factorization Methods

  42. Matrix Factorization Moonrise Kingdom 4 5 4 4 0.3 0.2

  43. Matrix Factorization Moonrise Kingdom 4 5 4 4 0.3 0.2 Idea: pose as (biased) matrix factorization problem

  44. Matrix Factorization users 1 3 5 5 4 5 4 4 2 1 3 items ~ 2 4 1 2 3 4 3 5 2 4 5 4 2 4 3 4 2 2 5 1 3 3 2 4 users 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 .1 -.4 .2 items -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.5 .6 .5 ~ 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 .7 .3 A rank-3 SVD approximation

  45. Prediction users 1 3 5 5 4 5 4 4 2 1 3 ? items ~ 2 4 1 2 3 4 3 5 2 4 5 4 2 4 3 4 2 2 5 1 3 3 2 4 users 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 .1 -.4 .2 items -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.5 .6 .5 ~ 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 .7 .3 A rank-3 SVD approximation

  46. Prediction users 1 3 5 5 4 5 4 4 2 1 3 2.4 items ~ 2 4 1 2 3 4 3 5 2 4 5 4 2 4 3 4 2 2 5 1 3 3 2 4 users 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 .1 -.4 .2 items -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.5 .6 .5 ~ 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 .7 .3 A rank-3 SVD approximation

  47. SVD with missing values .1 -.4 .2 1 3 5 5 4 -.5 .6 .5 5 4 4 2 1 3 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.2 .3 .5 ~ 2 4 1 2 3 4 3 5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2 4 5 4 2 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 4 3 4 2 2 5 -1 .7 .3 1 3 3 2 4 Pose as regression problem • SVD ¡isn’t ¡defined ¡when ¡entries ¡are ¡unknown ¡ � • � Regularize using Frobenius norm • – –

  48. Alternating Least Squares .1 -.4 .2 1 3 5 5 4 -.5 .6 .5 5 4 4 2 1 3 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.2 .3 .5 ~ 2 4 1 2 3 4 3 5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2 4 5 4 2 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 4 3 4 2 2 5 -1 .7 .3 1 3 3 2 4 • SVD ¡isn’t ¡defined ¡when ¡entries ¡are ¡unknown ¡ � (regress w u given X ) • � • – –

  49. Alternating Least Squares .1 -.4 .2 1 3 5 5 4 -.5 .6 .5 5 4 4 2 1 3 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.2 .3 .5 ~ 2 4 1 2 3 4 3 5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 1.1 2.1 .3 2 4 5 4 2 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 -.7 2.1 -2 4 3 4 2 2 5 -1 .7 .3 1 3 3 2 4 • SVD ¡isn’t ¡defined ¡when ¡entries ¡are ¡unknown ¡ � (regress w u given X ) • � • L 2: closed form solution Remember – ridge regression? w = ( X T X + λ I ) � 1 X T y –

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend