Collaborative Filtering Presentation by Alex Hugger Filtering - PowerPoint PPT Presentation

Collaborative Filtering Presentation by Alex Hugger

Filtering Documents Mittwoch, 28. April 2010 Departement/Institut/Gruppe 2

Content-Based Methods � Find other popular items by the same author or similar keywords � Recommendation quality is relatively poor Mittwoch, 28. April 2010 3

Filtering Music Mittwoch, 28. April 2010 4

Filtering Jokes www.xkcd.org Mittwoch, 28. April 2010 5

Filtering Jokes � Let the users rate the jokes � Sort by average rating Mittwoch, 28. April 2010 6

Collaborative Filtering � People who have agreed in the past tend to agree in the future Mittwoch, 28. April 2010 7

Good or Bad? Dirty Dancing (1987) Die Hard (1988) Mittwoch, 28. April 2010 8

Good or Bad? Mittwoch, 28. April 2010 9

Mittwoch, 28. April 2010 10

Jester 4.0 (http://eigentaste.berkeley.edu) Mittwoch, 28. April 2010 11

MovieLens (http://movielens.org) Mittwoch, 28. April 2010 12

Netflix � www.netflix.org � DVD/Blue-Ray rental and video streaming � 1’000’000$ for the first beating the current recommendation algorithm by 10% � Competition started in October 2006 � Ended July 2009 Mittwoch, 28. April 2010 13

GroupLens: An Open Architecture for Collaborative Filtering of NetNews Research paper from 1994 by: � Paul Resnick, MIT Center for Coordination Science � � Neophytos Iacovou, University of Minnesota Neophytos Iacovou, University of Minnesota � Mitesh Suchak, MIT Center for Coordination Science � Peter Bergstrom , University of Minnesota � John Riedl , University of Minnesota Mittwoch, 28. April 2010 14

NetNews Mittwoch, 28. April 2010 15

Problems of NetNews Signal to noise ratio is too low � Splitting bulletin board into newsgroups � Moderated newsgroups � News clients � Summary of the author and subject line � Display discussion threads together � String search facilities � Kill files Mittwoch, 28. April 2010 16

Modification to NetNews 2.61 3.72 Mittwoch, 28. April 2010 Mittwoch, 28. April 2010 17 17

Predicting Scores � Score prediction system is robust to certain differences of interpretation of the rating scale � One user rates 3-5 and the other 1-3 � One thinks 1 and the other 5 is best score Mittwoch, 28. April 2010 18

Predicting Scores � Predictions can be modeled as matrix filling Item # Ken Lee Meg Nan 1 1 1 1 4 4 2 2 2 2 2 5 2 4 4 3 3 4 2 5 5 5 4 1 1 6 ? 2 5 Mittwoch, 28. April 2010 19

Predicting Scores � Assign similarities to each of the other people � Compute over articles rated by both � Pearson Correlation Coefficients � Between -1 and 1 cov( , ) [( )( )] − µ − µ K L E K L = = Ken Lee r Ken Lee − σ ⋅ σ σ ⋅ σ Ken Lee Ken Lee σ = standard deviation of Ken Ken µ = average of Ken’s ratings Ken Mittwoch, 28. April 2010 20

Predicting Scores � Correlation Coefficients of Ken User Correlation # Ken Lee Meg Nan 1 1 1 1 4 4 2 2 2 2 Lee Lee -0.8 -0.8 2 5 2 4 4 Meg +1 3 3 Nan 0 4 2 5 5 5 4 1 1 6 ? 2 5 Mittwoch, 28. April 2010 21

Predicting Scores � Weighted average of all ratings on article 6 � Ken’s prediction is 4.56 ( ( ) ) ∑ ∑ − − µ µ ⋅ ⋅ J J r r 6 Ken − J J = µ + ∈ Raters K J ∑ 6 Prediction K r Ken − J Raters J ∈ Mittwoch, 28. April 2010 22

Scaling Issues � Relevant performance measures � Prediction quality � Compute time and disk storage � Rating is small, but each article may be rated by many � Rating is small, but each article may be rated by many users � Volume of ratings could exceed volume of news Mittwoch, 28. April 2010 23

Scaling Issues � Pre-fetching ratings and pre-computing predictions keeps user time constant � High computation complexity � Volume of all ratings may exceed the storage capacity � 100’000 users rate 10 articles per day. 100 bytes are required to store a rating. 1GB of storage required per 10 days. Mittwoch, 28. April 2010 24

Cluster Models Mittwoch, 28. April 2010 25

Cluster Models � Better online scalability and performance than classical collaborative filtering � Complex and extensive clustering is run offline � Prediction quality gets reduced Mittwoch, 28. April 2010 26

Item-to-Item Collaborative Filtering Mittwoch, 28. April 2010 27

Item-to-Item Collaborative Filtering � Amazon.com extensively uses recommendation algorithms � 10’000’000 products and customers � Result returned in real-time (< 0.5s) � Algorithm must respond immediately to new information Mittwoch, 28. April 2010 28

Amazon.com Mittwoch, 28. April 2010 29

How It Works - Offline � Similar-items table � Calculating similarity between a single product and all related products � Complexity: O(mn 2 ) - in practice: O(mn) � m: number of users � m: number of users � n: number of items Mittwoch, 28. April 2010 33

How It Works - Online � Given a similar-items table � Find all similar items to each of the users ratings and purchases � Aggregate those items � Recommend most popular and correlated items � Number of users has no effect on performance Mittwoch, 28. April 2010 34

General difficulties � Cold start � Self-fulfilling prophecy � Recommendations for groups � Evaluation of recommendation systems � Evaluation of recommendation systems Mittwoch, 28. April 2010 35

Conclusion � Effective form of targeted marketing � Mostly used in e-commerce business � Mostly used in e-commerce business � But can always be used when signal to noise ratio is too low Mittwoch, 28. April 2010 36

Questions? Mittwoch, 28. April 2010 37

References � GroupLens: An Open Architecture for Collaborative Filtering of NetNews � Published 1994 � Paul Resnick, MIT Center for Coordination Science � Neophytos Iacovou, University of Minnesota � Neophytos Iacovou, University of Minnesota � Mitesh Suchak, MIT Center for Coordination Science � Peter Bergstrom , University of Minnesota � John Riedl , University of Minnesota � Amazon.com Recommendations � Published 2003 � Greg Linden � Brent Smith � Jeremy York Mittwoch, 28. April 2010 38

Collaborative Filtering Presentation by Alex Hugger Filtering - PowerPoint PPT Presentation

Collaborative Filtering Presentation by Alex Hugger Filtering Documents Mittwoch, 28. April 2010 Departement/Institut/Gruppe 2 Content-Based Methods Find other popular items by the same author or similar keywords Recommendation

CS490W: What is Collaborative Filtering? Collaborative Filtering (CF): Making recommendation

Filtering Cubemaps Filtering Cubemaps Angular Extent Filtering and Edge Seam Fixup Methods

Traffic Control Mechanisms Filtering Source address filtering Other forms of filtering

Lesson 7 Rate Conversion Filtering and Downsampling interchange Filtering and Upsampling

Collaborative Filtering Yun-Ta Tsai 1 , Markus Steinberger 2 , Dawid Pajk 3 , Kari Pulli 4 1

aHomestake Array and Wiener Filtering Array Coherence Wiener Filtering Velocity Measurements

Least-Action Filtering L. C. G. Rogers Statistical Laboratory, University of Cambridge

The Filtering Matrix Interrogating Internet Filtering and Surveillance Practices Worldwide Nart

Statistical Filtering and Control for AI and Robotics Part I. Bayes filtering Riccardo Muradore

1 An Filtering System that Monitors Document Search Engines Can Help, But Not Enough!

FILTERING MACROECONOMIC DATA WienerKolmogorov Filtering of Stationary Sequences The classical

ECE 516: Adaptive Digital Filters Lecture 8 (Kalman Filtering) Mojtaba Soltanalian Kalman

Nonlinear Filtering using Particles and Outline Nonlinear Quadrature Filtering Monte Carlo

ADVANCED TOPICS ON VIDEO PROCESSING Image Spatial Processing Image Spatial Processing FILTERING

Training deep Autoencoders for collaborative filtering Oleksii Kuchaiev & Boris Ginsburg

Collaborative Filtering Practical Machine Learning, CS 294-34 Lester Mackey Based on slides by

Compiling N ESL for GPUs John Reppy University of Chicago August 2014 Introduction Credits

2015 full-year results presentation 15th March 2016 Forward-looking statements This presentation

Marketing PostgreSQL brand where to start Valeria Kaplan dataegret.com About me PostgreSQL:

An Extension of Systems Factorial Technology (SFT) to Arbitrary Numbers of Processes James

Searching for spectral features in the g -ray sky Alejandro Ibarra Technische Universitt

Dark Matter searches with H.E.S.S. towards dwarf spheroidals galaxies Aion Viana On behalf of

Distribution of traces of genus 3 curves over finite fields R. Lercier, C. Ritzenthaler, Florent

An Indexed Central Limit Theorem Bob Lowen (with B. Berckmoes, J. Van Casteren) University of

Collaborative Filtering Presentation by Alex Hugger Filtering - PowerPoint PPT Presentation

Collaborative Filtering Presentation by Alex Hugger Filtering Documents Mittwoch, 28. April 2010 Departement/Institut/Gruppe 2 Content-Based Methods Find other popular items by the same author or similar keywords Recommendation

CS490W: What is Collaborative Filtering? Collaborative Filtering (CF): Making recommendation

Filtering Cubemaps Filtering Cubemaps Angular Extent Filtering and Edge Seam Fixup Methods

Traffic Control Mechanisms Filtering Source address filtering Other forms of filtering

Lesson 7 Rate Conversion Filtering and Downsampling interchange Filtering and Upsampling

Collaborative Filtering Yun-Ta Tsai 1 , Markus Steinberger 2 , Dawid Pajk 3 , Kari Pulli 4 1

aHomestake Array and Wiener Filtering Array Coherence Wiener Filtering Velocity Measurements

Least-Action Filtering L. C. G. Rogers Statistical Laboratory, University of Cambridge

The Filtering Matrix Interrogating Internet Filtering and Surveillance Practices Worldwide Nart

Statistical Filtering and Control for AI and Robotics Part I. Bayes filtering Riccardo Muradore

1 An Filtering System that Monitors Document Search Engines Can Help, But Not Enough!

FILTERING MACROECONOMIC DATA WienerKolmogorov Filtering of Stationary Sequences The classical

ECE 516: Adaptive Digital Filters Lecture 8 (Kalman Filtering) Mojtaba Soltanalian Kalman

Nonlinear Filtering using Particles and Outline Nonlinear Quadrature Filtering Monte Carlo

ADVANCED TOPICS ON VIDEO PROCESSING Image Spatial Processing Image Spatial Processing FILTERING

Training deep Autoencoders for collaborative filtering Oleksii Kuchaiev &amp; Boris Ginsburg

Collaborative Filtering Practical Machine Learning, CS 294-34 Lester Mackey Based on slides by

Compiling N ESL for GPUs John Reppy University of Chicago August 2014 Introduction Credits

2015 full-year results presentation 15th March 2016 Forward-looking statements This presentation

Marketing PostgreSQL brand where to start Valeria Kaplan dataegret.com About me PostgreSQL:

An Extension of Systems Factorial Technology (SFT) to Arbitrary Numbers of Processes James

Searching for spectral features in the g -ray sky Alejandro Ibarra Technische Universitt

Dark Matter searches with H.E.S.S. towards dwarf spheroidals galaxies Aion Viana On behalf of

Distribution of traces of genus 3 curves over finite fields R. Lercier, C. Ritzenthaler, Florent

An Indexed Central Limit Theorem Bob Lowen (with B. Berckmoes, J. Van Casteren) University of

Training deep Autoencoders for collaborative filtering Oleksii Kuchaiev & Boris Ginsburg