Recommender Systems
Jee-Hyong Lee Information & Intelligence System Lab. Department of Computer Science & Engineering Sungkyunkwan University
Recommender Systems Jee-Hyong Lee Information & Intelligence - - PowerPoint PPT Presentation
Recommender Systems Jee-Hyong Lee Information & Intelligence System Lab. Department of Computer Science & Engineering Sungkyunkwan University Outline 1. Introduction 2. Collaborative Filtering 3. Content-based Recommendation 4.
Recommender Systems
Jee-Hyong Lee Information & Intelligence System Lab. Department of Computer Science & Engineering Sungkyunkwan University
Outline
2
3
2. Collaborative Filtering 3. Content-based Recommendation 4. Context-aware Recommendation 5. Other Approaches 6. Concluding Remarks
Recommender Systems
4
Recommender Systems
– 2/3 of the movies watched are recommended
– Recommendations generate 38% more clickthrough
– 35% sales from recommendations
– 28% of the people would buy more music if they found what they liked
5
Definition of Recommender Systems
– User profile (usage history, demographics, …) – Items (with or without additional information)
– Relevance scores of unseen items – List of unseen items
– Information Retrieval: document models, similarity, ranking – Machine Learning & Data Mining: classification, clustering, regression, probability, association – Others: user modeling, HCI
6
Approaches
– Memory based CF
– Model based CF
Boltzmann machine, Probabilistic approach, Other classifiers
– Content/User modeling & similarity
– Pre-filtering, Post-filtering – Contextual modeling
7
Approaches
– Combining Multiple Recommendation Approach – Combining Multiple Information
– Diversity in Recommendation – Division of Profiles into Sub-Profiles – Recommendation for group users
8
9
1. Introduction
3. Content-based Recommendation 4. Context-aware Recommendation 5. Other Approaches 6. Concluding Remarks
Overview
Item Score I101 0.7 I12 0.9 I32 1.0 … … 10
Candidate Items
List I21 I213 …
Other people’s data Target User
Overview
– Customers who had similar tastes in the past, will have similar tastes in the future – Implicit or explicit user ratings to items are available
– Based on big data: commercial e‐commerce sites – Easy to explain: wisdom of the crowd – Flexible: various algorithms exist – Example: book, movies, DVDs, ..
11
Collaborative Filtering
– User-based CF – Item-based CF
– Dimension reduction (Matrix Factorization) – Clustering – Association rule mining – Restricted Boltzmann machine – Probabilistic models – Various machine learning approaches
12
User-based Collaborative Filtering
– Predict the ratings of active user based on the ratings of similar users
13
I1 I2 I3 I4 I5 Active 4 3 ? 5 4 U1 2 2 2 3 3 U2 3 2 4 5 4 U3 2 3 3 2 5 U4 1 5 1 4 2
User-based Collaborative Filtering
– : rating of user u for item i – : user u’s average ratings
14
I i u i u I i u i u I i u i u u i u
r r r r r r r r u u sim
2 , 2 , , , 2 1
2 2 1 1 2 2 1 1,
I1 I2 I3 I4 I5 Active 4 3 ? 5 4 U1 2 2 2 3 3 U2 3 2 4 5 4 U3 2 3 3 2 5 U4 1 5 1 4 2
u
r
i u
r ,
User-based Collaborative Filtering
15
U v U v v i v u
v u sim r r v u sim r i u pred , , ,
,
I1 I2 I3 I4 I5 Sim. Active 4 3 ? 5 4 U1 2 2 2 3 3 0.71 U2 3 2 4 5 4 0.85 U3 2 3 3 2 5 0.24 U4 1 5 1 4 2
43 . I3 , Target pred
User-based Collaborative Filtering
– Sparsity
– Scalability (m = |users|, n = |items|)
– Solution
16
Model‐based Collaborative Filtering
– Lazy learning: User/Item-based collaborative filtering – Eager learning: Model-based collaborative filtering
– Build preference model from rating matrix – Use the models for predictions – Possibly computationally expensive
17
model
Model‐based Collaborative Filtering
– Dimension reduction (Matrix Factorization) – Clustering – Association rule mining – Restricted Boltzmann machine – Probabilistic models – Various machine learning approaches
18
Matrix Factorization
– Possibly 8,500M ratings (500,000 x 17,000) – But, there are only 100 M non-zero ratings
– Matrix Factorization – Clustering – Projection (PCA…)
– Worst case: O(mn) – In practice: O(m + n)
19
Matrix Factorization
20
Matrix Factorization
Matrix Factorization
Matrix Factorization
– PLSA (Probabilistic Latent Semantic Analysis) – LDA (Latent Dirichlet Allocation)
23
User purchase model User rating model
Matrix Factorization
– Interpreting as probabilities of user-item – Decompose the probability matrix P using an EM approach – Comparison to SVD
stochastic model
24
Collaborative Filtering
– Requires minimal knowledge engineering efforts – No need of any internal structure or characteristics
– Requires a large number of reliable ratings – Assumes that prior behavior determines current behavior – Cold start problems: New user, new items – Sparsity problems
25
26
1. Introduction 2. Collaborative Filtering
4. Context-aware Recommendation 5. Other Approaches 6. Concluding Remarks
Overview
27
Content modeling Similar content Recommendation Item List
Overview
28
– Explicit attributes or chracteristics (Eg for a movie)
– Textual content (Eg for a book)
– Any features or keywords which can describe items
Overview
– Customers will like similar content which they liked in the past
– Items are “described” by their features (e.g. keywords) – Users are described by the keywords in the items they bought
– Easy to apply to text-based products or products with text description – Based on match between the content (item keywords) and user keywords – Many machine learning approaches are applicable
29
Content/User Modeling
– Usually, bag of words model is adopted – Some important words can be selected
– User Modeling
30
Aa cc dd aa bb ff dd dd hh … ( 2, 1, 1, 2, 0, 1, 0, 1, …) ( aa, bb, cc, dd, ee, ff, gg, hh, …)
Content-User Matching
– Cosine similarity
31
Documents read by user User Model New
New
Term vector space
Advantages of CBR
32
– No first-rater problem or sparsity problems – Able to recommend new and unpopular items
– by listing content-features that caused an item to be recommended
– News, email, events, etc.
Disadvantages of CBR
– Book, web pages, news articles, music, video
– Users are recommended with items similar to what they watched – no serendipity
33
34
1. Introduction 2. Collaborative Filtering 3. Content-based Recommendation
5. Other Approaches 6. Concluding Remarks
Overview
– Are based on the ratings of user u for item i – Cumulate data of (User, Items, Rating) – Build a relation R: Users × Items → Rating, in order to estimate ratings for unseen items of a user
– Data: <user, item, rating, context> – Relation: Users × Items × Context→ Rating
35
Overview
– Except users and items – Can be used for better recommendations
book? – For what purpose is the book bought? (Work, leisure, …) – When will the book be read? (Weekday, weekend, …) – Where will the book be read? (At home, at school, on a plane, …)
36
Context is any information or conditions that can influence the perception of the usefulness of an item for a user
Architectural Models of Context Integration
37
< Contextual Post-Filtering > < Contextual Pre-Filtering > < Contextual Modeling >
Contextual Pre-Filtering
– Select the relevant data using given context – Generate recommendation based on the selected data using traditional recommendation approach
– How to efficiently extract relevant data – Exact filtering vs. Generalized filtering
38
Contextual Post-Filtering
– Convert into two-dimensional data (drop out the context information) – Build two models
– Generate recommendation by the traditional recommendation approach – Adjust the obtained recommendation using contextual information
– How to adjust the recommendation – How to apply generalized context
39
Contextual Modeling
information into the recommendation model – Three-dimensional model – Rating = f (User, Item, Context)
– How to efficient build a model – How to apply generalized context
40
Contextual Modeling
– Extension of two-dimensional models – Tensor factorization (like SVD)
41
Users × Items × Context→ Rating
Extension of two-dimensional models
– Traditional user-based collaborative filtering:
42
U v U v v i v u
v u sim r r v u sim r i u pred , , ,
,
1
C k U v C k U v k v k i v c u
k v c u sim r r k v c u sim r c i u pred
, , , , , ,
) , ( ), , ( ) , ( ), , ( , ,
Tensor Factorization
43
– Loss function – Regularization – Objective function
Tensor Factorization
44
Context-aware Recommendation
– Simple: using only the ratings in the same context – Works with large amounts of data
– Simple: Averaging ratings under different context – Takes into account context interactions
– Extension of 2-D model
– Tensor Factorization
45
46
1. Introduction 2. Collaborative Filtering 3. Content-based Recommendation 4. Context-aware Recommendation
6. Concluding Remarks
Overview
– Hybrid Information Network based CF – Collective matrix factorization
– Group profile based – Consensus function based
47
Combining Multiple Information
– User-user relation – User-program relation – Program-genre/channel/time relations
48
Combining Multiple Information
– Evaluate user-user similarity through multiple path – Recommend based on user-based CF
49
Combining Multiple Information
50
– Predicted rating
Combining Multiple Information
51
Group Recommendation
– If group profile is available – Treats a group as a single user – Most existing recommender systems can be adopted easily, but it is difficult to obtain group profiles
– If single user profile is available but group profile is not – Imitates decision-making process – It is easy to apply, but it needs domain knowledge to select consensus function
52
– Regular recommender systems are applicable to group profiles
– Virtual group is generated through consensus function, regular recommender systems are applied
Consensus Function
2 4 5 1Group Recommendation
53
4 2 2 3 2 1 2 4 5 1RS
4 2 4 5 2RS
Recommendation List Recommendation List
Group Recommendation
– Least Misery Strategy – Most Pleasure Strategy – Average Strategy
54
Min
2 2 4 5 1 4 4 2 2 3 3 2 2 2 2 3 1 2
Max
2 2 4 5 1 4 4 2 2 3 3 2 4 2 4 5 3 4
Avg
2 2 4 5 1 4 4 2 2 3 3 2 3 2 3 4 2 3
Group Recommendation
– Consensus-Recommendation
between group members
– Recommendation-Consensus
group member’s preference
55
Consensus Function
1 2 4 5 1Consensus Function
RS
1 2 4 5 1 2 4 2 2 3 2 456
1. Introduction 2. Collaborative Filtering 3. Content-based Recommendation 4. Context-aware Recommendation 5. Other Approaches
Summary
– Collaborative Filtering – Content-based Recommendation – Context-aware Recommendation – Others…
technology
57
58
Thank you for your attention