SLIDE 1 EBCR: Empirical Bayes Concordance Rate to weight similarity measurement in collaborative filtering recommendations
- Y. Du, LGI2P, IMT Mines Alès
- S. Ranwez, LGI2P, IMT Mines Alès
- V. Ranwez, AGAP, Montpellier SupAgro
- N. Sutton-Charani, LGI2P, IMT Mines Alès
SLIDE 2
Collaborative Filtering recommender systems
2
Rose
: like movie
SLIDE 3
3
Rose Alice Bob
: like movie
Collaborative Filtering recommender systems
SLIDE 4
4
Rose Alice Bob
: like movie
Collaborative Filtering recommender systems
SLIDE 5
5
Memory-based collaboratif filtering algorithm
Input : an User-Item-Rating matrix R Output : {ො 𝒔𝑣𝑗 | 𝑣 ∈ 𝑉, 𝑗 ∈ 𝐽 and 𝑠𝑣𝑗 = unknown} Algorithm : weighted average of ratings of u ’s neighbors
Ƹ 𝑠𝑣𝑗 = σ𝑤=1
𝑙
𝑠𝑤𝑗 ∗ 𝑡𝑗𝑛(𝑣, 𝑤) σ𝑤=1
𝑙
𝑡𝑗𝑛(𝑣, 𝑤) (5, ?,…, 1, …, 2) (?, 1,…, ?, …, ?) … (5, 2,…, ?, …, ?) … (?, ?,…, 1, …, 2) (5, 2,…, 2, …, 4) u1 u2 … u ... um-1 um i1, i2,…, i, …, in
U I Rm x n
SLIDE 6 6
Most employed similarity measurements in CF approches
PCC : Pearson Correlation Coefficient: COS : Cosine similarity: MSD : Mean Square Distance
- Linear correlation between two rating vectors
u v
- Normalized distance of two vectors in an euclidien space
SLIDE 7
7
What is the problem here?
PCC, COS, MSD: consider only the rating distributions of u and v restricted to their co-rated items, i.e. Iu,v . 1 3 2 ∅ 1 1 ∅ ∅ 5 ∅ 1 2 2 ∅ 1 Alice Rose Bob i1 i2 i3 i4 i5 IRose,Alice IRose,Bob = {i1} = {i1,i2,i3,i5}
SLIDE 8
8
Here is the problem !
PCC, COS, MSD : consider only the rating distributions of u and v restricted to their co-rated items, which ignores the number of co-rated items. Why do we have to consider the number of co-rated items, i.e. |Iu,v| ? PCC(R, A) = 1 > PCC(R, B) = 0.905 COS(R, A) = 1 > COS(R, B) = 0.9798 MSD(R, A) = 1 > MSD(R, B) = 0.8 NOT Reliable as |Iu,v| is small !!! So, the values need to be adjusted 1 3 2 ∅ 1 1 ∅ ∅ 5 ∅ 1 2 2 ∅ 1 Alice Rose Bob
SLIDE 9
9
Proposed method : EBCR (Empirical Bayes Concordance Rate)
T T (u, u,i) )
Discretize user ratings by three categories of taste
SLIDE 10 10
Proposed method : EBCR (Empirical Bayes Concordance Rate)
f (u and v v )’s concordantly co-rated items: i ∈ Cu,v if T (u,i) = T (v,i)
v : CRu,v = |Cu,v| |Iu,v |
- Interpretation of CR : Probability of two users having the same taste on an item
BUT, what if Iu,v is small ? (
1 1) 1 != 1 ( 2000 2000)
CR: concordance rate of a given user pair u: (1, 1, ?, 5, 4, 5, ?, ?, ..., 2) v: (?, 2, 5, ?, ?, 1, 1, ?, ..., 5) u: (1, 5, 2) v: (2, 1, 5) u: (dislike, like, dislike) v: (dislike, dislike, like) Cu,v = 1 |Iu,v | = 3 CRu,v = 1/3
SLIDE 11 11
Here comes Empirical Bayes
- 1. Take all the CR rates as a Beta prior distribution
- 2. Find 𝜷0 and 𝜸𝟏 that best fit the data, i.e. CR rate set
- 3. Use the prior to adjust each CR value :
EBCRu,v
,v :
:
𝑫𝒗𝒘 𝑱𝒗𝒘 𝑫𝒗𝒘 + 𝜷𝟏 𝑱𝒗𝒘 + 𝜷𝟏+ 𝜸𝟏
- 4. Use EBCR to weight similarity measurement :
sim’(u,v) = sim(u,v) * EBCRu,v
Figure taken from Google Image Espérance de la lois Beta(𝜷0, 𝜸𝟏)
SLIDE 12 12
Evaluation and results
- Dataset : Movielens-1M → 1 million movie ratings of 6 040 users on 3 900 items
- Evaluation metric : MAE (Mean Absolute Error)
- Evaluation protocol : 10-folds cross validation
better
SLIDE 13
- Oct. 2018
- Apr. 2019
- Juin. 2019
Envisage submitting EBCR to ECAI2019 in English version
State of advance and perspectives
1st Year 2nd Year 3rd Year
- Nov. 2019
- 1. Literature
- n RS
- 2. Literature on
knowledge- based RS Submission 1st paper for IC 2019 conference, accepted in May. 2nd Submission for the LFA 2019 conference Collaborate Ontology, knowledge graph, knowledge base and Model-based RS for recommendation diversity and explanation
13
Proposition RS + semantic
SLIDE 14
Merci pour votre attention
SLIDE 15
15
Formulars