unbiased offline recommender evaluation for missing not
play

Unbiased Offline Recommender Evaluation for Missing-Not-At-Random - PowerPoint PPT Presentation

Unbiased Offline Recommender Evaluation for Missing-Not-At-Random Implicit Feedback Serge Belongie Deborah Estrin Lo Longqi Yang Yuan Xuan Chenyang Wang Yin Cui Funders: 1 Offline Evaluation of Recommendation Algorithm user-item


  1. Unbiased Offline Recommender Evaluation for Missing-Not-At-Random Implicit Feedback Serge Belongie Deborah Estrin Lo Longqi Yang Yuan Xuan Chenyang Wang Yin Cui Funders: 1

  2. Offline Evaluation of Recommendation Algorithm user-item interactions recommendation algorithms ( , ) … ( , ) R … … rewards ( , ) 2

  3. Offline Evaluation of Recommendation Algorithm user interaction history recommendation algorithms Pr Pros: ( , ) … • Cost effective. • Efficient. ( , ) R • Iterate faster. • Experiment before deployment. … … rewards ( , ) 3

  4. Offline Evaluation of Recommendation Algorithm user interaction history recommendation algorithms Pr Pros: ( , ) … • Cost effective. • Efficient. ( , ) R • Iterate faster. • Experiment before deployment. … … rewards ( , ) Co Cons: • The data is Missing-Not-At-Random (MNAR) 4

  5. Of Offline E e Evaluation on procedure item " user ! interacted user ! with item " 5

  6. Of Offline E e Evaluation on procedure train/test 6

  7. Of Offline E e Evaluation on procedure 1. Train and validate a 2. Averaged performance over held- recommendation model out (user, item) interaction pairs (Average-Over-All) 7

  8. Of Offline E e Evaluation on procedure Rating-based recommendation systems Implicit feedback-based recommendation systems 1. Train and validate a 2. Averaged performance over held- recommendation model out (user, item) interaction pairs (policy) ! (Average-Over-All) 8

  9. Previous work: Av Average-Ov Over er-Al All is is bia biased fo for r ra rating ting-ba based d re recommenda ndatio tion n systems, be becaus use ra rating tings are re MN MNAR [Marlin et al. 09], [Schnabel et al. 16], [Steck 10], [Steck 11], and [Steck 13] 9

  10. Previous work: Av Average-Ov Over er-Al All is is bia biased fo for r ra rating ting-ba based d re recommenda ndatio tion n systems, be becaus use ra rating tings are re MN MNAR [Marlin et al. 09], [Schnabel et al. 16], [Steck 10], [Steck 11], and [Steck 13] Previous work: Av Average-Ov Over er-Al All is is unb unbiased fo for r im implic plicit it fe feedba dback-ba based d re recommenda datio ion systems, be because im implic plicit it fe feedba dback is is mi missing uniforml mly at random. [Lim 15] 10

  11. This work: Av Average-Ov Over er-Al All is is bia biased fo for r im implic plicit it fe feedba dback-ba based d re recommenda datio ion systems, be because im impl plic icit it fe feedbac dback k is is NO NOT mi missing uniforml mly at random . 11

  12. This work: Av Average-Ov Over er-Al All is is bia biased fo for r im implic plicit it fe feedba dback-ba based d re recommenda datio ion systems, be because im impl plic icit it fe feedbac dback k is is NO NOT mi missing uniforml mly at random. trending tr re recommendation Popularity bias (Users are more likely to be exposed to popular items) 12

  13. A Hypothetical Example Popular Items Long-tail Items # of liked items 1 : 10 (over all items) : # of liked items 10 1 (over observations) Algorithm 1 0.8 0 Performance Algorithm 2 0.75 0.75 Performance 13

  14. A Hypothetical Example Popular Items Long-tail Items # of liked items 1 : 10 (over all items) : # of liked items 10 1 (over observations) Algorithm 1 0.8 0 Performance Algorithm 2 0.75 0.75 Performance 14

  15. A Hypothetical Example Popular Items Long-tail Items # of liked items 1 : 10 (over all items) : # of liked items 10 1 (over observations) Algorithm 1 0.8 0 Performance Algorithm 2 0.75 0.75 Performance 15

  16. A Hypothetical Example Popular Items Long-tail Items # of liked items 1 : 10 (over all items) : # of liked items 10 1 (over observations) Algorithm 1 0.8 0 Performance Algorithm 2 0.75 0.75 Performance 16

  17. A Hypothetical Example Popular Items Long-tail Items # of liked items 1 : 10 (over all items) : # of liked items 10 1 (over observations) Algorithm 1 0.8 0 Performance Algorithm 2 0.75 0.75 Performance 17

  18. A Hypothetical Example Popular Items Long-tail Items # of liked items 1 : 10 (over all items) : # of liked items 10 1 (over observations) Any sensible Algorithm 1 0.8 0 evaluation Performance Algorithm 2 0.75 0.75 Performance 18

  19. A Hypothetical Example Popular Items Long-tail Items # of liked items 1 : 10 (over all items) : # of liked items 10 1 Average- (over observations) Over-All Algorithm 1 0.8 0 Performance Algorithm 2 0.75 0.75 Performance 19

  20. <latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit> <latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit> <latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit> <latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit> Formalize Reward ! Item rankings predicted by an algorithm Z ) = 1 1 R ( ˆ c ( ˆ X X Z u,i ) Ideal evaluation: |U| |S u | u ∈ U i ∈ S u 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend