Unbiased Offline Recommender Evaluation for Missing-Not-At-Random - - PowerPoint PPT Presentation

unbiased offline recommender evaluation for missing not
SMART_READER_LITE
LIVE PREVIEW

Unbiased Offline Recommender Evaluation for Missing-Not-At-Random - - PowerPoint PPT Presentation

Unbiased Offline Recommender Evaluation for Missing-Not-At-Random Implicit Feedback Serge Belongie Deborah Estrin Lo Longqi Yang Yuan Xuan Chenyang Wang Yin Cui Funders: 1 Offline Evaluation of Recommendation Algorithm user-item


slide-1
SLIDE 1

Unbiased Offline Recommender Evaluation for Missing-Not-At-Random Implicit Feedback

1

Funders: Lo Longqi Yang Yin Cui Yuan Xuan Chenyang Wang Serge Belongie Deborah Estrin

slide-2
SLIDE 2

2

Offline Evaluation of Recommendation Algorithm

( , ) ( , ) ( , ) …

user-item interactions

R

recommendation algorithms rewards

slide-3
SLIDE 3

R

recommendation algorithms rewards

3

Offline Evaluation of Recommendation Algorithm

( , ) ( , ) ( , ) …

user interaction history

Pr Pros:

  • Cost effective.
  • Efficient.
  • Iterate faster.
  • Experiment before deployment.
slide-4
SLIDE 4

R

recommendation algorithms rewards

4

Offline Evaluation of Recommendation Algorithm

( , ) ( , ) ( , ) …

user interaction history

Pr Pros:

  • Cost effective.
  • Efficient.
  • Iterate faster.
  • Experiment before deployment.

Co Cons:

  • The data is Missing-Not-At-Random (MNAR)
slide-5
SLIDE 5

5

Of Offline E e Evaluation

  • n procedure

user ! item " user ! interacted with item "

slide-6
SLIDE 6

6

Of Offline E e Evaluation

  • n procedure

train/test

slide-7
SLIDE 7

7

Of Offline E e Evaluation

  • n procedure
  • 1. Train and validate a

recommendation model

  • 2. Averaged performance over held-
  • ut (user, item) interaction pairs

(Average-Over-All)

slide-8
SLIDE 8

8

Of Offline E e Evaluation

  • n procedure
  • 1. Train and validate a

recommendation model (policy) !

  • 2. Averaged performance over held-
  • ut (user, item) interaction pairs

(Average-Over-All)

Rating-based recommendation systems Implicit feedback-based recommendation systems

slide-9
SLIDE 9

9

Previous work: Av Average-Ov Over er-Al All is is bia biased fo for r ra rating ting-ba based d re recommenda ndatio tion n systems, be becaus use ra rating tings are re MN MNAR

[Marlin et al. 09], [Schnabel et al. 16], [Steck 10], [Steck 11], and [Steck 13]

slide-10
SLIDE 10

10

Previous work: Av Average-Ov Over er-Al All is is bia biased fo for r ra rating ting-ba based d re recommenda ndatio tion n systems, be becaus use ra rating tings are re MN MNAR

[Marlin et al. 09], [Schnabel et al. 16], [Steck 10], [Steck 11], and [Steck 13]

Previous work: Av Average-Ov Over er-Al All is is unb unbiased fo for r im implic plicit it fe feedba dback-ba based d re recommenda datio ion systems, be because im implic plicit it fe feedba dback is is mi missing uniforml mly at random.

[Lim 15]

slide-11
SLIDE 11

11

This work: Av Average-Ov Over er-Al All is is bia biased fo for r im implic plicit it fe feedba dback-ba based d re recommenda datio ion systems, be because im impl plic icit it fe feedbac dback k is is NO NOT mi missing uniforml mly at random.

slide-12
SLIDE 12

12

This work: Av Average-Ov Over er-Al All is is bia biased fo for r im implic plicit it fe feedba dback-ba based d re recommenda datio ion systems, be because im impl plic icit it fe feedbac dback k is is NO NOT mi missing uniforml mly at random.

Popularity bias (Users are more likely to be exposed to popular items)

tr trending re recommendation

slide-13
SLIDE 13

13

A Hypothetical Example

Popular Items Long-tail Items # of liked items (over all items) # of liked items (over observations)

1 10 10 1 : :

Algorithm 1 Performance Algorithm 2 Performance

0.8 0.75 0.75

slide-14
SLIDE 14

14

A Hypothetical Example

Popular Items Long-tail Items # of liked items (over all items) # of liked items (over observations)

1 10 10 1 : :

Algorithm 1 Performance Algorithm 2 Performance

0.8 0.75 0.75

slide-15
SLIDE 15

15

A Hypothetical Example

Popular Items Long-tail Items # of liked items (over all items) # of liked items (over observations)

1 10 10 1 : :

Algorithm 1 Performance Algorithm 2 Performance

0.8 0.75 0.75

slide-16
SLIDE 16

16

A Hypothetical Example

Popular Items Long-tail Items # of liked items (over all items) # of liked items (over observations)

1 10 10 1 : :

Algorithm 1 Performance Algorithm 2 Performance

0.8 0.75 0.75

slide-17
SLIDE 17

17

A Hypothetical Example

Popular Items Long-tail Items # of liked items (over all items) # of liked items (over observations)

1 10 10 1 : :

Algorithm 1 Performance Algorithm 2 Performance

0.8 0.75 0.75

slide-18
SLIDE 18

18

A Hypothetical Example

Popular Items Long-tail Items # of liked items (over all items) # of liked items (over observations)

1 10 10 1 : :

Algorithm 1 Performance Algorithm 2 Performance

0.8 0.75 0.75

Any sensible evaluation

slide-19
SLIDE 19

19

A Hypothetical Example

Popular Items Long-tail Items # of liked items (over all items) # of liked items (over observations)

1 10 10 1 : :

Algorithm 1 Performance Algorithm 2 Performance

0.8 0.75 0.75

Average- Over-All

slide-20
SLIDE 20

20

Formalize Reward !

R( ˆ Z) = 1 |U| X

u∈U

1 |Su| X

i∈Su

c( ˆ Zu,i)

Ideal evaluation:

Item rankings predicted by an algorithm

<latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit>
slide-21
SLIDE 21

21

Formalize Reward !

R( ˆ Z) = 1 |U| X

u∈U

1 |Su| X

i∈Su

c( ˆ Zu,i)

Ideal evaluation:

Predicted ranking of item i for user u Items liked by user u among the entire item set Reward for (u, i) pair scoring metric

<latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit>

Item rankings predicted by an algorithm

<latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit>
slide-22
SLIDE 22

22

Formalize Reward !

R( ˆ Z) = 1 |U| X

u∈U

1 |Su| X

i∈Su

c( ˆ Zu,i)

Ideal evaluation:

Predicted ranking of item i for user u Items liked by user u among the entire item set Reward for user u scoring metric

<latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit>

Item rankings predicted by an algorithm

<latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit>
slide-23
SLIDE 23

23

Formalize Reward !

R( ˆ Z) = 1 |U| X

u∈U

1 |Su| X

i∈Su

c( ˆ Zu,i)

Ideal evaluation:

Predicted ranking of item i for user u Items liked by user u among the entire item set scoring metric

<latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy73hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q=</latexit>

Item rankings predicted by an algorithm

<latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">ACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCF/ghf/FS8eWopXj978b5ykEWzRBwOP97H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf3Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkLhpjYuHRYgljDHkacGNLrJeldow96JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8Umz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY3s9RZafsjJ2ziL1nXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA=</latexit>

Reward for the algorithm

<latexit sha1_base64="58zCSwWB0/MRbmW4ak6tFuTVoiw=">ACAXicdVDLSgMxFM3UV62vqhvBTbAIrkpa+3InuHGpYlWopWTSO21oZjIkd5RS6sZfceNCEbf+hTv/xvQhqOhZhM593Jzjh8raZGxDy81Mzs3v5BezCwtr6yuZdc3LqxOjIC60EqbK59bUDKCOkpUcBUb4KGv4NLvHY38yxswVuroHPsxNEPeiWQgBUcntbJbZ3DLTZsG2lDsAuWqo43EbtjK5lie1Ur7rEZvlIqFytVR9x7UGa0kGdj5MgUJ63s+3VbiySECIXi1jYKLMbmgBuUQsEwc51YiLno8Q40HI14CLY5GCcY0l2nTD4R6AjpWP2+MeChtf3Qd5Mhx6797Y3Ev7xGgkGtOZBRnCBEYnIoSBRFTUd10LY0IFD1HeHCBZeCi43XKArLeNK+EpK/ycXxXzB8dNi7rA0rSNtskO2SMFUiWH5JickDoR5I48kCfy7N17j96L9zoZTXnTnU3yA97bJ5YylvE=</latexit><latexit sha1_base64="58zCSwWB0/MRbmW4ak6tFuTVoiw=">ACAXicdVDLSgMxFM3UV62vqhvBTbAIrkpa+3InuHGpYlWopWTSO21oZjIkd5RS6sZfceNCEbf+hTv/xvQhqOhZhM593Jzjh8raZGxDy81Mzs3v5BezCwtr6yuZdc3LqxOjIC60EqbK59bUDKCOkpUcBUb4KGv4NLvHY38yxswVuroHPsxNEPeiWQgBUcntbJbZ3DLTZsG2lDsAuWqo43EbtjK5lie1Ur7rEZvlIqFytVR9x7UGa0kGdj5MgUJ63s+3VbiySECIXi1jYKLMbmgBuUQsEwc51YiLno8Q40HI14CLY5GCcY0l2nTD4R6AjpWP2+MeChtf3Qd5Mhx6797Y3Ev7xGgkGtOZBRnCBEYnIoSBRFTUd10LY0IFD1HeHCBZeCi43XKArLeNK+EpK/ycXxXzB8dNi7rA0rSNtskO2SMFUiWH5JickDoR5I48kCfy7N17j96L9zoZTXnTnU3yA97bJ5YylvE=</latexit><latexit sha1_base64="58zCSwWB0/MRbmW4ak6tFuTVoiw=">ACAXicdVDLSgMxFM3UV62vqhvBTbAIrkpa+3InuHGpYlWopWTSO21oZjIkd5RS6sZfceNCEbf+hTv/xvQhqOhZhM593Jzjh8raZGxDy81Mzs3v5BezCwtr6yuZdc3LqxOjIC60EqbK59bUDKCOkpUcBUb4KGv4NLvHY38yxswVuroHPsxNEPeiWQgBUcntbJbZ3DLTZsG2lDsAuWqo43EbtjK5lie1Ur7rEZvlIqFytVR9x7UGa0kGdj5MgUJ63s+3VbiySECIXi1jYKLMbmgBuUQsEwc51YiLno8Q40HI14CLY5GCcY0l2nTD4R6AjpWP2+MeChtf3Qd5Mhx6797Y3Ev7xGgkGtOZBRnCBEYnIoSBRFTUd10LY0IFD1HeHCBZeCi43XKArLeNK+EpK/ycXxXzB8dNi7rA0rSNtskO2SMFUiWH5JickDoR5I48kCfy7N17j96L9zoZTXnTnU3yA97bJ5YylvE=</latexit><latexit sha1_base64="58zCSwWB0/MRbmW4ak6tFuTVoiw=">ACAXicdVDLSgMxFM3UV62vqhvBTbAIrkpa+3InuHGpYlWopWTSO21oZjIkd5RS6sZfceNCEbf+hTv/xvQhqOhZhM593Jzjh8raZGxDy81Mzs3v5BezCwtr6yuZdc3LqxOjIC60EqbK59bUDKCOkpUcBUb4KGv4NLvHY38yxswVuroHPsxNEPeiWQgBUcntbJbZ3DLTZsG2lDsAuWqo43EbtjK5lie1Ur7rEZvlIqFytVR9x7UGa0kGdj5MgUJ63s+3VbiySECIXi1jYKLMbmgBuUQsEwc51YiLno8Q40HI14CLY5GCcY0l2nTD4R6AjpWP2+MeChtf3Qd5Mhx6797Y3Ev7xGgkGtOZBRnCBEYnIoSBRFTUd10LY0IFD1HeHCBZeCi43XKArLeNK+EpK/ycXxXzB8dNi7rA0rSNtskO2SMFUiWH5JickDoR5I48kCfy7N17j96L9zoZTXnTnU3yA97bJ5YylvE=</latexit>
slide-24
SLIDE 24

24

Formalize Reward !

Average-Over-All: ˆ

RAOA( ˆ Z) = 1 |U| X

u∈U

1 |S∗

u|

X

i∈S∗

u

c( ˆ Zu,i)

Items liked by user u (observed)

slide-25
SLIDE 25

25

Formalize Bias

Ou,i = 1 if (u, i) is observed, and Ou,i = 0 otherwise EO h ˆ RAOA( ˆ Z) i 6= R( ˆ Z) Ou,i ∼ B(1, Pu,i)

slide-26
SLIDE 26

26

ˆ RIPS( ˆ Z|P) = 1 |U| X

u∈U

1 |Su| X

i∈S∗

u

c( ˆ Zu,i) Pu,i ˆ RAOA( ˆ Z) = 1 |U| X

u∈U

1 |S∗

u|

X

i∈S∗

u

c( ˆ Zu,i)

Inverse-Propensity-Scoring (IPS)

slide-27
SLIDE 27

27

ˆ RIPS( ˆ Z|P) = 1 |U| X

u∈U

1 |Su| X

i∈S∗

u

c( ˆ Zu,i) Pu,i ˆ RAOA( ˆ Z) = 1 |U| X

u∈U

1 |S∗

u|

X

i∈S∗

u

c( ˆ Zu,i) EO h ˆ RIPS( ˆ Z|P) i = R( ˆ Z)

Inverse-Propensity-Scoring (IPS)

slide-28
SLIDE 28

28

ˆ RIPS( ˆ Z|P) = 1 |U| X

u∈U

1 |Su| X

i∈S∗

u

c( ˆ Zu,i) Pu,i

Self-Normalized Inverse-Propensity-Scoring (SNIPS)

[Swaminathan et al.15]

ˆ RSNIPS( ˆ Z|P) = 1 |U| X

u∈U

1 P

i∈S∗

u

1 Pu,i

X

i∈S∗

u

c( ˆ Zu,i) Pu,i

slide-29
SLIDE 29

29

Estimating Propensity Scores

Fa Factor: Popularity bias (Users are more likely to be exposed to popular items) As Assumptions:

  • User-independence assumption
  • Two-steps assumption
  • User preference is not affected by item presentation

Pu,i = P(Ou,i = 1) = P(O∗,i = 1) = P∗,i P∗,i = P select

∗,i

· P interact|select

∗,i

P interact|select

∗,i

= P interact

∗,i

slide-30
SLIDE 30

30

Estimating Propensity Scores

Po Popularity bias model [St Steck 11] 11]:

ˆ P select

∗,i

∝ (n∗

i )γ

Observed item popularity

slide-31
SLIDE 31

31

Estimating Propensity Scores

Po Popularity bias model [St Steck 11] 11]:

ˆ P select

∗,i

∝ (n∗

i )γ

ˆ P∗,i ∝ (n∗

i )( γ+1

2 )

Estimated from known online content serving policy

slide-32
SLIDE 32

32

Measuring bias in recommender evaluation (Yahoo! music rating dataset)

Model Average- Over-All !SNIPS (# = 1.5) !SNIPS (# = 2.0) !SNIPS (# = 2.5) !SNIPS (# = 3.0) U-CML 0.401 0.270 0.260 0.253 0.248 A-CML 0.399 0.274 0.264 0.258 0.253 BPR 0.380 0.275 0.268 0.262 0.258 PMF 0.386 0.267 0.259 0.252 0.248 Mean Absol

  • lute Error
  • r (MAE), Reca

call !SNIPS produces significantly lower MAE

slide-33
SLIDE 33

33

Measuring bias in recommender evaluation (Yahoo! music rating dataset)

Model Average- Over-All !SNIPS (# = 1.5) !SNIPS (# = 2.0) !SNIPS (# = 2.5) !SNIPS (# = 3.0) U-CML 0.401 0.270 0.260 0.253 0.248 A-CML 0.399 0.274 0.264 0.258 0.253 BPR 0.380 0.275 0.268 0.262 0.258 PMF 0.386 0.267 0.259 0.252 0.248 Mean Absol

  • lute Error
  • r (MAE), Reca

call !SNIPS produces significantly lower MAE

The accuracy of recommending popular items is a significant ov

  • verestimation of

the true recommendation performance

slide-34
SLIDE 34

34

Please come to our poster or refer to our paper for:

  • Proofs
  • Experimental details.
  • More experiments.
  • Deeper analysis of the unbiased evaluator.
slide-35
SLIDE 35

35

Conclusions and Future Work

ˆ RSNIPS( ˆ Z|P) = 1 |U| X

u∈U

1 P

i∈S∗

u

1 Pu,i

X

i∈S∗

u

c( ˆ Zu,i) Pu,i EO h ˆ RIPS( ˆ Z|P) i = R( ˆ Z)

  • Understanding variance of evaluators.
  • Propensity estimation (e.g., incorporate auxiliary

user and item information).

  • Debias training of recommendation systems (e.g.,

[Liang et al. 16]).

slide-36
SLIDE 36

36

http://www.openrec.ai

Github link, documents, and tutorials

Lo Longqi Yang

Ph.D. candidate Computer Science, Cornell Tech, Cornell University Email: ylongqi@cs.cornell.edu Web: bit.ly/longqi Twitter: @ylongqi Connected Experiences Lab http://cx.jacobs.cornell.edu/ Small Data Lab http://smalldata.io/

Funders: