a preference based bandit framework for personalized
play

A Preference-Based Bandit Framework for Personalized Recommendation - PowerPoint PPT Presentation

A Preference-Based Bandit Framework for Personalized Recommendation Maryam Tavakol and Ulf Brefeld Paderborn, Nov 8, 2016 Introduction Personalized Recommendation Preference Learning Multi-armed bandits 2 Recommendation 3 Recommendation


  1. A Preference-Based Bandit Framework for Personalized Recommendation Maryam Tavakol and Ulf Brefeld Paderborn, Nov 8, 2016

  2. Introduction Personalized Recommendation Preference Learning Multi-armed bandits 2

  3. Recommendation 3

  4. Recommendation 4

  5. Preference Model • Item i : { Shirt , Blue , Women , Cheap } • Item k : { Polo shirt , White , Women , Expensive } Item i ≻ Item k : { Shirt-Polo shirt, Blue-White, Women-Women, Cheap-Expensive } z i � k := z i − z k 5

  6. Payoff Model • Personalized model + average component User 1 User 1 + User 2 + … + User m User 2 … User m E [ r t,i � k | u t = u j ] = β > t z i � k + θ > z i � k 6

  7. Personalized Recommendation with Qualitative Bandit • For t = 1, …, T: 1. T he world generates some context 2. The learner chooses an action 3. The world reacts with a reward • Choosing the arm with the highest mean reward + confidence interval (General case of LinUCB) 7

  8. Unified Optimization • Solving the objective function in dual space • With arbitrary loss function • Using Fenchel-Legendre conjugate 8

  9. Squared Loss − 1 2 C α > α + r > α max α � 1 2 α > [ ZZ > + 1 X φ j ⌦ φ > j ) � ZZ > ] α µ ( j • The problem reduces to standard quadratic optimization • Model parameters ( , ), are obtained from θ β j α 9

  10. Squared Loss • In the contextual bandit framework: • Mean: β > t z i � k + θ > z i � k • Confidence bound: q z > i � k ( Z > Z + λ I ) � 1 z i � k c 10

  11. Algorithm 11

  12. Summary • Personalized recommendation • Pairwise learning in bandit framework • Optimization in dual space • Learning algorithm for squared loss 12

  13. Thanks for your attention Questions? Email: tavakol@leuphana.de

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend