a unified contextual bandit framework for long and short
play

A Unified Contextual Bandit Framework for Long- and Short-Term - PowerPoint PPT Presentation

A Unified Contextual Bandit Framework for Long- and Short-Term Recommendations Maryam Tavakol and Ulf Brefeld {tavakol,brefeld}@leuphana.de Skopje - Sep 21, 2017 Recommendation Tavakol & Brefeld, Leuphana University Lneburg 2/22


  1. A Unified Contextual Bandit Framework for Long- and Short-Term Recommendations Maryam Tavakol and Ulf Brefeld {tavakol,brefeld}@leuphana.de Skopje - Sep 21, 2017

  2. Recommendation Tavakol & Brefeld, Leuphana University Lüneburg 2/22

  3. Recommendation Tavakol & Brefeld, Leuphana University Lüneburg 3/22

  4. Personalization Tavakol & Brefeld, Leuphana University Lüneburg 4/22

  5. Short-Term Zeitgeist Tavakol & Brefeld, Leuphana University Lüneburg 5/22

  6. Proposed Approach • Goal: Combination of long- and short-term interests of users in one unified model ➡ Long-term part + Short-term component ✤ s.t.: Generality in terms of optimization • Framework: Contextual Multi-Armed Bandit (MAB) • e.g., LinUCB Tavakol & Brefeld, Leuphana University Lüneburg 6/22

  7. Unified Model current user outcome item features context bias term θ > β > E [ r t,a i | u j ] = + + b i i x t j z a i | {z } | {z } Short � term Long � term parameters of short-term model parameters of long-term model Tavakol & Brefeld, Leuphana University Lüneburg 7/22

  8. General Optimization • Objective function with arbitrary loss, V ( · , r t ) T 1 k θ i k 2 + ˆ t z t + b t , r t ) + λ µ X X X V ( θ > t x t + β > k β j k 2 inf 2 2 T θ 1 ,..., θ n t =1 i j β 1 ,..., β m b Regularization Tavakol & Brefeld, Leuphana University Lüneburg 8/22

  9. General Optimization • Using the Fenchel-Legendre conjugate of loss function in the dual space : T C , r t ) � 1 i ) � XX > + 1 V ⇤ ( � α t X X X δ i ⌦ δ > φ i ⌦ φ > 2 α > [( i ) � ZZ > ] α sup µ ( � C α , 1 > α =0 t =1 i i Kernel trick Tavakol & Brefeld, Leuphana University Lüneburg 9/22

  10. Optimization • Gradient-based approaches (in dual or primal) • Calculating the gradient depends on the loss function • Model parameters, , are obtained from ( θ i , β j ) α • Kernel functions applicable Tavakol & Brefeld, Leuphana University Lüneburg 10/22

  11. Algorithm Tavakol & Brefeld, Leuphana University Lüneburg 11/22

  12. Algorithm Tavakol & Brefeld, Leuphana University Lüneburg 11/22

  13. Algorithm Tavakol & Brefeld, Leuphana University Lüneburg 11/22

  14. Algorithm Tavakol & Brefeld, Leuphana University Lüneburg 11/22

  15. Algorithm Tavakol & Brefeld, Leuphana University Lüneburg 11/22

  16. Algorithm Tavakol & Brefeld, Leuphana University Lüneburg 11/22

  17. Algorithm Tavakol & Brefeld, Leuphana University Lüneburg 11/22

  18. Algorithm Tavakol & Brefeld, Leuphana University Lüneburg 11/22

  19. Instantiation: Squared Loss • Conjugate of the loss function: 1 t − 1 V ∗ ( − α t 2 C 2 α 2 C , r t ) = C α t r t • Becomes a standard quadratic optimization with a constraint q • Confidence bound: c x > t ( X > X ) � 1 x t + z > t ( Z > Z ) � 1 z t Tavakol & Brefeld, Leuphana University Lüneburg 12/22

  20. Instantiation: Logistic Loss • Conjugate of the loss function: V ∗ ( − α t , r t ) = (1 − α t ) log(1 − α t ) + α t log( α t ) r t Cr t Cr t Cr t Cr t • Confidence bound: Diagonal matrix of sigmoid model q x > t ( X > V a X ) � 1 x t + z > t ( Z > V u Z ) � 1 z t c Tavakol & Brefeld, Leuphana University Lüneburg 13/22

  21. Model Simplification • Focus on the item model E [ r t,a i ] = θ > Short-Term: • i x t E [ r t,a i ] = θ > i x t + β > z a i Short-Term+Average: • • Focus on the user model E [ r t,a i | u j ] = β > Long-Term: • j z a i E [ r t,a i | u j ] = β > j z a i + θ > z a i Long-Term+Average: • Tavakol & Brefeld, Leuphana University Lüneburg 14/22

  22. Empirical Study • Using squared loss function • Dataset: User transactions from Zalando* • Baseline: Matrix Factorization (MF) • Performance measure: normalized average rank Tavakol & Brefeld, Leuphana University Lüneburg 15/22 *www.zalando.com

  23. No New User/Item • T he combined approach outperforms either short- or long-term models —but not the baseline! Tavakol & Brefeld, Leuphana University Lüneburg 16/22

  24. Cold Start Scenarios • Robustness of combined model in case of new user/item generalizes well for both cases Tavakol & Brefeld, Leuphana University Lüneburg 17/22

  25. Time Complexity • The optimization time in combined model is exponential Tavakol & Brefeld, Leuphana University Lüneburg 18/22

  26. Short-Term Models • The average term compensates for the new items Tavakol & Brefeld, Leuphana University Lüneburg 19/22

  27. Long-Term Models • The average term compensates for the new users Tavakol & Brefeld, Leuphana University Lüneburg 20/22

  28. Conclusion • The short- and long-term interests of users are combined in one model • Free choice of loss function and model complexity • There is not one best model: the choice depends on the application Tavakol & Brefeld, Leuphana University Lüneburg 21/22

  29. Questions? Thanks for your attention A Unified Contextual Bandit Framework for Long- and Short-Term Recommendations Maryam Tavakol & Ulf Brefeld {tavakol,brefeld}@leuphana.de Source code available at https://github.com/marytavakol/Bandits Tavakol & Brefeld, Leuphana University Lüneburg 22/22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend