A Unified Contextual Bandit Framework for Long- and Short-Term - PowerPoint PPT Presentation

A Unified Contextual Bandit Framework for Long- and Short-Term Recommendations Maryam Tavakol and Ulf Brefeld {tavakol,brefeld}@leuphana.de Skopje - Sep 21, 2017

Recommendation Tavakol & Brefeld, Leuphana University Lüneburg 2/22

Recommendation Tavakol & Brefeld, Leuphana University Lüneburg 3/22

Personalization Tavakol & Brefeld, Leuphana University Lüneburg 4/22

Short-Term Zeitgeist Tavakol & Brefeld, Leuphana University Lüneburg 5/22

Proposed Approach • Goal: Combination of long- and short-term interests of users in one unified model ➡ Long-term part + Short-term component ✤ s.t.: Generality in terms of optimization • Framework: Contextual Multi-Armed Bandit (MAB) • e.g., LinUCB Tavakol & Brefeld, Leuphana University Lüneburg 6/22

Unified Model current user outcome item features context bias term θ > β > E [ r t,a i | u j ] = + + b i i x t j z a i | {z } | {z } Short � term Long � term parameters of short-term model parameters of long-term model Tavakol & Brefeld, Leuphana University Lüneburg 7/22

General Optimization • Objective function with arbitrary loss, V ( · , r t ) T 1 k θ i k 2 + ˆ t z t + b t , r t ) + λ µ X X X V ( θ > t x t + β > k β j k 2 inf 2 2 T θ 1 ,..., θ n t =1 i j β 1 ,..., β m b Regularization Tavakol & Brefeld, Leuphana University Lüneburg 8/22

General Optimization • Using the Fenchel-Legendre conjugate of loss function in the dual space : T C , r t ) � 1 i ) � XX > + 1 V ⇤ ( � α t X X X δ i ⌦ δ > φ i ⌦ φ > 2 α > [( i ) � ZZ > ] α sup µ ( � C α , 1 > α =0 t =1 i i Kernel trick Tavakol & Brefeld, Leuphana University Lüneburg 9/22

Optimization • Gradient-based approaches (in dual or primal) • Calculating the gradient depends on the loss function • Model parameters, , are obtained from ( θ i , β j ) α • Kernel functions applicable Tavakol & Brefeld, Leuphana University Lüneburg 10/22

Algorithm Tavakol & Brefeld, Leuphana University Lüneburg 11/22

Instantiation: Squared Loss • Conjugate of the loss function: 1 t − 1 V ∗ ( − α t 2 C 2 α 2 C , r t ) = C α t r t • Becomes a standard quadratic optimization with a constraint q • Confidence bound: c x > t ( X > X ) � 1 x t + z > t ( Z > Z ) � 1 z t Tavakol & Brefeld, Leuphana University Lüneburg 12/22

Instantiation: Logistic Loss • Conjugate of the loss function: V ∗ ( − α t , r t ) = (1 − α t ) log(1 − α t ) + α t log( α t ) r t Cr t Cr t Cr t Cr t • Confidence bound: Diagonal matrix of sigmoid model q x > t ( X > V a X ) � 1 x t + z > t ( Z > V u Z ) � 1 z t c Tavakol & Brefeld, Leuphana University Lüneburg 13/22

Model Simplification • Focus on the item model E [ r t,a i ] = θ > Short-Term: • i x t E [ r t,a i ] = θ > i x t + β > z a i Short-Term+Average: • • Focus on the user model E [ r t,a i | u j ] = β > Long-Term: • j z a i E [ r t,a i | u j ] = β > j z a i + θ > z a i Long-Term+Average: • Tavakol & Brefeld, Leuphana University Lüneburg 14/22

Empirical Study • Using squared loss function • Dataset: User transactions from Zalando* • Baseline: Matrix Factorization (MF) • Performance measure: normalized average rank Tavakol & Brefeld, Leuphana University Lüneburg 15/22 *www.zalando.com

No New User/Item • T he combined approach outperforms either short- or long-term models —but not the baseline! Tavakol & Brefeld, Leuphana University Lüneburg 16/22

Cold Start Scenarios • Robustness of combined model in case of new user/item generalizes well for both cases Tavakol & Brefeld, Leuphana University Lüneburg 17/22

Time Complexity • The optimization time in combined model is exponential Tavakol & Brefeld, Leuphana University Lüneburg 18/22

Short-Term Models • The average term compensates for the new items Tavakol & Brefeld, Leuphana University Lüneburg 19/22

Long-Term Models • The average term compensates for the new users Tavakol & Brefeld, Leuphana University Lüneburg 20/22

Conclusion • The short- and long-term interests of users are combined in one model • Free choice of loss function and model complexity • There is not one best model: the choice depends on the application Tavakol & Brefeld, Leuphana University Lüneburg 21/22

Questions? Thanks for your attention A Unified Contextual Bandit Framework for Long- and Short-Term Recommendations Maryam Tavakol & Ulf Brefeld {tavakol,brefeld}@leuphana.de Source code available at https://github.com/marytavakol/Bandits Tavakol & Brefeld, Leuphana University Lüneburg 22/22

A Unified Contextual Bandit Framework for Long- and Short-Term - PowerPoint PPT Presentation

A Unified Contextual Bandit Framework for Long- and Short-Term Recommendations Maryam Tavakol and Ulf Brefeld {tavakol,brefeld}@leuphana.de Skopje - Sep 21, 2017 Recommendation Tavakol & Brefeld, Leuphana University Lneburg 2/22

Reinforcement Learning n-armed bandit Kevin Spiteri April 21, 2015 n-armed bandit n-armed

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Contextual Inquiry Take Aways Overview of Contextual Design Contextual inquiry

Reinforcement Learning Kevin Spiteri April 21, 2015 n-armed bandit n-armed bandit 0.9 0.5

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model Gi-Soo Kim, Myunghee Cho

The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits Authors: John Langford, Tom Zhang

One Armed Bandit source: http://dogbeforewicket.blogspot.ca EECS 1030 moodle.yorku.ca One Armed

Contextual Analysis SWEN-444 Contextual analysis Systematic analysis of contextual user work

A Contextual-Bandit Approach to Personalized News Article Recommendation Lihong li, Wei Chu,

Meta-Learning Contextual Bandit Exploration Amr Sharaf Hal Daum e III University of Maryland

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I S ebastien

The Multi-Armed Bandit Problem Nicol` o Cesa-Bianchi Universit` a degli Studi di Milano Nicol`

Experimental Design & Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

Contextual Advertising: Contextual Advertising: Semantic Approach Semantic Approach Ekaterina

Serving Contextual Communities Serving Contextual Communities The Evangelical Theological

Basics of Unified Sports Ways to get involved with Unified Sports in Ohio Ohio 1 What are

Nullification test collections for Web spam and SEO Timothy Jones (ANU) David Hawking

Baseline A Library for Rapid Modeling, Experimentation and Development of Deep Learning

CSE543 - Introduction to Computer and Network Security Module: System Vulnerabilities Professor

Statistical Natural Language Parsing Gerald Penn [based on slides by Christopher Manning]

DEMYSTIFYING DATAJOURNALISM In collaboration with Singapore Press Club singapore The DJA are

SMART CITIES Cities of the futur? By Emmanuel Eveno LISST-CIEU, 18 juillet 2016 PROLEGOMENA

Web Dynamics Part 1 - Introduction 1.1 Dimensions of dynamics in the Web 1.2 Application

DIGITAL WOLVES CHANGING DIGITAL RIVERS => INFO@MIXEL.BE <= REINTRODUCING WOLVES YELLOWSTONE

Sambuz

Useful Links

Newsletter

Mail Us

A Unified Contextual Bandit Framework for Long- and Short-Term - PowerPoint PPT Presentation

A Unified Contextual Bandit Framework for Long- and Short-Term Recommendations Maryam Tavakol and Ulf Brefeld {tavakol,brefeld}@leuphana.de Skopje - Sep 21, 2017 Recommendation Tavakol & Brefeld, Leuphana University Lneburg 2/22

Reinforcement Learning n-armed bandit Kevin Spiteri April 21, 2015 n-armed bandit n-armed

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

Contextual Inquiry Take Aways Overview of Contextual Design Contextual inquiry

Reinforcement Learning Kevin Spiteri April 21, 2015 n-armed bandit n-armed bandit 0.9 0.5

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model Gi-Soo Kim, Myunghee Cho

The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits Authors: John Langford, Tom Zhang

One Armed Bandit source: http://dogbeforewicket.blogspot.ca EECS 1030 moodle.yorku.ca One Armed

Contextual Analysis SWEN-444 Contextual analysis Systematic analysis of contextual user work

A Contextual-Bandit Approach to Personalized News Article Recommendation Lihong li, Wei Chu,

Meta-Learning Contextual Bandit Exploration Amr Sharaf Hal Daum e III University of Maryland

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I S ebastien

The Multi-Armed Bandit Problem Nicol` o Cesa-Bianchi Universit` a degli Studi di Milano Nicol`

Experimental Design &amp; Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

Contextual Advertising: Contextual Advertising: Semantic Approach Semantic Approach Ekaterina

Serving Contextual Communities Serving Contextual Communities The Evangelical Theological

Basics of Unified Sports Ways to get involved with Unified Sports in Ohio Ohio 1 What are

Nullification test collections for Web spam and SEO Timothy Jones (ANU) David Hawking

Baseline A Library for Rapid Modeling, Experimentation and Development of Deep Learning

CSE543 - Introduction to Computer and Network Security Module: System Vulnerabilities Professor

Statistical Natural Language Parsing Gerald Penn [based on slides by Christopher Manning]

DEMYSTIFYING DATAJOURNALISM In collaboration with Singapore Press Club singapore The DJA are

SMART CITIES Cities of the futur? By Emmanuel Eveno LISST-CIEU, 18 juillet 2016 PROLEGOMENA

Web Dynamics Part 1 - Introduction 1.1 Dimensions of dynamics in the Web 1.2 Application

DIGITAL WOLVES CHANGING DIGITAL RIVERS =&gt; INFO@MIXEL.BE &lt;= REINTRODUCING WOLVES YELLOWSTONE

Sambuz

Useful Links

Newsletter

Mail Us

Experimental Design & Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

DIGITAL WOLVES CHANGING DIGITAL RIVERS => INFO@MIXEL.BE <= REINTRODUCING WOLVES YELLOWSTONE