A Contextual-Bandit Approach to Personalized News Article - PowerPoint PPT Presentation

Mar 09, 2023 •410 likes •560 views

A Contextual-Bandit Approach to Personalized News Article Recommendation Lihong li, Wei Chu, John Langford, Rebort E. Schapire Presentator: Qingyun Wu News Recommendation Cycle A K-armed Bandit Formulation A gambler must decide which of

A Contextual-Bandit Approach to Personalized News Article Recommendation Lihong li, Wei Chu, John Langford, Rebort E. Schapire Presentator: Qingyun Wu
News Recommendation Cycle
A K-armed Bandit Formulation • A gambler must decide which of the K non-identical slot machines(we called them arms) to play in a sequence of trails in order to maximize total reward. News Website <—> gambler Candidate news articles <—> arms User Click <—> Reward How to pull arms to maximize reward?
A K-armed Bandit formulation Setting • Set of K choices(arms) a • Each choice is associate with an unknown probability p a distribution supported in [0,1] • play the game for rounds T t • In each round (1)we pick article j p j (2)we observe random sample from X t T ∑ Our Goal: maximize X t t = 1
Ideal Solution µ a argmax Pick a But we DO NOT know the mean.
Feasible Solution Every time we pull an arm we learn a bit more about the distribution.
Exploitation VS. Exploration Exploitation: pull an arm Exploration: Pull an arm for which we current have we never pulled before the highest estimate of mean of reward Extreme examples: Greedy Strategy: Random Strategy: Take the arm with Randomly choose the highest average an arm reward Too confident Too unconfident
How to make trade off Exploitation Exploration Don’t just look at the mean(that’s the expected reward), but also the confidence!
UCB(Upper Confidence Bound) algorithm ^ µ a + α * Varance ) Pick argmax( a Confidence Interval is a range of values ^ within which we are µ a + α * UCB ) Pick argmax( sure the mean lies a with a certain probability 2ln T ^ a ( µ a + UCB1 argmax ) n a Reference: Finite-Analysis of the Multi-armed Bandit Problem, Peter Auer, Nicolo Cesa-Bianchi, Paul Fischer http://homes.di.unimi.it/~cesabian/Pubblicazioni/ml-02.pdf
Make use of Contextual Information User feature: demographic information, geographic features, behavioral categories Article feature :URL categories, topic categories Assumption about the reward: a d The expected reward of an arm is linear in its -dimensional θ a x t , a feature , with some unknown coefficient vector , * t namely, for all , T θ a t , a | x t , a ) = x t , a * E ( r
UCB(Upper Confidence Bound) algorithm T θ a t , a | x t , a ) = x t , a * E ( r Assumption ˆ T D a + I d ) − 1 D a θ a = ( D a T c a Parameter Estimation (Ridge Regression) T ˆ T ( D a θ a − E ( r t , a | x t , a ) ≤ α T D a + I d ) − 1 x t , a x t , a x t , a Bound of the variance Bound we need!!! T ˆ T ( D a T D a + I d ) x t , a ) θ a + α argmax a ( x t , a x t , a Pick
Performance Evaluation
Summary Model news recommendation as a K-armed Bandit Problem UCB-type Algorithm Take Contextual Information in to consideration
Q&A

Recommend

Reinforcement Learning n-armed bandit Kevin Spiteri April 21, 2015 n-armed bandit n-armed

Reinforcement Learning n-armed bandit Kevin Spiteri April 21, 2015 n-armed bandit n-armed bandit 0.9 0.5 0.1 0.9 0.5 0.1 0.0 0.0 0.0 0.0 estimate n-armed bandit n-armed bandit 0.9 0.5 0.1 0.9 0.5 0.1 0 0.0 0.0 0.0 0.0

677 views • 21 slides

Contextual Inquiry Take Aways Overview of Contextual Design Contextual inquiry

Contextual Inquiry Take Aways Overview of Contextual Design Contextual inquiry Interviewing techniques Contextual Design Contextual design is: An established process for analyzing tasks people do and designing technology to

456 views • 43 slides

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The

The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem The Contextual Bandits Problem A New, Fast, and Simple Algorithm A New, Fast, and Simple Algorithm A New, Fast, and

1.56k views • 134 slides

Reinforcement Learning Kevin Spiteri April 21, 2015 n-armed bandit n-armed bandit 0.9 0.5

Reinforcement Learning Kevin Spiteri April 21, 2015 n-armed bandit n-armed bandit 0.9 0.5 0.1 n-armed bandit 0.9 0.5 0.1 0.0 0.0 0.0 0.0 estimate n-armed bandit 0.9 0.5 0.1 0 0.0 0.0 0.0 0.0 estimate 0 0 0 0.0 0

995 views • 84 slides

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model Gi-Soo Kim, Myunghee Cho

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model Gi-Soo Kim, Myunghee Cho Paik Seoul National University June 13, 2019 Introduction We propose a new contextual multi-armed bandit (MAB) algorithm for the nonstationary

209 views • 20 slides

The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits Authors: John Langford, Tom Zhang

The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits Authors: John Langford, Tom Zhang Presented by: Ben Flora Overview Bandit problem Contextual bandits Epoch-Greedy algorithm Overview Bandit problem Contextual

674 views • 19 slides

One Armed Bandit source: http://dogbeforewicket.blogspot.ca EECS 1030 moodle.yorku.ca One Armed

One Armed Bandit source: http://dogbeforewicket.blogspot.ca EECS 1030 moodle.yorku.ca One Armed Bandit Utility /** * Returns the winnings from one pull of the one armed * bandit. * * @param coin the coin deposited in the one armed bandit.

623 views • 58 slides

Contextual Advertising: Contextual Advertising: Semantic Approach Semantic Approach Ekaterina

Contextual Advertising: Contextual Advertising: Semantic Approach Semantic Approach Ekaterina Biehl Ekaterina Biehl Overview: Overview: based on * A. Broder et al.: A Semantic Approach to Contextual Advertising . SIGIR Conference, 2007

602 views • 23 slides

Contextual Analysis SWEN-444 Contextual analysis Systematic analysis of contextual user work

Contextual Analysis SWEN-444 Contextual analysis Systematic analysis of contextual user work activity data Identification, sorting, organization, interpretation, consolidation, and communication For purpose of understanding work context for

449 views • 26 slides

Realizing the Dreams of Personalized Medicine Realizing the Dreams of Personalized Medicine

Realizing the Dreams of Personalized Medicine Realizing the Dreams of Personalized Medicine Realizing the Dreams of Personalized Medicine Realizing the Dreams of Personalized Medicine Realizing the Dreams of Personalized Medicine Realizing the

146 views • 13 slides

Meta-Learning Contextual Bandit Exploration Amr Sharaf Hal Daum e III University of Maryland

Meta-Learning Contextual Bandit Exploration Amr Sharaf Hal Daum e III University of Maryland Microsoft Research & University of Maryland amr@cs.umd.edu me@hal3.name Abstract 1 Can we learn to explore in contextual bandits? 2

381 views • 13 slides

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I S ebastien

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part I S ebastien Bubeck Theory Group i.i.d. multi-armed bandit, Robbins [1952] i.i.d. multi-armed bandit, Robbins [1952] Known parameters: number of arms n and

809 views • 67 slides

The Multi-Armed Bandit Problem Nicol` o Cesa-Bianchi Universit` a degli Studi di Milano Nicol`

The Multi-Armed Bandit Problem Nicol` o Cesa-Bianchi Universit` a degli Studi di Milano Nicol` o Cesa-Bianchi The Multi-Armed Bandit Problem The bandit problem [Robbins, 1952] . . . K slot machines Rewards X i ,1 , X i ,2 , . . . of

591 views • 15 slides

A Preference-Based Bandit Framework for Personalized Recommendation Maryam Tavakol and Ulf

A Preference-Based Bandit Framework for Personalized Recommendation Maryam Tavakol and Ulf Brefeld Paderborn, Nov 8, 2016 Introduction Personalized Recommendation Preference Learning Multi-armed bandits 2 Recommendation 3 Recommendation

140 views • 13 slides

Experimental Design & Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual

Experimental Design & Evaluation 4. Contextual Inquiry SunyoungKim,PhD Contextual Inquiry Contextual Inquiry Go where the customer works, immerse yourself in the context, observe the customer (user) as he or she works, and talk to

534 views • 28 slides

Serving Contextual Communities Serving Contextual Communities The Evangelical Theological

Rochina, Rio Serving Contextual Communities Serving Contextual Communities The Evangelical Theological Seminary of Cairo Serving Contextual Communities Are our seminaries conscious that they are functioning in societies that have crucial

319 views • 30 slides

Neuromorphic Computing with Reservoir Neural Networks on Memristive Hardware Aaron Stockdill

Neuromorphic Computing with Reservoir Neural Networks on Memristive Hardware Aaron Stockdill September 2016 Neuromorphic Computing with Reservoir Neural Networks on Memristive Hardware Aaron Stockdill September 2016 Neuromorphic Computing

657 views • 26 slides

Parallel and Distributed Training of Neural Networks via Successive Convex Approximation Authors

2016 IEEE International Workshop on Machine Learning for Signal Processing (MLSP16) Parallel and Distributed Training of Neural Networks via Successive Convex Approximation Authors : Paolo Di Lorenzo and Simone Scardapane Contents

686 views • 26 slides

Estimation of the Kernel Mean Embedding (with uncertainty) Paul Rubenstein University of

Estimation of the Kernel Mean Embedding (with uncertainty) Paul Rubenstein University of Cambridge Max-Planck Institute for Intelligent Systems, Tbingen 20th January 2016 RKHS theory A function k : X X R is a kernel if given x 1 ,

584 views • 11 slides

More efficient representations of compounds for machine learning models Bing Huang and Anatole von

More efficient representations of compounds for machine learning models Bing Huang and Anatole von Lilienfeld Institute of Physical Chemistry and National Centre for Computational Design and Discovery of Novel Materials (MARVEL) Department of

428 views • 40 slides

Planning practice and the purpose of the enforcement system Remedy the effects of

30/06/2016 Progress in planning: regression in enforcement? Stephen Mc Kay Director of Planning Education School of Planning, Architecture and Civil Engineering Queen's University, Belfast Planning practice and the purpose of the

481 views • 20 slides

The Simplex Algorithm Prominent algorithm for solving optimization problems over a set

The Simplex Algorithm Prominent algorithm for solving optimization problems over a set (conjunction) of linear inequations. For automated reasoning, optimization is not the focus, but solvability of a set of linear inequations. In this context

443 views • 29 slides

Lecture 2: Variables & Assignments (Sections 2.1-2.3,2.5) CS 1110 Introduction to Computing

http://www.cs.cornell.edu/courses/cs1110/2018sp Lecture 2: Variables & Assignments (Sections 2.1-2.3,2.5) CS 1110 Introduction to Computing Using Python [E. Andersen, A. Bracy, D. Gries, L. Lee, S. Marschner, C. Van Loan, W. White] CS

623 views • 40 slides

Introduction to Ordinary Differential Equations Emily Weymier Department of Mathematics &

Introduction to Ordinary Differential Equations Emily Weymier Department of Mathematics & Statistics Stephen F. Austin State University, Nacogdoches, TX September 22, 2017 Emily Weymier (SFA) Perturbation Theory September 22, 2017 1 / 28

355 views • 31 slides