The multi armed-bandit problem
(with covariates if we have time)
Vianney Perchet & Philippe Rigollet
LPMA ORFE Université Paris – Diderot Princeton University
The multi armed-bandit problem (with covariates if we have time) - - PowerPoint PPT Presentation
The multi armed-bandit problem (with covariates if we have time) Vianney Perchet & Philippe Rigollet LPMA ORFE Universit Paris Diderot Princeton University Algorithms and Dynamics for Games and Optimization October, 14-18th 2013
LPMA ORFE Université Paris – Diderot Princeton University
Introduction
Introduction
Introduction
Introduction
Introduction
Introduction
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
min)
Static case Framework Static Case Successive Elimination (SE)
min)
1 , .., Y K K (Round 1)
t = 1
τ an estimate of f k
t +
tk
Static case Framework Static Case Successive Elimination (SE)
log(T∆2
min)
∆min
1 , .., Y K K (Round 1)
t +
tk
k log(T) ∆k
∆min
min/K)
∆min
min/K) sufficient with covariates.
Static case Framework Static Case Successive Elimination (SE)
n +
n
n −
n
Static case Framework Static Case Successive Elimination (SE)
k log(T) ∆k , MOSS: K log(T∆2
min/K)
∆min
k ∆k.E[nk] with nk the number of draws of arm k
Static case Framework Static Case Successive Elimination (SE)
k log(T) ∆k , MOSS: K log(T∆2
min/K)
∆min
k ∆k.E[nk] with nk the number of draws of arm k
k
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Static case Framework Static Case Successive Elimination (SE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
t
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
t=1 f kt(Xt) or minimize regret
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
1
2
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
1
2
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
B =
P(B)
B, ..,¯
B ).
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d with the choice
2β+d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d
2β+d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d
2β+d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d
n − ¯
n ≥ ✷
n
nB
|B|2β
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
nB T|B|d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
nB T|B|d
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
B ≤ ¯
B ≤ ¯
B + |B|β)
n −
n +
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
B ≤ ¯
B ≤ ¯
B + |B|β)
n − ¯
n − ∆B ≥ 2
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
B ≤ ¯
B ≤ ¯
B + |B|β)
n − ¯
n − ∆B ≥ 2
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
B ≤ ¯
B ≤ ¯
B + |B|β)
n − ¯
n − ∆B ≥ 2
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
Dynamic Framework Framework Binned Successive Elimination (BSE) Adaptively BSE (ABSE)
2β+d
Conclusion and Remark