Nonparametric Bandits with Covariates
Philippe Rigollet
Princeton University
with A. Zeevi (Columbia University) Support from NSF (DMS-0906424)
1 / 32
Nonparametric Bandits with Covariates Philippe Rigollet Princeton - - PowerPoint PPT Presentation
Nonparametric Bandits with Covariates Philippe Rigollet Princeton University with A. Zeevi (Columbia University) Support from NSF (DMS-0906424) 1 / 32 Example: Real time web page optimization 2 / 32 Example: Real time web page optimization
Princeton University
1 / 32
2 / 32
2 / 32
2 / 32
3 / 32
t
n
t
4 / 32
1, π⋆ 2, . . .) pulls at each time t the
t = argmax i=1,2
t
n
(π⋆
t )
t
t
6 / 32
t ] are
6 / 32
7 / 32
7 / 32
7 / 32
t
i=1,2
t
i=1,2
8 / 32
9 / 32
10 / 32
11 / 32
12 / 32
13 / 32
t
t |Xt] = f (i)(Xt)
i=1,2
n
15 / 32
15 / 32
16 / 32
16 / 32
16 / 32
j
17 / 32
t |Zt = j] = ¯
j
18 / 32
19 / 32
19 / 32
19 / 32
19 / 32
t
t |Zt = j] = ¯
j
i=1,2
Zt
20 / 32
M
n
j
j
21 / 32
M
n
j
j
21 / 32
j (t) = t
(i) j (t) =
j (t) t
s 1
22 / 32
i=1,2
(i) j (t) + Bt(N(i) j (t))
j
j |.
M
23 / 32
M
24 / 32
1 2β+1.
log n
2β+1
log n
2β 2β+1
25 / 32
β(1+α)+1
β(1+α) β(1+α)+1
26 / 32
27 / 32
θ
f(1),f(2)∈Σ(β,L)
2β+1 ,
28 / 32
29 / 32
2β+d
30 / 32
i=i⋆(X) |f (i)(X) − f (i⋆(X))(X)| ≤ δ
2β+1 31 / 32
32 / 32