Bandits in Auctions (& more)
Vianney Perchet joint work with P. Rigollet (MIT) and J. Weed (MIT)
CEMRACS 2017 July 20 2017 CMLA, ENS Paris-Saclay & Criteo Research
Bandits in Auctions (& more) Vianney Perchet joint work with P. - - PowerPoint PPT Presentation
Bandits in Auctions (& more) Vianney Perchet joint work with P. Rigollet (MIT) and J. Weed (MIT) CEMRACS 2017 July 20 2017 CMLA, ENS Paris-Saclay & Criteo Research Motivations & Objectives Classical Examples of Bandits Problems
CEMRACS 2017 July 20 2017 CMLA, ENS Paris-Saclay & Criteo Research
3
3
4
4
4
4
5
5
5
6
6
6
6
6
6
6
6
6
6
6
6
6
7
t ∈ R (sub-)Gaussian,
1, Xi 2, . . . , ∼ N
1 , Xπ2 2 , . . . , Xπt−1 t−1
t=1 EXπt t = ∑T t=1 µπt
i∈{1,...,K} T
t=1
T
t=1
T
t=1
9
i
i t +
t=1 1{πt = i} and X i t = 1 Ti
t
s:is=i Xi s.
k log(T) ∆k
∆
10
⋆ t +
i t +
i
∆2
i ; each mistake costs ∆i.
i log(T) ∆2
i
i log(T) ∆i
i log(T) ∆i 11
k log(T∆k) ∆k
∆min
T ε ≤ T2/3 12
t ∈ R bounded in [0, 1]
1, Xi 2, . . .
1 , Xπ2 2 , . . . , Xπt−1 t−1
i∈[K] T
t=1
t − T
t=1
t
t=1 Xi t, from ∆([K]) to [0, 1] 14
t =
s=1 Xi s
j∈[K] eη ∑t−1
s=1 Xj s ,
t is observed, not Xt. Estimate Xt by
t = 1 −
t
t
t = 1 − (1 − pi t).0 + pi t 1−Xi
t
pi
t
t, unbiased estimator
i∈K pi t(
t)2 ≤ 1 + ∑ i∈[K] pi t
1−Xi
t
pi
t
t ≤ K + 1 bounded variance
15
17
18
19
t=1(vt − mt)1{bt > mt}
b∈[0,1] T
t=1
T
t=1
20
T
t=1
T
t=1
21
3 log(t) 2ωt , 1
∆
23
1−α 2
1+α 2 (T)
1−α 2
24
b∈[0,1] T
t=1
T
t=1
t
Xs
t where
t+1 is unbiased est.
26
1 32
27
t=1(vt − mt)1{b > mt} − ∑T t=1(vt − mt)1{bt > mt}
1−α 2
1+α 2
∆
28
29
30
31
32
33
t ∈ [0, 1] ∼ νk(ωt), E[Xk|ω] = µk(ω)
t=1 µπ⋆(ωt)(ωt) − µπt(ωt) 34
k
k and
k
k
35
35
36
36
36
36
36
36
37
37
37
37
37
37
38
38
38
K log(K) T
2β+d , bin side
K log(K) T
1 2β+d .
39
40
40
K log(K) T
2β+d .
41
β 2β+d
42
43
44
45
46
47
48
49
50
5%T ?? 51
52