1/7
Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payofgs
Han Shao∗, Xiaotian Yu∗, Irwin King and Michael R. Lyu
Department of Computer Science and Engineering The Chinese University of Hong Kong
Almost Optimal Algorithms for Linear Stochastic Bandits with - - PowerPoint PPT Presentation
1/7 Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payofgs Department of Computer Science and Engineering The Chinese University of Hong Kong NeurIPS, Dec. 2018 Han Shao , Xiaotian Yu , Irwin King and Michael
1/7
Department of Computer Science and Engineering The Chinese University of Hong Kong
2/7
x1,t ∈ Rd
x4,t
True Optimal Empirically Optimal at t
▶ 1. Given a set of arms represented by D ⊆ Rd ▶ 2. At time t, select an arm xt ∈ D, and observe
▶ 3. The goal is to maximize ∑T t=1 E[yt(xt)] ▶ 4. ηt follows a sub-Gaussian distribution (E[η2 t ] < ∞)
3/7
▶ High-probability extreme returns in fjnancial markets
Gaussian NASDAQ returns
▶ Many other real cases
4/7
▶ Multi-armed bandits (MAB) with heavy-tailed payofgs
t
▶ Our setting: LSB with ηt satisfying Eq. (1)
▶ Weaker assumption than sub-Gaussian ▶ Medina and Yang (2016) studied LSB with heavy-tailed payofgs
1 2 )
1 2 ) by Bubeck et al. (2013)
1 2 )
3 4 ) by Medina and Yang (2016)
▶ Can we achieve
1 2 )?
5/7
6/7
▶ Upper bounds
1+2ϵ 1+3ϵ )
1 1+ϵ )
1 2 + 1 2(1+ϵ) )
1 1+ϵ )
▶ Lower bound: Ω(T
1 1+ϵ )
1 2 )
7/7