almost optimal algorithms for linear stochastic bandits
play

Almost Optimal Algorithms for Linear Stochastic Bandits with - PowerPoint PPT Presentation

1/7 Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payofgs Department of Computer Science and Engineering The Chinese University of Hong Kong NeurIPS, Dec. 2018 Han Shao , Xiaotian Yu , Irwin King and Michael


  1. 1/7 Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payofgs Department of Computer Science and Engineering The Chinese University of Hong Kong NeurIPS, Dec. 2018 Han Shao ∗ , Xiaotian Yu ∗ , Irwin King and Michael R. Lyu

  2. 2/7 True Learning setting Optimal at t Linear Stochastic Bandits (LSB) Optimal Empirically Exploitation Exploration Previous setting x 1 , t ∈ R d x 4 , t ▶ 1. Given a set of arms represented by D ⊆ R d ▶ 2. At time t , select an arm x t ∈ D , and observe y t ( x t ) = ⟨ x t , θ ∗ ⟩ + η t ▶ 3. The goal is to maximize ∑ T t =1 E [ y t ( x t )] ▶ 4. η t follows a sub-Gaussian distribution ( E [ η 2 t ] < ∞ )

  3. 3/7 What Is A Heavy-Tailed Distribution? Practical scenarios Gaussian NASDAQ returns 1. Delays in communication networks (Liebeherr et al., 2012) 2. Analysis of biological data (Burnecki et al., 2015) 3. ... ▶ High-probability extreme returns in fjnancial markets ▶ Many other real cases

  4. 4/7 t MAB sub-Gaussian LSB with Heavy-Tailed Payofgs (1) (Bubeck et al., 2013) Problem defjnition LSB ▶ Multi-armed bandits (MAB) with heavy-tailed payofgs E [ η 1+ ϵ ] < + ∞ , where ϵ ∈ (0 , 1] ▶ Our setting: LSB with η t satisfying Eq. (1) ▶ Weaker assumption than sub-Gaussian ▶ Medina and Yang (2016) studied LSB with heavy-tailed payofgs heavy-tailed ( ϵ = 1 ) 1 1 2 ) 2 ) by Bubeck et al. (2013) O ( T O ( T � 1 � 3 2 ) 4 ) by Medina and Yang (2016) O ( T O ( T ▶ Can we achieve � 1 2 ) ? O ( T

  5. 5/7 Algorithm: Median of means under OFU (MENU) Framework comparison with MoM by Medina and Yang (2016)

  6. 6/7 algorithm MoM MENU CRT TOFU regret Regret Bounds ▶ Upper bounds 1+2 ϵ 1 1 � � 1 � 2(1+ ϵ ) ) � 1 1+3 ϵ ) 1+ ϵ ) 2 + 1+ ϵ ) O ( T O ( T O ( T O ( T 1 1+ ϵ ) ▶ Lower bound: Ω( T 1 When ϵ = 1 , our algorithms achieve � 2 ) O ( T

  7. 7/7 See You at the Poster Session Time: Dec. 5th, 10:45 AM – 12:45 PM Location: Room 210 & 230 AB #158

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend