discounted ucb
play

Discounted UCB Levente Kocsis and Csaba Szepesv ari MTA SZTAKI, - PowerPoint PPT Presentation

Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Discounted UCB Levente Kocsis and Csaba Szepesv ari MTA SZTAKI, Hungary Levente Kocsis and Csaba Szepesv ari Discounted UCB Contents UCB1-tuned


  1. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Discounted UCB Levente Kocsis and Csaba Szepesv´ ari MTA SZTAKI, Hungary Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  2. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  3. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions UCB1-tuned+ t � s it = I ( I τ = i ) x τ τ =0 t � n it = I ( I τ = i ) τ =0 � µ it = s it / n it n t = n it i   � max( µ it (1 − µ it ) , 0 . 002) ln n t I t +1 = argmax  µ it +  n it i Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  4. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Discounted UCB1-tuned+ t � I ( I τ = i ) γ t − τ x τ s it = τ =0 t � I ( I τ = i ) γ t − τ n it = τ =0 � µ it = s it / n it n t = n it i   � max( µ it (1 − µ it ) , 0 . 002) ln n t I t +1 = argmax  µ it +  n it i Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  5. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Experiments: Task 1 (averaged over 1000 seeds) 1000 UCB1-tuned Exp3 gamma=1.0 gamma=0.99999 gamma=0.9999 100 10 regret 1 0.1 0.01 10 100 1000 10000 100000 iteration Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  6. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Experiments: Task 1 (averaged over test seeds) 1000 UCB1-tuned Exp3 gamma=1.0 gamma=0.99999 gamma=0.9999 100 regret 10 1 0.1 10 100 1000 10000 100000 iteration Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  7. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Experiments: Task 2 (averaged over 1000 seeds) 100000 UCB1-tuned Exp3 gamma=0.999 gamma=0.99 periodic, gamma=0.999 10000 regret 1000 100 10 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 iteration Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  8. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Experiments: Task 2 (averaged over test seeds) 100000 UCB1-tuned Exp3 gamma=0.999 gamma=0.99 periodic, gamma=0.999 10000 regret 1000 100 10 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 iteration Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  9. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Experiments: Task 3 (averaged over 1000 seeds) 1000 UCB1-tuned Exp3 gamma=1.0 gamma=0.99999 gamma=0.9999 100 regret 10 1 0.1 10 100 1000 10000 100000 iteration Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  10. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Experiments: Task 3 (averaged over test seeds) 1000 UCB1-tuned Exp3 gamma=1.0 gamma=0.99999 gamma=0.9999 100 regret 10 1 0.1 10 100 1000 10000 100000 iteration Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  11. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Other algorithms ◮ line fitting ◮ discounted UCB + exploiting periodicity ◮ adaptive discounted UCB Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

  12. Contents UCB1-tuned Discounted UCB1-tuned Experiments Other algorithms Conclusions Conclusions ◮ Challenging challenge ◮ Task 4(?): mixing task 1 and 2 ◮ Regret bounds depending on how fast the response rate vary? ◮ Universal algorithms (algorithms adapting to response rate) Levente Kocsis and Csaba Szepesv´ ari Discounted UCB

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend