By Stian Berg Supervisor Ole-Christoffer Granmo, University of - - PowerPoint PPT Presentation

by stian berg supervisor ole christoffer granmo
SMART_READER_LITE
LIVE PREVIEW

By Stian Berg Supervisor Ole-Christoffer Granmo, University of - - PowerPoint PPT Presentation

Solving Dynamic Bandit Problems and Decentralized Games using the Kalman Bayesian Learning Automaton By Stian Berg Supervisor Ole-Christoffer Granmo, University of Agder Introduction Thesis topic: Evaluation of a novel approach to dynamic


slide-1
SLIDE 1

Solving Dynamic Bandit Problems and Decentralized Games using the Kalman Bayesian Learning Automaton

By Stian Berg Supervisor Ole-Christoffer Granmo, University of Agder

slide-2
SLIDE 2

2

Introduction

  • Thesis topic: Evaluation of a novel approach to dynamic bandit problems
  • Bandit problem example: Link relevance
slide-3
SLIDE 3

Stationary bandit problem

3

slide-4
SLIDE 4

Dynamic bandit problem

4

slide-5
SLIDE 5

The Kalman Bayesian Learning Automaton (KBLA)

  • Kalman filtering
  • Position tracking
  • Robot navigation
  • Electronic equipment
  • Stock estimation
  • Forecasting
  • Computer vision
  • KBLA
  • Kalman filtering adapted to work in a bandit setting

5

slide-6
SLIDE 6

Summary of results

  • Among the top performers in all experiments
  • Scaled rather well with the number of options
  • Could handle various types of feedback

However....

  • May need significant tuning for good performance

6

slide-7
SLIDE 7

Conclusion

  • Empirical evaluation of the KBLA
  • Performance
  • Scalability
  • Robustness
  • Overall we believe this is a very promising approach
  • Further work
  • Parameter problem
  • Combining ideas from other bandit algorithms

7