Agent-Environment Interface Markov Decision Processes, Dynamic - - PowerPoint PPT Presentation

agent environment interface
SMART_READER_LITE
LIVE PREVIEW

Agent-Environment Interface Markov Decision Processes, Dynamic - - PowerPoint PPT Presentation

Agent-Environment Interface Markov Decision Processes, Dynamic Programming, and Reinforcement Learning in R Click to edit Master text styles Click to edit Master text styles Second level Second level Third level Third


slide-1
SLIDE 1

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Markov Decision Processes, Dynamic Programming, and Reinforcement Learning in R

Jeffrey Todd Lins Thomas Jakobsen Saxo Bank A/S jtl@saxobank.com, tj@saxobank.com

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Agent-Environment Interface

Source: Sutton & Barto, 2001

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Markov Decision Process

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Dynamic Programming

slide-2
SLIDE 2

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Bellman Equation

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Bellman Optimality Equation

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Value Iteration

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Policy Iteration

slide-3
SLIDE 3

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Reinforcement Learning

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Temporal Difference Learning

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Q-Learning

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Linear Architectures

slide-4
SLIDE 4

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Least Squares TD Learning

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Examples of RL in Finance

Performance Functions and Reinforcement Learning for Trading Systems and Portfolios. John Moody, Lizhong Wu, Yuansong Liao & Matthew Saffell. Journal of Forecasting, Volume 17, Pages 441-470, 1998. Intraday FX trading: Reinforcement learning vs evolutionary learning.

  • M. A. H. Dempster, T. W. Payne, & V. S. Romahi. Working Paper No. 23/01,

Judge Institute of Management, University of Cambridge, 2001.

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

Advantages of RL in R

  • Vectorized Programming
  • Flexible, Interactive Simulation Environment
  • Wide Range of Possibilities for Linear Basis Functions
  • Interface to Existing Packages: HMMs, SVMs, GAs,

Neural Networks

useR! 2006 Vienna, June 15-17, 2006

  • Click to edit Master text styles
  • Second level
  • Third level
  • Fourth level
  • Fifth level

References

Richard Sutton and Andrew Barto. Reinforcement Learning: An Introduction. The MIT Press, Cambridge, Massachusetts, 1998. Michail G. Lagoudakis and Ronald Parr. “Least-Squares Policy Iteration,” Journal

  • f Machine Learning Research, 4, 2003, pp. 1107-1149.