CS 478 - Reinforcement Learning 1
Reinforcement Learning
l Variation on Supervised Learning l Exact target outputs are not given l Some variation of reward is given either immediately or after some
steps
–
Chess
–
Path Discovery
l RL systems learn a mapping from states to actions by trial-and-error
interactions with a dynamic environment
l TD-Gammon (Neuro-Gammon) l Deep RL (RL with deep neural networks) – Showing tremendous
potential
–
Especially nice for games because easy to generate data through self-play