Reinforcement Learning in Psychology and Neuroscience
with thanks to Elliot Ludvig Princeton University
Reinforcement Learning in Psychology and Neuroscience with thanks - - PowerPoint PPT Presentation
Reinforcement Learning in Psychology and Neuroscience with thanks to Elliot Ludvig Princeton University Psychology has identified two primitive kinds of learning Classical Conditioning Operant Conditioning (a.k.a. Instrumental
with thanks to Elliot Ludvig Princeton University
❖ Classical = Prediction
❖ Operant = Control
❖ Widely cited and used
❖ TD learning as extension of RW
❖ Decide which response to make? ❖ Decide how much to respond? ❖ Decide when to respond?
❖ What function is being fulfilled?
❖ How is it accomplished?
❖ What physical substrate is involved?
δt = rt+1 + γVt+1 − Vt. Vt = wT
t xt = n
wt(i)xt(i)
wi ei
i ~ δ ⋅ei
xi
Reward
δ
States
Features Value of state
wi ⋅xi
i
TD Error
TD Error Eligibility Trace
λ
Hammer, Menzel
Honeybee Brain VUM Neuron
What signal does this neuron carry?
❖ Diffuse projections from mid-brain
Wolfram Schultz, et al.
TD error
Reward Unexpected
Reward Value TD error
Reward Expected
Cue Value TD error
Reward Absent
Value TD error
TD errort = rt+1 + γVt+1 Vt