SLIDE 2 4/28/2014 2
The Authors
Michael S. Brainard Principal Investigator at University of California, San Francisco
Howard Hughes Medical Institute professor Also a professor of physiology and psychiatry
BS, biochemistry at Harvard University PhD, neurobiology, Stanford University
Howard Hughes Medical Institute (2014). Retrieved from: http://www.hhmi.org/scientists/michael- brainard
In the "actor-critic" models of reinforcement learning three events must occur for learning to occur. What are these three events and how do they influence learning?
“Actor/Critic Models of Reinforcement Learning”
Reinforcement learning or “trial and error” learning was first characterized in Thorndike’s (1911) “Law of Effect” which states that a random action that produces a satisfying effect is more likely to occur again in that same situation.
Thorndike, Edward Lee (1911) Animal Intelligence. Macmillan, New York, 297 pp.
“Actor/Critic Models of Reinforcement Learning”
Reinforcement learning or “trial and error” learning was first characterized by Thorndike’s (1911) “Law of Effect”. This states that a random action that produces a satisfying effect is more likely to occur again in that same situation. The three conditions for reinforcement learning are: 1) The situation (context, state, timing). 2) The action (what the animal or “actor” did – a motor act, a plan or a thought). 3) The reward Thus: If, in a given situation, after a given action, a reward occurs (i.e. a satisfying effect or a sense of comfort), then the action will be more likely to occur again in that same situation. By contrast, if in a given situation a negative reward (i.e. one that produces discomfort or dissatisfaction) will be less likely in the same context. Reward, which can either be positive or negative is input from the “critic”. Therefore: If an Action occurs in a given context, followed by a critic, the action will be repeated or not repeated.
Thorndike, Edward Lee (1911) Animal Intelligence. Macmillan, New York, 297 pp.
The authors use birdsong as an example of learned behavior. What is the evidence that birds actually learn their songs? The authors use birdsong as an example of learned behavior. What is the evidence that birds actually learn their songs?
1) P. Marler and W. Thorpe working in Cambridge England in 1950’s discovered that Chaffinches sang only 2 songs as adults, but that the songs were different from one geographic area to another (dialects). 2) To prove that the birds were learning their songs, they raised birds in acoustic isolation. 3) If tutored with a sound from a tape recorder, the isolate bird will copy the tutor song as an adult. If presented with a different dialect copied the tutor’s song, not their own native dialect. 4) White crown sparrow in California (P. Marler) had similar geographic dialects, and similar learning rules. 5) Birds who are deafened before they learn to sing will sing an abbarent song, if deafened after they have learned to sing the deafening has no effect. 6) Male birds often sing exactly the same song as their father’s song. There is a lot of variation in the songs within a species, but sons will often replicate the exact same syllables that their father sang.