4/28/2014 The Scope of This Talk Covert skill learning in a - - PDF document

4 28 2014
SMART_READER_LITE
LIVE PREVIEW

4/28/2014 The Scope of This Talk Covert skill learning in a - - PDF document

4/28/2014 The Scope of This Talk Covert skill learning in a cortical- basal ganglia circuit Journal Institution Charlesworth, J.D., Warren, T.L., & Brainard, M.S. (2012). Covert skill Authors learning in a cortical-basal ganglia


slide-1
SLIDE 1

4/28/2014 1

Covert skill learning in a cortical- basal ganglia circuit

Charlesworth, J.D., Warren, T.L., & Brainard, M.S. (2012). Covert skill learning in a cortical-basal ganglia circuit. Nature 486, 251-255. doi: 10.1038/nature11078 BIONB 4110 April 28, 2014 Presented by: Jennifer Hoots and Professor Carl Hopkins

The Scope of This Talk

 Journal  Institution  Authors  Background  Figures  Their Conclusions  Further Thoughts

The Journal

 Nature  Published weekly  Impact factor 38.597  Interdisciplinary  International  Peer-reviewed

The Institution

 University of California San Francisco

 W. M. Keck Center for Integrative Neuroscience  Department of Physiology  Neuroscience Graduate Program

The Authors

 Jonathan D. Charlesworth  Studied molecular biology as an undergraduate at Princeton University ‘07  PhD in neuroscience at University of California San Francisco ’12  Postdoc at Neurotek (now thync)  Current senior scientist at thync  Performed the experiments with APV in RA  Analyzed the data

Jonathan Charlesworth. Retrieved from: http://blogs.princeton.edu/pa w/2012/05/tiger-of-the-we- 115/

The Authors

 Timothy L. Warren  Performed the experiments with LMAN inactivations  Studied at Harvard as an undergraduate  UCSF graduate student at the time

Timothy L. Warren. Retrieved from: http://keck.ucsf.edu/~twarren/

slide-2
SLIDE 2

4/28/2014 2

The Authors

 Michael S. Brainard  Principal Investigator at University of California, San Francisco

 Howard Hughes Medical Institute professor  Also a professor of physiology and psychiatry

 BS, biochemistry at Harvard University  PhD, neurobiology, Stanford University

Howard Hughes Medical Institute (2014). Retrieved from: http://www.hhmi.org/scientists/michael- brainard

In the "actor-critic" models of reinforcement learning three events must occur for learning to occur. What are these three events and how do they influence learning?

“Actor/Critic Models of Reinforcement Learning”

Reinforcement learning or “trial and error” learning was first characterized in Thorndike’s (1911) “Law of Effect” which states that a random action that produces a satisfying effect is more likely to occur again in that same situation.

Thorndike, Edward Lee (1911) Animal Intelligence. Macmillan, New York, 297 pp.

“Actor/Critic Models of Reinforcement Learning”

Reinforcement learning or “trial and error” learning was first characterized by Thorndike’s (1911) “Law of Effect”. This states that a random action that produces a satisfying effect is more likely to occur again in that same situation. The three conditions for reinforcement learning are: 1) The situation (context, state, timing). 2) The action (what the animal or “actor” did – a motor act, a plan or a thought). 3) The reward Thus: If, in a given situation, after a given action, a reward occurs (i.e. a satisfying effect or a sense of comfort), then the action will be more likely to occur again in that same situation. By contrast, if in a given situation a negative reward (i.e. one that produces discomfort or dissatisfaction) will be less likely in the same context. Reward, which can either be positive or negative is input from the “critic”. Therefore: If an Action occurs in a given context, followed by a critic, the action will be repeated or not repeated.

Thorndike, Edward Lee (1911) Animal Intelligence. Macmillan, New York, 297 pp.

The authors use birdsong as an example of learned behavior. What is the evidence that birds actually learn their songs? The authors use birdsong as an example of learned behavior. What is the evidence that birds actually learn their songs?

1) P. Marler and W. Thorpe working in Cambridge England in 1950’s discovered that Chaffinches sang only 2 songs as adults, but that the songs were different from one geographic area to another (dialects). 2) To prove that the birds were learning their songs, they raised birds in acoustic isolation. 3) If tutored with a sound from a tape recorder, the isolate bird will copy the tutor song as an adult. If presented with a different dialect copied the tutor’s song, not their own native dialect. 4) White crown sparrow in California (P. Marler) had similar geographic dialects, and similar learning rules. 5) Birds who are deafened before they learn to sing will sing an abbarent song, if deafened after they have learned to sing the deafening has no effect. 6) Male birds often sing exactly the same song as their father’s song. There is a lot of variation in the songs within a species, but sons will often replicate the exact same syllables that their father sang.

slide-3
SLIDE 3

4/28/2014 3 Zebra Finch (Taeniopygia guttata)

Native to deserts of Australia. Huge flocks, migratory.

Neuroanatomy of song production The brain areas involved in song production were established by tract-tracing studies done in the 1970’s by Fernando Nottebohm (silver degeneration techniques).

LMAN is essential for song learning but not for song production

SCIENCE (1984) p. 901- 903 Abnormal song after LMAN lesion Normal song after control lesion to forebrain

slide-4
SLIDE 4

4/28/2014 4

Brainard and Doupe (2000) develop error model for song learning

Alexay A. Kozhevnikov , Michale

  • S. Fee

(2007) Singing-Related Activity of Identified HVC Neurons in the Zebra Finch. Journal of

  • Neurophysiology. Vol. 97no. 4271-

4283. Recording of single identified units in HVC recorded during bouts of natural singing showed three different but stereotyped patterns of firing with respect to the vocal

  • utput. Single units were identified by anti-

dromic stimulation of X, and RA. 1) Units that projected to area X in the Anterior Forebrain Nucleus fire in bursts

  • f one to four times per motif.

2) Units projecting to RA fire very rarely -- phase locked to no more than one syllable per motif. This is a sparse code for one piece of the song. 3) Interneurons within HVC fir throughout the song with tonic firing.

Simplified model for how the brain controls a complex, learned vocalization.

1) Neurons in HVC fire in sparse code, one neuron per

  • syllable. Each neuron connects to the next neuron

in the timing chain. 2) HVC neurons send output to one or more RA

  • neurons. RA neurons fire at syllable-specific times

in the song. RA codes for individual muscle contractions within the song. 3) Each syllable is composed of a complex of muscle contractions linked to the active units in RA. Anthony Leonardo and Michale S. Fee (2005) J.Neurosci.

Purpose & Hypothesis

 Purpose: To further resolve the function of cortical-basal ganglia circuits in trial and error skill learning.  “…learning requires the reinforcement of exploratory behavioural variation generated by the AFP; therefore, preventing the AFP from contributing to behavioural variation during training should prevent trial-and-error learning.”

Study Organism

 Bengalese finches (Lonchura striata domestica)  Adult males (more than 120 days old)  Housed in sound-attenuating chambers  All recorded songs were undirected (no female present)

Beckham, R. (2013) Society Finch - Lonchura striata domestica. efinch.com. Retrieved from:: http://www.efinch.com/species/society.htm

Training

 Tumer, E.C. & Brainard, M.S. (2007) Performance variability enables adaptive plasticity of “crystallized” adult birdsong.

  • Nature. doi: 10.1038/nature06390

 There is trial-by-trial variation in stable adult song  A computerized system monitors pitch variation and delivers real-time auditory disruption to a subset of those variations  Birds adjust their song to avoid the disruption  Threshold for avoiding white noise was set at about the baseline median FF performance  White noise was delivered for 4-14 hrs while birds were awake

slide-5
SLIDE 5

4/28/2014 5

Figure 1 (a & b)

Tumer, E.C. & Brainard, M.S. (2007) Performance variability enables adaptive plasticity of “crystallized” adult

  • birdsong. Nature. doi:

10.1038/nature06390

Figure 1 (c-g)

Use APV to block LMAN

Figure 2 (a)

APV infusion

Brainard, M.S., & Doupe, A.J. (2000) Auditory feedback in learning and maintenance of vocal behaviour. Nature Reviews Neuroscience 1, 31-40. doi: 10.1038/35036205

APV Injection

 Reverse microdialysis technique diffuses the solution into the brain area (RA) across the membrane of the implanted probe  48 hrs of ACSF was dialysed  NMDAR antagonist DL-APV (DL-2-Amino-5-phosphonopentanoic acid) was dialysed for at least 1.5 hrs before white noise training  Switched solution back to ACSF and prevented birds from singing for at least 1.5 hrs to allow washout before recording first song recording after training

Sigma-Aldrich Co. LLC. (2014). Retrieved from: http://www.sigmaaldrich.com/catalog/product/si gma/a5282?lang=en&region=US

Figure 2 (b & c)

slide-6
SLIDE 6

4/28/2014 6

Figure 3 (a & b) Figure 3 (c & d) Figure 3 (e & f) Figure 4 (a)

Muscimol (GABAA agonist) or lidocaine (Na+ channel blocker)

Brainard, M.S., & Doupe, A.J. (2000) Auditory feedback in learning and maintenance of vocal behaviour. Nature Reviews Neuroscience 1, 31-40. doi: 10.1038/35036205

Figure 4 (b-c) Their Conclusions

 “Our results motivate a revision to models of song plasticity10–12 and influential actor–critic models of skill learning2,3, which propose that essential learning-related signals develop only in brain regions that are ‘acting’ (that is, controlling behaviour).”

 10. Fee, M.S. & Goldberg, J.H. A hypothesis for basal ganglia- dependent reinforcement learning in the songbird. Neuroscience 198, 152-170 (2011).  11. Fiete, I.R., Fee, M.S. & Seung H.S. Model of birdsong learning based on gradient estimation by dynamic perturbation of neural

  • conductances. J. Neurophysiol. 98, 2038-2057 (2007).
slide-7
SLIDE 7

4/28/2014 7

Their Conclusions

 Learning can happen in the AFP even when it is not acting  Variation generated by the AFP is not necessary for learning  A different source of variation can be exploited for reinforcement learning

 Possibly variation in the RA

 Information about variation may converge at the AFP and be associated with reinforcement signals

Supplementary Figure 1

  • 1. Efference copy of motor command to AFP
  • 2. Efference copy and reinforcement signals converge allowing the AFP to

identify successful motor commands

  • 3. When AFP output is unblocked, functional connections between the AFP

and motor pathway allow the AFP to implement more successful motor commands

Discussion Questions

The authors refer to "covert skill learning" in their paper. Why exactly do they say the learning is covert?

Discussion Questions

Suppose that APV does not eliminate all connections between the anterior forebrain pathway and nucleus RA (primary motor cortex),. but merely reduces or blocks some of the connections? What could you conclude about covert learning if this were the case?

Using the Brainard method of experimentally driving learning, Andalman and Fee (2009) were able to cause shifts in FF of zebra finch syllables.

Methods: the sound is recorded with a microphone attached to the skull and feedback is provided through a speaker delivering sound to the cranial air-sac. A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors Aaron S. Andalman and Michale S. Fee1

A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors Aaron S. Andalman and Michale S. Fee PNA NAS.

  • S. July 28, 2009 vol. 106
  • no. 30.

Pitch up pitch down days Frequency of one syllable

However, when the Anterior Forebrain Pathway is knocked out by inserting bilateral cannulae into LMAN for TTX infusion the induced learning is completely abolished. (TTX is compared to CSF infusion as control) This result suggests that LMAN provides a corrective pre-motor bias to the song frequency causing up and down shifts in frequency of the

  • song. This bias is completely abolished by TTX

infusion into LMAN.

slide-8
SLIDE 8

4/28/2014 8