A reinforcement learning model of song acquisition in the bird - - PowerPoint PPT Presentation

a reinforcement learning model of song
SMART_READER_LITE
LIVE PREVIEW

A reinforcement learning model of song acquisition in the bird - - PowerPoint PPT Presentation

A reinforcement learning model of song acquisition in the bird Michale Fee McGovern Institute Department of Brain and Cognitive Sciences Massachusetts Institute of Technology 9.54 November 12, 2014 Structure of zebra finch song Motif Motif


slide-1
SLIDE 1

A reinforcement learning model of song acquisition in the bird

Michale Fee

McGovern Institute Department of Brain and Cognitive Sciences Massachusetts Institute of Technology

9.54 November 12, 2014

slide-2
SLIDE 2

0 kHz

10 kHz

1s

Frequency

Motif Motif Syllable (~100ms) Note (~10ms)

Structure of zebra finch song

slide-3
SLIDE 3

Songbirds learn to sing by imitating their parents

Decreased Variability Increased Similarity to Tutor

Tutor Song Subsong Plastic Song Crystallized

slide-4
SLIDE 4

Overview

  • The songbird as a model system for understanding how the brain

generates and learns complex sequential behaviors

  • Review some current understanding of the mechanisms of song

production

  • Describe progress in elucidating the role of cortical and basal ganglia

circuits in song learning.

  • Some speculations on how insights from the songbird may inform our

understanding of mammalian BG function

slide-5
SLIDE 5

A circuit for vocal production

RA nXII Motor Pathway HVC

Cortex

Uva

Thalamus

Nottebohm et al, 1976, 1982

slide-6
SLIDE 6

Hahnloser, Kozhevnikov and Fee, 2002

Antidromic activation allows identification

  • f RA-projecting neurons in HVC

HVC RA X Stimulation electrode Extracellular recording electrode

slide-7
SLIDE 7

HVC neurons burst throughout the song

Lynch, Okubo and Fee, in preparation

  • 100ms

t # 1 66

Bird A: 66 bursts, 40 neurons Bird B: 56 bursts, 44 neurons

  • 100ms

t # 1 56

Bird B: 56 bursts, 44 neurons

  • 100ms

t # 1 91

Bird C: 91 bursts, 64 neurons

slide-8
SLIDE 8

Activity of RA neurons during singing

Yu and Margoliash, 1996

Motif

Leonardo and Fee, 2005

HVC RA Extracellular recording electrode

slide-9
SLIDE 9

Simple sequence generation circuit

Sparse representation of time

Output

Leonardo and Fee, 2005

slide-10
SLIDE 10

Sparse representation of time

Output

Simple sequence generation circuit

Leonardo and Fee, 2005

slide-11
SLIDE 11

HVC is the ‘clock’ of the song motor pathway

Brain cooling to localize dynamics n

p

RA nXII HVC 0.0A 0.25A

  • 0.25A
  • 0.5A
  • 0.75A

...

Bilateral cooling of HVC causes uniform slowing of the song

5 mm

Long and Fee, Nature 2008

slide-12
SLIDE 12

A simple reinforcement model of song learning

Song motor system

Song

Exploratory variability Song evaluation

Auditory feedback

Auditory Memory

Doya and Sejnowski, 1989

  • Error/Reinforcement signal
slide-13
SLIDE 13

RA nXII Motor Pathway HVC

A separate circuit for song learning

Cortex

LMAN DLM

Thalamus

Anterior Forebrain Pathway (AFP)

  • The learning pathway is not necessary for adult song production , but is required

for learning (Bottjer, 1984, Scharff and Nottebohm, 1991)

Instructive signal

  • Bottjer proposed that the AFP transmits an instructive signal that guides plasticity

in the motor pathway

Basal Ganglia

Area X

slide-14
SLIDE 14

RA nXII LMAN HVC

Separate premotor pathways for stereotyped song and variability

Sequence generator Variability generator

Kao et al, 2005 Ölveczky et al, 2005 Aronov et al, 2008 Stepanek and Doupe, 2010

slide-15
SLIDE 15

RA nXII LMAN HVC TTX or Muscimol

Separate premotor pathways for stereotyped song and variability

Sequence generator Variability generator

Kao et al, 2005 Ölveczky et al, 2005 Aronov et al, 2008 Stepanek and Doupe, 2010

slide-16
SLIDE 16

Transient inactivation of the learning pathway

55 day old bird

RA nXII LMAN HVC Olveczky, Andalman, and Fee, 2005

slide-17
SLIDE 17

LMAN drives exploratory variability in song

LMAN intact LMAN inactivated

20 40 60

  • 20

20

Residual Pitch (Hz) Time (ms)

slide-18
SLIDE 18

LMAN intact

250 ms

LMAN inactivated

30 dB

LMAN also drives early song ‘babbling’

Goldberg and Fee, 2011

slide-19
SLIDE 19

RA nXII LMAN Motor Pathway Learning Pathway (AFP) HVC

HVC lesions abolish all stereotyped song structure

slide-20
SLIDE 20

HVC lesions abolish all stereotyped song structure

Pre HVC lesion Post HVC lesion

  • Transient pharmacological inactivation of HVC produces the same effect

Aronov, Andalman and Fee, Science 2008,

Subsong bird Plastic song bird Adult bird

slide-21
SLIDE 21

The basal ganglia are not necessary for subsong or vocal variability in juvenile birds

30 dB

Subsong

Pre-lesion 250 ms Post-lesion

HVC RA nXIIts LMAN X DLM

X

Goldberg and Fee, 2011

  • Lesions of the BG have little or no acute

effect on juvenile song variability.

  • Local cooling in LMAN slow timescales of

babbling  exploratory vocal variability is generated by local circuit dynamics within LMAN.

slide-22
SLIDE 22

RA nXII

Separate premotor pathways for stereotyped song and variability

HVC

Sequence generator

LMAN

Variability generator

Kao et al, 2005 Ölveczky et al, 2005 Aronov et al, 2008 Stepanek and Doupe, 2010

slide-23
SLIDE 23

RA nXII LMAN HVC

Separate premotor pathways for stereotyped song and variability

Sequence generator Variability generator

Kao et al, 2005 Ölveczky et al, 2005 Aronov et al, 2008 Stepanek and Doupe, 2010

Instructive signal

slide-24
SLIDE 24

RA nXII LMAN HVC

Separate premotor pathways for stereotyped song and variability

Sequence generator Variability generator

Instructive signal

Area X DLM

slide-25
SLIDE 25

Tchernichovski, Mitra, Lints, Nottebohm, 2001

Tutor Pupil Days of Training 5 8 12 20 30 606 Hz Harmonic Stack Pitch (Hz) 568 554 551 596 607

Song learning is slow

slide-26
SLIDE 26

Experimental control of song learning

0.5 0.6 Targeted region

  • f syllable

Vocalized

Pitch (kHz)

Heard

Feedback Noise

Pitch threshold

Andalman and Fee 2009; Tumer and Brainard 2007

Speaker

DSP

Brain Cranial airsac Mic

slide-27
SLIDE 27

Conditional auditory feedback drives pitch learning

Tumer and Brainard 2007

Pitch (Hz) 550 650 25 ms

0 h 2 h 4 h

Pitch (Hz) 550 650

Andalman and Fee 2009

slide-28
SLIDE 28
  • 50

50 5 10 15

ΔPitch, Day (Hz) Observations

Many days of sequential learning

120 125 130 135 140 145 150 155 160 165 500 600 700 141 142 143 144 470 600 161 162 163 164 520 650

Pitch (Hz) Days Post Hatch Days Post Hatch Days Post Hatch Pitch (Hz)

  • 50

50 5 10 15

ΔPitch, Overnight (Hz) Observations

Up Days Down Days Up Days Down Days

slide-29
SLIDE 29

Motor parameter space

AFP-driven variability

Where does this learning occur in the song control circuit?

Motor pathway

slide-30
SLIDE 30

Motor parameter space

AFP-driven variability AFP-driven bias

Error gradient (reduced error)

Motor pathway

Where does this learning occur in the song control circuit?

slide-31
SLIDE 31

Motor parameter space

AFP-driven variability AFP-driven bias

Error gradient (reduced error)

Motor pathway Plasticity in motor pathway

Where does this learning occur in the song control circuit?

slide-32
SLIDE 32

AFP-driven bias Plasticity in motor pathway

HVC RA nXIIts LMAN

X

DLM

Motor Pathway Anterior Forebrain Pathway (AFP)

HVC RA nXIIts LMAN

X

DLM

Where does this learning occur in the song control circuit?

slide-33
SLIDE 33

25 Hz Pitch (Hz) TTX 2 h ∆Pitch (Hz)

  • 50

50 2 4 6 8

TTX Observations

Up Days Down Days

Δ Vehicle Pitch (Hz) Vehicle ∆Pitch (Hz)

  • 50

50 2 4 6 8 10 Up Days Down Days

Observations

Andalman and Fee, PNAS 2009

470 600

TTX TTX TTX TTX PBS PBS PBS PBS

?

Pitch (Hz)

drug reservoir cap inflow tube

  • utlet tube

dialysis membrane skull dental acrylic LMAN

slide-34
SLIDE 34

Does AFP-driven variability become biased to reduce vocal errors?

Motor parameter space

AFP-driven variability AFP-driven error-reducing bias

Error gradient (reduced error)

Plasticity in motor pathway

Yes!!

Motor pathway

slide-35
SLIDE 35

Is all song learning mediated by AFP bias?

120 125 130 135 140 145 150 155 160 165 500 600 700

Days Post Hatch Pitch (Hz)

Many days of sequential learning

slide-36
SLIDE 36

120 125 130 135 140 145 150 155 160 165 500 600 Days post-hatch Pitch (Hz) baseline LMAN(+) LMAN(-)

Is all song learning mediated by AFP bias?

120 125 130 135 140 145 150 155 160 165 500 600 Days post-hatch Pitch (Hz) baseline LMAN(+) LMAN(-)

slide-37
SLIDE 37

AFP bias is highly predictive of motor pathway plasticity within the next 24 hours

Pitch β

Day 1

Δm

Night Day 2 Day 3

Andalman and Fee, 2009 Warren et al, 2011

  • 4
  • 2

2 4 0.2 0.4 0.6 0.8 1

Lag (days) Correlation Coefficient (r2)

  • 100

100 50 100 Lag = -2 d Lag = -1 d Lag = 0 d Δm (Hz) (down days inv.) Estimated AFP bias (Hz, down days inverted) Days

slide-38
SLIDE 38

Motor parameter space

AFP-driven variability AFP-driven bias motor pathway plasticity Motor pathway motor pathway plasticity

Error gradient (reduced error) Day 1 Day 2 Day 3

Motor pathway plasticity appears to ‘integrate’ AFP bias

slide-39
SLIDE 39

X

How is AFP bias generated?

RA nXII HVC LMAN

  • Area X receives an efference copy of variability signals sent to RA.
  • If Area X also receives an evaluation signal, then X could figure out which variations

lead to better song performance.

  • Dopaminergic midbrain (VTA) has been shown to signal reward prediction error
  • Do X-projecting VTA neurons carry error-related signals?

VTA Schultz, 2000

slide-40
SLIDE 40

A descending pathway from higher-order auditory areas to VTA/SNc

AIV

RA

VTA

X

CM

Aud

Keller and Hahnloser, 2008 Gale, Perkel 2008 Mandelblat-Cerf et al, 2014

Retrograde label from VTA AIV Ventral Intermediate Arcopallium (AIV)

slide-41
SLIDE 41

X

Is AIV necessary for song learning?

nXII HVC LMAN VTA

NeuN

Las, Denisenko, Mandelblat-Cerf, eLife, 2014

slide-42
SLIDE 42

Is AIV necessary for song learning?

Bird tutored in home cage AIV lesion 40 90 Check imitation Bird isolated Age (days post hatch)

slide-43
SLIDE 43

AIV lesioned pupil #2 – Adult song Example 1 Example 2 Tutor

AIV lesion produces profound song learning deficits

slide-44
SLIDE 44

AIV lesions produce profound song learning deficits

AIV lesioned – adult song Tutor

Lesioned control Unlesioned control AIV lesion

Similarity of unrelated birds

slide-45
SLIDE 45

Do AIV neurons transmit an ‘error’ signal to VTA during singing?

X nXII HVC LMAN VTA

  • 2

2 4 6 8 10 12 14

  • 0.4
  • 0.2

0.0 0.2 0.4 Voltage (mV) Time from stim (ms)

slide-46
SLIDE 46

200 ms

Noise burst

Do AIV neurons transmit an ‘error’ signal to VTA during singing?

slide-47
SLIDE 47

AIV neurons show error-related signals

Mandelblat-Cerf, Las, Denisenko, under review

slide-48
SLIDE 48

A descending pathway from higher-order auditory areas to VTA/SNc

AIV

RA

VTA

X

CM

Aud

Keller and Hahnloser, 2008 Gale, Perkel 2008 Mandelblat-Cerf et al, 2014

Retrograde label from VTA AIV Ventral Intermediate Arcopallium (AIV)

slide-49
SLIDE 49

RA nXII HVC LMAN

How it all works: a hypothesis

slide-50
SLIDE 50

VTA

X

How it all works: a hypothesis

CM

Aud

slide-51
SLIDE 51

X

VTA LMAN HVC

How it all works: a hypothesis

slide-52
SLIDE 52

DLM

X

LMAN HVC

How it all works: a hypothesis

slide-53
SLIDE 53

RA HVC LMAN

How it all works: a hypothesis

slide-54
SLIDE 54

3 2 1

VTA HVC (Time Sequence)

LMAN

Pallidal Thalamus

MSN

To RA

Area X

HVC(X) firing patterns

4 kHz 1 2 3 4 5 6 7 100 ms

A model of basal ganglia function with functionally distinct inputs for context, motor efference copy, and reward

The AFP forms a classic cortical-BG-thalamo-cortical loop

slide-55
SLIDE 55

3 2 1

VTA HVC (Time Sequence)

LMAN

Pallidal Thalamus

MSN

To RA

Area X

Learning rule: Strengthen HVC synapse after coincidence

  • f LMAN, HVC and DA inputs

VTA

A model of basal ganglia function with functionally distinct inputs for context, motor efference copy, and reward

slide-56
SLIDE 56

3 2 1

VTA HVC (Time Sequence)

LMAN

Pallidal Thalamus

MSN

To RA

Area X

A model of basal ganglia function with functionally distinct inputs for context, motor efference copy, and reward

Time-dependent bias of one LMAN neuron Goldberg and Fee 2010

LMAN

HVC

1 2 3

MSN To RA

slide-57
SLIDE 57

3 2 1

VTA HVC (Time Sequence)

LMAN

Pallidal Thalamus

MSN

To RA

Area X HVC synapses Timing Drive MSNs Plastic Selective for single synapses LMAN synapses Action Do NOT drive MSNs Not plastic Global signal

A model of basal ganglia function with functionally distinct inputs for context, motor efference copy, and reward

slide-58
SLIDE 58

VTA LMAN

MSN

HVC LMAN

MSN

HVC

A learning rule with an eligibility trace allows delayed reward

LMAN HVC

VTA ΔWHVC-X

Eligibility trace

EHVC-X

EHVC-X = LH DWHVC-X = bEHVC-XR

slide-59
SLIDE 59

HVC 1 HVC 2 LMAN MSN

Hypothesis for HVC-LMAN synaptic interaction

  • n striatal MSNs

HVC on spines LMAN on dendritic shafts

slide-60
SLIDE 60

t = 1 HVC 1 HVC 2 LMAN MSN

Hypothesis for HVC-LMAN synaptic interaction

  • n striatal MSNs
slide-61
SLIDE 61

t = 2 HVC 1 HVC 2 LMAN MSN

Hypothesis for HVC-LMAN synaptic interaction

  • n striatal MSNs
slide-62
SLIDE 62

t = 2 HVC 1 HVC 2 LMAN MSN

Hypothesis for HVC-LMAN synaptic interaction

  • n striatal MSNs
slide-63
SLIDE 63

t = 2 HVC 1 HVC 2 LMAN MSN

Hypothesis for HVC-LMAN synaptic interaction

  • n striatal MSNs
slide-64
SLIDE 64

t = 2 HVC 1 HVC 2 LMAN MSN

Hypothesis for HVC-LMAN synaptic interaction

  • n striatal MSNs

Dopamine

slide-65
SLIDE 65

HVC 1 HVC 2 LMAN MSN

Synapse Strengthened

Hypothesis for HVC-LMAN synaptic interaction

  • n striatal MSNs
slide-66
SLIDE 66

Serial Block Face Scanning EM

Collaboration with Winfried Denk and Jörgen Kornfeld

slide-67
SLIDE 67

LMAN

200 µm 200 µm

HVC

200 µm 200 µm

Distinct morphology of HVC and LMAN axons

Michael Stetner Axonal arbor of LMAN neuron in Area X Axonal arbor of HVC neuron in Area X

slide-68
SLIDE 68

MSN HVC-like LMAN-like

~94% of synapses onto spines are from HVC-like axons

Inputs onto MSN spines originate primarily from HVC

Putative LMAN axons Putative HVC axons

slide-69
SLIDE 69

The role of the basal ganglia in songbird vocal learning

  • LMAN directly drives ‘exploratory variability’ in the song motor pathway.
  • LMAN-driven variability becomes biased during learning, in the direction of

improved song performance.

  • We have found evidence that a dopaminergic pathway to the songbird BG may

carry ‘performance’ error-related information.

  • We hypothesize that the basal ganglia determine which song variations lead to

better performance and bias the variability in the direction of improved performance.

  • We have proposed a testable model of basal ganglia function that explicitly

incorporates an efference copy of cortically-generated motor actions.

slide-70
SLIDE 70

The Fee Lab

Current Lab Members

  • Anusha Narayan
  • Natalia Denissenko
  • Tatsuo Okubo
  • Michael Stetner
  • Emily Mackevicius
  • Galen Lynch

web.mit.edu/feelab

Former Lab Members

  • Richard Hahnloser
  • Alexay Kozhevnikov
  • Anthony Leonardo
  • Michael Long
  • Bence Ölveczky
  • Dmitriy Aronov
  • Aaron Andalman
  • Lena Veit
  • Jakob Förster
  • Liora Las
  • Jesse Goldberg
  • Yael Mandelblat

Funding:

National Institutes of Health - NIMH, NIDCD

slide-71
SLIDE 71

Separate premotor pathways for stereotyped song and variability

RA Sequence Stereotypy Precision HVC Uva Randomness Variability Exploration LMAN DLM Motor Output Subsong Adult song

X

slide-72
SLIDE 72

LMAN drives subsong

Stimulating electrode nXII LMAN Area X DLM HVC RA

slide-73
SLIDE 73

Instantaneous firing rate (Hz) 700

20 dB

Sound amplitude

250 ms

LMAN(RA) neurons exhibit premotor correlation with subsong syllables

slide-74
SLIDE 74

Instantaneous firing rate (Hz) 700

20 dB

Sound amplitude

200 ms

LMAN(RA) neurons exhibit premotor correlation with subsong syllables

slide-75
SLIDE 75

20 dB 700

Instantaneous firing rate (Hz) Neuron 14 Sound amplitude

250 ms

* *

  • 100

100 200 20 40 60

Time relative to offset (ms) Mean firing rate (Hz)

Sound amplitude

LMAN(RA) neurons exhibit premotor correlation with subsong acoustic structure

slide-76
SLIDE 76

Summary

  • The AFP can generate a direct premotor bias that reduces vocal errors.
  • The learning accumulated across many days of training is encoded

primarily in plasticity in the motor pathway.

  • The contribution of the AFP is limited to the learning that occurred

most recently (during the same day).

  • AFP bias is predictive of subsequent plasticity in the motor pathway

within the next 24 hours