[PPT] - COMP 546 Lecture 21 Cochlea to brain, Source Localization Tues. PowerPoint Presentation

SLIDE 1

1

COMP 546

Lecture 21 Cochlea to brain, Source Localization

Tues. April 3, 2018

SLIDE 2

Ear

2

auditory canal pinna cochlea

uter middle inner

SLIDE 3

3

Eye

Lens
Retina
Photoreceptors

(light -> chemical)

Ganglion cells (spikes)
Optic nerve

Ear

?
?
?
?
?

SLIDE 4

4

Eye

Lens
Retina
Photoreceptors

(light -> chemical)

Ganglion cells (spikes)
Optic nerve

Ear

Outer ear
Cochlea
hair cells

(mechanical -> chemical)

Ganglion cells (spikes)
VestibuloCochlear nerve

SLIDE 5

Basilar Membrane

5

BM fibres have bandpass frequency mechanical responses. 20,000 Hz 20 Hz

SLIDE 6

Basilar Membrane: Place code (“tonotopic”)

6

Nerve cells (hair + ganglion) are distributed along the BM. They have similar bandpass frequency response functions. 20,000 Hz 20 Hz

SLIDE 7

7

0 1000 2000 3000 4000 …. 22,000

Bandpass responses

(more details next lecture)

SLIDE 8

Neural coding of sound in cochlea

Basilar membrane responds by vibrating with sound.
Hair cells at each BM location release neurotransmitter

that signal BM amplitude at that location

Ganglion cells respond to neurotransmitter signals by

spiking

8

SLIDE 9

9

Louder sound within frequency band → greater amplitude of BM vibration at that location → greater release of neurotransmitter by hair cell → greater probability of spike at each peak of filtered wave

SLIDE 10

10

Hair cell neurotransmitter release can signal exact timing

f BM amplitude peaks for frequencies up to ~2 kHz.

For higher frequencies, hair cells encode only the envelope of BM vibrations.

t

SLIDE 11

Timing of ganglion cell spikes: for frequencies up to 2 KHz (“phase locking”)

11

Hair cells release more neurotransmitter at BM amplitude peaks. Ganglion cells respond to neurotransmitter peaks by spiking. This allows exact timing of BM vibrations to be encoded by spikes.

BM vibration Spikes

SLIDE 12

12

3,000 hair cells in each cochlea (left and right) 30,00 ganglion cells in each cochlea cochlear nerve (to brain)

Ganglion cells cannot spike faster than 500 times per second. So we need many ganglion cells for each hair cell.

SLIDE 13

“Volley” code

13

Different ganglion cells at same spatial position

n BM

SLIDE 14

14

From cochlea to brain stem

cochlea → cochlear nucleus → lateral and medial superior olive (LSO, MSO) … → auditory cortex

BRAIN STEM

SLIDE 15

Tonotopic maps

15

cochlear nucleus (CN) cochlea lateral superior

live (LSO)

medial superior

live (MSO)

auditory nerve high 𝜕 low 𝜕

SLIDE 16

Binaural Hearing

16

CN LSO MSO

high 𝜕 low 𝜕

CN MSO LSO

low 𝜕 high 𝜕

SLIDE 17

Binaural Hearing

17

CN LSO MSO

high 𝜕 low 𝜕

CN MSO LSO

low 𝜕 high 𝜕

MSO combines low frequency signals.

SLIDE 18

Binaural Hearing

18

CN LSO MSO

high 𝜕

CN MSO LSO

high 𝜕

LSO combines high frequency signals.

SLIDE 19

Levels of Analysis

19

what is the task ? what problem is being solved?
brain areas and pathways
neural coding
neural mechanisms

high low

SLIDE 20

20

For high frequency bands,

the head casts a shadow
the timing of the peaks cannot be accurately coded

by the spikes (only the rate of spikes is informative) For low frequency bands,

the head casts a weak shadow only
the timing of the peaks can be encoded by spikes

SLIDE 21

Duplex theory of binaural hearing

(Rayleigh, 1907)

level differences computed for higher frequencies

(ILD -- interaural level differences)

timing differences computed for lower frequencies

(ITD - interaural timing differences)

21

SLIDE 22

22

CN LSO MSO

high 𝜕

CN MSO LSO

high 𝜕

Level differences (high frequencies)

− − + +

Excitatory input comes from the ear on the same side. Inhibitory input comes from ear on the opposite side.

SLIDE 23

23

CN LSO MSO

high 𝜕 low 𝜕

CN MSO LSO

low 𝜕 high 𝜕

Timing differences (low frequencies)

Sum excitatory input from both sides. Reminiscent of binocular complex cells in V1 ? + + + +

SLIDE 24

Jeffress Model (1948) for timing differences

24

http://auditoryneuroscience.com/topics/jeffress-model-animation E D C B A from right ear from left ear A B C D E

SLIDE 25

Spike Timing precision required for Jeffress Model ?

25

from right ear from left ear

distance =

1 10 𝑛𝑗𝑚𝑚𝑗𝑛𝑓𝑢𝑠𝑓𝑡

speed of spike = 10 𝑛𝑓𝑢𝑠𝑓𝑡 𝑡𝑓𝑑𝑝𝑜𝑒−1 ⟹ ∆ time =

𝑒𝑗𝑡𝑢𝑏𝑜𝑑𝑓 𝑡𝑞𝑓𝑓𝑒

=

1 100 𝑛𝑗𝑚𝑚𝑗𝑡𝑓𝑑𝑝𝑜𝑒

See Exercises 19 Q2c E D C B A

SLIDE 26

26

Jeffress model remains controversial. It is not known exactly how “coincidence detection” occurs in MSO.

𝜕

Coincidence detection for each low frequency band

A B C D E A B C D E A B C D E A B C D E

SLIDE 27

Naïve Computational Model of Source Localization (Recall lecture 20)

27

𝐽𝑚 (𝑢) = 𝛽 𝐽𝑠(𝑢 − 𝜐) + 𝜗(𝑢)

shadow model error delay

Find the 𝛽 and 𝜐 that minimize

where 𝜐 < 0.5 𝑛𝑡.

𝑢=1 𝑈

{ 𝐽𝑚 (𝑢) − 𝛽 𝐽𝑠(𝑢 − 𝜐) }2

SLIDE 28

28

10 𝑚𝑝𝑕10 𝑢=1

𝑈

𝐽𝑚 (𝑢)2 𝑢=1

𝑈

𝐽𝑠 (𝑢)2

𝑢

𝐽𝑚 (𝑢) 𝐽𝑠(𝑢 − 𝜐) . Timing difference: find the 𝜐 that maximizes Level difference:

SLIDE 29

29

𝑢

𝐽𝑚𝑓𝑔𝑢

𝑘(𝑢) 𝐽𝑠𝑗𝑕ℎ𝑢 𝑘(𝑢 − 𝜐) .

For each low frequency band 𝑘, find the 𝜐 that maximizes (or use summation model similar to binocular cells or Jeffress model) An estimated value of delay 𝜐 in frequency band j is consistent with various possible source directions ( 𝜚, θ ).

Similar to cone of confusion, but more general because of frequency dependence

SLIDE 30

30

𝐽𝑀𝐸

𝑘 = 10 𝑚𝑝𝑕10

𝑢=1

𝑈

𝐽𝑚𝑓𝑔𝑢

𝑘 (𝑢) 2

𝑢=1

𝑈

𝐽𝑠𝑗𝑕ℎ𝑢

𝑘 (𝑢) 2 For each high frequency band 𝑘, compute interaural level difference (ILD) :

What does each 𝐽𝑀𝐸

𝑘 tell us ?

SLIDE 31

31

𝐽𝑚𝑓𝑔𝑢

𝑘

𝑢; 𝜚, 𝜄 = 𝑕𝑘 𝑢 ∗ ℎ𝑚𝑓𝑔𝑢(𝑢; 𝜚, 𝜄) ∗ 𝐽𝑡𝑠𝑑 𝑢; 𝜚, 𝜄 𝐽𝑠𝑗𝑕ℎ𝑢

𝑘

𝑢; 𝜚, 𝜄 = 𝑕𝑘 𝑢 ∗ ℎ𝑠𝑗𝑕ℎ𝑢(𝑢; 𝜚, 𝜄) ∗ 𝐽𝑡𝑠𝑑 𝑢; 𝜚, 𝜄

Recall head related impulse response function (HRIR) from last lecture.. If the source direction is (q, f), and 𝑕𝑘 𝑢 is the filter for band 𝑘. then…

SLIDE 32

32

𝐽 𝑚𝑓𝑔𝑢

𝑘

𝜕; 𝜚, 𝜄 = 𝑕𝑘 𝜕 ℎ𝑚𝑓𝑔𝑢(𝜕; 𝜚, 𝜄) 𝐽𝑡𝑠𝑑 𝜕; 𝜚, 𝜄 𝐽 𝑠𝑗𝑕ℎ𝑢

𝑘

𝜕; 𝜚, 𝜄 = 𝑕𝑘 𝜕 ℎ𝑠𝑗𝑕ℎ𝑢(𝜕; 𝜚, 𝜄) 𝐽𝑡𝑠𝑑 𝜕; 𝜚, 𝜄

Take the Fourier transform and apply convolution theorem :

SLIDE 33

33

𝐽 𝑚𝑓𝑔𝑢

𝑘

𝜕; 𝜚, 𝜄 = 𝑕𝑘 𝜕 ℎ𝑚𝑓𝑔𝑢(𝜕; 𝜚, 𝜄) 𝐽𝑡𝑠𝑑 𝜕; 𝜚, 𝜄 𝐽 𝑠𝑗𝑕ℎ𝑢

𝑘

𝜕; 𝜚, 𝜄 = 𝑕𝑘 𝜕 ℎ𝑠𝑗𝑕ℎ𝑢(𝜕; 𝜚, 𝜄) 𝐽𝑡𝑠𝑑 𝜕; 𝜚, 𝜄

Take the Fourier transform and apply convolution theorem : If there is just one source direction (𝜚, 𝜄), then for each frequency 𝜕 within band 𝑘 ∶

𝐽 𝑠𝑗𝑕ℎ𝑢

𝑘

𝜕 𝐽 𝑚𝑓𝑔𝑢

𝑘

𝜕 ℎ𝑚𝑓𝑔𝑢(𝜕; 𝜚, 𝜄) ℎ𝑠𝑗𝑕ℎ𝑢(𝜕; 𝜚, 𝜄) ≈

SLIDE 34

34

One can show using Parseval’s theorem of Fourier transforms that if ℎ𝑚𝑓𝑔𝑢(𝜕; 𝜚, 𝜄) and ℎ𝑠𝑗𝑕ℎ𝑢(𝜕; 𝜚, 𝜄) are approximately constant within band 𝑘, then:

𝑢=1

𝑈

𝐽𝑚𝑓𝑔𝑢

𝑘 (𝑢) 2

𝑢=1

𝑈

𝐽𝑠𝑗𝑕ℎ𝑢

𝑘 (𝑢) 2

| ℎ𝑚𝑓𝑔𝑢

𝑘

( 𝜚, 𝜄) |2 | ℎ𝑠𝑗𝑕ℎ𝑢

𝑘

( 𝜚, 𝜄) |2 ≈

SLIDE 35

35

𝑢=1

𝑈

𝐽𝑚𝑓𝑔𝑢

𝑘 (𝑢) 2

𝑢=1

𝑈

𝐽𝑠𝑗𝑕ℎ𝑢

𝑘 (𝑢) 2

| ℎ𝑚𝑓𝑔𝑢

𝑘

( 𝜚, 𝜄) |2 | ℎ𝑠𝑗𝑕ℎ𝑢

𝑘

( 𝜚, 𝜄) |2 ≈

The ear can measure this… and can infer source directions ( 𝜚, 𝜄) that are consistent with it. One can show using Parseval’s theorem of Fourier transforms that if ℎ𝑚𝑓𝑔𝑢(𝜕; 𝜚, 𝜄) and ℎ𝑠𝑗𝑕ℎ𝑢(𝜕; 𝜚, 𝜄) are approximately constant within band 𝑘, then:

SLIDE 36

36

https://auditoryneuroscience.com/topics/acoustic-cues-sound-location

Each iso-contour in each frequency band is consistent with a measured level difference (dB). Interaural Level Difference (dB) as a function of (𝜚, 𝜄) for two fixed ω. 700 Hz 11,000 Hz

SLIDE 37

Monaural spectral cues (Spatial localization with one ear?)

37

𝐽𝑘 𝑢; 𝜚, 𝜄 = 𝑕𝑘 𝑢 ∗ ℎ(𝑢; 𝜚, 𝜄) ∗ 𝐽𝑡𝑠𝑑 𝑢; 𝜚, 𝜄 𝐽𝑘 𝜕; 𝜚, 𝜄 = 𝑕𝑘 𝜕 ℎ𝑘(𝜕; 𝜚, 𝜄) 𝐽𝑡𝑠𝑑 𝜕; 𝜚, 𝜄

If the source is noise, then all frequencies make the same contribution on average. Pattern of peaks and notches across bands will be due to HRTF, not to the source.

SLIDE 38

38

HRTF from last lecture

e.g. medial plane

Azimuth 𝜄 = 0

“Pinnal notch” frequency varies with elevation of source e.g. in the medial plane.

𝐽𝑘 𝜕; 𝜚, 𝜄 = 𝑕𝑘 𝜕 ℎ𝑘(𝜕; 𝜚, 𝜄) 𝐽𝑡𝑠𝑑 𝜕; 𝜚, 𝜄

SLIDE 39

Levels of Analysis

39

what is the task ? what problem is being solved?

Source localization using level and timing differences within frequency channels.

brain areas and pathways

(cochlea to CN to MSO and LSO in the brainstem)

neural coding

(gave sketch only)

neural mechanisms

high low