10/20/20 1
Primer on Auditory Processing
Mounya Elhilali Department of Electrical & Computer Engineering Johns Hopkins University mounya@jhu.edu
601.467/667 Introduction to Human Language Technology
1
Speech as waves
2
Primer on Auditory Processing Mounya Elhilali Department of - - PDF document
10/20/20 Primer on Auditory Processing Mounya Elhilali Department of Electrical & Computer Engineering Johns Hopkins University mounya@jhu.edu 601.467/667 Introduction to Human Language Technology 1 Speech as waves 2 1 10/20/20 Sound
10/20/20 1
Mounya Elhilali Department of Electrical & Computer Engineering Johns Hopkins University mounya@jhu.edu
601.467/667 Introduction to Human Language Technology
1
2
10/20/20 2
3
the wave. A given molecule vibrates back and forth about a fixed location.
3
High Low Normal
Sound Pressure Time
5
point in space
ü denoted in cycles per second (cps) or hertz (Hz).
Period of oscillation
5
10/20/20 3
6
Amplitude
Frequency (F)
Wavelength (λ)
Period (T)
6
7
Physical Properties of Sound Perceptual Dimensions Amplitude/Intensity Loudness Frequency Pitch Complexity Timbre (frequency content & time) 7
10/20/20 4
8
Note: Listening to loud music will gradually damage your hearing!
8
9
Each contour represents equally-perceived tones
Loudness (dB)
9
10/20/20 5
a simple periodic signal is determined by its frequency.
vocal chords) naturally oscillate at a fundamental frequency (𝐺!) as well as its integer multiples (called harmonics/partials/overtones).
signal is often determined by its fundamental frequency (𝐺!)
10
1f 2f 1 octave 3f 4f 2 octaves 8f 3 octaves
10
11
ü Mel-scaling is used in signal processing to build filters that approximate human pitch perception (MFCC)
It’s a relative scale, based on pitch comparisons
11
10/20/20 6
another sound
interference in sound perception
12
12
13
13
10/20/20 7
17
then are transformed into neural firings
18
18
10/20/20 8
19
19
20
20
10/20/20 9
21
21
from the outside world into a signal of nerve impulses sent to the brain.
22
22
10/20/20 10
– The external ear plays the role of an acoustic antenna, – It diffracts and focuses sound waves (pinna), while the ear canal acts as a resonator => amplifies sounds in 2-5 kHz range – The end of the canal has an eardrum which vibrates with sound
23
23 – Eardrum (or tympanic membrane) vibrations cause mechanical motion
middle ear (malleus, incus & stapes) [3 smallest bones in the human body] – The middle ear acts as an impedance adapter to adjust energy difference between air environment and fluid environment
24
24
10/20/20 11
25
physical vibrations into electrical signals for the brain to process
frequency analyzer
25
inner ear organ that converts sound waves into neural signals.
are passed to the brain via the auditory nerve.
26
26
10/20/20 12
28
28
32
32
10/20/20 13
§ Very complex. Just some major pathways shown. § Extensive binaural interactions § General principle: ü Increasing complexity
vision, touch)
33
FUNCTION Identify and process complex sounds Principle relay to cortex Form full spatial map Locate sound sources in space Start sound feature processing Sound sensor / periphery
34
10/20/20 14
sound is processed
A1 are tonotopically organized (inherit cochleotopy from periphery)
35
35
the one observed in the cochlea.
Cochlea A1
36
36
10/20/20 15
NLL LL TB DCN PVCN AVCN
IC MGB
Range of Temporal modulations 300 Hz 3000 Hz
Fast Medium Slow
30 Hz Auditory nerve Midbrain Cortex
39
40
Example of good decomposition… A non-trivial task
40
10/20/20 16
41
41