Topic 4 Pitch & Frequency (Some slides are adapted from Zhiyao - - PowerPoint PPT Presentation

topic 4
SMART_READER_LITE
LIVE PREVIEW

Topic 4 Pitch & Frequency (Some slides are adapted from Zhiyao - - PowerPoint PPT Presentation

Topic 4 Pitch & Frequency (Some slides are adapted from Zhiyao Duans course slides on Computer Audition and Its Applications in Music) EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008 A musical interlude KOMBU


slide-1
SLIDE 1

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Topic 4

Pitch & Frequency

(Some slides are adapted from Zhiyao Duan’s course slides on Computer Audition and Its Applications in Music)

slide-2
SLIDE 2

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

A musical interlude

  • KOMBU

– This solo by Kaigal-ool of Huun-Huur-Tu (accompanying himself on doshpuluur) demonstrates perfectly the characteristic sound of the Xorekteer voice – An example of Tuvan throat-singing, or Khoomei

slide-3
SLIDE 3

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

The Cochlea

  • Each point on the Basilar membrane resonates to a

particular frequency

  • At the resonance point, the membrane moves
slide-4
SLIDE 4

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Thanks to Oarih Ropshkow

Cross section of Cochlea

Inner hair cells

slide-5
SLIDE 5

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Frequency Sensitivity

  • single nerve measurements

Basilar Membrane Width

slide-6
SLIDE 6

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

We decompose sounds into sines

10 20 30 40 50 60 70 80 90 100
  • 2.5
  • 2
  • 1.5
  • 1
  • 0.5
0.5 1 1.5 2 2.5

220 Hz 660 Hz 1100 Hz Peripheral Auditory system

Cochlea, Auditory nerve

10 20 30 40 50 60 70 80 90 100
  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2
0.2 0.4 0.6 0.8 1 10 20 30 40 50 60 70 80 90 100
  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2
0.2 0.4 0.6 0.8 1 10 20 30 40 50 60 70 80 90 100
  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2
0.2 0.4 0.6 0.8 1

Input: complex sound Output: sine waves

slide-7
SLIDE 7

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Masking

  • A loud tone masks perception of tones at nearby

frequencies

1000 Hz 1000_975_20dB 1000_975_6dB 1000_475_20dB

slide-8
SLIDE 8

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Critical Band

  • Critical band – the frequency range over which a pure

tone interferes with perception of other pure tones

  • Critical bands get wider as frequency increases
slide-9
SLIDE 9

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

More Critical Bands

slide-10
SLIDE 10

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Coding frequency information (a simplified story)

  • Frequencies under 5 kHz

– Individual harmonics are resolved by the cochlea – Coded by place (which nerve bundles along the cochlea are firing) – Coded by time (nerves fire in synchrony to harmonics)

  • Frequencies over 5 kHz

– Individual harmonics can’t be resolved by the inner ear and the frequency is revealed by temporal modulations of the waveform amplitude (resulting in synched neuron activity)

slide-11
SLIDE 11

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Pitch (ANSI 1994 Definition)

  • That attribute of auditory sensation in

terms of which sounds may be ordered on a scale extending from low to high. Pitch depends mainly on the frequency content of the sound stimulus, but also depends on the sound pressure and waveform of the stimulus.

slide-12
SLIDE 12

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Pitch (Operational)

  • A sound has a certain pitch if it can be

reliably matched to a sine tone of a given frequency at 40 db SPL

slide-13
SLIDE 13

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Mel Scale

  • A perceptual scale of pitches judged by

listeners to be equal in distance from one

  • another. The reference point between this

scale and normal frequency measurement is defined by equating a 1000 Hz tone, 40 dB SPL, with a pitch of 1000 mels.

slide-14
SLIDE 14

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Mel Scale

Mel=2595​log↓10 ⁠(1+​𝑔/700 )

slide-15
SLIDE 15

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Mel Scale

  • Above about 500 Hz, larger and larger intervals

are judged by listeners to produce equal pitch increments.

  • The name mel comes from the word melody to

indicate that the scale is based on pitch comparisons.

  • proposed by Stevens, Volkman and Newman

(Journal of the Acoustic Society of America 8(3), pp 185-190, 1937)

slide-16
SLIDE 16

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Ear Craziness

  • Binaural Diplacusis

– Left ear hears a different pitch from the right. – Can be up to 4% difference in perceived pitch

  • Otoacoustic Emissions

– Ears sometimes make noise. – Thought to be a by-product of the sound amplification system in the inner ear. – Caused by activity of the outer hair cells in the cochlea.

slide-17
SLIDE 17

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Harmonic Sound

  • A complex sound with strong sinusoid

components at integer multiples of a fundamental frequency. These components are called harmonics or

  • vertones or partials
  • Sine waves and harmonic sounds are the

sounds that may give a perception of “pitch”

slide-18
SLIDE 18

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Continuity of Sounds

  • Sine wave
  • Strongly harmonic (Flute)
  • Somewhat harmonic (Me)
  • Not very harmonic (Vacuum cleaner)
  • Absolutely not harmonic (White noise)
slide-19
SLIDE 19

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Classify Sounds by Harmonicity

  • Sine wave
  • Strongly harmonic

19

Oboe Clarinet

slide-20
SLIDE 20

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Classify Sounds by Harmonicity

20

  • Somewhat harmonic (quasi-harmonic)

Marimba

5 10 15 20 25

  • 0.2
  • 0.1

0.1 0.2 0.3 Time (ms) Amplitude 1000 2000 3000 4000 5000

  • 40
  • 20

20 40 Frequency (Hz) Magnitude (dB)

Human voice

slide-21
SLIDE 21

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Classify Sounds by Harmonicity

  • Inharmonic

21

Gong

slide-22
SLIDE 22

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Frequency (often) equals pitch

  • Complex tones

– Strongest frequency? – Lowest frequency? – Something else?

  • Let’s listen and explore…
slide-23
SLIDE 23

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Hypothesis

  • Pitch is determined by the lowest strong

frequency component in a complex tone.

slide-24
SLIDE 24

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

The Missing Fundamental

Frequency (linear) Time

slide-25
SLIDE 25

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Hypothesis

  • Pitch is determined by the lowest strong

frequency component in a complex tone.

  • The case of the missing fundamental

proves that ain’t always so.

slide-26
SLIDE 26

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Hypothesis

  • Pitch is determined by the strongest

frequency component in a harmonic tone.

  • Tuvan throat singing seems to back this

up.

  • But what about that case of the missing

fundamental?

slide-27
SLIDE 27

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Hypothesis – “It’s complicated”

  • We hear which frequency components are loudest
  • We decide if they all go together

– Do they all start together? – Do they modulate together?

  • We hear how they are spaced in frequency

– Are they all spaced at intervals which are multiples of a common frequency? – Are their frequencies multiples of the same common frequency?

  • We hear (or don’t hear) a pitch.
slide-28
SLIDE 28

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Shepard Tones

Shepard Risset http://www.cs.ubc.ca/nest/imager/contributions/flinn/Illusions/ST/st.html

slide-29
SLIDE 29

EECS 352: Machine Perception of Music and Audio Bryan Pardo 2008

Shepard tones

  • Make a sound composed of sine waves

spaced at octave intervals.

  • Control their amplitudes by imposing a

gaussian (or something like it) filter in the (log of the) frequency dimension

  • Move all the sine waves up a musical ½

step.

  • Wrap around in frequency.