Sound 2: frequency analysis Tues. March 27, 2018 1 Speed of Sound - - PowerPoint PPT Presentation

β–Ά
sound 2 frequency analysis
SMART_READER_LITE
LIVE PREVIEW

Sound 2: frequency analysis Tues. March 27, 2018 1 Speed of Sound - - PowerPoint PPT Presentation

COMP 546 Lecture 19 Sound 2: frequency analysis Tues. March 27, 2018 1 Speed of Sound Sound travels at about 340 m/s, or 34 cm/ ms. (This depends on temperature and other factors) 2 Wave equation =


slide-1
SLIDE 1

1

COMP 546

Lecture 19

Sound 2: frequency analysis

  • Tues. March 27, 2018
slide-2
SLIDE 2

Speed of Sound

Sound travels at about 340 m/s, or 34 cm/ ms. (This depends on temperature and other factors)

2

slide-3
SLIDE 3

Wave equation

3

𝑄𝑠𝑓𝑑𝑑𝑣𝑠𝑓 = 𝐽𝑏𝑒𝑛 + 𝐽(π‘Œ, 𝑍, π‘Ž, 𝑒) 𝐽(π‘Œ, 𝑍, π‘Ž, 𝑒) is not an arbitrary function. Rather:

πœ–2 πœ–π‘Œ2 + πœ–2 πœ–π‘2 + πœ–2 πœ–π‘Ž2 𝐽 π‘Œ, 𝑍, π‘Ž, 𝑒 = 1 𝑀2 πœ–2 πœ–π‘’2 𝐽 π‘Œ, 𝑍, π‘Ž, 𝑒 𝑀 = 340 m/s

slide-4
SLIDE 4

The wave equation + boundary conditions give complicated shadow and reflection effects. What happens when sound enters the ear ?

4

plane wave + single slit sea waves + islands

slide-5
SLIDE 5

5

Musical sounds

(brief introduction)

slide-6
SLIDE 6

6

Modes are sin(

𝜌 𝑀 π‘˜π‘¦) where 𝑀 is the length of the string, π‘˜ is an integer.

Write one string displacement at t = 0 as sum of sines. Example: guitar

slide-7
SLIDE 7

7

Physics says: where constant 𝑑 depends on physical properties of string (mass density, tension)

πœ• = 𝑑 𝑀

𝑀

slide-8
SLIDE 8

8

Modes of a vibrating string each have fixed points which reduce the effective length.

πœ• = 𝑑 𝑀 2𝑑 𝑀 3𝑑 𝑀 4𝑑 𝑀

𝑀 2 𝑀 3 𝑀 4

𝑀 Physics says:

slide-9
SLIDE 9

9

πœ• = 𝑑 𝑀 2𝑑 𝑀 3𝑑 𝑀 4𝑑 𝑀

πœ•0

β€œfundamental” β€œovertones” (1st harmonic) The temporal frequency 𝑛 πœ•0 is called the 𝑛-th harmonic.

slide-10
SLIDE 10

10

For stringed instruments, most of the sound is produced by vibrations

  • f the instrument body (neck, front and back plates).

http://www.acs.psu.edu/drussell/guitars/hummingbird.html

The lines in the sketches below are the nodal points. They don't move.

These are vibration modes, not harmonics. The guitar sound is a sum of these modes.

slide-11
SLIDE 11

11

Difference of two frequencies πœ•1 and πœ•2 : π‘šπ‘π‘•2

πœ•2 πœ•1

  • ctaves.

e.g. 1 octave is a doubling of frequency.

slide-12
SLIDE 12

(Western) Musical Notes

Each β€œoctave” ABCDEFGA is divided into 12 β€œsemitones”, separated into 1/12 octave. C-D, D-E, F-G, G-A, A-B are two semitones each E-F, B-C are one semitone each.

12

slide-13
SLIDE 13

13

Q: How many semi-tones are there from πœ•0 to πœ• ?

slide-14
SLIDE 14

14

Q: How many semi-tones are there from πœ•0 to πœ• ? A: 12 π‘šπ‘π‘•2

πœ• πœ•0

πœ•0

Fundamental frequency of note

πœ•

slide-15
SLIDE 15

15

88 fundamental frequencies (Hz) on a keyboard The fundamental frequencies of successive notes define a geometric progression. This is different from the harmonics of a vibrating string which define an arithmetic progression.

slide-16
SLIDE 16

Speech Sounds

16

slide-17
SLIDE 17

What determines speech sounds?

  • voiced vs. unvoiced

β€˜zzzz’ vs. β€˜ssss’, β€˜vvvv’ vs. β€˜ffff’

  • articulators (jaw, tongue, lips)

β€˜aaaa’, β€˜eeee’, β€˜oooo’, …

17

slide-18
SLIDE 18

18

Voiced sounds are produced by β€œglottal pulses”.

π‘˜=0 π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š

𝑕 𝑒 βˆ’ π‘˜ π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

slide-19
SLIDE 19

19

Exercise 16 Q7.

𝑕 𝑒 βˆ’ 𝑒0 = 𝑕 𝑒 βˆ— πœ€(𝑒 βˆ’ 𝑒0)

slide-20
SLIDE 20

20

Voiced sounds are produced by β€œglottal pulses”.

π‘˜=0 π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š

𝑕 𝑒 βˆ’ π‘˜ π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

= 𝑕 𝑒 βˆ—

π‘˜=0 π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š

πœ€ 𝑒 βˆ’ π‘˜ π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

slide-21
SLIDE 21

21

π‘˜=0 π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š

𝑕 𝑒 βˆ’ π‘˜ π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

decrease π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š by increasing tension in vocal cords

increase frequency of pulses ≑

slide-22
SLIDE 22

22

𝐽 𝑒

= 𝑏 𝑒 βˆ— 𝑕 𝑒 βˆ—

π‘˜=0 π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š

πœ€ 𝑒 βˆ’ π‘˜ π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

Let 𝑏 𝑒 be the impulse response function of the articulators.

(jaw, tongue,lips)

slide-23
SLIDE 23

23

π‘œπ‘žπ‘£π‘šπ‘‘π‘“ βˆ’1 π‘œπ‘žπ‘£π‘šπ‘‘π‘“ βˆ’1

slide-24
SLIDE 24

24

π‘œπ‘žπ‘£π‘šπ‘‘π‘“ βˆ’1 π‘œπ‘žπ‘£π‘šπ‘‘π‘“ βˆ’1

slide-25
SLIDE 25

25

slide-26
SLIDE 26

26

Oral and nasal cavity have resonant modes of vibration, like air cavity in guitar does.

slide-27
SLIDE 27

27

Time domain Temporal frequency domain Peaks are called β€œformants”

slide-28
SLIDE 28

28

π‘ˆ

𝑕 is the period of the glottal pulse train.

The pulse train has π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š pulses in π‘ˆ time steps, i.e. π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š = π‘ˆ.

Assume that the Fourier transform is taken over π‘ˆ samples.

𝐆

π‘˜=0 π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š

πœ€ 𝑒 βˆ’ π‘˜ π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

= ?

slide-29
SLIDE 29

29

π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š

Assignment 3: Show

𝐆

π‘˜=0 π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘šβˆ’1

πœ€ 𝑒 βˆ’ π‘˜ π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

= π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š

𝑛=0 π‘ˆπ‘•π‘šπ‘π‘’π‘’π‘π‘š βˆ’1

πœ€ πœ• βˆ’ 𝑛 π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š

slide-30
SLIDE 30

Units of temporal frequency πœ•

30

π‘ˆ

π‘•π‘šπ‘π‘’π‘’π‘π‘š is the period of the glottal pulse train.

π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š pulses in π‘ˆ time samples. To convert π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š to β€˜pulses per second’, we divide π‘ˆ (to get pulses per sample) and then multiply by β€˜time samples per second’. High quality audio uses 44,100 samples per second.

slide-31
SLIDE 31

31

π‘œπ‘•π‘šπ‘π‘’π‘’π‘π‘š is the fundamental frequency of the voiced sound. It determines the "pitch". Adult males : 100-150 Adult females : 150-250 Hz Children: over 250 Hz

slide-32
SLIDE 32

32

πœ•0 = 100 𝐼𝑨 πœ•0 = 200 𝐼𝑨 glottal pulse spectrum formant spectrum sound spectrum glottal pulse spectrum β€œformants” sound spectrum

slide-33
SLIDE 33

33

Voiced vowel sounds

slide-34
SLIDE 34

Unvoiced sounds

noise instead of glottal pulses

34

slide-35
SLIDE 35

Unvoiced sounds

noise instead of glottal pulses

35

Flat amplitude spectrum

  • n average ( β€˜white noise’)
slide-36
SLIDE 36

Restrict flow of air by moving tongue, lips into contact with the teeth & palate. Fricatives

  • voiced z, v, zh, th (the)
  • unvoiced ?

Stops

  • voiced b, d, g
  • unvoiced ?

Nasals (closed mouth)

  • m, n, ng

Consonants

36

slide-37
SLIDE 37

Restrict flow of air by moving tongue, lips into contact with the teeth & palate. Fricatives

  • voiced z, v, zh, th (the)
  • unvoiced s, f, sh, th (theta)

Stops

  • voiced b, d, g
  • unvoiced p, t, k

Nasals (closed mouth)

  • m, n, ng

Consonants

37

slide-38
SLIDE 38

38

I did not have time to cover the following slides properly. I will present them again in lecture 22.

slide-39
SLIDE 39

Spectrogram

39

Partition a sound signal into 𝐢 blocks of π‘ˆ samples each (i.e. the sound has πΆπ‘ˆ samples in total). Take the Fourier transform of each block.

slide-40
SLIDE 40

Spectrogram

40

Partition a sound signal into 𝐢 blocks of π‘ˆ samples each (i.e. the sound has πΆπ‘ˆ samples in total). Take the Fourier transform of each block. Let 𝑐 be the block number, and πœ• units be cycles per block.

slide-41
SLIDE 41

41

Cycles per second (Hz) Time (samples) πœ•0 =

slide-42
SLIDE 42

42

e.g. T = 512 samples (12 ms), πœ•0 = 86 Hz T = 2048 samples (48 ms) πœ•0 = 21 Hz

slide-43
SLIDE 43

43

e.g. T = 512 samples (12 ms), πœ•0 = 86 Hz T = 2048 samples (48 ms), πœ•0 = 21 Hz You cannot simultaneously localize the frequency and the time. This is a fundamental

  • tradeoff. We have seen it before (recall the Gaussian).
slide-44
SLIDE 44

44

Narrowband

(good frequency resolution, poor temporal resolution … ~50ms)

Wideband

(poor frequency resolution, good temporal resolution)

slide-45
SLIDE 45

45

Examples: Spectrograms of 10 vowel sounds