Speech Processing 15-492/18-492 Human Speech Processing Phonetics - - PowerPoint PPT Presentation

speech processing 15 492 18 492
SMART_READER_LITE
LIVE PREVIEW

Speech Processing 15-492/18-492 Human Speech Processing Phonetics - - PowerPoint PPT Presentation

Speech Processing 15-492/18-492 Human Speech Processing Phonetics and Phonology This lecture is recorded The vocal tract The vocal tract From meat to voice From meat to voice Blow air through lungs Blow air through lungs Vibrate


slide-1
SLIDE 1

Speech Processing 15-492/18-492

Human Speech Processing Phonetics and Phonology This lecture is recorded

slide-2
SLIDE 2

The vocal tract The vocal tract

slide-3
SLIDE 3

From meat to voice From meat to voice

 Blow air through lungs

Blow air through lungs

 Vibrate larynx

Vibrate larynx

 Vocal tract shape defines resonance

Vocal tract shape defines resonance

 Obstructions modify sound

Obstructions modify sound

 Tongue, teeth, lips, velum (nasal passage)

Tongue, teeth, lips, velum (nasal passage)

slide-4
SLIDE 4

The ear The ear

slide-5
SLIDE 5

From sound to brain waves From sound to brain waves

 Sound waves

Sound waves

 Vibrate ear drum

Vibrate ear drum

 Cause fluid in cochlear to vibrate

Cause fluid in cochlear to vibrate

 Spiral cochlear

Spiral cochlear

 Vibrate hairs inside cochlear

Vibrate hairs inside cochlear

 Different frequencies vibrate different hairs

Different frequencies vibrate different hairs

 Converts time domain to frequency domain

Converts time domain to frequency domain

slide-6
SLIDE 6

From grunts to meaning From grunts to meaning

 Grunts and vocalization

Grunts and vocalization

 Lots of variation available

Lots of variation available

 (continuous systems – not discrete)

(continuous systems – not discrete)

 Noises become distinct, recognizable

Noises become distinct, recognizable

 Grow into languages, dialects and idiolects

Grow into languages, dialects and idiolects

 What are the fundamental units?

What are the fundamental units?

slide-7
SLIDE 7

Articulatory Movements Articulatory Movements

slide-8
SLIDE 8

Electromagnetic Articulograph Electromagnetic Articulograph

slide-9
SLIDE 9

Phonemes Phonemes

 Defined as fundamental units of speech

Defined as fundamental units of speech

 If you change it, it (can) change the meaning

If you change it, it (can) change the meaning

“ “pat” to “bat” pat” to “bat” “ “pat” to “pam” pat” to “pam”

slide-10
SLIDE 10

IPA IPA

 International Phonetic Alphabet

International Phonetic Alphabet

 Defines everything

Defines everything

 All vowels, consonants, modifications

All vowels, consonants, modifications

 All distinctions, for all languages

All distinctions, for all languages

 Uses latin++ character set to do it

Uses latin++ character set to do it

 But it can be hard to type in computer programs

But it can be hard to type in computer programs

 Organized by

Organized by

 Vowels

Vowels

 Consonants

Consonants

slide-11
SLIDE 11

Vowel Space Vowel Space

  • One or two banded frequencies (formants)
slide-12
SLIDE 12

Consonant Chart Consonant Chart

  • Place and Manner of Articulation

Wikipedia: IPA

slide-13
SLIDE 13

English (US) Vowels English (US) Vowels

AA wAshington AE fAt, bAd AH bUt, hUsh AO lAWn, mAll AW hOW, sOUth AX About, cAnoe AY hIde, bUY EH gEt, fEAther ER makER, sEARch EY gAte, EIght IH bIt, shIp IY bEAt, shEEp OW lOne, nOse OY tOY, OYster UH fUll U W fOOl

slide-14
SLIDE 14

English Consonants English Consonants

 Stops: P, B, T, D, K, G

Stops: P, B, T, D, K, G

 Fricatives: F, V, HH, S, Z, SH, ZH

Fricatives: F, V, HH, S, Z, SH, ZH

 Affricatives: CH, JH

Affricatives: CH, JH

 Nasals: N, M, NG

Nasals: N, M, NG

 Glides: L, R, Y, W

Glides: L, R, Y, W

 Note: voiced vs unvoiced:

Note: voiced vs unvoiced:

 P vs B, F vs V

P vs B, F vs V

slide-15
SLIDE 15

Number of Phonemes in Language Number of Phonemes in Language

 US English: 43

US English: 43

 UK English: 44

UK English: 44

 Japanese: 25

Japanese: 25

 Hindi: 81

Hindi: 81

 Numbers aren’t definite though

Numbers aren’t definite though

 Depends on who you ask,

Depends on who you ask,

 And what you want it for

And what you want it for

slide-16
SLIDE 16

Not all variation is Phonetic Not all variation is Phonetic

 Phonology: linguistically discrete units

Phonology: linguistically discrete units

 May be a number of different ways to say them

May be a number of different ways to say them

 /r/ trill (Scottish or Spanish) vs US way

/r/ trill (Scottish or Spanish) vs US way

 Phonetics vs Phonemics

Phonetics vs Phonemics

 Phonetics: discrete units

Phonetics: discrete units

 Phonemics: all sounds

Phonemics: all sounds

 /t/ in US English: becomes “flap”

/t/ in US English: becomes “flap”

 “

“water” / w ao t er / water” / w ao t er /

 “

“water” / w ao dx er / water” / w ao dx er /

slide-17
SLIDE 17

Dialect and Idiolect Dialect and Idiolect

 Variation within language (and speakers)

Variation within language (and speakers)

 Phonetic

Phonetic

 “

“Don” vs “Dawn”, “Cot” vs “Caught” Don” vs “Dawn”, “Cot” vs “Caught”

 R deletion (Haavaad vs Harvard)

R deletion (Haavaad vs Harvard)

 Word choice:

Word choice:

 Y’all, Yins

Y’all, Yins

 Politeness levels

Politeness levels

slide-18
SLIDE 18

Not all languages use the same set Not all languages use the same set

 Asperated stops (Korean, Hindi)

Asperated stops (Korean, Hindi)

 P vs PH

P vs PH

 English uses both, but doesn’t care

English uses both, but doesn’t care

 Pot vs sPot (place hand over mouth)

Pot vs sPot (place hand over mouth)

 L-R in Japanese not phonological

L-R in Japanese not phonological

 US English dialects:

US English dialects:

 Mary, Merry, Marry

Mary, Merry, Marry

 Scottish English vs US English

Scottish English vs US English

 No distinction between “pull” and “pool”

No distinction between “pull” and “pool”

 Distinction between: “for” and “four”

Distinction between: “for” and “four”

slide-19
SLIDE 19

Different language dimensions Different language dimensions

 Vowel length

Vowel length

 Bit vs beat

Bit vs beat

 Japanese: shujin (husband) vs shuujin (prisoner)

Japanese: shujin (husband) vs shuujin (prisoner)

 Tones

Tones

 F0 (tune) used phonetically

F0 (tune) used phonetically

 Chinese, Thai, Burmese

Chinese, Thai, Burmese

 Clicks

Clicks

 Xhosa

Xhosa

slide-20
SLIDE 20

Co-articulation Co-articulation

 Voicing actually doesn’t always stop

Voicing actually doesn’t always stop

 “

“have honey”, “impossible” have honey”, “impossible”

 Nasalized voices, lip rounding

Nasalized voices, lip rounding

 “

“min” vs “bit”, “sow” vs “see” min” vs “bit”, “sow” vs “see”

 Lexical stress:

Lexical stress:

 EMphasis, emPHAsis

EMphasis, emPHAsis

 PROject, proJECT

PROject, proJECT

 Reduction, contraction

Reduction, contraction

 “

“A boy is riding a bike” A boy is riding a bike”

 “

“I want to go to Disneyland.” I want to go to Disneyland.”

 “

“I will go tomorrow” I will go tomorrow”

slide-21
SLIDE 21

Prosody Prosody

 Intonation

Intonation

 Tune

Tune

 Duration

Duration

 How long/short of each phoneme

How long/short of each phoneme

 Phrasing

Phrasing

 Where the breaks are

Where the breaks are

slide-22
SLIDE 22

Intonation (F0) Intonation (F0)

 Rate of vibration during voiced speech

Rate of vibration during voiced speech

 Males: 80-140 times a second

Males: 80-140 times a second

 Females: 130-220 times a second

Females: 130-220 times a second

 Children: 180-320 times a second

Children: 180-320 times a second

 Used for:

Used for:

 Emphasis

Emphasis

 Style: questions, statements, confidence etc

Style: questions, statements, confidence etc

slide-23
SLIDE 23

Intonation Contour Intonation Contour

slide-24
SLIDE 24

Intonation Information Intonation Information

 Large pitch range (female)

Large pitch range (female)

 Authoritive since goes down at the end

Authoritive since goes down at the end

 News reader

News reader

 Emphasis for Finance H*

Emphasis for Finance H*

 Final has a raise – more information to

Final has a raise – more information to come come

 Female American newsreader from WBUR

Female American newsreader from WBUR

 (Boston University Radio)

(Boston University Radio)

slide-25
SLIDE 25

Words Words

 Words

Words

 The things with space around them (sort of)

The things with space around them (sort of)

 Chinese, Thai, Japanese doesn’t use spaces

Chinese, Thai, Japanese doesn’t use spaces

 Speech doesn’t use spaces

Speech doesn’t use spaces

 Blackboard vs Black Board

Blackboard vs Black Board

 English

English

 Morphology: walk, walks, walking, walked

Morphology: walk, walks, walking, walked

 Japanese

Japanese

 Morphology: aruku, arukimasu, arukimashita, aruite, aruikitai,

Morphology: aruku, arukimasu, arukimashita, aruite, aruikitai, aruikitakatta, arukemasu, …. aruikitakatta, arukemasu, ….

slide-26
SLIDE 26

Speech Acts Speech Acts

 Words aren’t always what they seem

Words aren’t always what they seem

 Can you pass the salt?

Can you pass the salt?

 Boston. Boston! Boston?

  • Boston. Boston! Boston?

 Yeah, right

Yeah, right

 Multiple ways to say the same thing:

Multiple ways to say the same thing:

 I want to go to Boston.

I want to go to Boston.

 Yes

Yes

slide-27
SLIDE 27

Human Speech Human Speech

 Human production and perception

Human production and perception

 Quite different from computers

Quite different from computers

 Phonology

Phonology

 Defining the alphabet of speech

Defining the alphabet of speech

 Different languages make different distinctions

Different languages make different distinctions

 Intonation

Intonation

 How its said

How its said

slide-28
SLIDE 28
slide-29
SLIDE 29

IPA IPA

 International Phonetic Alphabet

International Phonetic Alphabet

 Defines everything

Defines everything

 All vowels, consonants, modifications

All vowels, consonants, modifications

 All distinctions, for all languages

All distinctions, for all languages

 Uses latin++ character set to do it

Uses latin++ character set to do it

 But it can be hard to type in computer programs

But it can be hard to type in computer programs