Speech Processing Course Number: 40967 Semester: 1397-2 - - PowerPoint PPT Presentation

speech processing
SMART_READER_LITE
LIVE PREVIEW

Speech Processing Course Number: 40967 Semester: 1397-2 - - PowerPoint PPT Presentation

Speech Processing Course Number: 40967 Semester: 1397-2 Instructor: Hossein Sameti Room CE706 sameti@sharif.edu Home page: CE courses 2 Speech Processing: Review of DSP Concepts Review of Probability and


slide-1
SLIDE 1

Speech Processing

slide-2
SLIDE 2

Course Number: 40967

Semester: 1397-2

Instructor: Hossein Sameti

Room CE706

sameti@sharif.edu

Home page: CE courses

2

slide-3
SLIDE 3

3

Speech Processing:

 Review of DSP Concepts  Review of Probability and Stochastic Processes  Anatomy and Physiology of Speech Production

System

 Phonemics and Phonetics  Spectrogram Reading  Linear Prediction Analysis  Speech Coding and Compression  Speech Synthesis (Text to Speech)  Speech Quality Assessment (Subjective and

Objective)

 Speech Recognition (Speech to Text)  Speech Enhancement

slide-4
SLIDE 4

4

Speech Processing:

 Marking Scheme:

Homeworks (written and programming): 20% Course Projects:

10%

Quizzes:

15%

Midterm:

25%

Final Exam:

30%

slide-5
SLIDE 5

5

Speech Processing:

 Text:

 Spoken language processing

 Huang, Acero, Hon, 2000

 Introduction to Digital Speech Processing

 Lawrence R. Rabiner and Ronald W. Schafer, 2007

 Discrete time processing of speech Signals

 Deller,Proakis,Hansen,1993

 Fundamentals of speech recognition

 Rabiner,Juang,1993  Password for any documents for the course:

 40967spring97

slide-6
SLIDE 6

وطسرا‌:

تسا قطان ناويح ،ناسنا.

6

slide-7
SLIDE 7

Old Speech Synthesizers

– Speech organ of Wheatstone, based on a system proposed by Wolfgang von Kempelen in 1791

7

slide-8
SLIDE 8

Old Speech Synthesizers (cont’d)

– Speech organ of Joseph Faber (1830-40)

8

slide-9
SLIDE 9

Old Speech Synthesizers (cont’d)

– Voder demonstrated in 1939

Source: http://www.ling.su.se/staff/hartmut/kemplne.htm

9

slide-10
SLIDE 10

More modern labs

(ICP lab in Grenoble, France)

– Study of the face movements to be included in speech synthesis (and recognition).

10

slide-11
SLIDE 11

Communication via Spoken Language

11

slide-12
SLIDE 12

Communication via Spoken Language

12

slide-13
SLIDE 13

Virtues of Spoken Language

Natural: Requires no special training Flexible: Leaves hands and eyes free Efficient: Has high data rate Economical: Communicated inexpensively Expressive: Conveys more than just words Popular/preferred: Verbal-acoustic problem solving Much longer evolution, compared to written language

13

slide-14
SLIDE 14

Virtues of Spoken Language

 Speech interfaces are ideal for

information access and management when:

 The information space is broad and complex,  The users are not allowed (or at ease or capable) to use

their eyes to read text messages,

 The users are technically naive, or  Only telephones are available.

14

slide-15
SLIDE 15

Diverse Sources of Constraint for Spoken Language Communication

Acoustic: human vocal tract Phonetic: let us pray lettuce spray Phonological: gas shortage fish sandwich Phonotactic: sprachst (german) Syntactic: I am flying to Chicago tomorrow tomorrow I flying Chicago am to Semantic: Is the baby crying Is the bay bee crying Contextual: It is easy to recognize speech It is easy to wreck a nice beach

15

slide-16
SLIDE 16

A Conversational System Architecture

16