Speech Processing Course Number: 40967 Semester: 1397-2 - PowerPoint PPT Presentation

Nov 07, 2022 •149 likes •324 views

Speech Processing Course Number: 40967 Semester: 1397-2 Instructor: Hossein Sameti Room CE706 sameti@sharif.edu Home page: CE courses 2 Speech Processing: Review of DSP Concepts Review of Probability and

Speech Processing
Course Number: 40967  Semester: 1397-2  Instructor: Hossein Sameti   Room CE706  sameti@sharif.edu Home page: CE courses  2
Speech Processing:  Review of DSP Concepts  Review of Probability and Stochastic Processes  Anatomy and Physiology of Speech Production System  Phonemics and Phonetics  Spectrogram Reading  Linear Prediction Analysis  Speech Coding and Compression  Speech Synthesis (Text to Speech)  Speech Quality Assessment (Subjective and Objective)  Speech Recognition (Speech to Text)  Speech Enhancement 3
Speech Processing:  Marking Scheme:  Homeworks (written and programming): 20%  Course Projects: 10%  Quizzes: 15%  Midterm: 25%  Final Exam: 30% 4
Speech Processing:  Text:  Spoken language processing  Huang, Acero, Hon, 2000  Introduction to Digital Speech Processing  Lawrence R. Rabiner and Ronald W. Schafer, 2007  Discrete time processing of speech Signals  Deller,Proakis,Hansen,1993  Fundamentals of speech recognition  Rabiner,Juang,1993  Password for any documents for the course:  40967spring97 5
 وطسرا ‌: تسا قطان ناويح ،ناسنا . 6
Old Speech Synthesizers – Speech organ of Wheatstone, based on a system proposed by Wolfgang von Kempelen in 1791 7
Old Speech Synthesizers (cont ’ d) – Speech organ of Joseph Faber (1830-40) 8
Old Speech Synthesizers (cont ’ d) – Voder demonstrated in 1939 Source: http://www.ling.su.se/staff/hartmut/kemplne.htm 9
More modern labs (ICP lab in Grenoble, France) – Study of the face movements to be included in speech synthesis (and recognition). 10
Communication via Spoken Language 11
Communication via Spoken Language 12
Virtues of Spoken Language Natural: Requires no special training Flexible: Leaves hands and eyes free Efficient: Has high data rate Economical: Communicated inexpensively Expressive: Conveys more than just words Popular/preferred: Verbal-acoustic problem solving Much longer evolution, compared to written language 13
Virtues of Spoken Language  Speech interfaces are ideal for information access and management when:  The information space is broad and complex,  The users are not allowed (or at ease or capable) to use their eyes to read text messages,  The users are technically naive, or  Only telephones are available. 14
Diverse Sources of Constraint for Spoken Language Communication Acoustic: human vocal tract Phonetic: let us pray lettuce spray Phonological: gas shortage fish sandwich Phonotactic: sprachst (german) Syntactic: I am flying to Chicago tomorrow tomorrow I flying Chicago am to Semantic: Is the baby crying Is the bay bee crying Contextual: It is easy to recognize speech It is easy to wreck a nice beach 15
A Conversational System Architecture 16

Recommend

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs Text Speech vs Text Same but different Same but different Core Speech Technologies Core Speech Technologies Speech Recognition Speech

706 views • 38 slides

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation Evaluating Speech Synthesis Evaluating Speech Synthesis How good is the voice? How good is the voice? This voice is a 45.67 This voice is a

466 views • 24 slides

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis From text to speech From text to speech Text Analysis Text Analysis Strings of characters to words Strings of characters to words

670 views • 25 slides

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody Speech Synthesis Speech Synthesis Linguistic Analysis Linguistic Analysis Pronunciations Pronunciations Prosody Prosody

422 views • 24 slides

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic modeling Pronunciation dictionary Acoustic Modeling Acoustic Modeling Speech and Signal Variability Speech and Signal Variability Measuring

625 views • 27 slides

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone Sequence To Speech Articulatory Approaches Concatenative Approaches HMM-based Approaches Rule-Based Approaches 1 Speech Synthesis Concept

751 views • 57 slides

Speech Processing for Speech Processing for Unwritten Languages Unwritten Languages Alan W

Speech Processing for Speech Processing for Unwritten Languages Unwritten Languages Alan W Black Language Technologies Institute Carnegie Mellon Universit y ISCSLP 2016 Tianjin, China Speech Processing for Speech Processing for

583 views • 47 slides

Speech Processing 15-492/18-492 Speech Recognition Signal Processing Analog to Digital Speech

Speech Processing 15-492/18-492 Speech Recognition Signal Processing Analog to Digital Speech (sound) is analog Speech (sound) is analog Computers are digital Computers are digital We need to convert We need to convert

499 views • 15 slides

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis From text to speech Text Analysis Strings of characters to words Linguistic Analysis From words to pronunciations and prosody

490 views • 25 slides

Chapter 1 Introduction to Speech Signal Processing 1 Outline The

Chapter 1 Introduction to Speech Signal Processing 1 Outline The Speech Signal Speech Signal Processing Speech Production/Perception Model and the Speech Chain The Speech Stack Applications

669 views • 51 slides

Speech Processing 15-492/18-492 Speech Processing Current Topics and Future challenges

Speech Processing 15-492/18-492 Speech Processing Current Topics and Future challenges Commercial and Research Current and Future What are the hot topics in Speech What are the hot topics in Speech What currently works What

546 views • 16 slides

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Grammars

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Grammars Other ASR techniques But not just acoustics But not just acoustics But not all phones are equi-probable Find word sequences that maximizes

572 views • 20 slides

Speech Processing 11-492/18-495 Speech Processing Current Topics and Future challenges

Speech Processing 11-492/18-495 Speech Processing Current Topics and Future challenges Commercial and Research Current and Future Current and Future What are the hot topics in Speech What are the hot topics in Speech What currently

434 views • 17 slides

Speech Processing 15-492/18-492 Speech Synthesis Pronunciation Letter to Sound rules Speech

Speech Processing 15-492/18-492 Speech Synthesis Pronunciation Letter to Sound rules Speech Synthesis Linguistic Analysis Linguistic Analysis Pronunciations Pronunciations Prosody Prosody Part of Speech Tagging

383 views • 21 slides

Speech Processing 15-492/18-492 Computer Speech Analog to Digital Speech (sound) is analog

Speech Processing 15-492/18-492 Computer Speech Analog to Digital Speech (sound) is analog Speech (sound) is analog Computers are digital Computers are digital We need to convert We need to convert Sample from A- -D

476 views • 18 slides

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 References: 1. Speech and

Part-of-Speech Tagging Part-of-Speech Tagging Berlin Chen 2003 References: 1. Speech and Language Processing, chapter 8 2. Foundations of Statistical Natural Language Processing, chapter 10 1 Review Tagging (part-of-speech tagging)

674 views • 38 slides

WRP Steering Committee with Committee Co-Chair Meeting SEPTEMBER 14-15, 2017 Sept 14-15 Agenda

WRP Steering Committee with Committee Co-Chair Meeting SEPTEMBER 14-15, 2017 Sept 14-15 Agenda (10 min) Brief Overview of WRP, History, Mission Information (60 min) 2017 WRP Regional Assessment Seeking input on next steps (120 min)

1.61k views • 119 slides

Search vs. planning Consider the task get milk, bananas, and a cordless drill Standard search

Search vs. planning Consider the task get milk, bananas, and a cordless drill Standard search algorithms seem to fail miserably: Talk to Parrot Planning Go To Pet Store Buy a Dog Go To School Go To Class Go To Supermarket Buy Tuna Fish

303 views • 7 slides

Workshop: How to catch a flying pig: Facilitating embodiment work in online rooms Overcoming

Workshop: How to catch a flying pig: Facilitating embodiment work in online rooms Overcoming the tyranny of distance Petro Janse van Vuuren & Christian F. Freisleben www.playingmantis.net cc_by_nc_van Vuuren / Freisleben Rules for our

103 views • 6 slides

Factual background 1896 1920 1987 2006 World oceans: 72% of the Earths surface

17/05/2017 Factual background 1896 1920 1987 2006 World oceans: 72% of the Earths surface Conservation and Sustainable Use of Marine EEZs: 30% of ocean space (under sovereign rights Biological Resources: Some Reflections from and state

348 views • 17 slides

WEEK 12: Communicating Motion with Words IAT100 Digital Image Design, Fall 2013 Chantal Gibson

WEEK 12: Communicating Motion with Words IAT100 Digital Image Design, Fall 2013 Chantal Gibson Simon Fraser University , School of Interactive Arts & Technology | Fall 2013 Todays Agenda Communicating Movement in/with Words

475 views • 19 slides

R. C. Airships developed in IIT Bombay Prof. Rajkumar S. Pant Aerospace Engineering Department

R. C. Airships developed in IIT Bombay Prof. Rajkumar S. Pant Aerospace Engineering Department IIT Bombay AE-664 Lighter-Than-Air Systems Capsule-1 Genesis of LTA Systems @ IIT Bombay PADD AE-664 Lighter-Than-Air Systems Capsule-1 Program on

703 views • 47 slides

SAMA-VTOL Aerial Image Dataset (SVAID): A New UAV Image Dataset for Advanced Remote Sensing

SAMA-VTOL Aerial Image Dataset (SVAID): A New UAV Image Dataset for Advanced Remote Sensing Research Abbas Ebrahimi Mohammad Reza Bayanlou Mehdi Khoshboresh Masouleh Aerospace Engineering Department, Aerospace Engineering Department, School

487 views • 11 slides

The LambrechtsStanley Model of Configuration Spaces Najib Idrissi Najib Idrissi The

The model FultonMacPherson operad Sketch of proof Factorization homology The LambrechtsStanley Model of Configuration Spaces Najib Idrissi Najib Idrissi The LambrechtsStanley Model of Configuration Spaces The model

727 views • 23 slides