Effective Open Source Speech Recognition in Your Application - PowerPoint PPT Presentation

Sep 25, 2023 •258 likes •411 views

Effective Open Source Speech Recognition in Your Application #kde-speech Peter Grasch peter@grasch.net The Basics Speech model Decoder Acoustic model Language model Sounds Vocabulary Grammar Open Source Speech Recognition

Effective Open Source Speech Recognition in Your Application #kde-speech Peter Grasch peter@grasch.net
The Basics Speech model Decoder Acoustic model Language model ● Sounds ● Vocabulary ● Grammar
Open Source Speech Recognition Decoder Trainer UI CMU SPHINX ✓ ✓ (PocketSphinx, SphinxTrain) Julius ✓ KALDI ✓ ✓ Simon ✓ ✓ ✓
Standard Architecture Commands Simond Simon Your application ? Acoustic model Language model
Standard Architecture Commands Simond Simon Scenario Scenario Your application Scenario Acoustic model Language model
Headless Architecture Commands Simond Simon Your application Acoustic model Language model
Embedded Architecture Commands Simond Simon Your application Acoustic model Language model Decoder
Standard Architecture Commands Simond Simon Scenario Scenario Your application Scenario Acoustic model Language model
Writing your Scenario ● Lay out the commands you want to support ● Create: – Vocabulary – Grammar – Commands
Writing your Scenario Demonstration
Tighter Integration: Write a Custom Command Plug-In ● Full, programmatic control of the scenario ● Meta information of recognition results: – Phonetic transcriptions – Confidence scores* – Alternative results*
Tighter Integration: Write a Custom Command Plug-In Demonstration
Q & A #kde-speech Peter Grasch peter@grasch.net
Thank you for your attention

Recommend

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types 1 7-Speech Recognition (Cont d) HMM Calculating Approaches

1.08k views • 74 slides

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs Text Speech vs Text Same but different Same but different Core Speech Technologies Core Speech Technologies Speech Recognition Speech

705 views • 38 slides

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

What Is Speech Recognition? EECS E6870 converting speech to text Speech Recognition automatic speech recognition (ASR), speech-to-text (STT) what its not Michael Picheny,

345 views • 22 slides

HMMS and Speech HMMS and Speech HMMS and Speech Recognition Recognition Recognition Presented

HMMS and Speech HMMS and Speech HMMS and Speech Recognition Recognition Recognition Presented by Jen-Wei Kuo Reference 1. X. Huang et. al., Spoken Language Processing, Chapter 8 2. Daniel Jurafsky and James H. Martin, Speech and Language

1.05k views • 65 slides

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech synthesis (Concluding lecture) Instructor: Preethi Jyothi Nov 6, 2017 Recall: SPSS framework O Speech Speech Train Parameter

273 views • 26 slides

Speech recognition Brief history Technology Computer Literacy 1 Lecture 22 How does

Topics Definition of speech recognition Speech recognition Brief history Technology Computer Literacy 1 Lecture 22 How does speech recognition work 10/11/2008 Speaker recognition Problems of speech and speaker recognition

325 views • 6 slides

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone Sequence To Speech Articulatory Approaches Concatenative Approaches HMM-based Approaches Rule-Based Approaches 1 Speech Synthesis Concept

749 views • 57 slides

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic modeling Pronunciation dictionary Acoustic Modeling Acoustic Modeling Speech and Signal Variability Speech and Signal Variability Measuring

622 views • 27 slides

Speech Processing 15-492/18-492 Speech Recognition Template matching Speech Recognition by

Speech Processing 15-492/18-492 Speech Recognition Template matching Speech Recognition by Templates A little history A little history Matching Templates Matching Templates DTW (Dynamic Time Warping) DTW (Dynamic

379 views • 24 slides

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 23: Speech

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 23: Speech Synthesis (Part I) Instructor: Preethi Jyothi Oct 30, 2017 T ext- T o- S peech Systems Storied History Von Kempelens speaking machine (1791)

290 views • 8 slides

GPU-Accelerated GPU-Accelerated Large Vocabulary Continuous Speech Recognition Large

GPU-Accelerated GPU-Accelerated Large Vocabulary Continuous Speech Recognition Large Vocabulary Continuous Speech Recognition for Scalable Distributed Speech Recognition for Scalable Distributed Speech Recognition Jungsuk Kim

600 views • 34 slides

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 1: Introduction

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 1: Introduction to Statistical Speech Recognition Instructor: Preethi Jyothi Lecture 1 Course Specifics About the course (I) Main Topics: Introduction to

525 views • 36 slides

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 1: Introduction

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 1: Introduction to Statistical Speech Recognition Instructor: Preethi Jyothi July 24, 2017 Course Specifics Pre-requisites Ideal Background: Completed one of

732 views • 44 slides

Speech Processing 15-492/18-492 Speech Recognition Signal Processing Analog to Digital Speech

Speech Processing 15-492/18-492 Speech Recognition Signal Processing Analog to Digital Speech (sound) is analog Speech (sound) is analog Computers are digital Computers are digital We need to convert We need to convert

499 views • 15 slides

Speech Processing 15-492/18-492 Speech Recognition Intro Acoustic modelling HMMs Speech

Speech Processing 15-492/18-492 Speech Recognition Intro Acoustic modelling HMMs Speech Recognition From acoustics to text From acoustics to text Acoustic modeling Acoustic modeling Recognizing all forms of all phonemes

655 views • 27 slides

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 12: Acoustic

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 12: Acoustic Feature Extraction for ASR Instructor: Preethi Jyothi Feb 13, 2017 Speech Signal Analysis Generate A frame discrete samples Need to

441 views • 26 slides

Speech: The Next Generation Bryan Catanzaro along with Baidu

Speech: The Next Generation Bryan Catanzaro along with Baidu SVAIL Speech Recognition: interface of the future Awni Hannun Bryan Catanzaro Speech

534 views • 21 slides

24.10.13 Investa Office Fund (ASX:IOF) Annual Unitholder Meeting Dear Sir/Madam, Enclosed is

24.10.13 Investa Office Fund (ASX:IOF) Annual Unitholder Meeting Dear Sir/Madam, Enclosed is the address to be given by the Chairman and the Fund Manager along with the Annual Unitholder Meeting presentation for Investa Office Fund to be

698 views • 34 slides

Tetsuya Fujimoto Managing Executive Officer in charge of Finance Thank you for joining our

November 2, 2017 (For your information) Mazda Motor Corporation FISCAL YEAR MARCH 2018 FIRST HALF FINANCIAL RESULTS (Speech Outline) Tetsuya Fujimoto Managing Executive Officer in charge of Finance Thank you for joining our earnings

635 views • 5 slides

Speaking Speaking under co under cover er: The he impact impact of of f face ace-con

Speaking Speaking under co under cover er: The he impact impact of of f face ace-con concea cealing ling gar garments ments on on the aco the acoustics of ustics of fri frica cativ tives. es. Natalie Fecher Language &

733 views • 29 slides

MUSCLE WP5 Showcase: M. Perakakis E. Sanchez-Soto Real-Time Audio-Visual ICCS-NTUA

Groups and Researchers Involved TSI-TUC A. Potamianos (showcase leader) MUSCLE WP5 Showcase: M. Perakakis E. Sanchez-Soto Real-Time Audio-Visual ICCS-NTUA Automatic Speech Recognition P. Maragos (group leader)

257 views • 3 slides

Audio Files Realignment by Dynamic Time Warping (DTW) Florian Picard, Florian Tilquin June 27,

Introduction Dynamic Time Warping Audio files realignment Speech recognition Conclusion Audio Files Realignment by Dynamic Time Warping (DTW) Florian Picard, Florian Tilquin June 27, 2013 1 / 22 Introduction Dynamic Time Warping Audio

459 views • 26 slides

WHAT IF WE HAD 5.5 MILLION PEOPLE DISCUSSING HOW TO APPLY #AI IN EVERYDAY LIFE? 2 AI IS A NEW

1 WHAT IF WE HAD 5.5 MILLION PEOPLE DISCUSSING HOW TO APPLY #AI IN EVERYDAY LIFE? 2 AI IS A NEW CIVIC SKILL Everyone has right to digital equality. We must take care that everyone has the opportunity to learn needed AI civic

307 views • 10 slides

AD ADA Adacel Technologies Li Limited Investor Presentation April 2016 Introductions Gary

AD ADA Adacel Technologies Li Limited Investor Presentation April 2016 Introductions Gary Pearson Peter Landos Brian Hennessey Chief Executive Officer Non Executive Chairman V.P. Business Development What Does Adacel Do? Adacel is a

729 views • 27 slides