SLIDE 1
Effective Open Source Speech Recognition in Your Application
#kde-speech Peter Grasch peter@grasch.net
SLIDE 2 The Basics
Speech model Acoustic model Language model Decoder
- Sounds
- Vocabulary
- Grammar
SLIDE 3
Open Source Speech Recognition
Decoder Trainer UI CMU SPHINX
(PocketSphinx, SphinxTrain)
✓ ✓ Julius ✓ KALDI ✓ ✓ Simon ✓ ✓ ✓
SLIDE 4 Standard Architecture
Simon Acoustic model Language model Simond Commands
?
Your application
SLIDE 5
Standard Architecture
Simon Acoustic model Language model Your application Scenario Scenario Scenario Simond Commands
SLIDE 6
Headless Architecture
Simon Acoustic model Language model Simond Commands Your application
SLIDE 7
Embedded Architecture
Simon Acoustic model Language model Commands Your application Simond Decoder
SLIDE 8
Standard Architecture
Simon Acoustic model Language model Your application Scenario Scenario Scenario Simond Commands
SLIDE 9 Writing your Scenario
- Lay out the commands you want to support
- Create:
– Vocabulary – Grammar – Commands
SLIDE 10
Writing your Scenario
Demonstration
SLIDE 11 Tighter Integration: Write a Custom Command Plug-In
- Full, programmatic control of the scenario
- Meta information of recognition results:
– Phonetic transcriptions – Confidence scores* – Alternative results*
SLIDE 12
Tighter Integration: Write a Custom Command Plug-In
Demonstration
SLIDE 13
Q & A
#kde-speech Peter Grasch peter@grasch.net
SLIDE 14
Thank you for your attention