A Realtime, Open-Source Speech- Processing Platform for Research in - PowerPoint PPT Presentation

A Realtime, Open-Source Speech- Processing Platform for Research in Hearing Loss Compensation openspeech.ucsd.edu Harinath Garudadri, Arthur Boothroyd, Ching-Hua Lee, Swaroop Gadiyaram, Justyn Bell, Dhiman Sengupta, Sean Hamilton, Krishna Chaithanya Vastare, Rajesh Gupta, Bhaskar D. Rao The 51st Asilomar Conference on Signals, Systems and Computers November 1, 2017 2017/11/01 1

Outline • Open Speech Platform (OSP): an architecture that enables advanced research to compensate for hearing loss. • Real-Time Master Hearing Aid (RT-MHA): a software implemented with basic and advanced features in commercial hearing aids (HAs). • Current signal processing libraries and reference designs. • User device for remote control of the HA parameters. • Performance comparison with commercial HAs. 2017/11/01 2

The Open Speech Platform (OSP) 2017/11/01 3

OSP for Hearing Loss Research • Realtime, Wearable, Open Source. • Offloading processing from ear level-assemblies, thereby eliminating the bottlenecks of CPU and communication between left and right HAs. • Can be configured at compile and run times. • Aim to support audiologists and hearing aid (HA) researchers to investigate advanced HA algorithms in field studies. 2017/11/01 4

Real-Time Master Hearing Aid (RT-MHA) • The basic functionalities of Hearing Aid (HA) software completed in our OSP. • Libraries are implemented in C for (i) basic and (ii) advanced features in commercial HAs. • Runs on a MacBook with an overall latency of 7.98 ms. • The software works with off-the-shelf microphones and speakers for real-time input and output. 2017/11/01 5

32 kHz Domain 96 kHz Domain 96 kHz Domain Block size = 32 (1ms) Input Buffer size = 96 (1ms) Output Buffer size = 96 (1ms) HA Process Latency = 3 ms Subband filter length = 193 Feedback filter length = 128 Subband-1 WDRC-1 x(n) e(n) s(n) ADC Resample + Mic Array + Resample Processing 3:1 DAC 1:3 - ADC y_(n) Subband-6 WDRC-6 Latency filter taps update I nput Buffer ............... = 1 ms s(n) y_(n) Mic Array Processing = 0.03 ms Feedback Cancellation Resample 3:1 ............. = 0.125 ms HA Process ................. = 3 ms e(n) s(n) Resample 1:3 .............. = 0.125 ms Feedback path Output Buffer .............. = 1 ms estimation H/W - OS (measured) .. = 4.7 ms Total Latency = 7.98 ms Laptop/Wearable Android Device Hearing Aid Hearing Aid Device Control OSP Layer communicates OSP Layer OSP Layer subband compression/gain parameters to HA device and receives data and diagnostics TCP/IP TCP/IP Hearing Aid functionality simulated in s/w. This implementation meets ANSI 3.22 requirements and currently being ported to an embedded platform. 3/26/14 6

RT-MHA System Description • The architecture with different sampling rates (96 kHz for I/O and 32 kHz for main processing) has the benefit of minimizing hardware latency and improving spatial resolution of beamforming with multiple microphones. • The basic functions are implemented in the 32 kHz domain: (i) Subband Decomposition (ii) Wide Dynamic Range Compression (WDRC) (iii) Adaptive Feedback Cancellation (AFC) • Algorithms are provided in source code and compiled libraries. 2017/11/01 7

Subband Decomposition • Enables independent gain control in multiple frequency regions called subbands decomposed by a set of FIR filters. • The filters are designed in MATLAB and are saved in .flt files for inclusion with the RT-MHA software. • Bandwidths, upper and lower cut-off frequencies of the filters are determined according to a set of critical frequency values. • It is possible for users to modify the MATLAB scripts to use FIR filters of different length and different number of subbands. 2017/11/01 8

Frequency Responses of the Subband Filters 2017/11/01 9

WDRC • The WDRC algorithm in the RT-MHA is a based on a version of Prof. James Kates utilizing: (i) Envelope Detection (Peak Detector) (ii) Nonlinear Amplification (Compression Rule) • Primary control parameters: Compression Ratio (CR), Attack Time (AT), Release Time (RT), and Upper and Lower Knee- points (K up and K low ). • These WDRC parameters can be specified at compile time and changed at run time using the user device. 2017/11/01 10

Peak Detector and Compression Rule • In each subband, the peak detector tracks the envelope variations and estimates the signal power accordingly. • Then the estimated input power level will become the input to a compression rule to determine the amount amplification. 2017/11/01 11

Peak Detector • Tracking the envelope by a recursive update: ³ - if | x ( n ) | p ( n 1 ) = a - + - a p ( n ) p ( n 1 ) ( 1 ) | x ( n ) | else = b - p ( n ) p ( n 1 ) end a b where and are constants determined from AT and RT, respective ly. 12

Compression Rule 2017/11/01 13

AFC • Least Mean Square (LMS) based algorithms. • Filtered-X LMS (FXLMS), Proportionate Normalized LMS (PNLMS), and Sparsity promoting LMS (SLMS) [1]. • A new approach to estimating the Added Stable Gain (ASG) of AFC algorithms [2] for researchers to compare AFC systems in file-based mode. [1] Ching-Hua Lee, Bhaskar Rao, and Harinath Garudadri, "Sparsity promoting LMS for adaptive feedback cancellation," European Signal Processing Conference (EUSIPCO) , 2017. [2] Ching-Hua Lee, James Kates, Bhaskar Rao, and Harinath Garudadri, "Speech quality and stable gain trade-offs in adaptive feedback cancellation for hearing aids," The Journal of the Acoustical Society of America Express Letters (JASA-EL) , 2017. 2017/11/01 14

Software Modules in Release 2017a 2017/11/01 15

Reference Designs • The reference design is provided in the files ospprocess.c and ospprocess.h . functions. • If you are working on alternate implementations of basic HA functions, we suggest clone a given function and call this in the reference design. • Implementation of additional functionality can also be done by adding the related .c and .h files in the libosp and modifying the reference directory accordingly. • Keeping interfaces the same will minimize code changes. 2017/11/01 16

User Device • An Android APP which provides for real-time changes to WDRC parameters. • Implemented above TCP/IP layer in a software stack called OSPLayer. • The modular structure enables investigations in self fitting and auto fitting algorithms. 2017/11/01 17

User Interface 2017/11/01 18

RT-MHA Performance • Compared with 4 commercial HAs (Systems A – D) System System System System OSP OSP AID Units Low-power Rx High-power Rx A B C D Nominal Gain dB 40 40 25 35 40 40 Max OSPL90 dB SPL 107 112 110 111 121 130 Avg OSPL90 dB SPL 106 109 108 106 112 126 Avg Gain @ 50 dB dB 37 39 25 35 35 41 Freq. Response kHz 0.2-5 0.2-6 0.2-5 0.2-6.725 0.2-8 0.2-6.3 Eq. Input Noise dB SPL 27 26 30 27 29 28 Distortion @ 500 Hz % THD 1 1 0 0 2 1 Distortion @ 800 Hz % THD 1 1 0 0 3 2 Distortion @ 1600 Hz % THD 0 0 0 0 1 1 2017/11/01 19

Summary and Future Plans • Takeaway message: An open source, realtime, wearable speech lab that DSP experts can contribute to – and enable new discoveries in Hearing Aids, Hearables and Hearing Healthcare in general • Release 2017b – Bug fixes and optimizations for the wearable device • Release 2018a – RT-MHA ported to the wearable device hardware 2017/11/01 20

A Realtime, Open-Source Speech- Processing Platform for Research in - PowerPoint PPT Presentation

A Realtime, Open-Source Speech- Processing Platform for Research in Hearing Loss Compensation openspeech.ucsd.edu Harinath Garudadri, Arthur Boothroyd, Ching-Hua Lee, Swaroop Gadiyaram, Justyn Bell, Dhiman Sengupta, Sean Hamilton, Krishna

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

GTFS-realtime What is GTFS-realtime GTFS-realtime is an extension of the General Transit Feed

Rtosc Realtime Open Sound Control Mark McCurry 2018 Rtosc Realtime Open Sound Control

Equinox: A C++11 platform for realtime SDR applications Equinox: A C++11 platform for realtime SDR

Realtime Hair Rendering Erik Sintorn - erik.sintorn@chalmers.se State of the art (realtime) In

Realtime Water Simulation Benjamin Harry CS148 Final Project Project Goal Create a realtime

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Realtime Data Processing at Facebook Abhay Venkatesh Actionable reports Why e.g. Chorus:

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Speech Processing for Speech Processing for Unwritten Languages Unwritten Languages Alan W

Speech Processing 15-492/18-492 Speech Recognition Signal Processing Analog to Digital Speech

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

LUCAS VASCONCELOS SANTANA IME-USP APACHE STORM is a free and open source distributed realtime

Improving the Process: Rochesters Refugee Deaf and nd Hard-of of-Hear Hearin ing g Patien

Hearing Healthcare Providers: Issues Facing the Licensee Vanessa Cajina, Legislative Advocate

Department of Health and Human Services MSM 2800 School Based Child Health Services (SBCHS)

{ Section 504 Non-Discrimination Duty School districts are prohibited from: Excluding a student

under the Medicare Physician Fee Schedule 2013 Final Rule Physician Feedback and Value-Based

Language Pathology and Audiology in Providing Assessment and Intervention for School-aged Children

Today and Tomorrow HEARING LOSS TECHNOLOGY TODAY AND TOMORROW Laura E. Plummer, MA, CRC, ATP

Beginning Billing Workshop Practitioner Colorado Medicaid 2015 Centers for Medicare &

A Realtime, Open-Source Speech- Processing Platform for Research in - PowerPoint PPT Presentation

A Realtime, Open-Source Speech- Processing Platform for Research in Hearing Loss Compensation openspeech.ucsd.edu Harinath Garudadri, Arthur Boothroyd, Ching-Hua Lee, Swaroop Gadiyaram, Justyn Bell, Dhiman Sengupta, Sean Hamilton, Krishna

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

GTFS-realtime What is GTFS-realtime GTFS-realtime is an extension of the General Transit Feed

Rtosc Realtime Open Sound Control Mark McCurry 2018 Rtosc Realtime Open Sound Control

Equinox: A C++11 platform for realtime SDR applications Equinox: A C++11 platform for realtime SDR

Realtime Hair Rendering Erik Sintorn - erik.sintorn@chalmers.se State of the art (realtime) In

Realtime Water Simulation Benjamin Harry CS148 Final Project Project Goal Create a realtime

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Realtime Data Processing at Facebook Abhay Venkatesh Actionable reports Why e.g. Chorus:

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Speech Processing for Speech Processing for Unwritten Languages Unwritten Languages Alan W

Speech Processing 15-492/18-492 Speech Recognition Signal Processing Analog to Digital Speech

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

LUCAS VASCONCELOS SANTANA IME-USP APACHE STORM is a free and open source distributed realtime

Improving the Process: Rochesters Refugee Deaf and nd Hard-of of-Hear Hearin ing g Patien

Hearing Healthcare Providers: Issues Facing the Licensee Vanessa Cajina, Legislative Advocate

Department of Health and Human Services MSM 2800 School Based Child Health Services (SBCHS)

{ Section 504 Non-Discrimination Duty School districts are prohibited from: Excluding a student

under the Medicare Physician Fee Schedule 2013 Final Rule Physician Feedback and Value-Based

Language Pathology and Audiology in Providing Assessment and Intervention for School-aged Children

Today and Tomorrow HEARING LOSS TECHNOLOGY TODAY AND TOMORROW Laura E. Plummer, MA, CRC, ATP

Beginning Billing Workshop Practitioner Colorado Medicaid 2015 Centers for Medicare &amp;

Beginning Billing Workshop Practitioner Colorado Medicaid 2015 Centers for Medicare &