Telephone Based Automatic Telephone Based Automatic Voice Pathology - - PDF document

telephone based automatic telephone based automatic voice
SMART_READER_LITE
LIVE PREVIEW

Telephone Based Automatic Telephone Based Automatic Voice Pathology - - PDF document

Telephone Based Automatic Telephone Based Automatic Voice Pathology Assessment. Voice Pathology Assessment. Rosalyn Moran 1 , R. B. Reilly 1 , P.D. Lacy 2 1 Department of Electronic and Electrical Engineering, University College Dublin, Ireland


slide-1
SLIDE 1

1

D.S.P. Research Group, University College Dublin

Telephone Based Automatic Telephone Based Automatic Voice Pathology Assessment. Voice Pathology Assessment.

Rosalyn Moran1, R. B. Reilly1 , P.D. Lacy2

1 Department of Electronic and Electrical

Engineering, University College Dublin, Ireland

2 Royal Victoria Eye and Ear Hospital (RVEEH), Dublin, Ireland. AAAI Fall Symposium, Dialogue Systems for Health Communication, Washington DC, Oct 2004. D.S.P. Research Group, University College Dublin

…… …….. To Follow .. To Follow

  • Why a system for voice disease?
  • The Anatomy of Voice
  • Generic Voice Classifiers
  • The next obvious step
  • System Design
  • Results
  • Refinement
slide-2
SLIDE 2

2

D.S.P. Research Group, University College Dublin

Why is this area Important ? Why is this area Important ?

Voice disorders are relatively common in the general population – 5% suffering abnormalities requiring medical intervention. – Cancerous tumors of vocal folds account for 40% of all head and neck carcinomas – At risk professionals: logistically difficult to monitor Currently, the accurate diagnosis requires visualisation of the

larynx.

– Videostroboscopy is the current gold standard

  • costly, time consuming, often subjective and labour intensive.

D.S.P. Research Group, University College Dublin

Anatomy of the Vocal Folds: Anatomy of the Vocal Folds: Healthy

Healthy Waves Waves

  • Vagus Nerve activates fold closure
  • Air Pressure from lungs force folds

apart

  • For short “Voiced” phonation (long

vowels) folds move periodically; The Mucosal Wave

slide-3
SLIDE 3

3

D.S.P. Research Group, University College Dublin

Vocal Fold Pathologies, Vocal Fold Pathologies, Unhealthy Waves

Unhealthy Waves

Structural (growths), Neurological (Loss of effective nerve action),

  • Lack of constancy
  • Escaping Air due to incomplete fold closure

D.S.P. Research Group, University College Dublin

Generic voice pathology classifier Generic voice pathology classifier

Feature Feature Extraction Extraction

Normal / Normal / Abnormal Abnormal

Acquisition Acquisition Classifier Classifier

Sustained Phonation of vowel sound /a/ Language Independent! High Quality Speech: 25kHz Sound Proof Chamber Measures of vocalisation constancy ….. Pitch Amplitude Noise Various, HMMs ANNs LDA

slide-4
SLIDE 4

4

D.S.P. Research Group, University College Dublin

Where have we come to date? Where have we come to date?

  • Successful Automatic classifiers :

Classification rate in excess of 90% for

separating normal from pathology voice

(- Godino-Llorente, P Gomez-Vilda, IEEE Transactions on Biomedical Engineering.

  • C. Maguire, P de Chazal, R Reilly, P.D. Lacy, World Congress on Medical Physics and Biomedical
  • Engineering. )

Feature Feature Extraction Extraction

Normal / Normal / Abnormal Abnormal

Acquisition Acquisition Classifier Classifier

Motivation: Can we make this more useful ? …………… Could you use a telephone …

D.S.P. Research Group, University College Dublin

Under Investigation….. Under Investigation…..

New method of acquisition employing IVRs to

allow transfer of data across telephone networks and the internet.

  • Remote
  • Secure
  • Identifiable

Feature Feature Extraction Extraction

Normal / Normal / Abnormal Abnormal

Acquisition: Acquisition: IVR IVR Classifier Classifier

System Infrastructure :VoiceXML

slide-5
SLIDE 5

5

D.S.P. Research Group, University College Dublin

An intelligent dialogue system An intelligent dialogue system

Incorporation of Transmission digital signal processing Characteristics algorithms

DSP DSP Comms Comms

D.S.P. Research Group, University College Dublin

Voice XML Voice XML Acquisition Acquisition

  • VoiceXML Scripts held on a web server .
  • Transferred to VoiceXML Gateway Voxpilot for TTS and speaker

recognition.

  • Dial up applications using any telephone.
slide-6
SLIDE 6

6

D.S.P. Research Group, University College Dublin

Disorderd Voice Database Model 4337

Massachusetts Eye and Ear Infirmary 631 valid patient samples of sustained phonation of /a/

Database Database

Feature Feature Extraction Extraction

Normal / Normal / Abnormal Abnormal

Acquisition Acquisition Classifier Classifier

  • Wide variety of pathologies condensed to normal / abnormal in

this study

  • Prelabelled by panel of experts
  • Recorded in soundproof environment using

a high quality microphone

D.S.P. Research Group, University College Dublin

… ….Corrupting The Corpus .Corrupting The Corpus

To Identify Causes of Information loss

Imitate telephone conditions by progressively

degrading the quality of the database.

Examine feature accuracies at each stage Feature Feature Extraction Extraction

Normal / Normal / Abnormal Abnormal

Acquisition Acquisition Classifier Classifier

slide-7
SLIDE 7

7

D.S.P. Research Group, University College Dublin

… ….Creating 5 Test Corpii .Creating 5 Test Corpii

1.Begin: 631 High Quality Speech Files @ 10kHz

  • 2. Degrade : Resample to 8kHz
  • 3. Degrade : Bandpass filtered from 100Hz-3.2kHz
  • 5. Transmit : Original Database
  • 4. Degrade : Add Noise

D.S.P. Research Group, University College Dublin

Transmission Channels: Transmission Channels:

Analog and Digital Long Distance Links Analog and Digital Long Distance Links

  • 1. VoiceXML Calling Application: Plays 30 Speech files,
  • 2. VoiceXml

Application: “answer” and save transmitted speech files.

*CORPUS 5

slide-8
SLIDE 8

8

D.S.P. Research Group, University College Dublin

Features Features Features! Features Features Features!

…… Of Medical Relevance

in conjunction with our Medical Consultants

  • Pitch Perturbation Features, Jitter (12)
  • Amplitude Perturbation Features, Shimmer (12)
  • Energy Measures, Harmonic to Noise Ratio HNR (11)

Feature Feature Extraction Extraction

Normal / Normal / Abnormal Abnormal

Acquisition Acquisition Classifier Classifier

D.S.P. Research Group, University College Dublin

Classifier

– Linear Discriminant Analysis

Performance Estimation

– Normal recordings duplicated to balance classes – 10 runs of 10 fold cross-validation

  • independent training and testing sets

– Specificity, sensitivity, predictivities, accuracy

Classifier / Performance Estimation Classifier / Performance Estimation

Feature Feature Extraction Extraction

Normal / Normal / Abnormal Abnormal

Acquisition Acquisition Classifier Classifier

slide-9
SLIDE 9

9

D.S.P. Research Group, University College Dublin

Classification Performance Classification Performance

D.S.P. Research Group, University College Dublin

Composite Feature Breakdown Composite Feature Breakdown

6 8 .9 3 7 7 .1 7 7 .8 5 1 % J itte r S h im m e r H N R C lea n 1 kH z S e rie s1 66.09 77.97 79.79 20 40 60 80 100 % Jitter S him m er H N R B andlimited (S ampling R ate 8kH z) S eries1

66 75.63 63.66 20 40 60 80 100 % J itter S him m er H N R F iltered 100H z-3200H z S eries1 75.7 74.85 57.16 20 40 60 80 100 % Jitter Shimmer HNR Noise Corrupted (30dB SNR) Series1

slide-10
SLIDE 10

10

D.S.P. Research Group, University College Dublin

Composite Feature Breakdown Composite Feature Breakdown

Shimmer Group proving most robust. HNR accuracies fall significantly.

64.7 73.03 57.85 20 40 60 80 100 % Jitter Shim m er HNR Series1

Telephone Corpus

D.S.P. Research Group, University College Dublin

Moving On Moving On

Classification rate of 74% for separating

normal from pathology voice……over the telephone.

Further Refinement

  • Homogenous Data Sets * Physical

* Neuromuscular * Mixed

slide-11
SLIDE 11

11

D.S.P. Research Group, University College Dublin

Occurs when the function of the larynx has been affected

by a physical change in the anatomy of the larynx. For example an arytenoid granuloma.

Physical Physical Pathology Pathology

D.S.P. Research Group, University College Dublin

Neuromuscular Pathology Neuromuscular Pathology

  • Occurs when the nerves that control the movement of the

muscles in the larynx have been altered in some way. An instance of this is Vocal Fold paralysis.

slide-12
SLIDE 12

12

D.S.P. Research Group, University College Dublin

Telephone Based Results : Telephone Based Results : Improved Accuracy

Improved Accuracy 87% Neuromuscular 78% Physical 61% Mixed Accuracy

D.S.P. Research Group, University College Dublin

Opportunities for providing related Opportunities for providing related health care information by voice health care information by voice applications. applications.

Voice assessment Voice assessment Speech training Speech training Improving literacy Improving literacy

Wider Impact for Healthcare

slide-13
SLIDE 13

13

D.S.P. Research Group, University College Dublin

Thank you Thank you

How is your voice? How is your voice?

rosalyn.moran@ee.ucd.ie