spoken language biomarkers for detecting cognitive
play

Spoken Language Biomarkers for Detecting Cognitive Impairment Tuka - PowerPoint PPT Presentation

Spoken Language Biomarkers for Detecting Cognitive Impairment Tuka Alhanai Advisor: James Glass Spoken Language Systems Group Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology 3 rd May 2018 tuka@mit.edu


  1. Spoken Language Biomarkers for Detecting Cognitive Impairment Tuka Alhanai Advisor: James Glass Spoken Language Systems Group Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology 3 rd May 2018 tuka@mit.edu talhanai talhanai.com 1

  2. 2

  3. Objective : Automatically detect cognitive conditions using spoken language. 3

  4. Cognitive impairment Definition : decline in mental abilities that is severe enough to interfere with daily life. • Alzheimer’s • Vascular Dementia • Lewy Body Dementia 4

  5. Cognitive impairment Definition : decline in mental abilities that is severe enough to interfere with daily life. 2 nd $200B to spinal cord injuries in expenditure in USA. [Alzheimer’s Association, (2015]) terms of its debilitating effects. [WHO, (2003)] equivalent value as: 5

  6. Why detect it? 6

  7. Cognitive function Pathological load Normal MCI Dementia Nestor et al. 2004 7

  8. Plan Hospital in the home : 50% suffer $80K a year 35% hosp. visits. from depression. 4% mortality rates. 8

  9. Lifestyle Delay onset Delay onset Delay onset by 7 months by 4 years by 2 months 3 times a week, Fish meal a week, 45% lower risk 70% lower risk 9

  10. Prevention Vascular Lewy Body Parkinson’s Alzheimer’s SIRT3 protein AD alone less Non-steroidal Anti- protects brain damaging than inflammatory Drugs cells against mixed pathologies. lowers risk degeneration. 10

  11. 11

  12. Data : Audio recordings of neuropsychological exams at the Framingham Heart Study. 12

  13. 13

  14. Framingham Heart Study since 15,000+ 1948 subjects recording since audio 2006 neuropsychological exams 14

  15. Outcome recall details describe scene recall verbal pair associates 15

  16. Outcome • severity • onset • cause reviewed 16

  17. Study : 92 subjects (21 impaired) 17

  18. Data statistics N Subjects 92 N Impaired 21 (22.8%) Age 68 years (+/- 17) Gender 47 male (51 female) Duration 65 minutes (+/- 18) Vocabulary Size 527 words (+/- 181) Transcript Size 2,496 words (+/- 1,508) 18

  19. Outcome of interest • Binary cognitive impairment • According to dementia review panel assessment • Pathology : • 14 Alzheimer’s • 5 Vascular Dementia • Severity : • 10 < mild • 6 mild • 5 moderate 19

  20. Assessment • AUC: Area Under the Receiver Operating Curve • TPR: True Positive Rate • FPR: False Positive Rate • HL-test: Hosmer-Lemeshow Test for statistical calibration • LOOCV: Leave-one-out cross-validation 20

  21. Modeling 21

  22. Inside the box Models: • Support vector machine (SVM) • Discriminant analysis • Decision tree • K-nearest neighbor • Logistic regression 22

  23. Inside the box Models: • Support vector machine (SVM) • Discriminant analysis • Decision tree • K-nearest neighbor • Interpretable • Logistic regression • Best performing 23

  24. Baseline model • Output : binary cognitive impairment • Model : logistic regression • Features : age, education, employment, gender part-time never age retired volunteer unemployed other disability high school female some college college 24

  25. Age Education Employment Gender unemployed part-time disability retired never other age *** *** * *** Model Coefficients *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 25

  26. Age Education Employment Gender unemployed More likely with: part-time disability retired never other age • Increasing age *** *** * *** Model Coefficients *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 26

  27. Age Education Employment Gender unemployed More likely with: part-time disability retired never other age • Increasing age *** *** * • Less education *** Model Coefficients *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 27

  28. Age Education Employment Gender unemployed More likely with: part-time disability retired never other age • Increasing age *** *** * • Less education *** Model Coefficients • Less employment *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 28

  29. Age Education Employment Gender unemployed More likely with: part-time disability retired never other age • Increasing age *** *** * • Less education *** Model Coefficients • Less employment • Male *** *** volunteer female high school some college college (modeled on N = 6,258, evaluated on the 92 subjects) 29

  30. Audio pre-processing 30

  31. Extracting features 31

  32. Extracting features 32

  33. Modeling 33

  34. Features (inputs) Pitch Segment Duration Jitter Speaking Rate Spectral Energy Question Mark Shimmer # Words RMS Energy Lexical Overlap Language Perplexity 34

  35. Features (inputs) Pitch Segment Duration Jitter Speaking Rate Spectral Energy Question Mark Shimmer # Words RMS Energy Lexical Overlap Language Perplexity 35

  36. Spectral Energy Prosody Text . f f i d 8 C C k F r a M 3 M 6 1 C n C o C C i F t F s Model Coefficients M M e u Q 3 h . 0 n f c f 1 o C i t d i i t C C P a 3 C r F u C M F D C M t F n M e m g e S 3 r . e f f C t i t d i C J 3 F C M C F M 36

  37. Spectral Energy Prosody Text • Monotonous voice . f f i d 8 C C k F r a M 3 M 6 1 C n C o C C i F t F s Model Coefficients M M e u Q 3 h . 0 n f c f 1 o C i t d i i t C C P a 3 C r F u C M F D C M t F n M e m g e S 3 r . e f f C t i t d i C J 3 F C M C F M 37

  38. Spectral Energy Prosody Text • Monotonous voice . f f i d 8 C C • Hesitation k F r a M 3 M 6 1 C n C o C C i F t F s Model Coefficients M M e u Q 3 h . 0 n f c f 1 o C i t d i i t C C P a 3 C r F u C M F D C M t F n M e m g e S 3 r . e f f C t i t d i C J 3 F C M C F M 38

  39. Spectral Energy Prosody Text • Monotonous voice . f f i d 8 C C • Hesitation k F r a M 3 M 6 1 C n C o C C i F t F • Limited response s Model Coefficients M M e u Q 3 h . 0 n f c f 1 o C i t d i i t C C P a 3 C r F u C M F D C M t F n M e m g e S 3 r . e f f C t i t d i C J 3 F C M C F M 39

  40. Results Features AUC TPR @ FPR 10% HL-test Text 0.69 0.14 > 0.05 Demographic 0.79 0.38 < 0.05 Audio 0.90 0.71 > 0.05 Text + Audio 0.92 0.76 > 0.05 • Text + Audio best performing (better than demographic) • Text + Audio also has best recall rate • Best performing model is well-calibrated 40

  41. Conclusion • A method to quantify speech patterns to model cognitive impairment. • Utilize findings without formally deploying the model. • Don’t necessarily need to know exam structure. 41

  42. Future Work 5,000+ subjects 7,000+ audio recordings 42

  43. Details in Publication “ Spoken Language Biomarkers for Detecting Cognitive Impairment ” T. Alhanai, R. Au, and J. Glass, IEEE Automatic Speech and Recognition Workshop , December 2017 [Paper]: https://groups.csail.mit.edu/sls/publications/2017/ASRU17_alhanai.pdf [Source Code]: https://github.com/talhanai/asru2017-method.git 43

  44. Spoken Language Biomarkers for Detecting Cognitive Impairment Tuka Alhanai Advisor: James Glass Spoken Language Systems Group Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology 3 rd May 2018 tuka@mit.edu talhanai talhanai.com 44

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend