Clinical Depression Classification Would like to detect clinical - PowerPoint PPT Presentation

Evaluation of Objective Features for Classification of Clinical Depression in Speech by Genetic Programming Juan Torres 1 , Ashraf Saad 2 , Elliot Moore 1 1 School of Electrical and Computer 2 Computer Science Department Engineering School of Computing Georgia Institute of Technology Armstrong Atlantic State University Savannah, GA 31407, USA Savannah, GA 31419, USA juan.torres@gatech.edu, ashraf@cs.armstrong.edu emoore@gtsav.gatech.edu � 09/18/2006 - 10/06/2006

Clinical Depression Classification � Would like to detect clinical depression by analyzing a patient’s speech. � Binary decision classification problem � Large number of features in dataset. Feature Selection is necessary for: � Designing a robust classifier � Identifying of small set of useful features, which may in turn provide physiological insight � 09/18/2006 - 10/06/2006

Speech Database � 15 patients (6 male, 9 female) � 18 control subjects (9 male, 9 female) � Corpus: 65 sentence short story � Observation Groupings: � G1: 13 obs/speaker (5 sentences each) � G2: 5 obs/speaker (13 sentences each) � 09/18/2006 - 10/06/2006

Speech Features � Prosodics � Vocal Tract Resonant Frequencies (Formants) � Glottal Waveform � Teager FM � 09/18/2006 - 10/06/2006

Speech Features (cont.) Pitch (PCH) Glottal Ratios � Raw features extracted (GLR) frame by frame (25- Energy Median Glottal Spectrum 30ms), and grouped into Statistics (EMS) (GLS) 10 categories: Energy Deviation Formant � EDS = STD(DFS(E v )) Statistics (EDS) Locations (FMT) � EMS = MED(DFS(E v )) Speaking Rate Formant (SPR) Bandwidths (FBW) Glottal Timing Teager FM (GLT) (TFM) � 09/18/2006 - 10/06/2006

Statistics Statistic Equation � Sentence-level statistics Average (AVG) were computed for each 1/N * Sum{x i } raw feature → Direct Median (MED) 50 th percentile Feature Statistics Standard Deviation Sqrt(1/(N-1) * Sum{(x i - (STD) Mean(x)) 2 }) � Same set of statistics Minimum (MIN) 5 th Percentile used on DFS’s over each entire observation → Maximum (MAX) 95 th Percentile Observation Level Range (RNG) MAX – MIN Statistics Dynamic Range log 10 (MAX) – (DRNG) log 10 (MIN) Interquartile Range 75 th percentile – 25 th (IQR) percentile � 09/18/2006 - 10/06/2006

Final Feature Sets Experiment Observations Features � Result: 2000+ distinct features (OFS) MG1 195 724 � Statistical significance MG2 75 298 tests (ANOVA) used FG1 234 1246 to initially prune FG2 90 857 feature set. � Final size: 298 – 1246 features → large FS problem. � 09/18/2006 - 10/06/2006

Feature Selection � Goal: Select (small) group of features that maximizes classifier performance � Approaches � Filter: optimize computationally inexpensive fitness function � Wrapper: fitness function = classification performance � 09/18/2006 - 10/06/2006

Genetic Programming for Classification and FS (GPFS) � Estimate optimal feature set and classifier simultaneously → “online approach”. (Muni, Pal, Das 2006) � Advantages: � Evolutionary search: explores (potentially) large portion of feature space � Resulting classifier consists of a simple algebraic expression (easy to read and interpret) � Stochastic: multiple runs should yield different solutions. Frequency of selection can be regarded as approximate fitness measure, given large number of runs. � 09/18/2006 - 10/06/2006

Genetic Programming � Classifier consists of expression trees � Binary decision → single tree Class assigned by algebraic sign of evaluation (T>0 → c1, T<0 → c2) � Internal nodes: { +, -, x, / (protected)} � External nodes: {features, rnd_dbl(0-10)} �� 09/18/2006 - 10/06/2006

Genetic Programming (Cont.) � Large population of classifier trees is evolved over several generations. � Population Initialization � Random Trees (height 2-6), ramped half and half method � Fitness Function = Classification Performance � Evolutionary Operators � Reproduction (fitness-proportional selection) � Mutation (random selection) � Crossover (tournament selection) �� 09/18/2006 - 10/06/2006

Evolutionary Rules for Simultaneous Feature Selection � Initial tree generation � Probability of selecting a feature set decreases linearly with feature set size. � Fitness � Biased toward trees that use few features. � Crossover � Homogeneous: only between parents with same feature set � Heterogeneous: biased toward selecting parents with similar feature sets �� 09/18/2006 - 10/06/2006

Dynamic Parameters � Fitness bias toward smaller subsets decreases with generations. � Probability of heterogeneous crossover decreases with generations � Motivation: Explore feature space during first few generations, then gradually concentrate on improving classification performance with current feature sets. �� 09/18/2006 - 10/06/2006

GP Parameters Parameter Value Crossover probability 0.80 Reproduction probability 0.05 Mutation probability 0.15 Prob. of selecting int./ext. node during crossover 0.8 / 0.2 Prob. of selecting int./ext. node during mutation 0.7 / 0.3 Tournament size 10 Number of generations 30 for G1 / 20 for G2 Initial height of trees 2-6 Maximum allowed nodes of a tree 350 Maximum height of a tree 12 Population size 3000 for G1 / 2000 for G2 �� 09/18/2006 - 10/06/2006

GP Results Classification Performance, Averaged over 10 runs of Leave-one-out Cross-validation Male Female Experiment Mean G1 G2 G1 G2 Classification 71.2 71.3 84.9 82.2 77.4 Accuracy Sensitivity 80.9 74.7 85.4 82.7 80.9 Specificity 64.8 69.1 84.4 81.8 75.0 Feature Set Size 18.5 15.3 16.1 14.2 16.0 �� 09/18/2006 - 10/06/2006

Feature Selection Histograms �� 09/18/2006 - 10/06/2006

“Best” Features -- Males Male - G1 Male - G2 GLT: Max((CP)MIN) GLT: Max((CP)MIN) GLT: DRng((CP)IQR) PCH: Med(A1) GLS: Med((gSt1000)MAX) EDS: Avg(MED) GLT: Std((OP)IQR) GLT: IQR((CP)IQR) GLR: Rng((rCPO)IQR) GLR: Min((rOPO)IQR) GLS: Avg((gSt1000)MAX) EDS: Avg(AVG) EDS: Avg(AVG) GLR: Med((rCPOP)MIN) EDS: Avg(MED) GLT: Std((CP)MIN) EDS: Med(MED) GLR: Max((rCPOP)MIN) GLT: Med((CP)MIN) GLS: Avg((gSt1000)MAX) �� 09/18/2006 - 10/06/2006

“Best” Features -- Females Female - G1 Female - G2 EMS: Med(MR) EMS: IQR(AVG_1) EMS: Med(STD_1) EMS: Med(STD_1) EMS: Max(MR) PCH: IQR(IQR) EMS: Med(RNG) EMS: Med(STD) EMS: Max(STD_1) EMS: Max(MR) EMS: Med(AVG) TFM: Avg(MAX(IQR)) EMS: Max(MAX) EMS: Med(MR) EMS: Avg(STD_1) FBW: Med((bwF3)IQR) EMS: Avg(MED) EMS: Med(MAX) EMS: Avg(AVG) EMS: Med(RNG) �� 09/18/2006 - 10/06/2006

GP Results (Cont.) � GP results were not as good as hoped for. However, the fact that certain features were selected in the final solutions more frequently than others can be regarded as a measure of their usefulness. � To test this hypothesis, we train Bayesian classifiers using the 16 features most frequently- selected by GP. �� 09/18/2006 - 10/06/2006

Naive Bayesian Classification � Assign Class C j with highest probability given observation (features). � Can be estimated using Bayes’ rule: ( | ) ( ) p X C P C ( | ) = j j p C X ( ) j p X � Under naive assumption, class-conditional distributions can be expressed as: ∏ ( | ) ( | ) = p X C p x C j i j i �� 09/18/2006 - 10/06/2006

PDF estimation methods Uniform Bins � A histogram with N uniformly spaced intervals (bins) is computed for each feature and each � class using the training data. The optimum value for N was found by exhaustive search. � Optimal Threshold � Similar to uniform bins with N=2, but the cutoff threshold between the two bins is chosen � separately for each feature. (Naive) Gaussian Assumption � � Model the PDF of each feature and each class as a 1-D Gaussian density function whose mean and variance are taken as the sample mean and (unbiased) variance of the training data. Gaussian Mixtures � Each likelihood function p ( X | C j ) is modeled as a weighted sum of multivariate Gaussian � densities. � The expectation-maximization (EM) algorithm is used to estimate means, covariance matrices, and weights. We use diagonal covariance matrices and limit the number of mixtures to 3 for the G1 experiments and 2 for the G2 experiments in order to reduce the number of parameters to be estimated. Multivariate Gaussian � � Each (class-conditional) likelihood function is modeled as a single multivariate Gaussian PDF with a full covariance matrix. Like the GMM, this method does not follow the naïve assumption. �� 09/18/2006 - 10/06/2006

Results Exp Method Acc Sen Spec Exp Method Acc Sen Spec Unif Bin 86.7 83.3 88.9 Unif Bin 88.0 85.5 90.6 (N = 8) (N = 9) Opt Thresh 82.6 82.1 82.9 Opt Thresh 78.6 65.8 91.5 Male Female Gaussian 87.2 88.5 86.3 Gaussian 87.2 91.5 82.9 G1 G1 GMM 88.7 87.2 89.7 GMM 87.6 88.0 87.7 MVG 84.1 83.3 84.6 MVG 85.5 83.8 87.2 Unif Bin 90.7 93.3 88.9 Unif Bin 93.3 93.3 93.3 (N = 2) (N = 5) Opt Thresh 73.3 50.0 88.9 Opt Thresh 86.7 75.6 97.8 Male Female Gaussian 89.3 93.3 86.7 Gaussian 91.1 95.6 86.7 G2 G2 GMM 90.7 90.0 91.1 GMM 88.0 83.3 91.1 MVG 86.7 80.0 91.1 MVG 92.2 86.7 97.8 Average Improvement: 18.5% (Males), 7.1% (Females) �� 09/18/2006 - 10/06/2006

Clinical Depression Classification Would like to detect clinical - PowerPoint PPT Presentation

Evaluation of Objective Features for Classification of Clinical Depression in Speech by Genetic Programming Juan Torres 1 , Ashraf Saad 2 , Elliot Moore 1 1 School of Electrical and Computer 2 Computer Science Department Engineering School of

The Great Depression Outcome: Causes of The Great Depression Causes of The Great Depression 1. A

Pierce Gierloff Health Depression Essay Depression Depression is a very serious, but also very

Depression Caterpillar Confidential Green Caterpillar Confidential Green 1 Depression 101 In

Pediatric Depression from an Pediatric Depression from an Inte Integrative grative Pe Perspe

Causes of The Great Depression The Great Depression is one of the most misunderstood events

DIAGNOSIS AND TREATMENT OF POSTPARTUM DEPRESSION Learning Objectives Improve the recognition

Overview Perimenopause-related mood disturbance 1. Depressive symptoms vs. clinical depression

The Contribution of Stigma on Depression Symptoms and Depression Status Among Individuals Living

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

Straight Talk on Depression Joseph P. Wiedemer MD Peter D. Rainey MS .. What is

Improving Improving Outcomes in Major Depression: Outcomes in Major Depression: The The Impact

The Rise in Depressive Disorder The Costs of Depression Rates of depression have radically

GREAT DEPRESSION THE GREAT CRASH GUIDING QUESTION What caused the Great Depression? the

NOTES on the Great Depression SSUSH17 SSUSH17 Causes of the Depression 1. uneven distribution

Image sourced from Depositphotos Mental Health Unipolar depression Anxiety disorders Images

Housing Demand & Affordability for Low-Wage Households: Evidence from Minimum Wage Changes

The Impact of Using Multiple Antennas on Wireless Localization Konstantinos Kleisouris Computer

IETF & ITU Standardization Activities IETF & ITU Standardization Activities Emile Stephan

Promoting Economic Equity in 2020 and Beyond February 25, 2020 Prosperity Nows mission is

DECEMBER MEETING | WEDNESDAY, DECEMBER 5, 2018 SOUTHEAST TECHNICAL INSTITUTE | SIOUX FALLS, SD

Human Capabilities Perception Processing Action Memory Jrg Cassens Institut fr Mathematik

Lecture 5: Perception Information Visualization CPSC 533C, Fall 2006 Tamara Munzner UBC

O e Overview & Aim e & Investigating common properties of across-channel processing in

Clinical Depression Classification Would like to detect clinical - PowerPoint PPT Presentation

Evaluation of Objective Features for Classification of Clinical Depression in Speech by Genetic Programming Juan Torres 1 , Ashraf Saad 2 , Elliot Moore 1 1 School of Electrical and Computer 2 Computer Science Department Engineering School of

The Great Depression Outcome: Causes of The Great Depression Causes of The Great Depression 1. A

Pierce Gierloff Health Depression Essay Depression Depression is a very serious, but also very

Depression Caterpillar Confidential Green Caterpillar Confidential Green 1 Depression 101 In

Pediatric Depression from an Pediatric Depression from an Inte Integrative grative Pe Perspe

Causes of The Great Depression The Great Depression is one of the most misunderstood events

DIAGNOSIS AND TREATMENT OF POSTPARTUM DEPRESSION Learning Objectives Improve the recognition

Overview Perimenopause-related mood disturbance 1. Depressive symptoms vs. clinical depression

The Contribution of Stigma on Depression Symptoms and Depression Status Among Individuals Living

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

Straight Talk on Depression Joseph P. Wiedemer MD Peter D. Rainey MS .. What is

Improving Improving Outcomes in Major Depression: Outcomes in Major Depression: The The Impact

The Rise in Depressive Disorder The Costs of Depression Rates of depression have radically

GREAT DEPRESSION THE GREAT CRASH GUIDING QUESTION What caused the Great Depression? the

NOTES on the Great Depression SSUSH17 SSUSH17 Causes of the Depression 1. uneven distribution

Image sourced from Depositphotos Mental Health Unipolar depression Anxiety disorders Images

Housing Demand &amp; Affordability for Low-Wage Households: Evidence from Minimum Wage Changes

The Impact of Using Multiple Antennas on Wireless Localization Konstantinos Kleisouris Computer

IETF &amp; ITU Standardization Activities IETF &amp; ITU Standardization Activities Emile Stephan

Promoting Economic Equity in 2020 and Beyond February 25, 2020 Prosperity Nows mission is

DECEMBER MEETING | WEDNESDAY, DECEMBER 5, 2018 SOUTHEAST TECHNICAL INSTITUTE | SIOUX FALLS, SD

Human Capabilities Perception Processing Action Memory Jrg Cassens Institut fr Mathematik

Lecture 5: Perception Information Visualization CPSC 533C, Fall 2006 Tamara Munzner UBC

O e Overview &amp; Aim e &amp; Investigating common properties of across-channel processing in

Housing Demand & Affordability for Low-Wage Households: Evidence from Minimum Wage Changes

IETF & ITU Standardization Activities IETF & ITU Standardization Activities Emile Stephan

O e Overview & Aim e & Investigating common properties of across-channel processing in