Criteria and metrics for thresholded AU detection Jeff Girard and - - PowerPoint PPT Presentation

criteria and metrics for thresholded au detection
SMART_READER_LITE
LIVE PREVIEW

Criteria and metrics for thresholded AU detection Jeff Girard and - - PowerPoint PPT Presentation

University of Pittsburgh Affect Analysis Group http://pitt.edu/~emotion Criteria and metrics for thresholded AU detection Jeff Girard and Jeff Cohn University of Pittsburgh BeFIT Workshop, ICCV 2011 Facial Action Coding System


slide-1
SLIDE 1

http://pitt.edu/~emotion

Criteria and metrics for thresholded AU detection

Jeff Girard and Jeff Cohn University of Pittsburgh

BeFIT Workshop, ICCV 2011

University of Pittsburgh – Affect Analysis Group

slide-2
SLIDE 2

http://pitt.edu/~emotion

Facial Action Coding System

  • Facial action informs:

– Emotion Displays – Pain Displays – Social Signaling

  • FACS Coding System:

– Anatomically-based – Common vocabulary – Objective and reliable – Action Unit 6+12+25+26

BeFIT Workshop, ICCV 2011 2/16 November 13, 2011

slide-3
SLIDE 3

http://pitt.edu/~emotion

Automatic AU Detection

  • FACS training, reliability, and

manual coding are time consuming

  • Automatic AU detection would be

faster and enable “online” coding

  • However, automatic coding

requires a trained classifier

  • Classifier training requires time,

expertise, and ground truth coding

  • Classifiers are considered corpus-

specific; thus a new classifier is usually trained for each corpus

BeFIT Workshop, ICCV 2011 3/16 November 13, 2011

slide-4
SLIDE 4

http://pitt.edu/~emotion

Classifier Strategy Comparison

Data Collection Groundtruth Coding (subset) Classifier Training (subset) Automatic Coding

Data Collection Classifier from other Database Automatic Coding

Novel Classifier Training Naïve Classifier Implementation

Strengths: +Classifier trained on same database Limitations:

  • Requires ground truth coding
  • Requires classifier training

Strengths: +Requires no ground truth coding +Requires no classifier training Limitations:

  • Classifier not trained on same database

BeFIT Workshop, ICCV 2011 4/16 November 13, 2011

slide-5
SLIDE 5

http://pitt.edu/~emotion

Threshold Analysis Alternative

Data Collection Classifier from other Database Threshold Analysis (subset) Automatic Coding

BeFIT Workshop, ICCV 2011

Strengths: +Requires no new classifier training +Threshold optimized for current database Limitations:

  • Classifier not trained on current database
  • Requires some ground truth coding
  • 1
  • 0,75
  • 0,5
  • 0,25

0,25 0,5 0,75 1

5/16 November 13, 2011

slide-6
SLIDE 6

http://pitt.edu/~emotion

Step 1 – Obtain Classifier

SVM Classifier

  • SIFT Feature Data
  • Radial Basis Kernel
  • 3-fold cross-validation

RU-FACS-1 Database

  • Classifier Training Set

– 17 subjects (97000 frames)

  • Classifier Testing Set

– 11 subjects (67000 frames)

  • False Opinion Paradigm

BeFIT Workshop, ICCV 2011 6/16 November 13, 2011

slide-7
SLIDE 7

http://pitt.edu/~emotion

Step 2 – Implement Classifier

Spectrum Database

  • Threshold Training Set

– 23 subjects (88000 frames)

  • Threshold Testing Set

– 13 subjects (37000 frames)

  • HRSD-17 Depression Interview

BeFIT Workshop, ICCV 2011

4 10 12 14

7/16 November 13, 2011

slide-8
SLIDE 8

http://pitt.edu/~emotion

Step 3 – Identify Potential Thresholds

  • 1
  • 0,75
  • 0,5
  • 0,25

0,25 0,5 0,75 1 300 600 900 1200 SVM Decision Value Frame Number

SVM_12

BeFIT Workshop, ICCV 2011

  • Find minimum and maximum SVM decision values for each AU
  • Separate this range into equal steps to identify potential thresholds
  • This study compared 250 thresholds for each of the action units

8/16 November 13, 2011

slide-9
SLIDE 9

http://pitt.edu/~emotion

Step 4 – Generate Predictions

  • 1
  • 0,75
  • 0,5
  • 0,25

0,25 0,5 0,75 1 300 600 900 1200

SVM_12 Threshold

300 600 900 1200

Thresholded Prediction

BeFIT Workshop, ICCV 2011 9/16 November 13, 2011

slide-10
SLIDE 10

http://pitt.edu/~emotion

Step 5 – Compare to Groundtruth

300 600 900 1200

Thresholded Prediction

300 600 900 1200

Groundtruth Labels

BeFIT Workshop, ICCV 2011

Accuracy = 0.855 F1 = 0.756 Kappa = 0.656

10/16 November 13, 2011

slide-11
SLIDE 11

http://pitt.edu/~emotion

Step 6 – Identify Optimal Thresholds

BeFIT Workshop, ICCV 2011

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

  • 4
  • 3
  • 2
  • 1

1 2 3 4 5 6 7 8 9 10

Score on Performance Metric Threshold Value

AU_10 Threshold Training

Accuracy F1 Kappa

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

1 2 3 4 5 6

Score on Performance Metric Threshold Value

AU_4 Threshold Training

Accuracy F1 Kappa

11/16 November 13, 2011

slide-12
SLIDE 12

http://pitt.edu/~emotion

Step 6 – Identify Optimal Thresholds

BeFIT Workshop, ICCV 2011

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

  • 5

5 10 15 20

Score on Performance Metric

Threshold Value

AU_14 Threshold Training

Accuracy F1 Kappa

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

  • 4
  • 3
  • 2
  • 1

1 2 3 4

Score on Performance Metric Threshold Value

AU_12 Threshold Training

Accuracy F1 Kappa

12/16 November 13, 2011

slide-13
SLIDE 13

http://pitt.edu/~emotion

Overall Results in Testing Subset

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 Accuracy F1 Kappa Score on Performance Metric Performance Metric Naïve Classifier Threshold Analysis

BeFIT Workshop, ICCV 2011

p < .0001 p < .002 p < .0001

13/16 November 13, 2011

slide-14
SLIDE 14

http://pitt.edu/~emotion

Comparison to FERA Winner

BeFIT Workshop, ICCV 2011

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

Naïve Implementation Threshold Analysis FERA Winner* Overall F1

14/16 November 13, 2011

slide-15
SLIDE 15

http://pitt.edu/~emotion

Performance Gains by Action Unit

BeFIT Workshop, ICCV 2011

4 10 12 14

0% 10% 20% 30% 40% 50% 60% 70% 80% AU_4 AU_10 AU_12 AU_14

Percent Increase in Performance Accuracy F1 Kappa

15/16 November 13, 2011

slide-16
SLIDE 16

http://pitt.edu/~emotion

Future Directions

  • Additional Databases and Action Units
  • Additional Feature and Classifier types
  • Determine required size of training set
  • Smoothing to remove noise in predictions
  • Compare directly to Novel Classifier Training

BeFIT Workshop, ICCV 2011

jmg174@pitt.edu

16/16 November 13, 2011

slide-17
SLIDE 17

http://pitt.edu/~emotion

Predictions with Smoothing

300 600 900 1200

Thresholded Prediction

300 600 900 1200

Groundtruth Labels

BeFIT Workshop, ICCV 2011

Accuracy = 0.855 F1 = 0.756 Kappa = 0.656

300 600 900 1200

Thresholded Prediction (with smoothing) Accuracy = 0.896 F1 = 0.826 Kappa = 0.754

17/16 November 13, 2011

slide-18
SLIDE 18

http://pitt.edu/~emotion

Threshold Training Set

BeFIT Workshop, ICCV 2011

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

Accuracy F1 Kappa Score on Performance Metric Performance Metric Training Set Naïve Implementation Threshold Analysis

18/16 November 13, 2011

slide-19
SLIDE 19

http://pitt.edu/~emotion

Results by Threshold Type

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

Accuracy F1 Kappa Score on Performance Metric Performance Metric

Zero maxAc maxF1 maxKa EER

BeFIT Workshop, ICCV 2011

The threshold that maximized Accuracy performed poorly on F1 and Kappa. Thresholds that maximized F1, Kappa, and EER performed best on all metrics.

19/16 November 13, 2011