High-Resolution Comprehensive 3-D Dynamic Database for Facial - - PowerPoint PPT Presentation

high resolution comprehensive 3 d dynamic database for
SMART_READER_LITE
LIVE PREVIEW

High-Resolution Comprehensive 3-D Dynamic Database for Facial - - PowerPoint PPT Presentation

High-Resolution Comprehensive 3-D Dynamic Database for Facial Articulation Analysis


slide-1
SLIDE 1

High-Resolution Comprehensive 3-D Dynamic Database for Facial Articulation Analysis

Bogdan J. Matuszewski , Wei Quan, Lik-Kwan Shark bmatuszewski1@uclan.ac.uk

Applied Digital Signal and Image Processing (ADSIP) Research Centre School of Computing, Engineering and Physical Sciences University of Central Lancashire (UCLan), Preston PR1 2HE, UK

slide-2
SLIDE 2

Presentation Outline

  • Motivation
  • Structure of the Hi4D-ADSIP
  • Validation of facial expression
  • Facial dysfunction analysis
  • Conclusions
slide-3
SLIDE 3

Facial Articulation Databases

A representative sample of the existing databases

slide-4
SLIDE 4

Motivation

  • High resolution 3D dynamic facial scans represent more closely the facial structure

related to the “internal” face anatomy rather than only the external appearance, therefore such data promise to have greater applicability for bio-medical applications:

  • head and neck radiation therapy;
  • corrective plastic surgery;
  • quantitative assessment of neurological conditions

(stroke, Bell’s palsy, Parkinson’s disease);

  • aging.
  • 3D dynamic data should enable to construct more accurate facial models e.g. for HCI,

biometrics and security:

  • facial composites from crime witness accounts (efit/evofit) – there is some evidence

that facial expressions and facial dynamics can improve a success rate of such systems.

Hi4D-ADSIP database

But the Hi4D-ADSIP database has been design for a general use!

slide-5
SLIDE 5
  • 3D facial sequences captured at 60fps using

six cameras with 2352x1728 pixel each (scanner from Dimensional Imaging has been used).

  • Audio synchronised with 3D recordings.
  • Sessions recorded on a camcorder.

Experimental Setup

Hi4D-ADSIP database

slide-6
SLIDE 6

Database Structure

Hi4D-ADSIP database

slide-7
SLIDE 7
  • Currently there are 80 “control” subjects in the database. 65 of them are

undergraduate students from the Performing Arts Department at UCLan. The rest are postgraduate students and staff from the University. They are of different ethnic origin, with age ranging between 18 and 60; 48 are female and 32 are male.

Hi4D-ADSIP database

Database Structure

slide-8
SLIDE 8

Database Structure

Hi4D-ADSIP database

slide-9
SLIDE 9
  • Currently there are 80 “control” subjects in the database. 65 of them are

undergraduate students from the Performing Arts Department at UCLan. The rest are postgraduate students and staff from the university. They are of different ethnic origin, with age ranging between 18 and 60; 48 are female and 32 are male.

  • Seven expressions were performed (acted) by each subject, including: Anger, Disgust,

Fear, Happiness, Sadness, Surprise and Pain at three levels of intensity ‘low’, ‘normal’ and ‘extreme’. Additionally each subject was asked to articulate mouth and eyebrows as well as read five phrases typically used in assessment of some neurological conditions again at three intensity levels.

Hi4D-ADSIP database

Database Structure

slide-10
SLIDE 10

Hi4D-ADSIP database

slide-11
SLIDE 11
  • Currently there are 80 “control” subjects in the database. 65 of them are

undergraduate students from the Performing Arts Department at UCLan. The rest are postgraduate students and staff from the university. They are of different ethnic origin, with age ranging between 18 and 60; 48 are female and 32 are male.

  • Seven expressions were performed (acted) by each subject, including: Anger, Disgust,

Fear, Happiness, Sadness, Surprise and Pain at three levels of intensity ‘mild’, ‘normal’ and ‘extreme’. Additionally each subject was asked to articulate mouth and eyebrows as well as read five phrases typically used in assessment of some neurological conditions again at three intensity levels.

  • Recorded sequences last between three and five seconds.
  • In total there are 3,360 recorded 3D sequences (~610,000 3D face models)

Hi4D-ADSIP database

Database Structure

slide-12
SLIDE 12

Database Structure

Hi4D-ADSIP database

slide-13
SLIDE 13
slide-14
SLIDE 14

Three levels of disgust Mouth and eyebrows articulation

slide-15
SLIDE 15
  • 1. You know how
  • 2. Down to earth
  • 3. I got home from work
  • 4. Near the table in the dining room
  • 5. They heard him speak on the radio last night

Standard phrases used in an assessment of neurological patients

slide-16
SLIDE 16

Each recorded video clip is assessed by 5 observers. To make the task manageable for

  • bservers, during an observation session each observer assesses

105 video clips (5 subjects with 7 expressions at 3 expressions intensity levels), with subjects assigned to the

  • bserver randomly. Observers are asked to provide a confidence ratings, for each
  • bserved sequence, with values in the range of 0 to 100%. For a given video clip, ratings

could be distributed over the various expressions as long as scores add up to 100%.

Hi4D-ADSIP database

Facial Expression Validation

Each expression has an associated mean confidence vector. Confidence scores have a grand mean of 60%; by actor: 54% - 80%.; by expression: 35% - 83%; by level: 57% - 65%

slide-17
SLIDE 17

Happiness expressions were given high confidence scores of 83% on average, whereas fear expressions were the worst with the average confidence score of only 35%. Also, the normal intensity level on average was somewhat better rated than mild, and extreme was on average also somewhat better than normal. Interestingly for two expressions happiness and pain the extreme level confidence scores were lower than for the normal level.

Hi4D-ADSIP database

Intensity Anger (%) Disgust (%) Fear (%) Happiness (%) Sadness (%) Surprise (%) Pain Mean (%) Mild 59.44 50.42 25.13 86.50 39.63 60.50 75.62 56.75 Normal 60.92 53.07 32.66 85.67 51.71 61.50 79.33 60.69 Extreme 64.36 61.50 48.25 78.21 61.70 64.75 75.33 64.87 Mean 61.57 54.99 35.34 83.46 .51.01 62.25 76.76 60.77

Facial Expression Validation

Human observers mean confidence results for seven expressions

slide-18
SLIDE 18

Anger (%) Disgust (%) Fear (%) Happiness (%) Sadness (%) Surprise (%) Pain (%) Anger 61.57 18.38 4.22 0.56 2.93 3.60 8.75 Disgust 16.37 54.99 7.93 0.58 8.31 5.24 6.58 Fear 3.84 9.69 35.34 9.86 33.71 7.56 Happiness 0.44 1.44 2.64 83.46 2.36 7.82 1.83 Sadness 2.19 9.33 7.23 0.69 62.25 5.25 13.06 Surprise 1.06 1.68 9.10 6.17 3.65 76.76 1.58 Pain 7.67 14.24 7.43 1.64 15.72 2.29 51.01

Human observer confidence confusion matrix

Hi4D-ADSIP database

Facial Expression Validation

slide-19
SLIDE 19

Baseline automatic facial expression recognition

Hi4D-ADSIP database

Anger (%) Disgust (%) Fear (%) Happiness (%) Sadness (%) Surprise (%) Pain (%) Anger 82.70 5.30 6.67 0.00 1.30 1.30 2.67 Disgust 10.67 68.00 6.67 0.00 8.00 0.00 6.67 Fear 6.67 0.00 82.70 0.00 2.70 5.30 2.67 Happiness 0.00 0.00 0.00 98.70 0.00 1.33 0.00 Sadness 9.30 4.30 1.30 1.30 77.30 0.00 6.67 Surprise 0.00 0.00 1.30 2.67 2.67 81.33 0.00 Pain 8.00 1.30 2.67 0.00 2.67 0.00 85.33

Confusion matrix of the kNN classifier with Fisher-face representation The recognition rates for different facial expressions vary between 98% for happiness expression and 68% for disgust expression. This difference between recognition rates for different expressions mirrors the recognition rates obtained for human observers, with happiness having the highest recognition rate; anger, surprise, pain medium and disgust and sadness with the low recognition rates.

slide-20
SLIDE 20

Facial dysfunction analysis

The constructed database can benefit research and development in diverse applications, including: psychology, security, biometrics, entertainment and medical assessment. To the best of the authors knowledge this is, to date, the most comprehensive repository of 3D dynamic facial articulations. In parallel to the proposed ”control subjects” database, a ”clinical subjects” database is also being constructed for stroke, Bell’s palsy and Parkinson’s

  • disease. These databases are currently being used for the studies on detection and

quantification of facial dysfunctions of neurological patients. Based on analysis of facial asymmetry, the preliminary results from the on-going study for stroke patients suggest that dynamic 3D optical scanning is a feasible technique for the accurate and robust quantification of facial paresis.

slide-21
SLIDE 21
  • The presentation introduced Hi4D-ADSIP 3D dynamic facial articulation database.
  • Currently it contains 3360 recorded sequences showing 14 different articulations

acquired from 80 subjects.

  • Another ~20 older (>50 years old) subject will be added soon to reduce an age bias in the

database.

  • Clinical version of the database for neurological patents (stroke, Bell’s palsy and

Parkinson’s disease) is under construction – currently containing 32 subjects (mostly stroke and Parkinson’s disease).

  • Subject to final validation of the tracking results and completed human observer

validation the database will be soon (January/February 2012) make publicly available for research (non-profit) purposes.

Conclusions

A sample of the database is currently available (for non-profit use), please send an email to bmatuszewski1@uclan.ac.uk. We would be grateful for a feedback as it will help us to improve the database for the full release.