Learning Face Recognition from Limited Training Data using Deep Neural Networks
Xi Peng*, Nalini Ratha**, Sharathchandra Pankanti**
*Rutgers University, New Jersey, 08854 **IBM T. J. Watson Research Center, New York 10598
Learning Face Recognition from Limited Training Data using Deep - - PowerPoint PPT Presentation
Learning Face Recognition from Limited Training Data using Deep Neural Networks Xi Peng * , Nalini Ratha ** , Sharathchandra Pankanti ** * Rutgers University, New Jersey, 08854 ** IBM T. J. Watson Research Center, New York 10598 Motivation
*Rutgers University, New Jersey, 08854 **IBM T. J. Watson Research Center, New York 10598
Pose Low resolution Illumination Chronology Expression Decoration Angelina Jolie Occlusion
Impostor pair (different person) 1 1 1 Genuine pair (same person)
Private Dataset Group Approach #subjects #images Accuracy on LFW SFC Facebook DeepFace/2014 4,030 4,400,000 97.35% WebFace Google FaceNet/2015 8,000,000 200,000,000 98.87% Public Dataset #subjects #images LFW 5,749 13,233 PubFig 200 58,797 FaceScrub 530 107,818 CACD 2,000 163,446 CASIA 10,575 494,414
DNNs are extremely data-thirsty. However, limited training data is available.
Require multiple networks:
alignment network
recognition network 224x224x3 100x100x3 face identity affine parameter
alignment network
recognition network 224x224x3 100x100x3 face identity affine parameter
alignment network
recognition network 224x224x3 100x100x3 face identity affine parameter
conv 7x7x128 max pooling 2x2 conv 5x5x128 conv 3x3x128 max pooling 2x2 max pooling 2x2 fully connected 256 fully connected 64 6-affine parameter 224x224x3
alignment network recognition network face identity affine parameter
conv 7x7x128 max pooling 2x2 conv 5x5x128 conv 3x3x128 max pooling 2x2 max pooling 2x2 fully connected 256 fully connected 64 6-affine parameter 224x224x3 100x100x3
alignment network recognition network face identity affine parameter
alignment network recognition network face identity affine parameter
convolution module 1 convolution module 2 inception module 5 inception module 4 inception module 3 fully connected 256 softmax loss contrastive loss 100x100x3
max pooling 2x2
max pooling 2x2 max pooling 2x2 max pooling 2x2
conv 1a: 3x3 LRN ReLU LRN ReLU conv 1b: 3x3 convolution module 1 convolution module 2 inception module 5 inception module 4 inception module 3 fully connected 256 softmax loss contrastive loss 100x100x3
max pooling 2x2
max pooling 2x2 max pooling 2x2 max pooling 2x2 conv 2a: 3x3 LRN ReLU LRN ReLU conv 2a: 3x3
convolution module 1 convolution module 2 inception module 5 inception module 4 inception module 3 fully connected 256 softmax loss contrastive loss 100x100x3
max pooling 2x2
max pooling 2x2 max pooling 2x2 max pooling 2x2 incept4b depth concat depth concat incept4a incept3b depth concat depth concat incept3a incept5b depth concat depth concat incept5a
convolution module 1 convolution module 2 inception module 5 inception module 4 inception module 3 fully connected 256 softmax loss contrastive loss 100x100x3
max pooling 2x2
max pooling 2x2 max pooling 2x2 max pooling 2x2 incept4b depth concat depth concat incept4a incept3b depth concat depth concat incept3a incept5b depth concat depth concat incept5a
pool 3x3 previous layer depth concat conv 1x1 conv 1x1 conv 1x1 conv 5x5 conv 3x3 conv 1x1
convolution module 1 convolution module 2 inception module 5 inception module 4 inception module 3 fully connected 256 softmax loss contrastive loss 100x100x3
max pooling 2x2
max pooling 2x2 max pooling 2x2 max pooling 2x2 identity label fully connected 256 fully connected 10575 fully connected 256 fully connected 256 contrastive loss softmax loss all genuine pairs, top-k closest impostor pairs
224x224x3 100x100x3 face detection landmark detection landmark centers mirroring 2D alignment
224x224x3 100x100x3 face detection landmark detection landmark centers mirroring 2D alignment
alignment network
recognition network loss