Deep Face Recognition Challenges and Tips for Real-life Deployment - - PowerPoint PPT Presentation
Deep Face Recognition Challenges and Tips for Real-life Deployment - - PowerPoint PPT Presentation
Deep Face Recognition Challenges and Tips for Real-life Deployment research@hertasecurity.com 1 Deep Face Recognition 2 Public DBs 3 Public models 4 Managing imbalance 5 Embeddings 6 Conclusions Deep Face Recognition GPU-powered
1 Deep Face Recognition 2 Public DBs 3 Public models 4 Managing imbalance 5 Embeddings 6 Conclusions
HERTA
www.hertasecurity.com
Deep Face Recognition
GPU-powered face recognition Offices in Barcelona, Madrid,
London, Los Angeles
Crowds, unconstrained
Deep Face Recognit itio ion
Large training DBs, >100K images, >1K subjects (Public DBs) Public models (Inception, VGG, ResNet, SENet…), close to state-of-the-art Typically, embedding layer (yielding facial descriptor) feeds one-hot encoding Unconstrained (in-the-wild) environments
HERTA
www.hertasecurity.com
Public DBs
CWF LFW VGG Face VGG Face 2
IJBB
- Mostly celebrities: subjects overlap
2.6K 9.1K 10.6K 1.8K 5.7K
HERTA
www.hertasecurity.com
Public DBs
LF LFW CW CWF
- Highly imbalanced
Demographic group Images / subject
HERTA
www.hertasecurity.com
Public models Public models
- trained on public DBs (DIY)
Validate with
- demographically-balanced DB:
Asian female: 1M 1M pairs Asian male: 1M 1M pairs Black female: 1M 1M pairs Black male: 1M 1M pairs White female:1M 1M pairs White male: 1M 1M pairs Fac FaceNet (2015) CWF / MS-1MC VGGFace (2015) VGG Sp SphereFace (2017) CWF VGGFace2 (2017) MS-1MC + VGG2 (50% same ID, 50% different ID)
HERTA
www.hertasecurity.com
Public models: examples of failures False positives False negatives
HERTA
www.hertasecurity.com
Public models: evaluation
Fac FaceNet (2015) Sp SphereFace (2017) VGGFace (2015) VGGFace2 (2017)
1MC CWF VGG CWF 1MC 1MC VG2 VG2
White male Black male Asian female
HERTA
www.hertasecurity.com
“Features get better at understanding faces, improving performances of individual tasks”
Multi-task learning
id gender ethnics
Managing imbalance
Undersampling Oversampling Cost-sensitive learning
c
SAMPLING (DATA-ORIENTED) TRAINING LOSS (MODEL-ORIENTED)
R Ranjan, VM Patel, R Chellappa. “Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition.” TPAMI 2017
HERTA
www.hertasecurity.com
Managing imbalance – Data augmentation
- Data augmentation: makes imbalance mitigation much more effective
Stochastic data augmentation Oversampled DB DNN Database
I Masi et al. "Do we really need to collect millions of faces for effective face recognition?" ECCV 2016.
HERTA
www.hertasecurity.com
Managing imbalance – Proposal
Traditional imbalance: Proposal: IDR (robust to outliers) Ite Iterativ ive mult lti-la label l ove vers rsampli ling: 𝑛𝑏𝑦 𝑌 𝑛𝑗𝑜(𝑌) 𝐸9 𝑌 𝐸1(𝑌)
- 1. Find most imbalanced label L
- 2. Find most imbalanced category C within L
- 3. Draw random sample from C, replicate
𝐸1 𝐸9
𝑛𝑏𝑦 𝑌 𝑛𝑗𝑜(𝑌) 𝐸9 𝑌 𝐸1(𝑌) #samples added #samples added
HERTA
www.hertasecurity.com
Managing imbalance – Sample training batch Before oversampling… …and after
HERTA
www.hertasecurity.com
Managing imbalance
- Results with ResNet 20 (tiny network, for comparison only)
- Better with almost 6X le
less subjects, 2X le less images! 10.6K subjects, 494K images 1.8 .8K subjects, 295K images
HERTA
www.hertasecurity.com
Sparse embedding
Typically, in deep face recognition:
- What about
- ReLU + embedding + one-hot encoding? (e.g. VGGFace)
Why more dimensions, if 90% zero? Larger representation subspace, at expense of computational efficiency
- But can gain it back!
- ̴200M comp/s
image CNN embedding layer
- ne-hot
encoding Sparse 4096-d Dense 512-d Dict + Dense 256-d
HERTA
www.hertasecurity.com
Conclusions
- Public training / validation DBs: heavily bia
iased at multiple levels
- Without balancing, trained models will be biased, too!
- Prefer “better data” over “more data”
- Machine Learning