Deep Face Recognition Challenges and Tips for Real-life Deployment - - PowerPoint PPT Presentation

deep face recognition challenges and tips for real life
SMART_READER_LITE
LIVE PREVIEW

Deep Face Recognition Challenges and Tips for Real-life Deployment - - PowerPoint PPT Presentation

Deep Face Recognition Challenges and Tips for Real-life Deployment research@hertasecurity.com 1 Deep Face Recognition 2 Public DBs 3 Public models 4 Managing imbalance 5 Embeddings 6 Conclusions Deep Face Recognition GPU-powered


slide-1
SLIDE 1

Deep Face Recognition Challenges and Tips for Real-life Deployment

research@hertasecurity.com

slide-2
SLIDE 2

1 Deep Face Recognition 2 Public DBs 3 Public models 4 Managing imbalance 5 Embeddings 6 Conclusions

slide-3
SLIDE 3

HERTA

www.hertasecurity.com

Deep Face Recognition

 GPU-powered face recognition  Offices in Barcelona, Madrid,

London, Los Angeles

 Crowds, unconstrained

Deep Face Recognit itio ion

 Large training DBs, >100K images, >1K subjects (Public DBs)  Public models (Inception, VGG, ResNet, SENet…), close to state-of-the-art  Typically, embedding layer (yielding facial descriptor) feeds one-hot encoding  Unconstrained (in-the-wild) environments

slide-4
SLIDE 4

HERTA

www.hertasecurity.com

Public DBs

CWF LFW VGG Face VGG Face 2

IJBB

  • Mostly celebrities: subjects overlap

2.6K 9.1K 10.6K 1.8K 5.7K

slide-5
SLIDE 5

HERTA

www.hertasecurity.com

Public DBs

LF LFW CW CWF

  • Highly imbalanced

Demographic group Images / subject

slide-6
SLIDE 6

HERTA

www.hertasecurity.com

Public models Public models

  • trained on public DBs (DIY)

Validate with

  • demographically-balanced DB:

Asian female: 1M 1M pairs Asian male: 1M 1M pairs Black female: 1M 1M pairs Black male: 1M 1M pairs White female:1M 1M pairs White male: 1M 1M pairs Fac FaceNet (2015) CWF / MS-1MC VGGFace (2015) VGG Sp SphereFace (2017) CWF VGGFace2 (2017) MS-1MC + VGG2 (50% same ID, 50% different ID)

slide-7
SLIDE 7

HERTA

www.hertasecurity.com

Public models: examples of failures False positives False negatives

slide-8
SLIDE 8

HERTA

www.hertasecurity.com

Public models: evaluation

Fac FaceNet (2015) Sp SphereFace (2017) VGGFace (2015) VGGFace2 (2017)

1MC CWF VGG CWF 1MC 1MC VG2 VG2

White male Black male Asian female

slide-9
SLIDE 9

HERTA

www.hertasecurity.com

“Features get better at understanding faces, improving performances of individual tasks”

Multi-task learning

id gender ethnics

Managing imbalance

Undersampling Oversampling Cost-sensitive learning

c

SAMPLING (DATA-ORIENTED) TRAINING LOSS (MODEL-ORIENTED)

R Ranjan, VM Patel, R Chellappa. “Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition.” TPAMI 2017

slide-10
SLIDE 10

HERTA

www.hertasecurity.com

Managing imbalance – Data augmentation

  • Data augmentation: makes imbalance mitigation much more effective

Stochastic data augmentation Oversampled DB DNN Database

I Masi et al. "Do we really need to collect millions of faces for effective face recognition?" ECCV 2016.

slide-11
SLIDE 11

HERTA

www.hertasecurity.com

Managing imbalance – Proposal

Traditional imbalance: Proposal: IDR (robust to outliers) Ite Iterativ ive mult lti-la label l ove vers rsampli ling: 𝑛𝑏𝑦 𝑌 𝑛𝑗𝑜(𝑌) 𝐸9 𝑌 𝐸1(𝑌)

  • 1. Find most imbalanced label L
  • 2. Find most imbalanced category C within L
  • 3. Draw random sample from C, replicate

𝐸1 𝐸9

𝑛𝑏𝑦 𝑌 𝑛𝑗𝑜(𝑌) 𝐸9 𝑌 𝐸1(𝑌) #samples added #samples added

slide-12
SLIDE 12

HERTA

www.hertasecurity.com

Managing imbalance – Sample training batch Before oversampling… …and after

slide-13
SLIDE 13

HERTA

www.hertasecurity.com

Managing imbalance

  • Results with ResNet 20 (tiny network, for comparison only)
  • Better with almost 6X le

less subjects, 2X le less images! 10.6K subjects, 494K images 1.8 .8K subjects, 295K images

slide-14
SLIDE 14

HERTA

www.hertasecurity.com

Sparse embedding

Typically, in deep face recognition:

  • What about
  • ReLU + embedding + one-hot encoding? (e.g. VGGFace)

Why more dimensions, if 90% zero? Larger representation subspace, at expense of computational efficiency

  • But can gain it back!
  • ̴200M comp/s

image CNN embedding layer

  • ne-hot

encoding Sparse 4096-d Dense 512-d Dict + Dense 256-d

slide-15
SLIDE 15

HERTA

www.hertasecurity.com

Conclusions

  • Public training / validation DBs: heavily bia

iased at multiple levels

  • Without balancing, trained models will be biased, too!
  • Prefer “better data” over “more data”
  • Machine Learning

vs Machine Teaching

 Explainable ML Designing algorithms to passively train models Choosing which examples to show a learner

Zhu, Xiaojin, et al. "An Overview of Machine Teaching." arXiv preprint arXiv:1801.05927 (2018).

slide-16
SLIDE 16

Questions?

research@hertasecurity.com