SLIDE 1 Breakthroughs in Face Recognition Capability via Deep Learning and GPUs
- Prof. Neil M. Robertson
- Dr. Sankha Mukherjee, Dr. Rolf Baxter
- Dr. Steven Lu
- Dr. Guosheng Hu
- Dr. Yang Hua
- Dr. Yuan Yang
- Dr. Elyor Qodirov
Soumya Ghosh
SLIDE 2 Face Recognition that really works
- Face recognition in unconstrained scenes is hard
- Changes in pose, lighting makes matching faces from different sources very
difficult (e.g. passport to cctv)
- Recognition, especially at a large scale, is even harder
Anyvision technology matches millions of identities across the range of appearances
SLIDE 3 Pipeline
Face Detection Face Alignment Discriminative Feature Extraction Tracking DB Search
40x faster on GPU compared to CPU alone
SLIDE 4 Face Detection in SD -> 4k
▶ Huge impact on the system performance and speed. ▶ High recall and precision on real situation - pose, image quality, illumination ▶ Good methods are not fast enough ▶ OTS DL methods(Faster RCNN … YOLO) are all fully convolutional networks ▶ they can not save any computation via early rejection ▶ Fast methods not accurate enough ▶ Traditional methods e.g. Haar, cascade method, sliding window ▶ Cascade structure is essential for speed ▶ image is large ▶ reject false positive in early stage ▶ reduces computation at later stages.
SLIDE 5 Face Detection
Reuse feature map to generate multi scale proposals (scale1, scale2) Reuse feature map to filter out false positive detections in early stage (det->stage1->stage2->out)
SLIDE 6
8 ms on Quadro/Tesla (at HD)
SLIDE 7
Data, algorithms and computation must be treated holistically
SLIDE 8 Data collection and cleanup
- Use a small number of clean annotations from images and videos covering a wide range of
pose, lighting and resolution
- Dynamic clustering and ranking to clean up millions of images and video using the latest
best trained model
- Image attributes and probabilistic graphical models used to quantify weakness of the
features from the net
- Invariant to pose, lighting, facial hair, hairstyle, glasses
- Male vs. Female distinct
- Virtuous cycle of data clean up using current best net and the graphical model
SLIDE 9 3D Data Augmentation
- Decompose non-linear (non-convex) cost functions into linear (convex) ones
- Closed-form solutions
- Perspective camera
- Phong illumination model
- Contour landmarks
- Efficient and accurate
SLIDE 10
3D Data Augmentation
SLIDE 11
3D Data Augmentation
SLIDE 12 Generative approaches to missing data
- Generator and Discriminator autoencoders
- Loss function is difference of the encoding loss distribution between the
- riginal image and the generated image
- Loss = F(|A(i_real)-A(i_gen)|)
[A=encoding loss of discriminator, i_real=real image, i_gen=generated image]
SLIDE 13 Generative Approach- Face Enhancement
Generated Ground Truth Input
SLIDE 14
Attribute feature fusion
SLIDE 15 Tensor Fusion- Gated Two Stream Networks
Optimisation
Tucker Decomposition
SLIDE 16 Feature Invariance – Data and Algorithmic Harmony
Glasses(Blue) No Glasses(Red) Blurry (Blue) Not Blurry(Red) Facial Hair (Blue) No Facial Hair (Red) Goatee Moustache
SLIDE 17 Our Nets Contain Attributes
Female(Red) Male(Blue) Young (Blue) Not Young(Red)
SLIDE 18
DB Search
SLIDE 19
Using GPU
SLIDE 20 Real World Testimony
- We are deployed right now all over the world
- The system runs real-time on 5 Megapixel video streams.
- We run 10+ cameras on a single commercial GPU
A workstation can support multiple GPUs … scale up
- From a current deployment in high profile, high security locations
120K people every 3 days pass through the system all in real time 50 people in the watchlist 99.9% recognition accuracy <1 false positive per day
SLIDE 21
Pose Invariant Recognition NIST rank #1
IJB-A
SLIDE 22
Results- Pose
SLIDE 23
Illumination and Expression
SLIDE 24
NIR Cross-Modality
SLIDE 25
Labeled Faces in the Wild
SLIDE 26
“The numbers make it tactical”
SLIDE 27
We are hiring in the UK
Algorithms engineers Machine learning specialists Hardware experts - GPU/FPGA info@anyvision.co.uk