Breakthroughs in Face Recognition Capability via Deep Learning and - - PowerPoint PPT Presentation

breakthroughs in face recognition capability via deep
SMART_READER_LITE
LIVE PREVIEW

Breakthroughs in Face Recognition Capability via Deep Learning and - - PowerPoint PPT Presentation

Breakthroughs in Face Recognition Capability via Deep Learning and GPUs Prof. Neil M. Robertson Dr. Steven Lu Dr. Guosheng Hu Dr. Sankha Mukherjee , Dr. Rolf Baxter Dr. Yang Hua Dr. Yuan Yang Dr. Elyor Qodirov Soumya Ghosh Face Recognition


slide-1
SLIDE 1

Breakthroughs in Face Recognition Capability via Deep Learning and GPUs

  • Prof. Neil M. Robertson
  • Dr. Sankha Mukherjee, Dr. Rolf Baxter
  • Dr. Steven Lu
  • Dr. Guosheng Hu
  • Dr. Yang Hua
  • Dr. Yuan Yang
  • Dr. Elyor Qodirov

Soumya Ghosh

slide-2
SLIDE 2

Face Recognition that really works

  • Face recognition in unconstrained scenes is hard
  • Changes in pose, lighting makes matching faces from different sources very

difficult (e.g. passport to cctv)

  • Recognition, especially at a large scale, is even harder

Anyvision technology matches millions of identities across the range of appearances

slide-3
SLIDE 3

Pipeline

Face Detection Face Alignment Discriminative Feature Extraction Tracking DB Search

40x faster on GPU compared to CPU alone

slide-4
SLIDE 4

Face Detection in SD -> 4k

▶ Huge impact on the system performance and speed. ▶ High recall and precision on real situation - pose, image quality, illumination ▶ Good methods are not fast enough ▶ OTS DL methods(Faster RCNN … YOLO) are all fully convolutional networks ▶ they can not save any computation via early rejection ▶ Fast methods not accurate enough ▶ Traditional methods e.g. Haar, cascade method, sliding window ▶ Cascade structure is essential for speed ▶ image is large ▶ reject false positive in early stage ▶ reduces computation at later stages.

slide-5
SLIDE 5

Face Detection

Reuse feature map to generate multi scale proposals (scale1, scale2) Reuse feature map to filter out false positive detections in early stage (det->stage1->stage2->out)

slide-6
SLIDE 6

8 ms on Quadro/Tesla (at HD)

slide-7
SLIDE 7

Data, algorithms and computation must be treated holistically

slide-8
SLIDE 8

Data collection and cleanup

  • Use a small number of clean annotations from images and videos covering a wide range of

pose, lighting and resolution

  • Dynamic clustering and ranking to clean up millions of images and video using the latest

best trained model

  • Image attributes and probabilistic graphical models used to quantify weakness of the

features from the net

  • Invariant to pose, lighting, facial hair, hairstyle, glasses
  • Male vs. Female distinct
  • Virtuous cycle of data clean up using current best net and the graphical model
slide-9
SLIDE 9

3D Data Augmentation

  • Decompose non-linear (non-convex) cost functions into linear (convex) ones
  • Closed-form solutions
  • Perspective camera
  • Phong illumination model
  • Contour landmarks
  • Efficient and accurate
slide-10
SLIDE 10

3D Data Augmentation

slide-11
SLIDE 11

3D Data Augmentation

slide-12
SLIDE 12

Generative approaches to missing data

  • Generator and Discriminator autoencoders
  • Loss function is difference of the encoding loss distribution between the
  • riginal image and the generated image
  • Loss = F(|A(i_real)-A(i_gen)|)

[A=encoding loss of discriminator, i_real=real image, i_gen=generated image]

slide-13
SLIDE 13

Generative Approach- Face Enhancement

Generated Ground Truth Input

slide-14
SLIDE 14

Attribute feature fusion

slide-15
SLIDE 15

Tensor Fusion- Gated Two Stream Networks

Optimisation

Tucker Decomposition

slide-16
SLIDE 16

Feature Invariance – Data and Algorithmic Harmony

Glasses(Blue) No Glasses(Red) Blurry (Blue) Not Blurry(Red) Facial Hair (Blue) No Facial Hair (Red) Goatee Moustache

slide-17
SLIDE 17

Our Nets Contain Attributes

Female(Red) Male(Blue) Young (Blue) Not Young(Red)

slide-18
SLIDE 18

DB Search

slide-19
SLIDE 19

Using GPU

slide-20
SLIDE 20

Real World Testimony

  • We are deployed right now all over the world
  • The system runs real-time on 5 Megapixel video streams.
  • We run 10+ cameras on a single commercial GPU

A workstation can support multiple GPUs … scale up

  • From a current deployment in high profile, high security locations

120K people every 3 days pass through the system all in real time 50 people in the watchlist 99.9% recognition accuracy <1 false positive per day

slide-21
SLIDE 21

Pose Invariant Recognition NIST rank #1

IJB-A

slide-22
SLIDE 22

Results- Pose

slide-23
SLIDE 23

Illumination and Expression

slide-24
SLIDE 24

NIR Cross-Modality

slide-25
SLIDE 25

Labeled Faces in the Wild

slide-26
SLIDE 26

“The numbers make it tactical”

slide-27
SLIDE 27

We are hiring in the UK

Algorithms engineers Machine learning specialists Hardware experts - GPU/FPGA info@anyvision.co.uk