TensorRT Optimizations for Embedded Facial Recognition Alexey - - PowerPoint PPT Presentation

tensorrt optimizations for
SMART_READER_LITE
LIVE PREVIEW

TensorRT Optimizations for Embedded Facial Recognition Alexey - - PowerPoint PPT Presentation

TensorRT Optimizations for Embedded Facial Recognition Alexey Kadeishvili, CTO, Vocord Vocord Company: Main Facts Developer of video surveillance and video analytics systems since 1999 Deep expertise in facial recognition


slide-1
SLIDE 1

TensorRT Optimizations for Embedded Facial Recognition

Alexey Kadeishvili, CTO, Vocord

slide-2
SLIDE 2

Vocord Company: Main Facts

www.vocord.com 2

■ Developer of video surveillance and video analytics systems since 1999 ■ Deep expertise in facial recognition ■ Top-rated in NIST and Megaface face recognition tests ■ NVIDIA Metropolis program member Our customers and partners

slide-3
SLIDE 3

Notable figures

250+ projects for public and private sectors 140 million faces in enrollment database in a single project 200,000 cameras are managed by VOCORD video analysis software 350,000/month API request to VOCORD FaceMatica cloud Geography: Europe, Middle East, SE Asia, East Asia, Latin America, Oceania

www.vocord.com 3

slide-4
SLIDE 4

Face recognition products

www.vocord.com 4

All products support NVIDIA GPU

VOCORD FaceMatica

Face recognition engine in a Cloud

VOCORD NetCam

New generation face recognition camera

VOCORD NanoFace

NVIDIA Jetson-based embedded face recognition solution

VOCORD FaceControl

“Faces in the crowd” FR system

VOCORD FaceControl 3D

Free flow 3D facial recognition

nano

Face Recognition SDK

Face recognition engine SDK

slide-5
SLIDE 5

www.vocord.com 5

Enrolment DB Recognition engine Inbound image quality

Main Factors Impacting Facial Recognition

Enrolment DB quality: something beyond control Recognition engine: already works as in the Marvel movies

slide-6
SLIDE 6

www.vocord.com 6

VOCORD Facial Recognition Engine

TOP in Megaface Face Scrub Open Challenge 2015-2018

With accuracy 91.76%

TOP in NIST Face Recognition Vendor Test 2016-2018

TPR at FPR 10-4 = 98.7%, TPR at FPR 10-6 = 96.6%

slide-7
SLIDE 7

www.vocord.com 7

Cross Nation Invariance

Source: NIST Face recognition vendor test, 2018

slide-8
SLIDE 8

www.vocord.com 8

Pose Invariance

< 10˚ 10 ÷ 30˚ 30 ÷ 45˚ 45 ÷ 60˚ > 60˚ > 60˚, enrollment DB >60˚

0.25 0.2 0.15 0.1 0.05

FRR FAR

1.E-01 1.E-04 1.E-05 1.E-03 1.E-07 1.E-06 1.E-02 1.E00

Enrollment DB <30˚ Group 1 <10˚ Group 2 10 ÷ 30˚ Group 3 30 ÷ 45˚ Group 4 45 ÷ 60˚ Group 5 > 60˚

slide-9
SLIDE 9

www.vocord.com 9

Image Resolution Impact

L=48 pix L =24 pix

*L – the distance between eyes, pix ** FAR=10-4

Face identification probability Pixels between eyes (L)

0.7 0.8 0.85 0.75 1.0 0.95 0.9 72 48 36 60 12 24

True Identification Rate**

Optimal resolution Recommended minimum

slide-10
SLIDE 10

www.vocord.com 10

How to improve recognition?

Recognition engine: already works as in the Marvel movies Enrollment DB quality: something beyond control The quality of acquired face images: point of growth

Enrollment DB Recognition Engine Inbound Image Quality

slide-11
SLIDE 11

www.vocord.com 11

Different types of test datasets

NIST FRVT Report 2017 10 03

slide-12
SLIDE 12

www.vocord.com 12

“Controlled” dataset

Algorithm A Algorithm B

NIST FRVT Report 2017 10 03

slide-13
SLIDE 13

www.vocord.com 13

“Uncontrolled” dataset

Algorithm A Algorithm B

NIST FRVT Report 2017 10 03

slide-14
SLIDE 14

www.vocord.com 14

Controlled vs. Uncontrolled (FRR log scale)

0.1 0.3 0.4 0.2 0.7 0.6 0.5

FAR FRR

1.E-04 1.E-05 1.E-03 1.E-07 1.E-06 1.E-02

Algorithm A, uncontrolled environment Algorithm B, uncontrolled environment Algorithm A, controlled environment Algorithm B, controlled environment

slide-15
SLIDE 15

www.vocord.com 15

Controlled vs. Uncontrolled (linear scale)

0.1 0.3 0.4 0.2 0.7 0.6 0.5

FAR FRR

1.E-04 1.E-05 1.E-03 1.E-07 1.E-06 1.E-02

Algorithm A, uncontrolled environment Algorithm B, uncontrolled environment Algorithm A, controlled environment Algorithm B, controlled environment

slide-16
SLIDE 16

Hit the bottom: Images from IP camera

slide-17
SLIDE 17

The Advantages of Edge Video Analysis

www.vocord.com 17

■ Face recognition onboard ■ No compression artifacts: the image is taken directly from the sensor ■ Dynamic Region of Interest for every intelligent algorithm ■ Algorithm adjustment for particular camera set up VOCORD NetCam.AI edge video analytics camera

slide-18
SLIDE 18

Video Enhancement Onboard

18

12 bit image with static ROI 12 bit image with dynamic ROI Backlight, no enhancement

Dynamic ROI enhances the quality of image in the face area

slide-19
SLIDE 19

VOCORD NetCam.AI HW Features

www.vocord.com 19

Automated lens control High quality sensor NVIDIA Jetson TX1 GPU

slide-20
SLIDE 20

VOCORD NetCam.AI Tech Specs

www.vocord.com 20 Camera specs Resolution 3÷5 Mpix Temperature range

  • 25С ~ +50С

Ingress Protection IP 67 Dimensions 20x71x150 mm Power consumption 15W Built-in facial recognition engine specs Min face resolution for face recognition 12 pixels between the eyes Number of faces detected in one frame Up to 25 Latency of biometric template extraction Up to 150 ms per 1 face Face recognition performance Up to 32 faces/s Inference framework TensorRT

slide-21
SLIDE 21

Performance on Different Platforms

www.vocord.com 21 32 19 12 9 6 4 2,2 1,4 0,9 5 10 15 20 25 30 35 "Shallow" CNN "Medium" CNN "Deep" CNN NVIDIA Jetson TX1 Intel Movidius Qualcom Snapdragon 820

slide-22
SLIDE 22

Higher FPS Improves Accuracy

www.vocord.com 22

1.E-02

FRR

0.03 0.7 0.15 0.11 1.E-04 1.E-05 1.E-03 1.E-07 1.E-06

FAR

0.13 0.09 0.5 0.01

”Shallow” CNN “Medium” CNN “Deep” CNN Single face: Track (multiple faces): ”Shallow” CNN “Medium” CNN “Deep” CNN

slide-23
SLIDE 23

TensorRT vs. MXNet Performance

www.vocord.com 23

“Shallow” CNN “Very” CNN “Medium” CNN

Platform: NVIDIA Jetson TX1

FPS TensoRT MXNet

15 35 25 30 20 5 10 32 18 19 10 12 6

slide-24
SLIDE 24

www.vocord.com 24

WHAT’S THE PROFIT?

slide-25
SLIDE 25

Face recognition systems architectures

Edge analytics system with VOCORD NetCam.AI cameras 25 “Traditional” server architecture approach with regular IP-cameras

VS

LAN, Wi-Fi LAN One archive server Data center with many expensive rack servers

95% of processing is here 95% of processing is here

slide-26
SLIDE 26

www.vocord.com 26

Cost-Efficiency: 100 High Loaded Cameras

Edge computing with VOCORD NetCam.AI

26

“Traditional” server architecture with IP cameras

VS

Cameras USD 2,000 x 100 = USD 200,000 Server for matching and archive USD 10,000 Cameras USD 500 x 100 = USD 50,000 Servers Detection: 2 servers, 4xCPU 32 cores each USD 60,000 Template extraction: 4 servers, 2 GPU Tesla P40 each USD 120,000 Server for matching and archive USD 10,000

CAPEX: USD 210,000 CAPEX: USD 240,000

Maintenance costs: power supply (7-8 kWt), bandwidth (2Gbps), rack space

OPEX: USD 30,000 per year

Maintenance costs: power supply (800 Wt), bandwidth (2Gbps), rack space

OPEX: USD 2,000 per year

slide-27
SLIDE 27

www.vocord.com 27

  • Uploading various video analytics algorithms
  • Highly customized algorithms
  • Interacting cameras as a part of IoT
  • 3D vision

WHAT’S NEXT?

slide-28
SLIDE 28

Open Platform: Easy Algorithm Uploading

www.vocord.com 28 Facial recognition Behavioral analysis License plate recognition Emergency cases Lost and found objects Vehicle types

slide-29
SLIDE 29

Camera-Dependent Algorithm Customization

www.vocord.com 29 Step 1. The camera collects images and uploads them to the server Step 2. The neural network is retrained on the server using new images Step 3. Customized, light-weight neural network is uploaded back to the camera

slide-30
SLIDE 30

Customization to restricted data

www.vocord.com 30 Deeper DNNs provide better performance on unrestricted data On restricted data difference between deep and shallow network is negligible

Unrestricted data Restricted data

0.01 0.015 0.005 0.04 0.025 0.02 0.035 0.03 1.E-01 1.E-04 1.E-05 1.E-03 1.E-07 1.E-06 1.E-02

“Deep” neural network “Shallow” nueral network FAR FRR

1.E-02

FRR

0.01 0.015 0.005 0.04 0.025 0.02 0.035 0.03 1.E-04 1.E-05 1.E-03 1.E-07 1.E-06

FAR “Deep” neural network “Shallow” neural network

slide-31
SLIDE 31

Intercamera Tracking

www.vocord.com 31

NetCam.AI #1 NetCam.AI #2

Face Jeans Bag

slide-32
SLIDE 32

Obtaining 3D Models

■ Building a 3D object from synchronous snapshots from multiple cameras ■ Feature preprocessing for conjugate points search

www.vocord.com 32

slide-33
SLIDE 33

E-mail: sales@vocord.com Website: www.vocord.com

Thank you for your attention! Questions?