Improving RGB-D face recognition via transferring pretrained 2D - - PowerPoint PPT Presentation

▶

Aug 31, 2023 386 likes •523 views

Improving RGB-D face recognition via transferring pretrained 2D networks Xingwang Xiong, Xu Wen, and Cheng Huang INSTITUTE O https://github.com/XingwXiong/Face3D-Pytorch OF C COMPUTING T TECHNOLOGY 3D Face Recognition Algorithm Challenge

SLIDE 1

INSTITUTE O OF C COMPUTING T TECHNOLOGY

Improving RGB-D face recognition via transferring pretrained 2D networks

Xingwang Xiong, Xu Wen, and Cheng Huang

https://github.com/XingwXiong/Face3D-Pytorch 3D Face Recognition Algorithm Challenge (3DFRAC) ICT, Chinese Academy of Sciences

SLIDE 2

3DFRAC Bench 19

Face Representations

Point cloud 3D Mesh Depth image RGB image

3D Face Images

SLIDE 3

3DFRAC Bench 19

3D Face Recognition Algorithm Challenge

n RGB-D Face Recognition n The value in the depth image reflects the distance

f scene object surface from the viewpoint.

RGB image Depth image

RGB-D image

Depth camera (Kinect V2)

SLIDE 4

3DFRAC Bench 19

Why do we need RGB-D FR ?

n 2D FR is sensitive to external variations

n Poses n Facial expressions n Illuminations

n Extra low-level patterns on depth images

n Smooth variations n Contracts n Borders & global layouts

n Face Anti-spoofing (ICPR 2018)

SLIDE 5

3DFRAC Bench 19

Open-set vs. Closed-set FR

n Open-set FR

n Classification

n Closed-set FR

n Face embedding n Similarity comparison

n 3DFRAC

n a closed-set problem

CVPR 2017

SLIDE 6

3DFRAC Bench 19

RGB Images vs. Depth Images

RGB images

n High frequency patterns

n Textures & Details

n Easy to obtain n Massive scale

n ∼ 3.3 million faces1 n ∼ 9K identities1

Depth images

n Low frequency patterns

n Smooth variations n Contracts n Borders & Global layouts

n Not enough to learn a deep CNN

n ∼ 403K million faces2 n ∼ 1.2K identities2

1. VGGFace2 2. Intellifusion RGB-D face dataset

SLIDE 7

3DFRAC Bench 19

Goal

n To leverage both conventional RGB-based

works and depth features

SLIDE 8

3DFRAC Bench 19

Inter-modal Transfer Learning

Copy weights 2D pretrained network RGB-D network

SLIDE 9

3DFRAC Bench 19

Inter-modal Transfer Learning

n Use ResNet-50 as the backbone network n Copy pretrained weights of middle layers n Fine-tune the whole model with 224x224

RGB-D images

SLIDE 10

3DFRAC Bench 19

Preprocessing

n Face Detection & Alignment

n MTCNN (SPL 2016)

n Randomly horizontally flipping n Normalizing

n 0-1 range

Face Detection MTCNN Locate the face

SLIDE 11

3DFRAC Bench 19

Results

n 94.64% accuracy on the Intellifusion RGB-D

dataset

n Won 1st on 3DFRAC

Method Input data CNN Models Accuracy (%) Pretrained on ImageNet RGB images ResNet-50 94.47 Train from scratch RGB-D images RGB-D ResNet50 88.36 Pretrained on ImageNet RGB-D images RGB-D ResNet50 94.64

SLIDE 12

3DFRAC Bench 19

Conclusions

n Inter-modal transfer learning from pretrained

2D networks to RGB-D networks improves recognition accuracy

n Code is open-sourced

n https://github.com/XingwXiong/Face3D-Pytorch

n Email

n xiongxingwang@ict.ac.cn

SLIDE 13

3DFRAC Bench 19