1
Dat ata a Bias as in Visual ual Re Reco cognition nition
报告人: 邓伟洪 北京邮电大学
- Mar. 2020 VALSE
Dat ata a Bias as in Visual ual Re Reco cognition nition - - PowerPoint PPT Presentation
Mar. 2020 VALSE Dat ata a Bias as in Visual ual Re Reco cognition nition 1 Visual al recognit nition Courtesy of Prof. Fei-fei Li 2 History ry of CNN Geoff Hinton Yann LeCun Kunihiko
1
2
Visual al recognit nition
Courtesy of Prof. Fei-fei Li
3
History ry of CNN
Kunihiko Fukushima Geoff Hinton Yann LeCun
K Fukushima, Biological cybernetics, 1980 Y LeCun, et al, Proceedings of the IEEE, 1998 A Krizhevsky, I Sutskever, GE Hinton, NIPS 2012
4
Real-world rld Recogni nitio ion n Bias
Google Photo Amazon Rekognition Tesla Autopolit Data bias Algorithm bias
5
What cause the bias of visual recognition
6
Racial Faces in-the-Wild (RFW)
Mei Wang, Weihong Deng, et al., Racial Faces in-the-Wild: Reducing Racial Bias by Information Maximization Adaptation Network, ICCV 2019.
7
Model RFW Caucasian Indian Asian African SOTA Algorithms Center-loss 87.18 81.92 79.32 78.00 SphereFace 90.80 87.02 82.95 82.28 ArcFace 92.15 88.00 83.98 84.93 VGGFace2 89.90 86.13 84.93 83.38 Mean 90.01 85.77 82.80 82.15 Commercial APIs Face++ 93.90 88.55 92.47 87.50 Baidu 89.13 86.53 90.27 77.97 Amazon 90.45 87.20 84.87 86.27 Microsoft 87.60 82.83 79.67 75.83 Mean 90.27 86.28 86.82 81.89
8
A major drive iver r of bias in face recognit nitio ion
Cauc ucasian 78% 78% Asian 5% 5% Indian 3% 3% African 14% 14%
CURRENT TRAINING DBS
Caucasian Asian Indian African
Database Racial distribution (%) Caucasian Asian Indian African CASIA- WebFace 84.5 2.6 1.6 11.3 VGGFace2 74.2 6.0 4.0 15.8 MS-Celeb-1M 76.3 6.6 2.6 14.5 Average 78.3 5.0 2.7 13.8
9
Racial bias: A special imbalance learning problem
Mei Wang, Weihong Deng, Mitigating Bias in Face Recognition using Skewness-Aware Reinforcement Learning, CVPR 2020
classes
10
Mei Wang, Weihong Deng, Mitigating Bias in Face Recognition using Skewness-Aware Reinforcement Learning, CVPR 2020
11
Ethnicit icity y Aware Training ning Sets s for RFW
Cauc ucasian 38% 38% Asian 31% 31% Indian 18% 18% African 13% 13%
BUPT-Globa Globalface
Caucasian Asian Indian African
2M Images
Cauc ucasian 25% 25% Asian 25% 25% Indian 25% 25% African 25% 25%
BUPT-Ba Balancedf edface
Caucasian Asian Indian African
1.3M Images
Mei Wang, Weihong Deng, Mitigating Bias in Face Recognition using Skewness-Aware Reinforcement Learning, CVPR 2020
12
Deficiency of Current Training Datasets
We summary some interesting findings and problems about these training sets: depth v.s. breadth, long tail distribution, data noise and data bias.
Long tail distribution
Long tail property refers to the condition where only limited number of object classes appear frequently, while most
relatively rarely.
Mei Wang & Weihong Deng, Deep Face Recognition: A Survey, arXiv:1804.06655
13 13
sufficient number of samples to model intra- class variability Contain sufficient number
class variability
Yaoyao Zhong, Weihong Deng, Mei Wang, Jiani Hu, et al., Unequal-training for deep face recognition with long-tailed noisy data, CVPR 2019.
14
Overview
Bingyu Liu, Weihong Deng, et al., Fair Loss: Margin-aware Reinforcement Learning for Deep Face Recognition, ICCV 2019.
Class Grouping according to sample size
15
What cause the bias of visual recognition
16 16
17
download
Keywords
‘smile’ ‘crying’ ‘OMG’… 60,000 images downloaded XML
parse
URLs
Collection
Reliability Estimation
An
EM
framework
Filter out unreliable labels
Enhanced Reliability
Data collection and Annotation Process
30K 30K image ge s
Learning from labels
Crowd-sourcing
315 volunteers online Each image labelled 40 times
Annotation
1.2M labels
Single label / Mutli-label
0.12 0.34 0.11 0.02 0.39 0.01 0 0.2 0.4 0.6 0.8 1
Probability
RAF-DB DB
RAF-ML ML
18 18
Shan Li, Weihong Deng, Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using Crowdsourced Annotations and Deep Locality Feature Learning. IJCV 2019.
19 19
0.419355 0.032258 0.064516 0.281250 0.375000 0.343750
Compound expression Blended expression
20
Shan Li, Weihong Deng, Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition. IEEE TIP 2019.
21
Shan Li, Weihong Deng, Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using Crowdsourced Annotations and Deep Locality Feature Learning. IJCV 2019.
22
Labels of real-world face datasets are noisy
Motivation: When the face-recognition accuracy of deep models is already much higher than human, it is possible the machine can boost itself by automagical data cleansing.
Mei Wang & Weihong Deng, Deep Face Recognition: A Survey, arXiv:1804.06655
23
The image pairs are from Similar-Looking LFW database
Weihong Deng, et al., Pattern Recognition , 2017
24 24
99.55 96.78 93.75 80.45 99.85 92.03 87.33 88.42
80 85 90 95 100
LFW SLL LLFW CALFW CPLFW PLFW
Deep CNN versus My Students
Human n > CNN CNN >> Human CNN ~ Human n ~ 100% 100%
Human CNN Arcface CVPR19
CNN > Human
25 25
Methodology – Overview
Yaobing Zhang, Weihong Deng, et al., Global-Local GCN: Large-Scale Label Noise Cleansing for Face Recognition, CVPR 2020
26 26
Methodology – Local Graph Net
the local subgraphs
results
Yaobing Zhang, Weihong Deng, et al., Global-Local GCN: Large-Scale Label Noise Cleansing for Face Recognition, CVPR 2020
27 27
Experiments – MillionCelebs (2/3)
MegaFace Challenge IJB-B and IJB-C
Yaobing Zhang, Weihong Deng, et al., Global-Local GCN: Large-Scale Label Noise Cleansing for Face Recognition, CVPR 2020
28
What cause the bias of visual recognition
29
Ethnicit icity y Aware Training ning Sets s for RFW
Cauc ucasian 75% 75% Unlabeled 8% 8% Unlabeled 8% 8% Unlabeled 8% 8%
BUPT-Transferface
Caucasian Asian Indian African
Mei Wang, Weihong Deng, et al., Racial Faces in-the-Wild: Reducing Racial Bias by Information Maximization Adaptation Network, ICCV 2019.
30
Clustering to generate pseudo-labels Learn discriminative distribution at cluster- level for color races
Methods Caucasian Indian Asian African Softmax 94.12 88.33 84.60 83.47 DDC-S
86.32 84.95 DAN-S
85.53 84.10 IMAN-S (ours)
89.88 89.13
Recognition accuracy on color races is boosted
Mei Wang, Weihong Deng, et al., Racial Faces in-the-Wild: Reducing Racial Bias by Information Maximization Adaptation Network, ICCV 2019.
31
Datasets play an important role in the progress of facial expression recognition algorithms, but they may suffer from obvious biases caused by different cultures and collection conditions. Hence, evaluating methods with intra-database protocol would render them lack generalization capability on unseen samples at test time.
Shan Li, and Weihong Deng, A Deeper Look at Facial Expression Dataset Bias. IEEE TAC 2020.
32
Capture Bias:
Each dataset tends to have its own preference during the construction processing.
Experiment Ⅰ Database Recognition Experiment Ⅱ Cross-dataset Generation
Category Bias:
Annotators in each dataset may have different perceptions of the emotion conveyed in images, and many images tend to express more than one expression which enhances the uncertainty of annotation.
Shan Li, and Weihong Deng, A Deeper Look at Facial Expression Dataset Bias. IEEE TAC 2020.
33
Shan Li, and Weihong Deng, A Deeper Look at Facial Expression Dataset Bias. IEEE TAC 2020.
34
Man-made ade Adve versarial sarial Uncertaint ainty
Different people. Confidence is 0.08944 The same person. Confidence is 0.91928
35 35
36 36
First step: seek the potential adversarial examples by gradient vulnerability exploitation Second step: conduct triplet metric learning based on the anchors
adversarial samples.
Actively mining the potential noisy points Set as anchor sample to do triplet metric learning Address the adversarial sample Yaoyao Zhong, Weihong Deng, Adversarial Learning with Margin-based Triplet Embedding Regularization, ICCV 2019.
37 37
The experimental results on MNIST, CASIA-WebFace, VGGFace2 and MS-Celeb-1M reveal that our method increases the robustness of the network against adversarial attacks in simple object classification and deep face recognition.
Figure 2. Accuracy on clean images, and adversarial examples
Figure 1. Embedding space visualization of MNIST trained with Softmax and Softmax+MTER.
38
Conclusio lusions ns
Real-world imbalanced data bias is more complex than that in simulation experiments Reinforcement / transfer learning for mitigating racial bias (CVPR20a, ICCV19c) Grouping based unequal-training for Long-tailed datasets (CVPR19, ICCV19a) Model learned on automatically cleansed dataset can improve SOTA performance
Top performance on IJB-C for face recognition (CVPR20b) Data collection and labelling, e.g. emotions, are not only a labor work, but requires interdisciplinary knowledge and robust label estimation . RAF-DB and RAF-ML for expression analysis (TIP19, IJCV19, TAC20) Adversarial samples are very dangerous, even for the tasks with massive training data and perfect accuracy. Adversarial training is useful, but does not solve the problem. (ICCV19c)
39
References
[CVPR20a] Mei Wang, Weihong Deng, Mitigating Bias in Face Recognition using Skewness-Aware Reinforcement Learning, CVPR 2020 [CVPR19] Yaoyao Zhong, Weihong Deng, Mei Wang, Jiani Hu, et al., Unequal-training for deep face recognition with long-tailed noisy data, CVPR 2019. [ICCV19a] Bingyu Liu, Weihong Deng, et al., Fair Loss: Margin-aware Reinforcement Learning for Deep Face Recognition, ICCV 2019. [TIP19] Shan Li, Weihong Deng, Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition. IEEE TIP 2019. [IJCV19] Shan Li, Weihong Deng, Blended Emotion in-the-Wild: Multi-label Facial Expression Recognition Using Crowdsourced Annotations and Deep Locality Feature Learning. IJCV 2019. [CVPR20b] Yaobing Zhang, Weihong Deng, et al., Global-Local GCN: Large-Scale Label Noise Cleansing for Face Recognition, CVPR 2020 [TAC20] Shan Li, Weihong Deng, A Deeper Look at Facial Expression Dataset Bias. IEEE TAC 2020. [ICCV19b] Mei Wang, Weihong Deng, et al., Racial Faces in-the-Wild: Reducing Racial Bias by Information Maximization Adaptation Network, ICCV 2019. [ICCV19c] Yaoyao Zhong, Weihong Deng, Adversarial Learning with Margin-based Triplet Embedding Regularization, ICCV 2019.
http://www.whdeng.cn