SLIDE 1 Bounding Box Regression With Uncertainty for Accurate Object Detection
1Carnegie Mellon University 2Megvii
Yihui He1, Chenchen Zhu1, Jianren Wang1, Marios Savvides, 2Xiangyu Zhang
SLIDE 2 Ambiguity: inaccurate labelling
SLIDE 3 Ambiguity: inaccurate labelling
SLIDE 4 Ambiguity: introduced by occlusion
SLIDE 5 Ambiguity: object boundary itself is ambiguous
SLIDE 6
Classification Score & Localization misalignment
MS-COCO VGG-16 Faster RCNN
SLIDE 7 Standard Faster R-CNN Pipeline
1024 x 81 1024 x 81x4
Cross entropy/focal loss
SLIDE 8 Modeling bounding box prediction
- Predict Gaussian distribution instead of a number
https://upload.wikimedia.org/wikipedia/commons/9/9e/Normal_Distribution_NIST.gif
SLIDE 9 Modeling ground truth bounding box
https://upload.wikimedia.org/wikipedia/commons/b/b4/Dirac_function_approximation.gif
SLIDE 10
KL Loss: Gaussian meets delta function
SLIDE 11
Architecture
An additional fully-connected layer for prediction variance (1024 x 81 x 4) 1024 x 81 1024 x 81x4 1024 x 81x4
SLIDE 12
Why KL Loss
(1) The ambiguities in a dataset can be successfully captured. The bounding box regressor gets smaller loss from ambiguous bounding boxes. (2) The learned variance is useful during post-processing. We propose var voting (variance voting) to vote the location of a candidate box using its neighbors’ locations weighted by the predicted variances during nonmaximum suppression (NMS). (3) The learned probability distribution is interpretable. Since it reflects the level of uncertainty of the bounding box prediction, it can potentially be helpful in down-stream applications like self-driving cars and robotics
SLIDE 13
KL Loss: Degradation Case
SLIDE 14
KL Loss: Reparameterization trick
convert α back to σ during testing
SLIDE 15 KL Loss: Rubust L1 Loss (Smooth L1 Loss)
Smooth L1 Loss KL Loss
SLIDE 16
KL Loss: Uncertainty Prediction
Sigma in Green box
SLIDE 17
KL Loss: Uncertainty Prediction
Sigma in Green box
SLIDE 18
KL Loss: Uncertainty Prediction
Sigma in Green box
SLIDE 19
KL Loss: Uncertainty Prediction
Sigma in Green box
SLIDE 20 Variance Voting
- Larger IoU gets higher score
- Lower variance gets higher score
- Classification score invariance
SLIDE 21
Variance Voting
Before after
SLIDE 22
Variance Voting
Before after
SLIDE 23
Variance Voting
Before after
SLIDE 24
Variance Voting
Before after
SLIDE 25 Ablation Study: KL Loss, soft-NMS, Variance Voting
SLIDE 26
Ablation Study: does #params in head matter?
The Larger R-CNN head, the better
SLIDE 27
Ablation Study: Variance Voting Threshold
σt = 0, standard NMS Large σt: farther boxes are considered
SLIDE 28 Improving State-of-the-Art
SLIDE 29 Inference Latency
- VGG-16
- single image
- single GTX 1080 Ti GPU
2ms
SLIDE 30
Other models on MS-COCO
SLIDE 31
VGG on PASCAL VOC
SLIDE 32
Join us at Tuesday Afternoon Poster Session #41
Bounding Box Regression with Uncertainty for Accurate Object Detection