Bounding Box Regression With Uncertainty for Accurate Object - - PowerPoint PPT Presentation

▶

Dec 09, 2023 378 likes •719 views

Bounding Box Regression With Uncertainty for Accurate Object Detection 1 Carnegie Mellon University 2 Megvii Yihui He 1 , Chenchen Zhu 1 , Jianren Wang 1 , Marios Savvides, 2 Xiangyu Zhang Ambiguity: inaccurate labelling MS-COCO Ambiguity:

SLIDE 1

Bounding Box Regression With Uncertainty for Accurate Object Detection

1Carnegie Mellon University 2Megvii

Yihui He1, Chenchen Zhu1, Jianren Wang1, Marios Savvides, 2Xiangyu Zhang

SLIDE 2

Ambiguity: inaccurate labelling

MS-COCO

SLIDE 3

Ambiguity: inaccurate labelling

MS-COCO

SLIDE 4

Ambiguity: introduced by occlusion

MS-COCO

SLIDE 5

Ambiguity: object boundary itself is ambiguous

YouTube-BoundingBoxes

SLIDE 6

Classification Score & Localization misalignment

MS-COCO VGG-16 Faster RCNN

SLIDE 7

Standard Faster R-CNN Pipeline

1024 x 81 1024 x 81x4

Cross entropy/focal loss

SLIDE 8

Modeling bounding box prediction

Predict Gaussian distribution instead of a number

https://upload.wikimedia.org/wikipedia/commons/9/9e/Normal_Distribution_NIST.gif

SLIDE 9

Modeling ground truth bounding box

Dirac delta function

https://upload.wikimedia.org/wikipedia/commons/b/b4/Dirac_function_approximation.gif

SLIDE 10

KL Loss: Gaussian meets delta function

SLIDE 11

Architecture

An additional fully-connected layer for prediction variance (1024 x 81 x 4) 1024 x 81 1024 x 81x4 1024 x 81x4

SLIDE 12

Why KL Loss

(1) The ambiguities in a dataset can be successfully captured. The bounding box regressor gets smaller loss from ambiguous bounding boxes. (2) The learned variance is useful during post-processing. We propose var voting (variance voting) to vote the location of a candidate box using its neighbors’ locations weighted by the predicted variances during nonmaximum suppression (NMS). (3) The learned probability distribution is interpretable. Since it reflects the level of uncertainty of the bounding box prediction, it can potentially be helpful in down-stream applications like self-driving cars and robotics

SLIDE 13

KL Loss: Degradation Case

SLIDE 14

KL Loss: Reparameterization trick

convert α back to σ during testing

SLIDE 15

KL Loss: Rubust L1 Loss (Smooth L1 Loss)

Smooth L1 Loss KL Loss

SLIDE 16

KL Loss: Uncertainty Prediction

Sigma in Green box

SLIDE 17

KL Loss: Uncertainty Prediction

Sigma in Green box

SLIDE 18

KL Loss: Uncertainty Prediction

Sigma in Green box

SLIDE 19

KL Loss: Uncertainty Prediction

Sigma in Green box

SLIDE 20

Variance Voting

Larger IoU gets higher score
Lower variance gets higher score
Classification score invariance

SLIDE 21

Variance Voting

Before after

SLIDE 22

Variance Voting

Before after

SLIDE 23

Variance Voting

Before after

SLIDE 24

Variance Voting

Before after

SLIDE 25

Ablation Study: KL Loss, soft-NMS, Variance Voting

VGG-16
MS-COCO

SLIDE 26

Ablation Study: does #params in head matter?

The Larger R-CNN head, the better

SLIDE 27

Ablation Study: Variance Voting Threshold

σt = 0, standard NMS Large σt: farther boxes are considered

SLIDE 28

Improving State-of-the-Art

Mask R-CNN
MS-COCO

SLIDE 29

Inference Latency

VGG-16
single image
single GTX 1080 Ti GPU

2ms

SLIDE 30

Other models on MS-COCO

SLIDE 31

VGG on PASCAL VOC

SLIDE 32

Bounding Box Regression With Uncertainty for Accurate Object Detection

Yihui He1, Chenchen Zhu1, Jianren Wang1, Marios Savvides, 2Xiangyu Zhang

Ambiguity: inaccurate labelling

Ambiguity: inaccurate labelling

Ambiguity: introduced by occlusion

Ambiguity: object boundary itself is ambiguous

Classification Score & Localization misalignment

MS-COCO VGG-16 Faster RCNN

Standard Faster R-CNN Pipeline

1024 x 81 1024 x 81x4

Modeling bounding box prediction

Modeling ground truth bounding box

KL Loss: Gaussian meets delta function

Architecture

An additional fully-connected layer for prediction variance (1024 x 81 x 4) 1024 x 81 1024 x 81x4 1024 x 81x4

Why KL Loss

KL Loss: Degradation Case

KL Loss: Reparameterization trick

convert α back to σ during testing

KL Loss: Rubust L1 Loss (Smooth L1 Loss)

KL Loss: Uncertainty Prediction

Sigma in Green box

KL Loss: Uncertainty Prediction

Sigma in Green box

KL Loss: Uncertainty Prediction

Sigma in Green box

KL Loss: Uncertainty Prediction

Sigma in Green box

Variance Voting

Variance Voting

Before after

Variance Voting

Before after

Variance Voting

Before after

Variance Voting

Before after

Ablation Study: KL Loss, soft-NMS, Variance Voting

Ablation Study: does #params in head matter?

The Larger R-CNN head, the better

Ablation Study: Variance Voting Threshold

σt = 0, standard NMS Large σt: farther boxes are considered

Improving State-of-the-Art

Inference Latency

2ms

Other models on MS-COCO

VGG on PASCAL VOC

Join us at Tuesday Afternoon Poster Session #41

Bounding Box Regression with Uncertainty for Accurate Object Detection