CenterNet2
Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl UT Austin & Intel Labs
1
CenterNet2 Xingyi Zhou, Vladlen Koltun, Philipp Krhenbhl UT Austin - - PowerPoint PPT Presentation
CenterNet2 Xingyi Zhou, Vladlen Koltun, Philipp Krhenbhl UT Austin & Intel Labs 1 Conventional two-stage detector Backbone Classifier BB regression ROIAlign Stage 1 Stage 2 45ms 8ms Ren et. al, Faster R-CNN: Towards
Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl UT Austin & Intel Labs
1
Backbone
…
Classifier BB regression
ROIAlign
Stage 1 45ms Stage 2 8ms
Ren et. al, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS 2015
Backbone
…
Classifier BB regression
ROIAlign
Stage 1 45ms Stage 2 8ms Stage 3 8ms
ROIAlign …
Classifier BB regression
… … Cai et. al, Cascade R-CNN: Delving into High Quality Object Detection, CVPR 2018
Backbone Classifier BB regression Stage 1 53ms
Lin et. al, Focal Loss for Dense Object Detection, ICCV 2017
Backbone keypoint detection size regression Stage 1 53ms
Zhou et. al, Objects as Points, arXiv 2019
Backbone keypoint detection size regression Stage 1 51ms
…
Classifier BB regression
ROIAlign
Stage 2 2ms Stage 3 2ms
… ROIAlign
Classifier BB regression
7
COCO box mAP
38 39.75 41.5 43.25 45 CenterNet-FPN CascadeRCNN CenterNet2
42.9 42.1 39.6 COCO runtime (ms)
20 40 60 80 CenterNet-FPN CascadeRCNN CenterNet2
60 70 53 LVIS box mAP
20 22.5 25 27.5 30 CascadeRCNN CenterNet2
26.9 24
8
… Negatives Positives … Unlabeled …
Gupta et. al, LVIS: A Dataset for Large Vocabulary Instance Segmentation, CVPR 2019
9
… Positives … … Negatives
foreground
10
Tan et. al, Equalization Loss for Long-Tailed Object Recognition CVPR 2020
loss
11
7 14 21 28 35 Box AP Box APr Box APc Box APf
33.3 26 16.1 27.1 31.5 24.6 15.5 25.7 31.5 21.9 8.2 23.3 32.7 22.9 7.6 24
Softmax-CE Sigmoid-CE EQL FedLoss
12
mAP 24 28 32 36 40 CenterNet2 +Mask +2x +FPN2-6 +X101 +DCN +PointRend +Larger input +Test-aug
27.2 36.1 34.9 34 32.1 30.3 28.2 27.4 25.3 38.5 37.3 36.7 35.9 33.9 31.5 30.6 28.6 28.2
Box mAP Mask mAP Official baseline
13