1 / 10
1 / 10 Dive Deeper Into Box for Object Detection Ran Chen 1 , Yong - - PowerPoint PPT Presentation
1 / 10 Dive Deeper Into Box for Object Detection Ran Chen 1 , Yong - - PowerPoint PPT Presentation
1 / 10 Dive Deeper Into Box for Object Detection Ran Chen 1 , Yong Liu 2 , Mengdan Zhang 2 , Shu Liu 3 , Bei Yu 1 , Yu-Wing Tai 4 1 The Chinese University of Hong Kong 2 Tencent Youtu Lab 3 SmartMore 4 The Hong Kong University of Science and
Dive Deeper Into Box for Object Detection
Ran Chen1, Yong Liu2, Mengdan Zhang2, Shu Liu3, Bei Yu1, Yu-Wing Tai4
1The Chinese University of Hong Kong 2Tencent Youtu Lab 3SmartMore 4The Hong Kong University of Science and Technology
2 / 10
Introduction
Box Decomposition and Recombination
◮ Reorganizing boundaries of boxes
during training.
◮ Optimal boxes with tightening instances
provide better localization. Semantic inconsistency in annotations
◮ Backgrounds regarded as positive pixels
are the noise for the training.
◮ A self adaptive module is approached to
tackle this problem.
3 / 10
Box Decomposition and Recombination
Ranking: Recombination: Assignment:
S0 S0 S0 S0 S0
1
S0
1
S0
1
S0
1
S0
2
S0
2
S0
2
S0
2
S0 S0 S0 S0 S0
1
S0
1
S0
1
S0
1
S0
2
S0
2
S0
2
S0
2
(b) (c)
Decomposition:
(a)
S2 S2 S0 S0 S1 S1
rightI
S0 S0 S0 S0 S0 S0 S0 S0 S1 S1 S1 S1 S1 S1 S1 S1 S2 S2 S2 S2 S2 S2
S2 S2
S0 S0 S0 S0 S0
2
S0
2
S2 S2 S0
1
S0
1
S1 S1 S0 S0 S0 S0 S0
1
S0
1
S0
2
S0
2
S0
2
S0
2
S0
1
S0
1
max(S1, S0
0)
<latexit sha1_base64="m+AOn2kXJDEyvOaMqmUcRt7XPis=">AB9XicbVDLSgNBEOyNrxhfUY9eBoMYQcJuiOgx4MVjJOYBybrMTmaTIbOzy8ysGpb8hxcPinj1X7z5N04eB0saCiqunu8mPOlLbtbyuzsrq2vpHdzG1t7+zu5fcPmipKJKENEvFItn2sKGeCNjTnLZjSXHoc9ryh9cTv/VApWKRuNOjmLoh7gsWMIK1ke67IX4q1j3nvH7q2WdevmCX7CnQMnHmpABz1Lz8V7cXkSkQhOleo4dqzdFEvNCKfjXDdRNMZkiPu0Y6jAIVuOr16jE6M0kNBJE0Jjabq74kUh0qNQt90hlgP1KI3Ef/zOokOrtyUiTjRVJDZoiDhSEdoEgHqMUmJ5iNDMJHM3IrIAEtMtAkqZ0JwFl9eJs1yamULm4rhWp5HkcWjuAYiuDAJVThBmrQAISnuEV3qxH68V6tz5mrRlrPnMIf2B9/gBNYpEJ</latexit>δ3 δ1 δ2
Rank : δ3 > δ1 > δ2
<latexit sha1_base64="GAFXR/1Om5fGSedTSPnsn0bcvsE=">ACFHicbZBNSwMxEIazftb6terRS7AIglB2a0XxIXjypWhbaUbHZWQ7PZJZkVy9If4cW/4sWDIl49ePfmNaK2vpC4OGdGSbzBqkUBj3vwxkbn5icmi7MFGfn5hcW3aXlc5NkmkONJzLRlwEzIWCGgqUcJlqYHEg4SJoH/bqFzegjUjUGXZSaMbsSolIcIbWarmbDYRbzE+Zanf3aCMEiay1Rfe/0f/BSsteWvLzoK/gBKZKDjlveCBOexaCQS2ZM3fdSbOZMo+ASusVGZiBlvM2uoG5RsRhM+8f1aXr1glplGj7FNK+3siZ7ExnTiwnTHDazNc65n/1eoZRrvNXKg0Q1D8a1GUSYoJ7SVEQ6GBo+xYFwL+1fKr5lmHG2ORuCP3zyKJxXyn61vH1SLR1UBnEUyCpZIxvEJzvkgByRY1IjnNyRB/JEnp1759F5cV6/WsecwcwK+SPn7ROSH50x</latexit>(d)
LIoU = − 1 Npos
- I
n
- i
log(IoU(pi, p∗
I )),
(1)
LD&R
IoU = 1
Npos
- I
(1{S′
I>SI}LIoU(B′
I, TI)
+ 1{SIS′
I}LIoU(BI, TI)),
(2)
4 / 10
Box Decomposition and Recombination
2 4 6 8 10 12 0.3 0.5 0.7 0.9
epoch IoU IoU w/ D&R IoU w/o D&R
◮ Boxes optimized by D&R have higher IoU scores and lower variances.
5 / 10
Box Decomposition and Recombination
6 / 10
Semantic Consistency
x
RI↑ CI↑ positive set negative set
CI↓ RI↓ ← negative, CI↑ RI↑ ← positive, ci =
g
max
j=0 (cj) ∈ CI,
(3)
Settings
AP AP50 AP75 APS APM APL
None 33.6 53.1 35.0 18.9 38.2 43.7 PN 34.2 53.2 36.3 20.8 38.9 44.2 PNI 33.7 53.0 35.5 17.9 38.3 44.1 Ours 35.3 55.4 37.1 20.9 39.6 45.9 7 / 10
Semantic Consistency
8 / 10
Results
Modules
AP AP50 AP75 APS APM APL
Baseline D&R Consistency
- 33.6
53.1 35.0 18.9 38.2 43.7
- 34.8
54.0 36.4 19.7 39.0 44.9
- 37.2
55.4 39.5 21.0 41.7 48.6
- 38.0
56.5 40.8 21.6 42.4 50.4
9 / 10
Results
Method Backbone
AP AP50 AP75 APS APM APL
Two-stage methods: Faster R-CNN w/ FPN ResNet-101-FPN 36.2 59.1 39.0 18.2 39.0 48.2 Faster R-CNN w/ TDM Inception-ResNet-v2-TDM 36.8 57.7 39.2 16.2 39.8 52.1 Faster R-CNN by G-RMI Inception-ResNet-v2 34.7 55.5 36.7 13.5 38.1 52.0 RPDet ResNet-101-DCN 42.8 65.0 46.3 24.9 46.2 54.7 Cascade R-CNN ResNet-101 42.8 62.1 46.3 23.7 45.5 55.2 One-stage methods: YOLOv2 DarkNet-19 21.6 44.0 19.2 5.0 22.4 35.5 SSD ResNet-101 31.2 50.4 33.3 10.2 34.5 49.8 DSSD ResNet-101 33.2 53.3 35.2 13.0 35.4 51.1 FSAF ResNet-101 40.9 61.5 44.0 24.0 44.2 51.3 RetinaNet ResNet-101-FPN 39.1 59.1 42.3 21.8 42.7 53.9 CornerNet Hourglass-104 40.5 56.5 43.1 19.4 42.7 53.9 ExtremeNet Hourglass-104 40.1 55.3 43.2 20.3 43.2 53.1 FCOS† ResNet-101-FPN 41.5 60.7 45.0 24.4 44.8 51.6 FCOS† ResNeXt-64x4d-101-FPN 43.2 62.8 46.6 26.5 46.2 53.3 FCOS† w/improvements ResNeXt-64x4d-101-FPN 44.7 64.1 48.4 27.6 47.5 55.6 DDBNet (Ours) ResNet-101-FPN 42.0 61.0 45.1 24.2 45.0 53.3 DDBNet (Ours) ResNeXt-64x4d-101-FPN 43.9 63.1 46.7 26.3 46.5 55.1 DDBNet (Ours)§ ResNeXt-64x4d-101-FPN 45.5 64.5 48.5 27.8 47.7 57.1
10 / 10