1 / 10 Dive Deeper Into Box for Object Detection Ran Chen 1 , Yong - - PowerPoint PPT Presentation

1 10 dive deeper into box for object detection
SMART_READER_LITE
LIVE PREVIEW

1 / 10 Dive Deeper Into Box for Object Detection Ran Chen 1 , Yong - - PowerPoint PPT Presentation

1 / 10 Dive Deeper Into Box for Object Detection Ran Chen 1 , Yong Liu 2 , Mengdan Zhang 2 , Shu Liu 3 , Bei Yu 1 , Yu-Wing Tai 4 1 The Chinese University of Hong Kong 2 Tencent Youtu Lab 3 SmartMore 4 The Hong Kong University of Science and


slide-1
SLIDE 1

1 / 10

slide-2
SLIDE 2

Dive Deeper Into Box for Object Detection

Ran Chen1, Yong Liu2, Mengdan Zhang2, Shu Liu3, Bei Yu1, Yu-Wing Tai4

1The Chinese University of Hong Kong 2Tencent Youtu Lab 3SmartMore 4The Hong Kong University of Science and Technology

2 / 10

slide-3
SLIDE 3

Introduction

Box Decomposition and Recombination

◮ Reorganizing boundaries of boxes

during training.

◮ Optimal boxes with tightening instances

provide better localization. Semantic inconsistency in annotations

◮ Backgrounds regarded as positive pixels

are the noise for the training.

◮ A self adaptive module is approached to

tackle this problem.

3 / 10

slide-4
SLIDE 4

Box Decomposition and Recombination

Ranking: Recombination: Assignment:

S0 S0 S0 S0 S0

1

S0

1

S0

1

S0

1

S0

2

S0

2

S0

2

S0

2

S0 S0 S0 S0 S0

1

S0

1

S0

1

S0

1

S0

2

S0

2

S0

2

S0

2

(b) (c)

Decomposition:

(a)

S2 S2 S0 S0 S1 S1

rightI

S0 S0 S0 S0 S0 S0 S0 S0 S1 S1 S1 S1 S1 S1 S1 S1 S2 S2 S2 S2 S2 S2

S2 S2

S0 S0 S0 S0 S0

2

S0

2

S2 S2 S0

1

S0

1

S1 S1 S0 S0 S0 S0 S0

1

S0

1

S0

2

S0

2

S0

2

S0

2

S0

1

S0

1

max(S1, S0

0)

<latexit sha1_base64="m+AOn2kXJDEyvOaMqmUcRt7XPis=">AB9XicbVDLSgNBEOyNrxhfUY9eBoMYQcJuiOgx4MVjJOYBybrMTmaTIbOzy8ysGpb8hxcPinj1X7z5N04eB0saCiqunu8mPOlLbtbyuzsrq2vpHdzG1t7+zu5fcPmipKJKENEvFItn2sKGeCNjTnLZjSXHoc9ryh9cTv/VApWKRuNOjmLoh7gsWMIK1ke67IX4q1j3nvH7q2WdevmCX7CnQMnHmpABz1Lz8V7cXkSkQhOleo4dqzdFEvNCKfjXDdRNMZkiPu0Y6jAIVuOr16jE6M0kNBJE0Jjabq74kUh0qNQt90hlgP1KI3Ef/zOokOrtyUiTjRVJDZoiDhSEdoEgHqMUmJ5iNDMJHM3IrIAEtMtAkqZ0JwFl9eJs1yamULm4rhWp5HkcWjuAYiuDAJVThBmrQAISnuEV3qxH68V6tz5mrRlrPnMIf2B9/gBNYpEJ</latexit>

δ3 δ1 δ2

Rank : δ3 > δ1 > δ2

<latexit sha1_base64="GAFXR/1Om5fGSedTSPnsn0bcvsE=">ACFHicbZBNSwMxEIazftb6terRS7AIglB2a0XxIXjypWhbaUbHZWQ7PZJZkVy9If4cW/4sWDIl49ePfmNaK2vpC4OGdGSbzBqkUBj3vwxkbn5icmi7MFGfn5hcW3aXlc5NkmkONJzLRlwEzIWCGgqUcJlqYHEg4SJoH/bqFzegjUjUGXZSaMbsSolIcIbWarmbDYRbzE+Zanf3aCMEiay1Rfe/0f/BSsteWvLzoK/gBKZKDjlveCBOexaCQS2ZM3fdSbOZMo+ASusVGZiBlvM2uoG5RsRhM+8f1aXr1glplGj7FNK+3siZ7ExnTiwnTHDazNc65n/1eoZRrvNXKg0Q1D8a1GUSYoJ7SVEQ6GBo+xYFwL+1fKr5lmHG2ORuCP3zyKJxXyn61vH1SLR1UBnEUyCpZIxvEJzvkgByRY1IjnNyRB/JEnp1759F5cV6/WsecwcwK+SPn7ROSH50x</latexit>

(d)

LIoU = − 1 Npos

  • I

n

  • i

log(IoU(pi, p∗

I )),

(1)

LD&R

IoU = 1

Npos

  • I

(1{S′

I>SI}LIoU(B′

I, TI)

+ 1{SIS′

I}LIoU(BI, TI)),

(2)

4 / 10

slide-5
SLIDE 5

Box Decomposition and Recombination

2 4 6 8 10 12 0.3 0.5 0.7 0.9

epoch IoU IoU w/ D&R IoU w/o D&R

◮ Boxes optimized by D&R have higher IoU scores and lower variances.

5 / 10

slide-6
SLIDE 6

Box Decomposition and Recombination

6 / 10

slide-7
SLIDE 7

Semantic Consistency

x

RI↑ CI↑ positive set negative set

   CI↓ RI↓ ← negative, CI↑ RI↑ ← positive, ci =

g

max

j=0 (cj) ∈ CI,

(3)

Settings

AP AP50 AP75 APS APM APL

None 33.6 53.1 35.0 18.9 38.2 43.7 PN 34.2 53.2 36.3 20.8 38.9 44.2 PNI 33.7 53.0 35.5 17.9 38.3 44.1 Ours 35.3 55.4 37.1 20.9 39.6 45.9 7 / 10

slide-8
SLIDE 8

Semantic Consistency

8 / 10

slide-9
SLIDE 9

Results

Modules

AP AP50 AP75 APS APM APL

Baseline D&R Consistency

  • 33.6

53.1 35.0 18.9 38.2 43.7

  • 34.8

54.0 36.4 19.7 39.0 44.9

  • 37.2

55.4 39.5 21.0 41.7 48.6

  • 38.0

56.5 40.8 21.6 42.4 50.4

9 / 10

slide-10
SLIDE 10

Results

Method Backbone

AP AP50 AP75 APS APM APL

Two-stage methods: Faster R-CNN w/ FPN ResNet-101-FPN 36.2 59.1 39.0 18.2 39.0 48.2 Faster R-CNN w/ TDM Inception-ResNet-v2-TDM 36.8 57.7 39.2 16.2 39.8 52.1 Faster R-CNN by G-RMI Inception-ResNet-v2 34.7 55.5 36.7 13.5 38.1 52.0 RPDet ResNet-101-DCN 42.8 65.0 46.3 24.9 46.2 54.7 Cascade R-CNN ResNet-101 42.8 62.1 46.3 23.7 45.5 55.2 One-stage methods: YOLOv2 DarkNet-19 21.6 44.0 19.2 5.0 22.4 35.5 SSD ResNet-101 31.2 50.4 33.3 10.2 34.5 49.8 DSSD ResNet-101 33.2 53.3 35.2 13.0 35.4 51.1 FSAF ResNet-101 40.9 61.5 44.0 24.0 44.2 51.3 RetinaNet ResNet-101-FPN 39.1 59.1 42.3 21.8 42.7 53.9 CornerNet Hourglass-104 40.5 56.5 43.1 19.4 42.7 53.9 ExtremeNet Hourglass-104 40.1 55.3 43.2 20.3 43.2 53.1 FCOS† ResNet-101-FPN 41.5 60.7 45.0 24.4 44.8 51.6 FCOS† ResNeXt-64x4d-101-FPN 43.2 62.8 46.6 26.5 46.2 53.3 FCOS† w/improvements ResNeXt-64x4d-101-FPN 44.7 64.1 48.4 27.6 47.5 55.6 DDBNet (Ours) ResNet-101-FPN 42.0 61.0 45.1 24.2 45.0 53.3 DDBNet (Ours) ResNeXt-64x4d-101-FPN 43.9 63.1 46.7 26.3 46.5 55.1 DDBNet (Ours)§ ResNeXt-64x4d-101-FPN 45.5 64.5 48.5 27.8 47.7 57.1

10 / 10