A Solution for Densely Annotated Large Scale Object Detection Task - - PowerPoint PPT Presentation

a solution for densely annotated large scale object
SMART_READER_LITE
LIVE PREVIEW

A Solution for Densely Annotated Large Scale Object Detection Task - - PowerPoint PPT Presentation

A Solution for Densely Annotated Large Scale Object Detection Task Yuan Gao, Hui Shen, Donghong Zhong, Jian Wang, Zeyu Liu, Ti Bai, Xiang Long and Shilei Wen Ob Object 365 365 Da Datas aset # Box in Box Num Image Image Box Area


slide-1
SLIDE 1

A Solution for Densely Annotated Large Scale Object Detection Task

Yuan Gao, Hui Shen, Donghong Zhong, Jian Wang, Zeyu Liu, Ti Bai, Xiang Long and Shilei Wen

slide-2
SLIDE 2

Ob Object 365 365 Da Datas aset

Pretrain # Class # Image # Box in Total Box Num Avg Image Height Avg Image Width Avg Box Area Avg (Pixel) Max # Box COCO17 (Train) 80 118287 0.86M 7.27 484 577 12025 93 Object 365 (Train) 365 608606 9.62M 15.81 536 662 14074 835

slide-3
SLIDE 3

Ful Full Track ck

slide-4
SLIDE 4

R50 50 Ca Cascade RC RCNN

22.73 20 22 24 26 28 30 32 34 36 38 Baseline R50

Ob Object365 Va Validation(mA mAP)

slide-5
SLIDE 5

C5 C4 C3 C2 P5 P4 P3 P2 NASFPN RPN

  • An RL based Neural Architecture Search

is adopted.

  • The NAS-FPN module is directly

cascaded behind the original FPN module.

  • A strong architecture found by prior knowledge[1] is used to initialized the NAS-FPN

searching procedure.

RPN RPN RPN

Ne Neural Ar Architecture Se Search ch

P6

RPN

[1] Ghiasi G, Lin T Y, Pang R, et al. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object

  • Detection. arXiv:1904.07392, 2019.
slide-6
SLIDE 6

Ne Neural Ar Architecture Se Search ch

C5 C4 C3 C2 P5 P4 P3 P2 P6 P6 P5 P4 P3 P2 N6 N4 N3 N2 N5 (a) FPN (b) Our NAS-FPN The architecture graph of original FPN and NAS-FPN after ~400 episodes

slide-7
SLIDE 7

Ne Neural Ar Architecture Se Search ch

22.73 23.66 20 22 24 26 28 30 32 34 36 38 Baseline R50 NASFPN

Ob Object365 Va Validation (mA mAP)

slide-8
SLIDE 8

Cl Class Div Diversit ity Se Sensitive Sa Samp mpling

15 Classes 4 Classes Sampling probability equally is not appropriate.

slide-9
SLIDE 9

Cl Class Div Diversit ity Se Sensitive Sa Samp mpling

𝑋

" = Sampling weight of the i th image.

C" = Total number of the classes of the i th image. N = Total number of the classes of the dataset. 𝑄

% = The c th class prior probability, according to the total box

number of the dataset. 𝐼% = 1 if the images contains class c or 0. The i th image contains 15 Classes

𝑋

" = ln 𝐷" + 𝜁 ∑%-. /

𝑄

%𝐼%

slide-10
SLIDE 10

Cl Class Div Diversit ity Se Sensitive Sa Samp mpling

5 10 15 20 25 30 iter 49999 99999 149999 199999 249999 299999 349999

Object365 Validation (mAP)

Random Class Diverisity Sensitive

slide-11
SLIDE 11

Cl Class Div Diversit ity Se Sensitive Sa Samp mpling

22.73 23.66 25.01 30.1 30.7 20 22 24 26 28 30 32 34 36 38 Baseline R50 NASFPN CDSS SENet154+GN OHEM+Deformable

Ob Object365 Va Validation (mA mAP)

slide-12
SLIDE 12

La Large Re Resolution Bo Box Head Head

RoI-Align 7X7 conv GN conv GN conv GN conv GN Box Class RoI-Align 9X9 conv GN conv GN conv GN conv GN conv GN Class Box

slide-13
SLIDE 13

La Large Re Resolution Bo Box Head Head

22.73 23.66 25.01 30.1 30.7 30.9 20 22 24 26 28 30 32 34 36 38 B a s e l i n e R 5 N A S F P N C D S S S E N e t 1 5 4 + G N O H E M + D e f

  • r

m a b l e L a r g e R e s

  • l

u t i

  • n

H e a d

Ob Object365 Va Validation (mA mAP)

slide-14
SLIDE 14

Ca Cascade RC RCNN Te Testing

F pool pool pool H1 H2 H3 B1 B0 B2 C3 H1 H2 C2 C1 B3 C

  • Use the predicted bbox of the 2nd

stage to extract the feature. (Standard Cascade RCNN)

[1] Cai Z, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection. CVPR.2018

slide-15
SLIDE 15

Ca Cascade RC RCNN Ad Adaptive Te Testing

F pool pool pool H1 H1 H2 B1 B0 B2 C2 C1 C3 B3 C H2 H3 pool H3

  • Use the predicted bbox of each

stage itself to extract the feature.

slide-16
SLIDE 16

Ca Cascade RC RCNN Ad Adaptive Te Testing

22.73 23.66 25.01 30.1 30.7 30.9 31.1 33.1 20 22 24 26 28 30 32 34 36 38 B a s e l i n e R 5 N A S F P N C D S S S E N e t 1 5 4 + G N O H E M + D e f

  • r

m a b l e L a r g e R e s

  • l

u t i

  • n

H e a d A d a p t i v e C a s c a d e T e s t i n g M S T r a i n i n g a n d T e s t i n g

Ob Object365 Va Validation (mA mAP)

Validation

slide-17
SLIDE 17

Im Implem plemen entatio tion De Details ails

  • Use COCO Pretrained model, mAP 52.9 on COCO17 minival.
  • Training multiscale size (400, 1400), max size 1600.
  • Testing multiscale size (400, 1400), max size 2100.
  • 8 V100(32GB) x 2 for 7 days.
  • Weight Standardization brings model diversity.
  • SoftNMS is adopted.
slide-18
SLIDE 18

Im Implem plemen entatio tion De Details ails

22.73 23.66 25.01 30.1 30.7 30.9 31.1 33.1 36.5 20 22 24 26 28 30 32 34 36 38 B a s e l i n e R 5 N A S F P N C D S S S E N e t 1 5 4 + G N O H E M + D e f

  • r

m a b l e L a r g e R e s

  • l

u t i

  • n

H e a d A d a p t i v e C a s c a d e T e s t i n g M S T r a i n i n g a n d T e s t i n g E n s e m b l e 5 m

  • d

e l s

Ob Object365 Va Validation (mA mAP)

Validation

slide-19
SLIDE 19

Vi Visualization

slide-20
SLIDE 20

Vi Visualization

slide-21
SLIDE 21

Ti Tiny Track ck

slide-22
SLIDE 22

Full Full Tr Track Pr Pretrain

Pretrain Full Val mAP Tiny Val mAP Gain Tiny Test mAP COCO Pretrain

  • 28.9
  • Obj365 Full

Pretrain 30.7 33.0 +4.1

  • Obj365 Full

Pretrain 32.9 34.8 +5.9

  • Ensemble 8

models

  • 37.6

+8.7 29.0

  • Multiscale input with flip in Training and Testing
slide-23
SLIDE 23

Vi Visualization

  • Full Track Pretrained
  • COCO17 Pretrained
slide-24
SLIDE 24

Pa Paddle Pa Paddle De Detect ctio ion

  • Fast/Faster R-CNN, FPN, Mask RCNN, Cascade R-CNN,

Yolo v3, RetinaNet, SSD ……

  • GN, SyncBN, Deformable Conv v1/v2 ……
  • https://github.com/PaddlePaddle/models/tree/develo

p/PaddleCV/object_detection

  • Training framework will be released soon.
slide-25
SLIDE 25

Thank you!

Please feel free to contact us, if you have any questions. gaoyuan18@baidu.com