A Solution for Densely Annotated Large Scale Object Detection Task
Yuan Gao, Hui Shen, Donghong Zhong, Jian Wang, Zeyu Liu, Ti Bai, Xiang Long and Shilei Wen
A Solution for Densely Annotated Large Scale Object Detection Task - - PowerPoint PPT Presentation
A Solution for Densely Annotated Large Scale Object Detection Task Yuan Gao, Hui Shen, Donghong Zhong, Jian Wang, Zeyu Liu, Ti Bai, Xiang Long and Shilei Wen Ob Object 365 365 Da Datas aset # Box in Box Num Image Image Box Area
Yuan Gao, Hui Shen, Donghong Zhong, Jian Wang, Zeyu Liu, Ti Bai, Xiang Long and Shilei Wen
Pretrain # Class # Image # Box in Total Box Num Avg Image Height Avg Image Width Avg Box Area Avg (Pixel) Max # Box COCO17 (Train) 80 118287 0.86M 7.27 484 577 12025 93 Object 365 (Train) 365 608606 9.62M 15.81 536 662 14074 835
22.73 20 22 24 26 28 30 32 34 36 38 Baseline R50
Ob Object365 Va Validation(mA mAP)
C5 C4 C3 C2 P5 P4 P3 P2 NASFPN RPN
is adopted.
cascaded behind the original FPN module.
searching procedure.
RPN RPN RPN
P6
RPN
[1] Ghiasi G, Lin T Y, Pang R, et al. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object
C5 C4 C3 C2 P5 P4 P3 P2 P6 P6 P5 P4 P3 P2 N6 N4 N3 N2 N5 (a) FPN (b) Our NAS-FPN The architecture graph of original FPN and NAS-FPN after ~400 episodes
22.73 23.66 20 22 24 26 28 30 32 34 36 38 Baseline R50 NASFPN
Ob Object365 Va Validation (mA mAP)
15 Classes 4 Classes Sampling probability equally is not appropriate.
𝑋
" = Sampling weight of the i th image.
C" = Total number of the classes of the i th image. N = Total number of the classes of the dataset. 𝑄
% = The c th class prior probability, according to the total box
number of the dataset. 𝐼% = 1 if the images contains class c or 0. The i th image contains 15 Classes
" = ln 𝐷" + 𝜁 ∑%-. /
%𝐼%
5 10 15 20 25 30 iter 49999 99999 149999 199999 249999 299999 349999
Object365 Validation (mAP)
Random Class Diverisity Sensitive
22.73 23.66 25.01 30.1 30.7 20 22 24 26 28 30 32 34 36 38 Baseline R50 NASFPN CDSS SENet154+GN OHEM+Deformable
Ob Object365 Va Validation (mA mAP)
RoI-Align 7X7 conv GN conv GN conv GN conv GN Box Class RoI-Align 9X9 conv GN conv GN conv GN conv GN conv GN Class Box
22.73 23.66 25.01 30.1 30.7 30.9 20 22 24 26 28 30 32 34 36 38 B a s e l i n e R 5 N A S F P N C D S S S E N e t 1 5 4 + G N O H E M + D e f
m a b l e L a r g e R e s
u t i
H e a d
Ob Object365 Va Validation (mA mAP)
F pool pool pool H1 H2 H3 B1 B0 B2 C3 H1 H2 C2 C1 B3 C
stage to extract the feature. (Standard Cascade RCNN)
[1] Cai Z, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection. CVPR.2018
F pool pool pool H1 H1 H2 B1 B0 B2 C2 C1 C3 B3 C H2 H3 pool H3
stage itself to extract the feature.
22.73 23.66 25.01 30.1 30.7 30.9 31.1 33.1 20 22 24 26 28 30 32 34 36 38 B a s e l i n e R 5 N A S F P N C D S S S E N e t 1 5 4 + G N O H E M + D e f
m a b l e L a r g e R e s
u t i
H e a d A d a p t i v e C a s c a d e T e s t i n g M S T r a i n i n g a n d T e s t i n g
Ob Object365 Va Validation (mA mAP)
Validation
22.73 23.66 25.01 30.1 30.7 30.9 31.1 33.1 36.5 20 22 24 26 28 30 32 34 36 38 B a s e l i n e R 5 N A S F P N C D S S S E N e t 1 5 4 + G N O H E M + D e f
m a b l e L a r g e R e s
u t i
H e a d A d a p t i v e C a s c a d e T e s t i n g M S T r a i n i n g a n d T e s t i n g E n s e m b l e 5 m
e l s
Ob Object365 Va Validation (mA mAP)
Validation
Pretrain Full Val mAP Tiny Val mAP Gain Tiny Test mAP COCO Pretrain
Pretrain 30.7 33.0 +4.1
Pretrain 32.9 34.8 +5.9
models
+8.7 29.0
Yolo v3, RetinaNet, SSD ……
p/PaddleCV/object_detection
Please feel free to contact us, if you have any questions. gaoyuan18@baidu.com