SLIDE 9 Full resolution feature network
64 64*5 *512 12*51 512 64 64*2 *256 56*25 256 128*256 128*256*256 *256 12 128* 8*12 128*1 *128 28 25 256* 6*12 128*1 *128 28 256*64* 256*64*64 64 256 256*12 *128*12 *128 256 256*12 *128*12 *128 128 128*25 *256*25 *256 128 128*25 *256*25 *256 64* 64*512 512*512 512 64*512* 64*512*512 512 64* 64*512 512*512 512 64* 64*256 256*256 256 64* 64*128 128*128 128
Upsampling(by2) Max Max-pooling(by2) Conv(3*3+BN+ReLu)
64* 64*256 256*256 256 128 128*12 *128*12 *128 256 256*64 *64*64 64 1*5 1*512* 12*512 12
Inut image Prediction Prediction Prediction Conv(1*1+BN+ReLu)
Fig
- Figure. 5.
- 5. The structure of the full resolution feature network.
The output feature maps have higher resolutions than the FPN[2], and have more accurate local information; Compared to FPN[2], element-wise addition is replaced by the concatenation to softly merge the feature maps.
2.Lin T Y , Dollár, Piotr, Girshick R , et al. Feature Pyramid Networks for Object Detection[J]. 2016.