psconv squeezing feature pyramid into one compact poly
play

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale - PowerPoint PPT Presentation

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer Anbang Yao Qifeng Chen Duo Li Highlights investigates multi-scale architecture through the lens of kernel engineering instead of network engineering


  1. PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer Anbang Yao Qifeng Chen Duo Li

  2. Highlights • investigates multi-scale architecture through the lens of kernel engineering instead of network engineering • extends the scope of conventional mono-scale convolution operation by developing our Poly-Scale Convolution • bring about performance improvement on classification, detection, segmentation tasks with NO computational overheads.

  3. Motivation: Multi-Scale Architecture Design • Single-Scale • AlexNet • VGGNet • …… • Multi-Scale • FCN -> skip connection • Inception -> parallel stream • …… Long et al ., Fully Convolutional Networks for Semantic Segmentation, CVPR 2015 Szegedy et al ., Going Deeper with Convolutions, CVPR 2015

  4. Previous Work: Layer-Level Skip Connection

  5. Previous Work: Filter-Level Parallel Stream  Kernel Size  Dilation Rate

  6. Previous Work: Filter-Level Feature Pyramid

  7. Motivation: Kernel-Level Feature Pyramid Input Feature Map Convolutional Filter Banks Different Colors->Different Dilation Rates

  8. Poly-Scale Convolution Method Standard Convolution Dilated Convolution

  9. Efficient Implementation Observation: Feature channel indices are interchangeable Implementation: Grouping kernels with the same dilation rate together and implement with group convolution

  10. Quantitative Results: ILSVRC 2012 Comparison to baseline models and SOTA multi-scale architectures on ImageNet

  11. Quantitative Results: MS COCO 2017 Comparison to baseline with basic/cascade detectors on COCO detection track

  12. Quantitative Results: MS COCO 2017 Comparison to baseline with basic/cascade detectors on COCO segmentation track

  13. Qualitative Results: Scale Allocation PS-ResNet-50 on ImageNet PS-ResNeXt-29 on CIFAR-100 ■ indicates starting residual block of one stage

  14. Conclusion • a plug-and-play convolution operation for any deep learning models • leads to consistent and considerable performance margins in a wide range of vision tasks, without bells and whistles • code available for reproducibility: https://github.com/d-li14/PSConv

  15. Thanks!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend