PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale - - PowerPoint PPT Presentation

psconv squeezing feature pyramid into one compact poly
SMART_READER_LITE
LIVE PREVIEW

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale - - PowerPoint PPT Presentation

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer Anbang Yao Qifeng Chen Duo Li Highlights investigates multi-scale architecture through the lens of kernel engineering instead of network engineering


slide-1
SLIDE 1
slide-2
SLIDE 2

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer

Qifeng Chen Duo Li Anbang Yao

slide-3
SLIDE 3

Highlights

  • investigates multi-scale architecture through the lens of kernel

engineering instead of network engineering

  • extends the scope of conventional mono-scale convolution operation

by developing our Poly-Scale Convolution

  • bring about performance improvement on classification, detection,

segmentation tasks with NO computational overheads.

slide-4
SLIDE 4

Motivation: Multi-Scale Architecture Design

  • Single-Scale
  • AlexNet
  • VGGNet
  • ……
  • Multi-Scale
  • FCN -> skip connection
  • Inception -> parallel stream
  • ……

Long et al., Fully Convolutional Networks for Semantic Segmentation, CVPR 2015 Szegedy et al., Going Deeper with Convolutions, CVPR 2015

slide-5
SLIDE 5

Previous Work: Layer-Level Skip Connection

slide-6
SLIDE 6

Previous Work: Filter-Level Parallel Stream

 Dilation Rate  Kernel Size

slide-7
SLIDE 7

Previous Work: Filter-Level Feature Pyramid

slide-8
SLIDE 8

Motivation: Kernel-Level Feature Pyramid

Input Feature Map Convolutional Filter Banks Different Colors->Different Dilation Rates

slide-9
SLIDE 9

Method

Standard Convolution Dilated Convolution Poly-Scale Convolution

slide-10
SLIDE 10

Efficient Implementation

Observation: Feature channel indices are interchangeable Implementation: Grouping kernels with the same dilation rate together and implement with group convolution

slide-11
SLIDE 11

Quantitative Results: ILSVRC 2012

Comparison to baseline models and SOTA multi-scale architectures on ImageNet

slide-12
SLIDE 12

Quantitative Results: MS COCO 2017

Comparison to baseline with basic/cascade detectors on COCO detection track

slide-13
SLIDE 13

Quantitative Results: MS COCO 2017

Comparison to baseline with basic/cascade detectors on COCO segmentation track

slide-14
SLIDE 14

Qualitative Results: Scale Allocation

PS-ResNet-50 on ImageNet

■ indicates starting residual block of one stage

PS-ResNeXt-29 on CIFAR-100

slide-15
SLIDE 15

Conclusion

  • a plug-and-play convolution operation for any deep learning models
  • leads to consistent and considerable performance margins in a wide

range of vision tasks, without bells and whistles

  • code available for reproducibility: https://github.com/d-li14/PSConv
slide-16
SLIDE 16

Thanks!