Deformable Convolutional Networks Jifeng Dai^ With Haozhi Qi*^, - PowerPoint PPT Presentation

Deformable Convolutional Networks Jifeng Dai^ With Haozhi Qi*^, Yuwen Xiong*^, Yi Li*^, Guodong Zhang*^, Han Hu, Yichen Wei Visual Computing Group Microsoft Research Asia (* interns at MSRA, ^ equal contribution)

Highlights • Enabling effective modeling of spatial transformation in ConvNets • No additional supervision for learning spatial transformation • Significant accuracy improvements on sophisticated vision tasks Code is available at https://github.com/msracver/Deformable-ConvNets

Modeling Spatial Transformations • A long standing problem in computer vision Deformation: Scale: Viewpoint variation: Intra-class variation: (Some examples are taken from Li Fei- fei’s course CS223B, 2009-2010)

Traditional Approaches • 1) To build training datasets with sufficient desired variations • 2) To use transformation-invariant features and algorithms Scale Invariant Feature Transform (SIFT) Deformable Part-based Model (DPM) • Drawbacks: geometric transformations are assumed fixed and known, hand-crafted design of invariant features and algorithms

Spatial transformations in CNNs • Regular CNNs are inherently limited to model large unknown transformations • The limitation originates from the fixed geometric structures of CNN modules regular convolution 2 layers of regular convolution regular RoI Pooling

Spatial Transformer Networks • Learning a global, parametric transformation on feature maps • Prefixed transformation family, infeasible for complex vision tasks

Deformable Convolution • Local, dense, non-parametric transformation • Learning to deform the sampling locations in the convolution/RoI Pooling modules regular deformed scale & aspect ratio rotation

Deformable Convolution Regular convolution Deformable convolution where is generated by a sibling branch of regular convolution

Deformable RoI Pooling Regular RoI pooling Deformable RoI pooling where is generated by a sibling fc branch deformable RoI Pooling

Deformable ConvNets • Same input & output as the plain versions • Regular convolution -> deformable convolution • Regular RoI pooling -> deformable RoI pooling • End-to-end trainable without additional supervision

Sampling Locations of Deformable Convolution (a) standard convolution (b) deformable convolution

Part Offsets in Deformable RoI Pooling

Ablation Experiments on VOC & Cityscapes • Number of deformable convolutional layers (using ResNet-101) DeepLab Class-aware RPN Faster R-CNN (2fc) R-FCN # deformable layers mIoU@V (%) mIoU @C (%) mAP@0.5 (%) mAP@0.7 (%) mAP@0.5 (%) mAP@0.7 (%) mAP@0.5 (%) mAP@0.7 (%) None (0, baseline) 69.7 70.4 68.0 44.9 78.1 62.1 80.0 61.8 Res5c (1) 73.9 73.5 73.5 54.4 78.6 63.8 80.6 63.0 Res5b, c (2) 74.8 74.4 74.3 56.3 78.5 63.3 81.0 63.8 Res5a, b, c (3) (default) 75.2 75.2 74.5 57.2 78.6 63.3 81.4 64.7 Res5 & res4b22, b21, b20 (6) 74.8 75.1 74.6 57.7 78.7 64.0 81.5 65.4

Deformable ConvNets v.s. dilated convolution DeepLab Class-aware RPN Faster R-CNN R-FCN Deformable modules mIoU@V/@C mAP@0.5/@0.7 mAP@0.5/@0.7 mAP@0.5/@0.7 Dilated convolution (2, 2, 2) (default) 69.7 / 70.4 68.0 / 44.9 78.1 / 62.1 80.0 / 61.8 Dilated convolution (4, 4, 4) 73.1 / 71.9 72.8 / 53.1 78.6 / 63.1 80.5 / 63.0 Dilated convolution (6, 6, 6) 73.6 / 72.7 73.6 / 55.2 78.5 / 62.3 80.2 / 63.5 Dilated convolution (8, 8, 8) 73.2 / 72.4 73.2 / 55.1 77.8 / 61.8 80.3 / 63.2 Deformable convolution 75.3 / 75.2 74.5 / 57.2 78.6 / 63.3 81.4 / 64.7 Deformale RoI pooling N.A N.A 78.3 / 66.6 81.2 / 65.0 Deformale convolution & RoI pooling N.A N.A 79.3 / 66.9 82.6 / 68.5 regular convolution dilated convolution deformable convolution

Model Complexity and Runtime on VOC & Cityscapes • Deformable ConvNets v.s. regular ConvNets Method # params Net forward (sec) Runtime (sec) Regular DeepLab @Cityscapes 46.0M 0.610 0.650 Deformable DeepLab @Cityscapes 46.1 M 0.656 0.696 Regular DeepLab @VOC 46.0M 0.084 0.094 Deformable DeepLab @VOC 46.1 M 0.088 0.098 Regular Class-aware RPN 46.0 M 0.142 0.323 Deformable class-aware RPN 46.1 M 0.152 0.334 Regular Faster R-CNN (2fc) 58.3 M 0.147 0.190 Deformable Faster R-CNN (2fc) 59.9 M 0.192 0.234 Regular R-FCN 47.1 M 0.143 0.170 Deformable R-FCN 49.5 M 0.169 0.193

Object Detection on COCO • Deformable ConvNets v.s. regular ConvNets 48.5 FPN++ (ALIGNED-XCEPTION) 45.2 43.3 FPN+OHEM (ALIGNED-XCEPTION) 40.2 40.5 FPN+OHEM (RESNET-101) 37.4 37.5 R-FCN (ALIGNED-INCEPTION-RESNET) 34.5 35.7 R-FCN (RESNET-101) 32.1 35 FASTER R-CNN, 2FC (RESNET-101) 30.3 25.8 CLASS-AWARE RPN (RESNET-101) 23.2 20 25 30 35 40 45 50 mAP (%) Deformable Regular

Conclusion • Deformable ConvNets for dense spatial modeling • Simple, efficient, deep, and end-to-end • No additional supervision • Feasible and effective on sophisticated vision tasks for the first time

Deformable Convolutional Networks Jifeng Dai^ With Haozhi Qi*^, - PowerPoint PPT Presentation

Deformable Convolutional Networks Jifeng Dai^ With Haozhi Qi^, Yuwen Xiong^, Yi Li^, Guodong Zhang^, Han Hu, Yichen Wei Visual Computing Group Microsoft Research Asia (* interns at MSRA, ^ equal contribution) Highlights Enabling

Geometric Registration for Deformable Shapes 2.2 Deformable Registration Variational Model

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Engineering Mechanics Of Deformable Solids A Presentation With Exercises Engineering Mechanics

Manipulation of 1D and 2D Deformable Objects Without Modeling Deformation Dmitry Berenson

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Tracking deformable objects with WiSARD networks: a preliminary work INNOROBO 2014 European

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image

Anytime Reliability of Systematic LDPC Motivation Convolutional Codes LDPC Convolutional Codes

SimCLR: A Simple Framework for Contrastive Learning of Visual Representations Ting Chen Simon

Translator Research Production Shared Research task Dataset newstest2016 newstest2017

Structure at the meta-level: Observations on the structure of design spaces of high-performance

Ti Timi ming of ADT T and ch chemotherapy Thomas Keane M.D. Medical University of South

Deep Generation of Coq Lemma Names Using Elaborated Terms Pengyu Nie 1 , Karl Palmskog 2 , Junyi

vil : Dri Drift ft with th De Devi Security of Multi-Sensor Fusion based Localization in

sample synthesis method for few-shot object recognition Eli Schwartz, Leonid Karlinsky,

Natural Language Processing with Deep Learning CS224N/Ling284 Christopher Manning Lecture 9:

Deformable Convolutional Networks Jifeng Dai^ With Haozhi Qi*^, - PowerPoint PPT Presentation

Deformable Convolutional Networks Jifeng Dai^ With Haozhi Qi*^, Yuwen Xiong*^, Yi Li*^, Guodong Zhang*^, Han Hu, Yichen Wei Visual Computing Group Microsoft Research Asia (* interns at MSRA, ^ equal contribution) Highlights Enabling

Geometric Registration for Deformable Shapes 2.2 Deformable Registration Variational Model

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Engineering Mechanics Of Deformable Solids A Presentation With Exercises Engineering Mechanics

Manipulation of 1D and 2D Deformable Objects Without Modeling Deformation Dmitry Berenson

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Tracking deformable objects with WiSARD networks: a preliminary work INNOROBO 2014 European

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Neural Networks (Part III) 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image

Anytime Reliability of Systematic LDPC Motivation Convolutional Codes LDPC Convolutional Codes

SimCLR: A Simple Framework for Contrastive Learning of Visual Representations Ting Chen Simon

Translator Research Production Shared Research task Dataset newstest2016 newstest2017

Structure at the meta-level: Observations on the structure of design spaces of high-performance

Ti Timi ming of ADT T and ch chemotherapy Thomas Keane M.D. Medical University of South

Deep Generation of Coq Lemma Names Using Elaborated Terms Pengyu Nie 1 , Karl Palmskog 2 , Junyi

vil : Dri Drift ft with th De Devi Security of Multi-Sensor Fusion based Localization in

sample synthesis method for few-shot object recognition Eli Schwartz*, Leonid Karlinsky*,

Natural Language Processing with Deep Learning CS224N/Ling284 Christopher Manning Lecture 9:

Deformable Convolutional Networks Jifeng Dai^ With Haozhi Qi^, Yuwen Xiong^, Yi Li^, Guodong Zhang^, Han Hu, Yichen Wei Visual Computing Group Microsoft Research Asia (* interns at MSRA, ^ equal contribution) Highlights Enabling

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image

sample synthesis method for few-shot object recognition Eli Schwartz, Leonid Karlinsky,