Context For Semantic Segmentation Gang Yu Collaborators - - PowerPoint PPT Presentation
Context For Semantic Segmentation Gang Yu Collaborators - - PowerPoint PPT Presentation
Context For Semantic Segmentation Gang Yu Collaborators Changqian Yu Jingbo Wang Chao Peng Xiangyu Zhang Changxin Gao Nong Sang Gang Yu Jian Sun Outline Revisit Semantic Segmentation Context for Semantic
Chao Peng Jingbo Wang Changqian Yu Changxin Gao Xiangyu Zhang Gang Yu Jian Sun
Collaborators
Nong Sang
Outline
- Revisit Semantic Segmentation
- Context for Semantic Segmentation
- Backbone
- Head
- Loss
- Conclusion
Outline
- Revisit Semantic Segmentation
- Context for Semantic Segmentation
- Backbone
- Head
- Loss
- Conclusion
What is Semantic Segmentation?
- Classification + Localization
- Visual Recognition
- Classification
- Semantic Segmentation
- Instance Segmentation
- Panoptic Segmentation
- Detection
- Keypoint Detection
Pipeline
Backbone
Head
LOSS
VGG16 ResNet ResNext … Softmax L2 … U-Shape 4/8-Sampling + Dilation …
Challenges in Semantic Segmentation?
- Speed
- Performance
- Per-pixel Accuracy
- Boundary
What is Context?
- According to Dictionary:
- the parts of a discourse that surround a word or passage and
can throw light on its meaning
Sports ball Grass Play Fields Person
Outline
- Revisit Semantic Segmentation
- Context for Semantic Segmentation
- Backbone
- Head
- Loss
- Conclusion
Context in Backbone
- Motivation
- Traditional Backbone is designed for Classification
- Large Receptive field by compromising spatial resolution
- Segmentation requires both Classification & Localization
- Maintain both Receptive Field (context) & Spatial resolution
- Computational cost?
Context in Backbone - BiSeNet
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018
- BiSeNet: Bilateral Segmentation Network
Context in Backbone - BiSeNet
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018
- Pipeline
Context in Backbone - BiSeNet
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018
- Results
Context in Backbone - BiSeNet
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018
- Ablation Results
Context in Backbone - BiSeNet
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018
- Speed
Context in Backbone - BiSeNet
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018
- Summary
- Two path in backbone: Spatial path + Context path
- Context is implicitly encoded in receptive field
- Efficient speed
- Code: https://github.com/ycszen/TorchSeg
- Context:
- A branch encodes semantic meaning with large receptive field?
- Related work:
- ICNet for Real-Time Semantic Segmentation on High-Resolution Images, Hengshuang Zhao,
Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia, ECCV2018
- Stacked Hourglass Networks for Human Pose Estimation, Alejandro Newell, Kaiyu Yang, Jia
Deng, ECCV2016
Context in Head
- Motivation
- Large Receptive field without compromising boundary results
- Why working on Head?
- Efficient speed
- Obvious gain on increasing the receptive
- Simple to implement
Context in Head – Large Kernel
- Receptive Field vs Valid Receptive Field
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – Large Kernel
- Large Kernel Matters
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – Large Kernel
- Large Kernel Matters
- Why Boundary Refinement?
- Large receptive field will blur the object boundary
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – Large Kernel
- Large Kernel Matters
- Ablation: Why Boundary Refinement?
- Large receptive field will blur the object boundary
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – Large Kernel
- Large Kernel Matters
- Ablation: Different kernel size?
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – Large Kernel
- Large Kernel Matters
- Ablation: Are more parameters helpful?
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – Large Kernel
- Large Kernel Matters
- Ablation: GCN vs. Stack of small convolutions
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – Large Kernel
- Large Kernel Matters
- Ablation: GCN in Backbone
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – Large Kernel
- Large Kernel Matters: illustrative examples
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – Large Kernel
- Summary
- Global Convolution network to increase the receptive field
- Large separable convolution is an efficient implementation
- Context
- Large receptive field?
- Related work
- PSPNet: Pyramid Scene Parsing Network, Hengshuang Zhao, Jianping Shi, Xiaojuan Qi,
Xiaogang Wang, Jiaya Jia, CVPR2017
- DeeplabV3: Rethinking Atrous Convolution for Semantic Image Segmentation, Liang-Chieh
Chen, George Papandreou, Florian Schroff, Hartwig Adam
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
Context in Head – DFN
- Motivation:
- Large kernel (GCN) is computationally intensive
- Global pooling is efficient to compute and can obtain the
global context
- Large receptive field does not equal to good context
- Attention strategy to adaptively aggreate the features
Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018
Context in Head – DFN
- DFN: Pipeline
Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018
Context in Head – DFN
- DFN: Ablation
Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018
Context in Head – DFN
- DFN: Results
Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018
Context in Head – DFN
- Summary
- Global pooling is efficient and effective to capture the long-range
context
- Attention for adaptive adjusting feature weights
- Code: https://github.com/ycszen/TorchSeg/
- Context
- Receptive field & feature aggregation?
- Related work
- Non-local Neural Networks, Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He, CVPR2018
- CCNet: Criss-Cross Attention for Semantic Segmentation, Zilong Huang, Xinggang Wang, Lichao
Huang, Chang Huang, Yunchao Wei, Wenyu Liu
- PSANet: Point-wise Spatial Attention Network for Scene Parsing, Hengshuang Zhao*, Yi Zhang*, Shu
Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia, ECCV2018
- OCNet: Object Context Network for Scene Parsing, Yuhui Yuan, Jingdong Wang
- ParseNet: Looking Wider to See Better, Wei Liu, Andrew Rabinovich, Alexander C. Berg
Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018
Context in Loss
- Motivation
- “Thing” may be important for stuff prediction
COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf
Sports ball Grass Play Fields Person
Context in Loss
- Motivation
- “Thing” may be important for stuff prediction
COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf
Encoder Train/Inference Train Supervision Inference Merge
Res-Block
Multi Types Context
Objects Semantic Stuff Stuff
Context in Loss
- Pipeline
COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf
Context in Loss
- COCO2018 Panoptic Segmentation Challenge
49.3 49.6 54.1 54.5 50.8 Res50 +Encoder +Extra Res Blocks +Multi Context +Huge Backbone +Multi-Scale Flip Test Results of Stuff Regions on COCO2018 Panoptic Segmentation Validation Dataset Metric:Mean IoU% Finally, we assembled three models and achieve 55.9% mIoU on this dataset.
COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf
Context in Loss
- COCO2018 Panoptic Segmentation Challenge
COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf
Context in Loss
- Summary
- “Thing” and “stuff” are complementary
- Loss is a good approach to encode the context
- Better feature representation
- Context
- A loss to encode the semantic meaning?
- Related work
- Context Encoding for Semantic Segmentation, Hang Zhang, Kristin Dana, Jianping Shi,
Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal, CVPR2018
COCO2018 Panoptic Segmentation Challenge, http://presentations.cocodataset.org/ECCV18/COCO18-Panoptic-Megvii.pdf
Outline
- Revisit Semantic Segmentation
- Context for Semantic Segmentation
- Backbone
- Head
- Loss
- Conclusion
Conclusion
- Context in different parts
- Backbone, Head, Loss
- What is Context?
- Large receptive field?
- A semantic branch?
- Spatial/feature aggregation?
- Future work
- Explicitly show what is a context
- Panoptic seg: Stuff vs Thing
Reference
- Pyramid Scene Parsing Network, Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia,
CVPR2017
- ICNet for Real-Time Semantic Segmentation on High-Resolution Images, Hengshuang Zhao, Xiaojuan Qi,
Xiaoyong Shen, Jianping Shi, Jiaya Jia, ECCV2018
- Context Encoding for Semantic Segmentation, Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang,
Xiaogang Wang, Ambrish Tyagi, Amit Agrawal, CVPR2018
- Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, Liang-Chieh Chen,
Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam, ECCV2018
- Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network, Chao Peng,
Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun, CVPR, 2017
- Learning a Discriminative Feature Network for Semantic Segmentation, Changqian Yu, Jingbo Wang, Chao
Peng, Changxin Gao, Gang Yu, Nong Sang, CVPR, 2018
- BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation, Changqian Yu, Jingbo
Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang, ECCV, 2018
Q&A
- Megvii Detection 知乎专栏
- Webpage: http://www.skicyyu.org/
- Email: yugang@megvii.com