Pixel-Level Im Image Understanding wit ith Semantic Segmentation and Panoptic Segmentation
Hengshuang Zhao The Chinese University of Hong Kong May 29, 2019
Pixel-Level Im Image Understanding wit ith Semantic Segmentation - - PowerPoint PPT Presentation
Pixel-Level Im Image Understanding wit ith Semantic Segmentation and Panoptic Segmentation Hengshuang Zhao The Chinese University of Hong Kong May 29, 2019 Part I: I: Semantic Segmentation Semantic Segmentation background car person
Hengshuang Zhao The Chinese University of Hong Kong May 29, 2019
Original Image Per-Pixel Annotation person horse car background Images adapted from PASCAL VOC 2012 Images adapted from ADE20K
FCN [Long et al. 2015]
DeepLabV1 [Chen et al. 2015], DPN [Liu et al. 2015], CRF-RNN [Zheng et al. 2015]
UNet [Ronneberger et al. 2015], DeconvNet [Noh et al. 2015], SegNet [Badrinarayanan et al. 2015], LRR [Ghiasi et al. 2016], RefineNet [Lin et al. 2017], FRRN [Pohlen et al. 2017]
DeepLabV1 [Chen et al. 2015], Dilation [Fisher et al. 2016]
Pooling: ParseNet [Liu et al. 2015], PSPNet [Zhao et al. 2017], DeepLabV2 [Chen et al. 2016] Large Kernel: GCN [Peng et al. 2017]
Search for backbone: Auto-DeepLab [Liu et al. 2019] Search for head: DPC [Chen et al. 2018]
Spatial attention (dot product): Transformer [Vaswani et al. 2017], Non-Local-Net [Wang et al. 2018] OCNet [Yuan et al. 2018], DANet [Fu et al. 2018], CCNet [Huang et al. 2018] Channel reweighting: SENet [Hu et al. 2018], EncNet [Zhang et al. 2018], DFN [Yu et al. 2018]
Information collection branch Information distribution branch
Over-completed Compact
Information collection branch Information distribution branch
Over-completed Compact
feature fusion: local & global
ADE20K: information aggregation approaches ADE20K: result on val set PSACAL VOC 2012:result on val set PSACAL VOC 2012: result on val set
result on val set result on test set (train with fine set) result on test set (train with fine+coarse set)
semantic segmentation: instances indistinguishable
instance segmentation: stuff unsolved
panoptic segmentation: stuff and things are solved, instances distinguishable
Mask R-CNN [He et al. 2017] PSPNet [Zhao et al. 2017]
Instance Semantic redundant computation for independent models
Mask R-CNN [He et al. 2017] PSPNet [Zhao et al. 2017]
Instance Semantic
Heuristic Merge
heuristic merge logic is not end-to-end trainable
heuristic combination
Unified Backbone Network Save Computation! Pixel-wise Classification Consistent Estimation!
Semantic Head: FPN with Deformable Conv Instance Head: Same as Mask-RCNN
Mask logits from Instance head
𝑍
𝑗
resize/pad
𝑌thing
Thing & Stuff logits from Semantic head
𝑌mask𝑗
𝑂inst
H x W 𝑌stuff
𝑂stuff
H x W Panoptic logits
max max 1
Logits for Unknown
160 165 170 175 180 185 190 41.4 41.6 41.8 42 42.2 42.4 42.6 Results on COCO (800 x 1300) 200 400 600 800 1000 1200 57 57.5 58 58.5 59 59.5 Results on Cityscapes (1024 x 2048) UPSNet MR-CNN-PSP UPSNet MR-CNN-PSP
result on COCO result on Cityscapes result on internal data run time comparison
result on COCO result on Cityscapes
I. Semantic Segmentation:
II. Panoptic Segmentation:
I. Semantic Segmentation:
II. Panoptic Segmentation: