An Overview of Semantic Image Segmentation with Deep Learning - - PowerPoint PPT Presentation

an overview of semantic image segmentation with deep
SMART_READER_LITE
LIVE PREVIEW

An Overview of Semantic Image Segmentation with Deep Learning - - PowerPoint PPT Presentation

An Overview of Semantic Image Segmentation with Deep Learning Simone Bonechi Outline Semantic Image Segmentation Deep Network for Semantic Segmentation FCN (Fully Convolutional Neural Network) DeconvNet PSPNet (Pyramid Scene


slide-1
SLIDE 1

An Overview of Semantic Image Segmentation with Deep Learning

Simone Bonechi

slide-2
SLIDE 2

Outline

Ø Semantic Image Segmentation Ø Deep Network for Semantic Segmentation

  • FCN (Fully Convolutional Neural Network)
  • DeconvNet
  • PSPNet (Pyramid Scene Parsing Network)

Ø

Work in progress…

slide-3
SLIDE 3

Semantic Image Segmentation

slide-4
SLIDE 4

Instance-Level Segmentation

Ø Its main purpose is to identify objects of the same class and split them

into different instances

slide-5
SLIDE 5

Results on PascalVoc 2012

slide-6
SLIDE 6

Fully Convolutional Neural Network (FCN)

Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).

slide-7
SLIDE 7

FCN Overview

Ø Tested with AlexNet, VGG and GoogLeNet Ø Reinterpret standard classification convnets as “Fully convolutional”

networks (FCN) for semantic segmentation

Ø Combine information from different layers for segmentation

slide-8
SLIDE 8

Replace FC with Convolutions

A classification network Becoming fully convolutional

slide-9
SLIDE 9

Upsampling the output

slide-10
SLIDE 10

Convolution & Deconvolution

Ø Deconvolution Ø Transposed convolution Ø Fractionally strided convolution Ø Backward strided convolution Ø Upconvolution Ø …..

slide-11
SLIDE 11

Upsampling the output

slide-12
SLIDE 12

FCN Limitations

Ø Fixed-size receptive field

  • FCN has fixed-size receptive field; objects substantially larger or

smaller than the receptive field may be fragmented or mislabeled

  • Label map is so small, tend to forget detail structures of object
slide-13
SLIDE 13

FCN skip architecture

slide-14
SLIDE 14

FCN Results

Ø Results on PascalVOC 2012

slide-15
SLIDE 15

DeconvNet

Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1520-1528).

slide-16
SLIDE 16

Pooling & Unpooling

Ø Unpooling

  • Retrieve structure of original activation map
  • Activation size is preserved, but still sparse
slide-17
SLIDE 17

Convolution & Deconvolution

Ø Deconvolution

  • Densify sparse activation map
slide-18
SLIDE 18

Visualization of activations

slide-19
SLIDE 19

Results - Comparisons

slide-20
SLIDE 20

PSP-net

Zhao, Hengshuang, et al. "Pyramid scene parsing network." IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2017.

slide-21
SLIDE 21

Atrous Convolution

Ø Upsample with atrous convolution to compute feature densely

slide-22
SLIDE 22

PSPNet Results