sem semanti tic c segm segmen enta tati tion on
play

Sem Semanti tic c segm segmen enta tati tion on CV3DST | - PowerPoint PPT Presentation

Sem Semanti tic c segm segmen enta tati tion on CV3DST | Prof. Leal-Taix 1 Ta Task d defin init itio ion: s semantic ic s segm gmentatio ion Classify the main object in the image. CAT , GRASS, TREE, SKY No objects, just


  1. Sem Semanti tic c segm segmen enta tati tion on CV3DST | Prof. Leal-Taixé 1

  2. Ta Task d defin init itio ion: s semantic ic s segm gmentatio ion Classify the main object in the image. CAT , GRASS, TREE, SKY No objects, just classify each pixel. CV3DST | Prof. Leal-Taixé 2

  3. Se Semantic ic Se Segmentatio ion - Every label in the image needs to be labelled with a category label. - Do not differentiate between the instances (see how we do not differentiate between pixels coming from different cows). CV3DST | Prof. Leal-Taixé 3

  4. Fully lly Convolu lutional l Netw Networks CV3DST | Prof. Leal-Taixé 9

  5. Fully convolutio ional neural networks • A FCN is able to deal with any input/output size Long, Shelhamer, Darrell - Fully Convolutional Networks for Semantic Segmentation, CVPR 2015, PAMI 2016 CV3DST | Prof. Leal-Taixé 10

  6. Fully convolutio ional neural networks 1. Replace FC layers with convolutional layers. 2. Convert the last layer output to the original resolution. 3. Do softmax-cross entropy between the pixelwise predictions and segmentaion ground truth. 4. Backprop and SGD Convolutional layers CV3DST | Prof. Leal-Taixé 11

  7. “Co Convolutio ionaliz izatio ion” 1x1 Convolutions! CV3DST | Prof. Leal-Taixé 12

  8. “Co Convo volutionaliza zation” See a more detailed explanation in this quora answer. CV3DST | Prof. Leal-Taixé 13

  9. Se Semanti ntic c Se Segmenta ntati tion n (FCN) Fully Convolutional Networks for Semantic Segmentation • How do we upsample? Long, Shelhamer, Darrell - Fully Convolutional Networks for Semantic Segmentation, CVPR 2015, PAMI 2016 CV3DST | Prof. Leal-Taixé 14

  10. Network's archit itecture Predict the segmentation mask from high level features CV3DST | Prof. Leal-Taixé 15

  11. Network's archit itecture Predict the segmentation mask from high level features Predict the segmentation mask from mid-level features CV3DST | Prof. Leal-Taixé 16

  12. Network's archit itecture Predict the segmentation mask from high level features Predict the segmentation mask from mid-level features Predict the segmentation mask from low-level features CV3DST | Prof. Leal-Taixé 17

  13. Network's archit itecture Hierarchical training where the network is initially trained only based on high level features and then finetuned based on middle and low-level features. CV3DST | Prof. Leal-Taixé 18

  14. Network's archit itecture This is important because it allows the network to also learn the mid and low-level details of the image, in addition to high level ones. CV3DST | Prof. Leal-Taixé 19

  15. Qualit itativ ive results Good Better Best CV3DST | Prof. Leal-Taixé 20

  16. Qualit itativ ive results SDS is an R-CNN-based method, i.e., it uses object proposals. In general, FCN outperforms significantly (both qualitatively and quantitatively) pre-deep learning and quasi-deep learning methods and is recognized as the AlexNet of semantic segmentation. CV3DST | Prof. Leal-Taixé 21

  17. Au Autoenc ncoder-style le ar archit hitecture CV3DST | Prof. Leal-Taixé 22

  18. Se SegNet • Step-wise upsampling Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 CV3DST | Prof. Leal-Taixé 23

  19. Se SegNet • Enc Encoder : normal convolutional filters + pooling • De Decoder : Upsampling + convolutional filters Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 CV3DST | Prof. Leal-Taixé 24

  20. Se SegNet • Enc Encoder : normal convolutional filters + pooling • De Decoder : Upsampling + convolutional filters Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 CV3DST | Prof. Leal-Taixé 25

  21. Se SegNet • Enc Encoder : normal convolutional filters + pooling • De Decoder : Upsampling + convolutional filters • The convolutional filters in the decoder are learned using backprop and their goal is to refine the upsampling Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 CV3DST | Prof. Leal-Taixé 26

  22. Tr Trans nsposed co convo volu luti tion • Transposed convolution Output 5x5 - Unpooling - Convolution filter (learned) - Also called up-convolution (never deconvolution) Input 3x3 CV3DST | Prof. Leal-Taixé 27

  23. Se SegNet • Enc Encoder : normal convolutional filters + pooling • De Decoder : Upsampling + convolutional filters ax layer: The output of the soft-max classifier is • Softmax a K channel image of probabilities where K is the number of classes. Badrinarayanan et al. „SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation“. TPAMI 2016 CV3DST | Prof. Leal-Taixé 28

  24. CV3DST | Prof. Leal-Taixé Upsampli ling 29

  25. Ty Types of upsa upsampl plings gs • 1. Interpolation ? CV3DST | Prof. Leal-Taixé 30

  26. Ty Types of upsa upsampl plings gs • 1. Interpolation Original image Nearest neighbor interpolation Bilinear interpolation Bicubic interpolation Image: Michael Guerzhoy CV3DST | Prof. Leal-Taixé 31

  27. Ty Types of upsa upsampl plings gs • 1. Interpolation Few artifacts CV3DST | Prof. Leal-Taixé 32

  28. Ty Types of upsa upsampl plings gs • 2. Fixed unpooling efficient + CONVS A. Dosovitskiy, “Learning to Generate Chairs, Tables and Cars with Convolutional Networks“. TPAMI 2017 CV3DST | Prof. Leal-Taixé 33

  29. Ty Types of upsa upsampl plings gs • 3. Unpooling: “à la DeconvNet” Keep the locations where the max came from Zeiler and Fergus. „Visualizing and understanding convolutional neural networks“. ECCV 2014 CV3DST | Prof. Leal-Taixé 34

  30. Ty Types of upsa upsampl plings gs • 3. Unpooling: “à la DeconvNet” Keep the details of the structures CV3DST | Prof. Leal-Taixé 35

  31. Sk Skip p con connecti ection ons s (U (U-Net) Net) CV3DST | Prof. Leal-Taixé 36

  32. Ski Skip Conne nnecti ctions ns • U-Net Pass the low- level information High-level information Recall ResNet O. Ronneberger et al. “U-Net: Convolutional Networks for Biomedical Image Segmentation”. MICCAI 2015 CV3DST | Prof. Leal-Taixé 37

  33. Ski Skip Conne nnecti ctions ns • U-Net: zoom in append O. Ronneberger et al. “U-Net: Convolutional Networks for Biomedical Image Segmentation”. MICCAI 2015 CV3DST | Prof. Leal-Taixé 38

  34. Ski Skip Conne nnecti ctions ns • Concatenation connections C. Hazirbas et al. “Deep depth from focus”. ACCV 2018 CV3DST | Prof. Leal-Taixé 39

  35. DeepL DeepLab CV3DST | Prof. Leal-Taixé 41

  36. Deep DeepLab ab CV3DST | Prof. Leal-Taixé 42

  37. Se Semant ntic Se Segm gment ntation: n: 3 cha hallenge nges • Reduced feature resolution – Proposed solution: Atrous convolutions • Objects exist at multiple scales – Proposed solution: Pyramid pooling, as in detection. • Poor localization of the edges – Proposed solution: Refinement with Conditional Random Field (CRF) CV3DST | Prof. Leal-Taixé 43

  38. Se Semant ntic Se Segm gment ntation: n: 3 cha hallenge nges • Reduced feature resolution – Proposed solution: Atrous convolutions • Objects exist at multiple scales – Proposed solution: Pyramid pooling, as in detection. • Poor localization of the edges – Proposed solution: Refinement with Conditional Random Field (CRF) CV3DST | Prof. Leal-Taixé 44

  39. Wish: no Wi o redu educed ed feat eature e res esol olution on conv conv conv conv pixels in pixels out width x height x RGB width x height x classes Just convs & activations Super expensive! Fully Convolutional Network

  40. Al Alternative: Dilated (at atrous) ) con onvol olution ions Sparse feature extraction with standard convolution on a low resolution input feature map. Dense feature extraction with atrous convolution with rate r = 2, applied on a high resolution input feature map. CV3DST | Prof. Leal-Taixé 46

  41. Al Alternative: Dilated (at atrous) ) con onvol olution ions Sparse feature extraction with standard convolution on a low resolution input feature map. Dense feature extraction with atrous convolution with rate r=2, applied on a high resolution input feature map. CV3DST | Prof. Leal-Taixé 47

  42. Dilated Di ed (at atrous) ) con onvol olution ions 1D (a) Sparse feature extraction with standard convolution on a low resolution input feature map. (b) Dense feature extraction with atrous convolution with rate r = 2, applied on a high resolution input feature map. CV3DST | Prof. Leal-Taixé 48

  43. Di Dilated ed (at atrous) ) co convo nvolutions ns in n 2D Output An analogy Standard for dilated convolution conv is a conv has dilation 1 filter with holes cla lass ss to torch ch.n .nn.Co Conv2d ( in in_channels , , out_ch out channels els , , ker kernel_ el_si size , , st stride= e=1 , , pa paddin ing=0 , , di dilat ation= n=2 ) cla lass ss to torch ch.n .nn.Co ConvTran anspose2d ( in in_channels , , out out_ch channels els , , ker kernel_ el_si size , , Input stride= st e=1 , , pa paddin ing=0 , , di dilat ation= n=2 ) CV3DST | Prof. Leal-Taixé 49

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend