LEARNING AFFINITY VIA SPATIAL PROPAGATION NETWORK Sifei Liu, - PowerPoint PPT Presentation

LEARNING AFFINITY VIA SPATIAL PROPAGATION NETWORK Sifei Liu, Shalini De Mello, Jinwei Gu, Guangyu Zhong, Ming-Hsuan Yang, Jan Kautz NVIDIA Research, March 27, 2018

WHAT IS AFFINITY? • The relations between two pixels/regions • A typical model: pixel locations and intensities ! "# = % & ', ) % * + , , + - % & and % * are manually designed kernel function, e.g., . /, geometric intensity closeness closeness

WHY AFFINITY? A semantic-level example spatially adjacent nose far away, with similar texture hair far away, different in shape and appearance image semantic labels eye ! "# = % & ', ) * % + , - , , .

a CNN a propagation network … … • deep • shallow • • pixels are pixels are independent correlated image patch

HOW TO USE IT? CNN-based semantic segmentation softmax CNN pixel-wise image probability segmentation Ø Standard CNN-based image segmentation does not explicitly model the pairwise relations of pixels.

WHY AFFINITY? Refine the segmentation CNN-based segmentation image our result

WHY AFFINITY? Improve the context CNN-based segmentation image our result

PROPOSED ARCHITECTURE conv refined result conv coarse mask w " guidance network (deep CNN) affinity RGB image

PROPOSED ARCHITECTURE Spatial propagation networks (SPN) conv refined result conv coarse mask …

SPATIAL PROPAGATION NETWORK Row/column-wise linear propagation left to right propagating ℎ " = $ − & " ' " + ) " ℎ "*+ , . ∈ 2, 1 • ℎ " : the . "2 column in the SPN hidden layer • ) " : an 1×1 transform matrix between t-1 and t • & " : the degree matrix normalizing the response of ℎ " 8 • & " 4, 4 = ∑ ) " 6,7 79+,7:6 ℎ "*+ ℎ " ℎ "*+ ℎ "

SPATIAL PROPAGATION NETWORK Row/column-wise linear propagation Affinity: the off-diagonal entities of G • ℎ " = $ " , • ℎ & = ' − ) & $ & + + & ℎ " , … • ℎ , = ' − ) , $ , + + , ℎ ,-& , … 6 / 0 : the span of ℎ &-. 23 ℎ & , ℎ 4 , … , ℎ . • • ℎ . = ' − ) . $ . + + . ℎ .-& , 6 • 7 0 : the span of $ &-. 23 $ & , $ 4 , … , $ . 8 : an 9×9 9 = ; 4 matrix • < , = ' − ) , •

SPATIAL PROPAGATION NETWORK Advantage: a compact representation Affinity: the off-diagonal entities of G • Parameters • Dense affinity (dense CRF): ! " ×! " • Affinity in SPN (for all ( $ % )): ! " ×! learnable affinity matrix = learn all $ & , $ ( , … , $ *

LEARNING THE AFFINITY MATRIX • Basic idea: Learn all the entities of ! w.r.t. 4 directions ! ' , ! ( , … , ! * → ! ' , ! ( , … , ! * ↑ image CNN ! ' , ! ( , … , ! * ← ! ' , ! ( , … , ! * ↓ • Disadvantage: Huge dimension of network output: number of channels= " ℎ %&( ℎ %&' ℎ % … … … … … … A fully-connected spatial propagation

REDUCING CONNECTIONS Three-way connection • Each pixel is connected to 3 adjacent pixels in the previous row/column ℎ $." = 1 − ) * $," . $," + ) * $," ℎ $,"01 $∈- $∈- • For each pixel, only 3 where 2 = 3 scalar weights needs to be learned with each direction. ℎ "01 ℎ " ℎ "01 ℎ " • Tridiagonal-transform matrix ! " The total CNN output is: • 7 8 ×12 ! " = ℎ " = 1 − 4 " . " + ! " ℎ "01 , 5 ∈ 2, 7

PROPOSED ARCHITECTURE Learnable affinity through a guidance network (CNN) SPN guidance network (deep CNN) affinity RGB image 3×% & w "

REDUCING CONNECTIONS Three-way connection • The integration of 4 directions results in dense connections between all pixels.

IMAGE SEGMENTATION REFINEMENT Helen Dataset: high-resolution VOC 2012 Dataset: general object face parsing with 11 classes semantic segmentation with 21 classes

A TYPICAL CNN IMAGE SEGMENTATION RGB image Jonathan Long, Evan Shelhamer, Trevor Darrell. Fully convolutional networks for semantic segmentation, CVPR 2015

SPATIAL PROPAGATION NETWORK • Input and output: the probability map of all classes (we show one class as an example). 128×128×1 128×128×1 64×64×16 4×4×16/2 4×4×16/2 node-wise max-pool coarse mask refined result 64×64×16

GUIDANCE NETWORK to the propagation module symmetric CNN 64×64×16 skipped links ! of one direction RGB image Helen: VOC: A relatively small VGG16 conv1~pool5 with pre-trained network, learned weights, symmetric upsample layers from scratch learned from scratch

IMPLEMENTATION • Baseline CNN Helen: we train a CNN-base network with the output size as ! " as the input image • VOC: we directly use FCN-8s. • • Train a universal SPN • Coarse mask: segmentation results on the trainset by CNN-base. Guidance network: an independent deep CNN. • • Test on any base networks (for VOC only) • We directly apply the SPN on any CNN-base network ( e.g. , deeplab-VGG-16 and ResNet-101).

HELEN FACE PARSING Original CNN-base • CNN-base: the baseline CNN network • SPN: the three-way model with the two different connections and a same guidance network. CNN+[1] ours [1] Sifei Liu, Jinshan Pan and Ming-Hsuan Yang. Learning recursive filters for low-level vision via a hybrid neural network. ECCV 2016

QUANTITATIVE EVALUATION f-score skin brows eyes nose mouth lip_upper lip_lower lip_inner overall Multi-obj [1] 90.87 69.89 74.74 90.23 82.07 59.22 66.30 81.70 83.68 CNN base 90.53 70.09 74.86 89.16 83.83 55.61 64.88 71.27 82.89 CNN Highres 91.78 71.84 74.46 89.42 81.83 68.15 72 71.95 83.21 CNN + [2] 92.26 75.05 85.44 91.51 88.13 77.61 70.81 79.95 87.09 CNN + SPN (ours) 93.1 78.53 87.71 92.62 91.08 80.17 71.53 83.13 89.3 [1] Sifei Liu, Jimei Yang, Chang Huang and Ming-Hsuan Yang. Multi-objective convolutional learning for face labeling. CVPR 2015. [2] Sifei Liu, Jinshan Pan and Ming-Hsuan Yang. Learning recursive filters for low-level vision via a hybrid neural network. ECCV 2016

VOC2012 OBJECT SEGMENTATION Experimental comparison a universal SPN • Segmentation Networks FCN-8s • • Deeplab-VGG-16 Deeplab-ResNet-101 • • Refinement Models … deeplab deeplab FCN … vgg-16 resnet-101 • Dense CRF SPN (trained on FCN-8s) •

VOC2012 OBJECT SEGMENTATION CNN base Original Dense CRF • CNN base: ResNet-101 with dilated convolution (deeplab pretrained model) • Dense CRF: CNN base + Dense CRF SPN • SPN: pretrained + SPN

VOC2012 OBJECT SEGMENTATION

VOC2012 OBJECT SEGMENTATION Quantitative results on the validation set model F F+[2] F+SPN V V+[2] V+SPN R R+[2] R+SPN overall AC 91.22 90.64 92.90 92.61 92.16 93.83 94.63 94.12 95.49 mean AC 77.61 70.64 79.49 80.97 73.53 83.15 84.16 77.46 86.09 mean IoU 65.51 60.95 69.86 68.97 64.42 73.12 76.46 72.02 79.76 • F: FCN-8s • V: VGG-16 • R: ResNet-101 [2] Sifei Liu, Jinshan Pan and Ming-Hsuan Yang. Learning recursive filters for low-level vision via a hybrid neural network. ECCV 2016

VOC2012 OBJECT SEGMENTATION Quantitative results on the validation set model F F+[2] F+SPN V V+[2] V+SPN R R+[2] R+SPN overall AC 91.22 90.64 92.90 92.61 92.16 93.83 94.63 94.12 95.49 mean AC 77.61 70.64 79.49 80.97 73.53 83.15 84.16 77.46 86.09 mean IoU 65.51 60.95 69.86 68.97 64.42 73.12 76.46 72.02 79.76 mean IoU CNN base +Dense CRF +SPN VGG-16 (val) 68.97 71.57 73.12 ResNet-101 (val) 76.40 77.69 79.76 ResNet-101 (test) - 79.70 80.22 [1] Liang-Chieh Chen*, George Papandreou*, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. arXiv preprint, 2016 [2] Sifei Liu, Jinshan Pan and Ming-Hsuan Yang. Learning recursive filters for low-level vision via a hybrid neural network. ECCV 2016

RUNTIME ANALYSIS (VOC MODEL) Computational efficiency on 512×512 images 4.4 s 5000 4000 3.2 s 3000 2000 ~1 s 1000 0.08 s 0 Dense CRF [1] Dense CRF [2] CRF as RNN [3] OURS [1] Philipp Krähenbühl, Vladlen Koltun, Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. NIPS 2011. [2] Liang-Chieh Chen*, George Papandreou*, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. arXiv preprint arXiv:1606.00915 (2016). [3] Shuai Zheng*, Sadeep Jayasumana*, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip H. S. Torr. Conditional Random Fields as Recurrent Neural Networks. ICCV 2015.

LEARNING AFFINITY VIA SPATIAL PROPAGATION NETWORK Sifei Liu, - PowerPoint PPT Presentation

LEARNING AFFINITY VIA SPATIAL PROPAGATION NETWORK Sifei Liu, Shalini De Mello, Jinwei Gu, Guangyu Zhong, Ming-Hsuan Yang, Jan Kautz NVIDIA Research, March 27, 2018 WHAT IS AFFINITY? The relations between two pixels/regions A typical

PLANT PROPAGATION An Overview of Plant Propagation Methods Two Techniques of Stem Cutting

Affinity Care Affinity Care Shipley and Westcliffe Medical Practices A new chapter Background

Creating a Science of Spatial Learning Nora S. Newcombe Temple University PI, Spatial

Resource 1: What is spatial? presentation notes Section Section text Notes 1. Spatial

Broadening the Study of Spatial Intelligence Mary Hegarty University of California, Santa

A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial

Kinetic & Affinity Analysis An introduction What are kinetics and affinity? Kinetics

UN Global Compact Affinity Private Wealth: Communication on Progress, April 2020 This is Affinity

THE AMATEURS FRIEND OR Enemy A short course on Propagation Propagation What is it? What

1 How to deal with Radio Propagation How to deal with Radio Propagation Where are you from?

Physical of radio propagation Two types of propagation models

Spatial Digitech Keep it s im ple Make it spatial About US Spatial Digitech is a provider of

UCSB is Spatial ! http://www.spatial.ucsb.edu Specialist Meeting on Spatial Thinking across the

STAT 209 Spatial Data I April 30, 2018 Colin Reimer Dawson 1 / 26 Spatial Data Projections

Joint Affinity Propagation for Multiple View Segmentation Jianxiong XIAO, Jingdong WANG, Ping

Contributions to Large Scale Data Clustering and Streaming with Affinity Propagation. Application

Putting the Science in Computer Science What makes for a good program, and how can we

CANOPY Redefining Debate Elena F Yasmeen A Teresa N Gamliel S Mission We started with wanting

CUDA Kernel based Collective Reduction Operations on Large-scale GPU Clusters Ching-Hsiang Chu ,

Top-Down Parsing Top-Down Parsing #1 Extra Credit Question Given this grammar G: E

TOWARDS FORMAL VERIFICATION IN AUTOMOTIVE APPLIED TO THE AUTONOMOUS DRIVING SUPERVISION FUNCTION

Outline Monday: design, interfaces, representation of information Tuesday: testing,

A Simple Computer Computing Models A simple computer model with a unified notion of

Classification of associative multivariate polynomial functions Jean-Luc Marichal and Pierre

LEARNING AFFINITY VIA SPATIAL PROPAGATION NETWORK Sifei Liu, - PowerPoint PPT Presentation

LEARNING AFFINITY VIA SPATIAL PROPAGATION NETWORK Sifei Liu, Shalini De Mello, Jinwei Gu, Guangyu Zhong, Ming-Hsuan Yang, Jan Kautz NVIDIA Research, March 27, 2018 WHAT IS AFFINITY? The relations between two pixels/regions A typical

PLANT PROPAGATION An Overview of Plant Propagation Methods Two Techniques of Stem Cutting

Affinity Care Affinity Care Shipley and Westcliffe Medical Practices A new chapter Background

Creating a Science of Spatial Learning Nora S. Newcombe Temple University PI, Spatial

Resource 1: What is spatial? presentation notes Section Section text Notes 1. Spatial

Broadening the Study of Spatial Intelligence Mary Hegarty University of California, Santa

A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial Cloaking Framework A Spatial

Kinetic &amp; Affinity Analysis An introduction What are kinetics and affinity? Kinetics

UN Global Compact Affinity Private Wealth: Communication on Progress, April 2020 This is Affinity

THE AMATEURS FRIEND OR Enemy A short course on Propagation Propagation What is it? What

1 How to deal with Radio Propagation How to deal with Radio Propagation Where are you from?

Physical of radio propagation Two types of propagation models

Spatial Digitech Keep it s im ple Make it spatial About US Spatial Digitech is a provider of

UCSB is Spatial ! http://www.spatial.ucsb.edu Specialist Meeting on Spatial Thinking across the

STAT 209 Spatial Data I April 30, 2018 Colin Reimer Dawson 1 / 26 Spatial Data Projections

Joint Affinity Propagation for Multiple View Segmentation Jianxiong XIAO, Jingdong WANG, Ping

Contributions to Large Scale Data Clustering and Streaming with Affinity Propagation. Application

Putting the Science in Computer Science What makes for a good program, and how can we

CANOPY Redefining Debate Elena F Yasmeen A Teresa N Gamliel S Mission We started with wanting

CUDA Kernel based Collective Reduction Operations on Large-scale GPU Clusters Ching-Hsiang Chu ,

Top-Down Parsing Top-Down Parsing #1 Extra Credit Question Given this grammar G: E

TOWARDS FORMAL VERIFICATION IN AUTOMOTIVE APPLIED TO THE AUTONOMOUS DRIVING SUPERVISION FUNCTION

Outline Monday: design, interfaces, representation of information Tuesday: testing,

A Simple Computer Computing Models A simple computer model with a unified notion of

Classification of associative multivariate polynomial functions Jean-Luc Marichal and Pierre

Kinetic & Affinity Analysis An introduction What are kinetics and affinity? Kinetics