recurrent transformer networks
play

Recurrent Transformer Networks for Semantic Correspondence - PowerPoint PPT Presentation

Neural Information Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence Seungryong Kim 1 , Stepthen Lin 2 , Sangryul Jeon 1 , Dongbo Min 3 , Kwanghoon Sohn 1 Dec. 05, 2018 1) 2) 3) In Introduction


  1. Neural Information Processing Systems (NeurIPS) 2018 Recurrent Transformer Networks for Semantic Correspondence Seungryong Kim 1 , Stepthen Lin 2 , Sangryul Jeon 1 , Dongbo Min 3 , Kwanghoon Sohn 1 Dec. 05, 2018 1) 2) 3)

  2. In Introduction Semantic Correspondence • Establishing dense correspondences between semantically similar images , i.e., different instances within the same object or scene categories • For example, the wheels of two different cars, the body of people or animals Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 2

  3. Introduction In Challenges in Semantic Correspondence ? Pho hotometric De Deformations Ge Geometric De Deformations Lack of Lac of Sup Supervis ision • Intra-class appearance and • Different viewpoint or baseline • Labor-intensive of annotation • Non-rigid shape deformations • Degraded by subjectivity attribute variations • Etc. • Etc. • Etc. Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 3

  4. Problem Formulation Objective 𝐔 𝑗 = 𝐁 𝑗 , 𝐠 𝑗 𝑗 ′ = T 𝑗 𝑗 𝑗 How to estimate locally-varying affine transformation fields without ground-truth supervision? Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 4

  5. Background Methods for Geometric Invariance in Feature Extraction Step Spatial Transformer Networks (STNs)-based methods [Jaderberg et al., NeurIPS’15] ∗ 𝐁 𝑗 is learned wo/ 𝐁 𝑗 ∗ But, 𝐠 𝑗 is learned w/ 𝐠 𝑗 Geometric inference based on only source or target image • UCN [Choy et al. , NeurIPS’16 ] • CAT-FCSS [Kim et al. , TPAMI’18] • Etc. Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 5

  6. Background Methods for Geometric Invariance in Regularization Step ∗ 𝐔 𝑗 is learned wo/ 𝐔 𝑗 using self- or meta-supervision Geometric Inference using source/target images Globally-varying geometric Inference only Only fixed, untransformed versions of the features • GMat. [Rocco et al. , CVPR’17] • GMat. w/Inl. [Rocco et al. , CVPR’18] • Etc. Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 6

  7. Recurrent Transformer Networks (RTNs) Networks Configuration • To weaves the advantages of STN-based methods and geometric matching methods by recursively estimating geometric transformation residuals using geometry-aligned feature activations Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 7

  8. Recurrent Transformer Networks (RTNs) Feature Extraction Networks • Input images 𝐽 𝑡 and 𝐽 𝑢 are passed through Siamese convolution networks with parameters 𝐗 𝐺 such that 𝐸 𝑗 = 𝐺 𝐽 𝐗 𝐺 • Using CAT-FCSS, VGGNet (conv4-4), ResNet (conv4-23) Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 8

  9. Recurrent Transformer Networks (RTNs) Recurrent Geometric Matching Networks • Constraint correlation volume construction 𝑡 , 𝐸 𝑢 (𝐔 𝑘 )) =< 𝐸 𝑗 𝑡 , 𝐸 𝑢 (𝐔 𝑘 ) >/ < 𝐸 𝑗 𝑡 , 𝐸 𝑢 (𝐔 𝑘 ) > 2 𝐷(𝐸 𝑗 Source Target Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 9

  10. Recurrent Transformer Networks (RTNs) Recurrent Geometric Matching Networks • Recurrent geometric inference 𝑙 − 𝐔 𝑗 𝑙−1 = 𝐺(𝐷(𝐸 𝑗 𝑡 , 𝐸 𝑢 (𝐔 𝑗 𝑙−1 ))|𝐗 𝐻 ) 𝐔 𝑗 Source Target Iter. 1 Iter. 2 Iter. 3 Iter. 4 Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 10

  11. Recurrent Transformer Networks (RTNs) Weakly-supervised Learning Intuition : matching score between the source 𝐸 𝑡 at each pixel 𝑗 and the target • 𝐸 𝑢 (𝐔 𝑗 ) should be maximized while keeping the scores of other candidates low • Loss Function : 𝑡 , 𝐸 𝑢 𝐔 ∗ log(𝑞(𝐸 𝑗 𝑡 , 𝐸 𝑢 (𝐔 𝑘 ))) 𝑀 𝐸 𝑗 = − ෍ 𝑞 𝑘 𝑘∈𝑁 𝑗 𝑡 , 𝐸 𝑢 (𝐔 𝑘 )) is a Softmax probability where the function 𝑞(𝐸 𝑗 𝑡 , 𝐸 𝑢 (𝐔 𝑘 ))) exp(𝐷(𝐸 𝑗 𝑡 , 𝐸 𝑢 (𝐔 𝑘 )) = 𝑞(𝐸 𝑗 𝑡 , 𝐸 𝑢 (𝐔 𝑚 ))) σ 𝑚∈𝑁 𝑗 exp(𝐷(𝐸 𝑗 ∗ denotes a class label defined as 1 if 𝑘 = 𝑗 , 0 otherwise where 𝑞 𝑘 Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 11

  12. Experimental Results Results on the TSS Benchmark Source Target SCNet GMat. w/Inl. RTNs [Han et al. , ICCV’17] [Rocco et al. , CVPR’18] images images Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 12

  13. Experimental Results Results on the PF-PASCAL Benchmark Source Target SCNet GMat. w/Inl. RTNs [Han et al. , ICCV’17] [Rocco et al. , CVPR’18] images images Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 13

  14. Experimental Results Results on the PF-PASCAL Benchmark Source Target SCNet GMat. w/Inl. RTNs [Han et al. , ICCV’17] [Rocco et al. , CVPR’18] images images Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 14

  15. Concluding Remarks • RTNs learn to infer locally-varying geometric fields for semantic correspondence in an end-to-end and weakly-supervised fashion • The key idea is to utilize and iteratively refine the transformations and convolutional activations through matching between the image pair • A technique is presented for weakly-supervised training of RTNs Seungryong Kim et al., Recurrent Transformer Networks for Semantic Correspondence, NeurIPS, 2018 15

  16. Thank you! See you at 210 & 230 AB #119 Seungryong Kim, Ph.D. Digital Image Media Lab. Yonsei University, Seoul, Korea Tel: +82-2-2123-2879 E-mail: srkim89@yonsei.ac.kr Homepage: http://diml.yonsei.ac.kr/~srkim/

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend