delving deep into computer vision
play

Delving Deep into Computer Vision Caner Hazirbas Machine Learning - PowerPoint PPT Presentation

Delving Deep into Computer Vision Caner Hazirbas Machine Learning Meetup #1 Delving Deep into Computer Vision FlowNet FuseNet PoseLSTM DDFF Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 2 Delving Deep into


  1. Delving Deep into Computer Vision Caner Hazirbas Machine Learning Meetup #1

  2. Delving Deep into Computer Vision FlowNet FuseNet PoseLSTM DDFF Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 2

  3. Delving Deep into Computer Vision FlowNet FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x refine- prediction 7 5 x ment 3 5 x 5 1024 3 x 96 x 128 9 5 512 512 192 x 256 512 512 256 384 x 512 256 136 x 320 128 64 6 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 3

  4. Learning Optical Flow with FlowNet Convolutional Networks ICCV’15 FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x refine- prediction 7 5 x ment 3 5 x 5 1024 3 x 9 96 x 128 5 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 136 x 320 256 441 473 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 4

  5. Flying Chairs FlowNet Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 5

  6. FlowNetSimple FlowNet FlowNetSimple conv1 conv2 conv3 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 7 x refine- prediction 7 5 x ment 3 5 x 5 1024 3 x 96 x 128 9 5 512 512 192 x 256 512 512 256 256 384 x 512 136 x 320 128 64 6 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 6

  7. <latexit sha1_base64="bqyMlj+iueCfrlLrqfMh5shUHg=">ACaXicbVFLa9wEJbdV7p9bdpLaS9Dl0JC3MUyhZCILSXQi8pdJPAyhZO94VlmUjySGL2T/ZW/9AL/0DPVbr+LJBwTfY0YaPuWNktbF8a8gvHP3v0Hew9Hjx4/efpsvP/8zNatETgTtarNRc4tKqlx5qRTeNEY5FWu8Dwv2z980s0Vtb6h1s3mFZ8qWUhBXdeysZKHFxlNLrKkM4BmbKutqYFLD/F0ZlSlzskI7kA0wxfVSIRQZ3c7BEdSHkWeJZ0nPgJm+hUsioAx+Abw6RiSEo5oNp7E07gvuA3oACZkqNs/JstatFWqJ1Q3No5jRuXdtw4KRuRqy12HBR8iV2K1SX6Ha0uYea+/3Trk9qA2+9soCiNv5oB726cwuvrF1Xue+suFvZm95W/J83b13xMe2kblqHWlw/VLQKXA3b2GEhDQqn1h5wYaTfH8SKGy6c/5yRD4bejOE2OEumNJ7S7+8nJ5+HiPbIa/KGHBKPpAT8pWckhkR5Cf5GwRBGPwJ98OX4avr1jAYZl6QnQon/wC2bLM3</latexit> <latexit sha1_base64="bqyMlj+iueCfrlLrqfMh5shUHg=">ACaXicbVFLa9wEJbdV7p9bdpLaS9Dl0JC3MUyhZCILSXQi8pdJPAyhZO94VlmUjySGL2T/ZW/9AL/0DPVbr+LJBwTfY0YaPuWNktbF8a8gvHP3v0Hew9Hjx4/efpsvP/8zNatETgTtarNRc4tKqlx5qRTeNEY5FWu8Dwv2z980s0Vtb6h1s3mFZ8qWUhBXdeysZKHFxlNLrKkM4BmbKutqYFLD/F0ZlSlzskI7kA0wxfVSIRQZ3c7BEdSHkWeJZ0nPgJm+hUsioAx+Abw6RiSEo5oNp7E07gvuA3oACZkqNs/JstatFWqJ1Q3No5jRuXdtw4KRuRqy12HBR8iV2K1SX6Ha0uYea+/3Trk9qA2+9soCiNv5oB726cwuvrF1Xue+suFvZm95W/J83b13xMe2kblqHWlw/VLQKXA3b2GEhDQqn1h5wYaTfH8SKGy6c/5yRD4bejOE2OEumNJ7S7+8nJ5+HiPbIa/KGHBKPpAT8pWckhkR5Cf5GwRBGPwJ98OX4avr1jAYZl6QnQon/wC2bLM3</latexit> <latexit sha1_base64="bqyMlj+iueCfrlLrqfMh5shUHg=">ACaXicbVFLa9wEJbdV7p9bdpLaS9Dl0JC3MUyhZCILSXQi8pdJPAyhZO94VlmUjySGL2T/ZW/9AL/0DPVbr+LJBwTfY0YaPuWNktbF8a8gvHP3v0Hew9Hjx4/efpsvP/8zNatETgTtarNRc4tKqlx5qRTeNEY5FWu8Dwv2z980s0Vtb6h1s3mFZ8qWUhBXdeysZKHFxlNLrKkM4BmbKutqYFLD/F0ZlSlzskI7kA0wxfVSIRQZ3c7BEdSHkWeJZ0nPgJm+hUsioAx+Abw6RiSEo5oNp7E07gvuA3oACZkqNs/JstatFWqJ1Q3No5jRuXdtw4KRuRqy12HBR8iV2K1SX6Ha0uYea+/3Trk9qA2+9soCiNv5oB726cwuvrF1Xue+suFvZm95W/J83b13xMe2kblqHWlw/VLQKXA3b2GEhDQqn1h5wYaTfH8SKGy6c/5yRD4bejOE2OEumNJ7S7+8nJ5+HiPbIa/KGHBKPpAT8pWckhkR5Cf5GwRBGPwJ98OX4avr1jAYZl6QnQon/wC2bLM3</latexit> <latexit sha1_base64="bqyMlj+iueCfrlLrqfMh5shUHg=">ACaXicbVFLa9wEJbdV7p9bdpLaS9Dl0JC3MUyhZCILSXQi8pdJPAyhZO94VlmUjySGL2T/ZW/9AL/0DPVbr+LJBwTfY0YaPuWNktbF8a8gvHP3v0Hew9Hjx4/efpsvP/8zNatETgTtarNRc4tKqlx5qRTeNEY5FWu8Dwv2z980s0Vtb6h1s3mFZ8qWUhBXdeysZKHFxlNLrKkM4BmbKutqYFLD/F0ZlSlzskI7kA0wxfVSIRQZ3c7BEdSHkWeJZ0nPgJm+hUsioAx+Abw6RiSEo5oNp7E07gvuA3oACZkqNs/JstatFWqJ1Q3No5jRuXdtw4KRuRqy12HBR8iV2K1SX6Ha0uYea+/3Trk9qA2+9soCiNv5oB726cwuvrF1Xue+suFvZm95W/J83b13xMe2kblqHWlw/VLQKXA3b2GEhDQqn1h5wYaTfH8SKGy6c/5yRD4bejOE2OEumNJ7S7+8nJ5+HiPbIa/KGHBKPpAT8pWckhkR5Cf5GwRBGPwJ98OX4avr1jAYZl6QnQon/wC2bLM3</latexit> FlowNetCorr FlowNet FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv4 conv3_1 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- prediction kernel 3 x 3 3 corr ment 1024 512 512 512 512 32 256 136 x 320 441 473 X c ( x 1 , x 2 ) = h f 1 ( x 1 + o ) , f 2 ( x 2 + o ) i , o ∈ [ − k,k ] × [ − k,k ] K := 2 k + 1 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 7

  8. Simple vs. Corr 
 FlowNet Flying Chairs FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- kernel prediction 3 x 3 3 corr ment 1024 512 512 512 512 32 256 136 x 320 441 473 FlowNetS FlowNetCorr Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 8

  9. Simple vs. Corr 
 FlowNet Sintel FlowNetCorr conv1 conv2 conv3 conv_redir 1 x 1 7 x 7 sqrt 1 x 1 5 x 5 conv3_1 conv4 conv4_1 conv5 conv5_1 conv6 384 x 512 256 4 x 512 4 x 512 128 64 2 refine- kernel prediction 3 x 3 3 corr ment 1024 512 512 512 512 32 256 136 x 320 441 473 FlowNetS FlowNetCorr Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 9

  10. Learning Optical Flow with FlowNet Convolutional Networks Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 10

  11. Delving Deep into Computer Vision FlowNet FuseNet Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 11

  12. Incorporating Depth into Semantic Segmentation via Fusion-based CNN FuseNet Architecture ACCV’16 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 12

  13. A conventional way: HHA FuseNet Multi-Scale Convolutional Architecture for Semantic Segmentation, Raj et al., Tech. Report, CMU-RI-TR-15-21,2015 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 13

  14. A deep way… FuseNet Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 14

  15. Why a second encoder for FuseNet Depth input? Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 15

  16. Are we any better than HHA? FuseNet Proposed network improves all segmentation • metrics Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 16

  17. What about the others? FuseNet Proposed network improves all segmentation metrics • Metrics 
 • Global : total number of correctly classified pixels 
 Mean : average class accuracy 
 IoU : average of intersection over union. Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 17

  18. Delving Deep into Computer Vision FlowNet FuseNet PoseLSTM LSTMs Pretrained FC GoogLeNet p ∈ R 3 q ∈ R 4 CNNs y ∈ R 2048 FC Y ∈ R 32 × 64 z ∈ R 128 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 18

  19. Image-based localization using LSTMs PoseLSTM for structured feature correlation ICCV’17 LSTMs Pretrained FC GoogLeNet p ∈ R 3 q ∈ R 4 CNNs y ∈ R 2048 FC Y ∈ R 32 × 64 z ∈ R 128 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 19

  20. PoseNet PoseLSTM Pretrained FC GoogLeNet p ∈ R 3 q ∈ R 4 CNNs y ∈ R 2048 FC R 128 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 20

  21. Structured Feature Correlation PoseLSTM LSTMs Pretrained FC GoogLeNet p ∈ R 3 q ∈ R 4 CNNs y ∈ R 2048 FC Y ∈ R 32 × 64 z ∈ R 128 Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 21

  22. Winner in Outdoor: SIFT PoseLSTM Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 22

  23. Where SIFT dies… PoseLSTM TUM-LSI Dataset The map cannot be reconstructed due to a lack of sufficient matches: repeated structures, textureless areas Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 23

  24. Delving Deep into Computer Vision FlowNet FuseNet PoseLSTM DDFF Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 24

  25. Deep Depth From Focus DDFF Image of a point intersects the camera sensor when the point is in focus • Therefore, sharpness determines the focused regions on the images • https://inst.eecs.berkeley.edu/~cs39j/sp02/session12.html Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 25

  26. Conventional DFF methods DDFF Image of a point intersects the camera sensor when the point is in focus • Therefore, sharpness determines the focused regions on the images • Distance of a point from the camera can be formulated wrt. focus • Measure of Optimizer sharpness [Pertuz et al.] [Moeller et al.] Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 26

  27. Deep Depth From Focus DDFF Focus gradually changes on each image in the stack • End-to-end trained convolutional auto-encoder • Depth (disparity) from focal stack • Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 27

  28. How to get data? DDFF Caner Hazirbas | hazirbas@cs.tum.edu Delving Deep into Computer Vision 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend