Neur NeurVPS PS : Neural Vanishing Point Scanner via Conic Convolution Yichao Zhou * Haozhi Qi * Jingwei Huang ⱡ Yi Ma * * University of California, Berkeley ⱡ Stanford University NeurIPS 2019 1
Introduction • Parallel lines in 3D intersect in one point after projection • Vanishing points are important as it gives the line direction in 3D 2 Image source: Military Science and Tactics.
Related Work (Traditional Approaches) • Two-stage pipeline • Heuristic Line Segment Detection • Canny Edge + Hough Transformation [1] • LSD [2] • Contour [3] • Line Clustering • J-Linkage [4] • Line RANSAC [5] • Angle Histogram [6] • Problems [1] Kiryati, Nahum, Yuval Eldar, and Alfred M. Bruckstein. "A probabilistic Hough transform." Pattern recognition 24.4 (1991): 303-316. • Edges do not have semantic meaning [2] Von Gioi, et al. “LSD: A fast line segment detector with a false detection control.” PAMI 32.4 (2008 [3] Zhou, Zihan, Farshid Farhat, and James Z. Wang. "Detecting dominant vanishing points in natural scenes with application to composition-sensitive image retrieval." IEEE Transactions on Multimedia 19.12 (2017 • Edges can be noisy [4] Tardif, Jean-Philippe. "Non-iterative approach for fast and accurate vanishing point detection." 2009 ICCV . [5] Bazin, Jean-Charles, and Marc Pollefeys. "3-line ransac for orthogonal vanishing point detection." 2012 • Outliers can result in total failure IEEE/RSJ International Conference on Intelligent Robots and Systems . IEEE, 2012. [6] Li, Bo, et al. "Vanishing point detection using cascaded 1D Hough Transform from single images." Pattern Recognition Letters 33.1 (2012): 1-8. 3
Related Work (Neural Network Era) • Recent data-driven approaches • [1], [2], [3]: divide image into patches and do classification • Hard to find vanishing point outside the image • [4] uses neural network to filter outliers • Challenges: • Neural network does not have a geometric understanding of vanishing points • CNN only provides a coarse estimations of vanishing points [1] “Vanishing point detection with convolutional neural networks”, Ali Borji, arXiv 2016 [2] “DeepVP: Deep learning for vanishing point detection on 1 million street view images”, Chin-Kai Chang, Jiaping Zhao, and Laurent Itti. ICRA 2018 4 [3] “Dominant vanishing point detection in the wild with application in composition analysis”, Xiaodan Zhang, Xinbo Gao, Wen Lu, Lihuo He, and Qi Liu. NeuralComputing 2018 [4] “Detecting Vanishing Points using Global Image Context in a Non-Manhattan World” Menghua Zhai, Scott Workman, Nathan Jacobs. CVPR 2016
Poor Accuracy of CNNs on VP Detection Zoom in 5
Design Philosophy of NeurVPS • The overall approach has the advantages of • accuracy of traditional line clustering algorithms • robustness of neural network-based algorithms • Neural networks should be trained end-to-end • without relying on line segment detectors • New operators that captures geometric cues • vanishing points are the intersections of lines • operators should be local and stackable Image Source: Wikipedia 6
Our Methods Image Vanishing Points • Input Hourglass Backbone • An image 1x1, 32, Conv (BN, ReLU) • A coordinate (vanishing point candidate) • Output 3x3, 64/128/256/256, Conic Conv (BN, ReLU) • likelihood of the existence of a vanishing point x4 3x3, stride 2 near that coordinate. Max Pooling • Key Component 1024, FC (ReLU) • Conic Convolution 1024, FC (ReLU) R , FC (Sigmoid) Output 7
Conic Convolution • Guided by vanishing point candidates (convolution center) V 1 7 1 2 3 2 4 4 8 1 3 5 5 4 5 6 9 7 2 6 8 6 7 8 9 9 3 V Conic Convolution Plain Convolution Conic Convolution (vanishing point inside image plane) (vanishing point outside image plane) 8
Conic Convolution • Guided by vanishing point candidates (convolution center) V 7 1 1 2 3 2 8 4 3 4 9 5 1 4 5 6 5 6 7 6 2 8 7 8 9 9 3 V Conic Convolution Plain Convolution Conic Convolution (vanishing point inside image plane) (vanishing point outside image plane) 9
Conic Convolution • Guided by vanishing point candidates (convolution center) V 7 1 8 1 2 3 2 3 9 4 4 5 4 5 6 5 6 6 1 7 2 8 7 8 9 9 3 V Conic Convolution Plain Convolution Conic Convolution (vanishing point inside image plane) (vanishing point outside image plane) 10
Conic Convolution • Guided by vanishing point candidates (convolution center) V 9 8 7 1 2 3 1 2 3 6 5 4 4 5 6 4 5 6 3 2 1 7 8 9 7 8 9 V Conic Convolution Plain Convolution Conic Convolution (vanishing point inside image plane) (vanishing point outside image plane) 11
Conic Convolution • Guided by vanishing point candidates (convolution center) V 9 3 2 3 8 1 2 1 7 6 6 5 5 4 5 6 4 4 3 9 2 9 8 7 8 7 1 V Conic Convolution Plain Convolution Conic Convolution (vanishing point inside image plane) (vanishing point outside image plane) 12
Intuition Behind Conic Convolution False Proposal Ground Truth True Proposal 13
Coarse-to-Fine Inference Image Vanishing Points Hourglass Backbone • Our network is essentially a vanishing point classifier 1x1, 32, Conv (BN, ReLU) • During evaluation 3x3, 64/128/256/256, Conic Conv 1. Sample vanishing points (BN, ReLU) x4 2. Test it with our network classifier 3x3, stride 2 Max Pooling • How to sample vanishing points? 1024, FC (ReLU) 1024, FC (ReLU) R , FC (Sigmoid) Output 14
A very brief review of Gaussian Sphere • How to do uniform sampling for vanishing point? 15
A very brief review of Gaussian Sphere • How to do uniform sampling for vanishing point? • We put the image on a sphere (Gaussian Sphere Representation) 16
Hierarchical Inference 17
Hierarchical Inference 18
Hierarchical Inference 19
Training Image Vanishing Points • We train multiple classifiers, each of which corresponds Hourglass to a different threshold; Backbone • Sample one positive & one negative vanishing points for 1x1, 32, Conv (BN, ReLU) each threshold; 3x3, 64/128/256/256, Conic Conv • Randomly sample three vanishing points to reduce bias. (BN, ReLU) x4 3x3, stride 2 Max Pooling 1024, FC (ReLU) 1024, FC (ReLU) R , FC (Sigmoid) Output 20
Training Image Vanishing Points • We train multiple classifiers, each of which corresponds Hourglass to a different threshold; Backbone • Sample one positive & one negative vanishing points for 1x1, 32, Conv (BN, ReLU) each threshold; 3x3, 64/128/256/256, Conic Conv • Randomly sample three vanishing points to reduce bias. (BN, ReLU) x4 3x3, stride 2 Max Pooling 1024, FC (ReLU) 1024, FC (ReLU) R , FC (Sigmoid) Output 21
Training Image Vanishing Points • We train multiple classifiers, each of which corresponds Hourglass to a different threshold; Backbone • Sample one positive & one negative vanishing points for 1x1, 32, Conv (BN, ReLU) each threshold; 3x3, 64/128/256/256, Conic Conv • Randomly sample three vanishing points to reduce bias. (BN, ReLU) x4 3x3, stride 2 Max Pooling 1024, FC (ReLU) 1024, FC (ReLU) R , FC (Sigmoid) Output 22
Training Image Vanishing Points • We train multiple classifiers, each of which corresponds Hourglass to a different threshold; Backbone • Sample one positive & one negative vanishing points for 1x1, 32, Conv (BN, ReLU) each threshold; 3x3, 64/128/256/256, Conic Conv • Randomly sample three vanishing points to reduce bias. (BN, ReLU) x4 3x3, stride 2 Max Pooling 1024, FC (ReLU) 1024, FC (ReLU) R , FC (Sigmoid) Output 23
Training Image Vanishing Points • We train multiple classifiers, each of which corresponds Hourglass to a different threshold; Backbone • Sample one positive & one negative vanishing points for 1x1, 32, Conv (BN, ReLU) each threshold; 3x3, 64/128/256/256, Conic Conv • Randomly sample three vanishing points to reduce bias. (BN, ReLU) x4 3x3, stride 2 Max Pooling 1024, FC (ReLU) 1024, FC Negative Sample (ReLU) R , FC Positive Sample (Sigmoid) Output 24
Training Image Vanishing Points • We train multiple classifiers, each of which corresponds Hourglass to a different threshold; Backbone • Sample one positive & one negative vanishing points for 1x1, 32, Conv (BN, ReLU) each threshold; 3x3, 64/128/256/256, Conic Conv • Randomly sample three vanishing points to reduce bias. (BN, ReLU) x4 3x3, stride 2 Max Pooling 1024, FC (ReLU) 1024, FC (ReLU) Negative Sample Positive Sample R , FC (Sigmoid) Output 25
Training Image Vanishing Points • We train multiple classifiers, each of which corresponds Hourglass to a different threshold; Backbone • Sample one positive & one negative vanishing points for 1x1, 32, Conv (BN, ReLU) each threshold; 3x3, 64/128/256/256, Conic Conv • Randomly sample three vanishing points to reduce bias. (BN, ReLU) x4 3x3, stride 2 Max Pooling 1024, FC (ReLU) 1024, FC (ReLU) Positive Sample R , FC Negative Sample (Sigmoid) Output 26
Recommend
More recommend