Efficient Large Scale 3D Reconstruction (Wenbing Tao) School of - PowerPoint PPT Presentation

华中科技大学自动化学院，图像识别与人工智能研究所 , 多谱信息处理国家重点实验室，图像信息处理与智能控制教育部重点实验室 Efficient Large Scale 3D Reconstruction 陶文兵 (Wenbing Tao) School of Automation, Institute for Pattern Recognition and Artificial Intelligence National Key Laboratory of Science and Technology on Multi-spectral Information Processing, Key Laboratory of Ministry of Education for Image Processing and Intelligence Control, Huazhong University of Science & Technology, 主要合作者： Qingshan Xu( 徐青山 ) ， Kun Sun( 孙琨 ) ， Tao Xu( 徐涛 )

Background 01 GPU accelerated large scale image 目 02 matching 录 03 Large scale Structure from Motion Multi-view stereo for 3D dense 04 reconstruction

PART 1 Background

Background The three-dimensional model can provide the most true 1 perception of the world 维度降低，信息损失三维数据二维图像多幅图像，信息恢复

Background 2 The three-dimensional city model has extensive application 市政规划灾后救援虚拟景观数字校园三维导航公共安全交通管理地图查询

Existing 3D modeling method 1. 利用几何造型技术建模缺点优点技术成熟，有很多流行的商业软件  重建精度差，不能反映真实尺寸  重建真实感差，技术过于虚拟化

Existing 3D modeling method 2. 主动接触式三维建模 ( 激光雷达扫描仪、结构光扫描仪、红外测距仪 ) 优点主动测量，直接得到三维点云信息，不需要复杂的后续计算和处理缺点  设备操作复杂  重建成本很高  远距离精度差  重建真实感差

Existing 3D modeling method 3. 被动式三维建模 ( 视觉算法 ) 优点  Shape from X （阴影、纹理、遮挡等）  双目立体视觉（ Binocular Stereo ）  运动恢复结构（ Structure from Motion ， SfM ）

Multiple-view 3D reconstruction 数据易于获取视觉三自动化程度高维重建适用范围广 2014 年全球有大约 8800 亿张新的图片产生 2017 年这一数字达到 1.3 万亿

The basic procedure Image matching Structure from Motion Dense representation Texture mapping Surface reconstruction

GPU Accelerated Cascade Hashing PART 2 Image Matching

Introduction SIFT, Kd-Tree, CasHash and siftGPU SIFT Matching (Lowe1999) ： O(N 2 ) ， a pair of images Brute search Find the smallest Euclidean costs 4-5 seconds distance and significant point Kd-Tree (Muja2009) ： Binary search tree O(log N) ， 2-4 pairs / s Approximate nearest neighbor (ANN) search 10 4 SIFT points Cascade Hashing Lower algorithm (Cheng2014) ： Hashing lookup complexity Two-level hashing filtering Hashing remapping 10-20 pairs / s ANN search <10 siftGPU(Wu 2013) 40-50pair/s

Introduction Cascade Hashing SIFT Points About 10,000 SIFT points per image ... y x 1 0 0 0 0 0 1 1 1 1 x 2 θ 8-bit hashing code, first filtering x 8 products (Reduce) for each feature point r Hashing mapping (Hashing bucket) 128-bit hashing code, second filtering 128 products (Reduce) for each feature point Euclidean distance calculation 1 products (Reduce) for each feature point

GPU Accelerated CasHash GPU algorithms SIFT Points Fast Computation of Reduction About 10,000 SIFT points per image ... y Data Exchange Strategy 0 0 0 0 0 1 1 1 1 x 1 GPU-Memory-Disk θ x 2 8-bit hashing code, first filtering x 8 products (Reduce) for each feature point r Improved Parallel Hashing Ranking Hashing mapping (Hashing bucket) 128-bit hashing code, second filtering 128 products (Reduce) for each feature point Euclidean distance calculation 1 products (Reduce) for each feature point Tao Xu, Kun Sun and Wenbing Tao*, GPU Accelerated Cascade Hashing Image Matching for Large Scale 3D reconstruction, arXiv:1805.08995

GPU Accelerated CasHash Data Scheduling Strategy

Experiments Results on Public Available Datasets

Experiments on large image set Multiple GPU acceleration The relationship between the number of GPU card and matching speed. The experiment on Data-Dubrovnik(6K) time is showed in left. The experiment on Data-Rome(16K) time is showed in right.

Experiments Geometry-aware CasHashGPU  The top 20% scale SIFT features is used to do exhaustive image matching (Wu 2013) by CasHashGPU  The information is used to guide the remaining matching procedure

Experiments GPS-aware CasHashGPU

Related works Vocabulary tree Fast searching for nearest neighbors. Bag of words Vocabulary tree

Introduction Our improvement on overlap detection A fast GPU vocabulary indexing implementation 1DSfM_Roman_Forum, 2360 images Stage GPU Time(s) CPU Time(s) Speedup factor Pre-Process 0.782 0 - Search(+Sparse) 7.854 267.478 34.0 Weight 0.005 0.220 - All the tests are performed Normalize 0.182 0.544 - on a machine with 256GB Score 0.506 1.027 - RAM, one Intel Xeon E5- Data Copy 2.444 0 - 2630 v3 @ 2.40GHz CPU and Others 0.501 0.242 - one NVIDIA GeForce GTX Total 12.274 269.511 21.9 Titan X GPU card 1DSfM_Vienna_Cathedral, 6280 images Stage GPU Time(s) CPU Time(s) Speedup factor Expect to process 10000 Pre-Process 0.892 0 - images within 1 minute. Search(+Sparse) 29.317 837.375 28.5 Weight 0.023 0.346 - Normalize 0.466 1.284 - Score 5.821 19.399 - Data Copy 6.852 0 - Others 1.910 0.930 - Total 45.281 859.334 18.9

Experiments GPU-based F-matrix and H-matrix estimation

Multiple starting points selection and PART 3 data partition for large scale SFM

Introduction Structure from Motion Giving a set of images, estimate the camera poses and the sparse 3D structure. Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding points in 3D? Correspondence (matching): Given a point in just one image, how does it constrain the position of the corresponding point in another image? Camera geometry (motion): Given a set of corresponding points in two or more images, what are the camera matrices for these views?

Introduction Structure from Motion The general pipeline of the SfM algorithm

Introduction Structure from Motion Matching graph construction

Introduction Structure from Motion Epipolar Geometry estimated by RANSAC

Introduction Structure from Motion Build tracks from matches Image 1 Image 2 Image 3 Image 4  Link up matches between pairs of images into tracks between multiple images  Each track corresponds to a 3D point

Introduction Structure from Motion Choose two views  They have the most number of feature correspondences  They have wide baseline (The baseline can be measured by the inlier ratio of a planar homography)

Introduction Structure from Motion Estimate relative pose using two-view geometry  Camera intrinsics known Essential matrix, E (5 points)  Camera intrinsics unknown Fundamental matrix, F (7 points)

Introduction Structure from Motion Triangulate inlier correspondences  Given projections of a 3D point in two or more images (with known camera matrices), find the coordinates of the point

Introduction Structure from Motion Triangulation R 1 R 2  We want to intersect X? the two visual rays corresponding to x1 and x2, but because of noise x 2 x 1 and numerical errors, they don’t meet exactly O 2 O 1

Introduction Structure from Motion Triangulation X  Find shortest segment connecting the two viewing x 2 rays and let X be the x 1 midpoint of that segment O 2 O 1

Introduction Structure from Motion Bundle Adjustment  refine 3D points  refine camera parameters X j  Minimize reprojection error: 2 m n å å ( ) E ( P , X ) = w ij D x ij , P i X j i = 1 j = 1 w ij indicator variable for visibility P 1 X j of point X j in camera P i x 3 j x 1 j P 3 X j • Minimizing this function is called P 2 X j x 2 j P 1 bundle adjustment P 3 – Optimized using non-linear least P 2 squares, e.g. Levenberg-Marquardt

Introduction Structure from Motion Add new cameras

Introduction Structure from Motion Add new cameras  2D-2D correspondences

Introduction Structure from Motion Add new cameras  Feature tracks help a lot  Maximize number of 2D-3D correspondences

Introduction Structure from Motion Add new cameras  Solve Perspective-n-Point problem

Introduction Structure from Motion Add new cameras  Triangulate new points  Bundle adjustment

Introduction Difficulties The difficulties in SfM for large scale unordered images. 1. Explosive image data:  Image matching is time consuming  Sequentially adding them is time consuming  How to partition the image set properly? 100 million images on Yahoo

Introduction Difficulties The difficulties in SfM for large scale unordered images. VS unstructured structured 2. Unordered:  Unknown neighborhood, unknown scene overlap  Burdensome image matching procedure

Efficient Large Scale 3D Reconstruction (Wenbing Tao) School of - PowerPoint PPT Presentation

, Efficient Large Scale 3D Reconstruction (Wenbing Tao)

3D RECONSTRUCTION Reconstruction method Reconstruction from images Reconstruction from video

Vertex reconstruction Vertex reconstruction in large liquid scintillator detectors in large

Delaunay Triangulation: Applications Reconstruction Meshing 1 Reconstruction From points 2 -

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

1. Reconstruction and the West 1.1 Reconstruction: Americas Unfinished Revolution, 1865-1877

Volumetric Scene Reconstruction Volumetric Scene Reconstruction Goal Goal from Multiple

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Design of Geofoam Embankment for the I-15 Reconstruction I 15 Reconstruction Steven F. Bartlett,

Curve and surface reconstruction Steve Oudot Reconstruction Paradigm Q What do you see? Why?

Type Reconstruction and Polymorphism 1 Type Checking and Type Reconstruction We now come to the

S Surface f Reconstruction Digitalisierung Surface Reconstruction: Dr. Peer Stelldinger WS

Advanced Methods for Data Processing and Reconstruction Accelerating Reconstruction on advanced

Surface Reconstruction Level Sets Computer Graphics Hoppe et al, Surface reconstruction from

Estimating Differential Quantities using Polynomial fitting of Osculating Jets

Geometric Registration for Deformable Shapes 4.2 Animation Reconstruction Basic Algorithm

Surface reconstruction via mean curvature flow Emre Baspinar supervised by prof. dr. Giovanna

3D Scanning and Reconstruction (02) RNDr. Martin Madaras, PhD. madaras@skeletex.xyz How the

Advances in Face Recognition Research Presentation for the 2 nd End User Group Meeting Juergen

Reconstruction accuracy of the surface detector of the Pierre Auger Observatory The Pierre Auger

LIGHTING 1 OUTLINE Learn to light/shade objects so their images appear three-dimensional

IR Land Surface Emissivity Validation Bob Knuteson University of Wisconsin-Madison Space