PROCESSING WITH HIERARCHICAL GAUSSIAN MIXTURES Ben Eckart, NVIDIA - PowerPoint PPT Presentation

GPU-ACCELERATED 3D POINT CLOUD PROCESSING WITH HIERARCHICAL GAUSSIAN MIXTURES Ben Eckart, NVIDIA Research, Learning and Perception Group, 3/20/2019

2 3D POINT CLOUD DATA Basic data type for unstructured 3D data Emergence of commercial depth sensors has made it ubiquitous 2

POINT CLOUD PROCESSING CHALLENGES Points are non-differentiable, non-probabilistic Large amounts of often noisy data Often spatially redundant, wide ranging density variance 3

PREVIOUS APPROACHES What have people done before? Discrete Approaches Voxel Grids/Lists, Octrees, TSDFs Though efficient, they inherit the same non- differentiable, non-probabilistic problems as point clouds OctoMap 4

PREVIOUS APPROACHES What have people done before? Continuous Approaches Gaussian Mixture Models, Gaussian Processes Though theoretically attractive, in practice tend to be too slow for many applications Gaussian Process GMM 5

Proposal: Hierarchical Gaussian Mixture Goals: GMM J=8 “Level 2” GMM Efficiency benefits of GMM GMM hierarchical J=8 J=8 structures like “Level Octree 3” GMM GMM GMM GMM Theoretical J=8 J=8 J=8 J=8 benefits of a probabilistic “Level 4” generative GMM GMM GMM GMM GMM GMM GMM GMM model J=8 J=8 J=8 J=8 J=8 J=8 J=8 J=8

Talk Overview • Background – Theory of generative modeling for point clouds • Single-Layer Model (GMMs) – GPU-Accelerated Construction Algorithm – Benefits: Compact and Data-Parallel – Limitations: Scaling with model size, lack of memory coherence • Hierarchical Models (HGMMs) – GPU-Accelerated Construction Algorithm – Benefits: Fast and Parallelizable on GPU – Application: Registration

STATISTICAL / GENERATIVE MODELS Interpret point cloud data (PCD) as an iid sampling of some unknown latent spatial probabilistic function Generative property: Full joint probability space is represented Model 8

Modeling as an MLE Optimization • Given a set of parameters describing the model, find the parameters that best “explain” the data (Maximum Data Likelihood) Data Model

Parametric Model as a Modified GMM Interpret point cloud data as an iid sampling from a small number (J << N) of Gaussian and Uniform Distributions:

GMM for Point Clouds: Intuition Point samples representing pieces of the same local geometry could be aggregated into clusters with the local geometry encoded inside the covariance of that cluster.

SOLVING FOR THE MLE GMM PARAMETERS Typically done via the Expectation Maximization (EM) Algorithm Update point-cluster associations 𝚰 𝒋𝒐𝒋𝒖 𝚰 𝒈𝒋𝒐𝒃𝒎 E Step M 𝚰 Step Update 𝚰 Point Cloud EM Algorithm 12

E Step: A Single Point Z z i 𝑃(𝑂) For each point z, we want to find the relative likelihood (expectation) of it having been generated by each cluster

E Step: Expectation Vector Z z i 𝑃(𝐾) We calculate the probability of each point with respect to each J Gaussian cluster. The expected associations are denoted by the NxJ matrix γ

M STEP: CLOSED FORM WEIGHTED SUMS For the GMM case, the M Step has closed form solutions given the NxJ matrix γ : “Probabilistic generalization of K- Means Clustering” 15

GPU Data Parallelism

GMM Model Limitations • Each point needs to access all J z i cluster parameters in CUDA (poor 𝑃(𝐾) memory locality and linear scaling with J ) • NxJ expectation matrix mostly sparse (thus wasted computation) • Static number of Gaussians that must be set a priori

18 HIERARCHICAL GAUSSIAN MIXTURE Suppose we restrict J to be only 8 Gaussians The model would fit entirely in shared memory for each CUDA threadblock, removing need for global memory accesses The expectation matrix will be dense (Nx8) 18

19 HIERARCHICAL GAUSSIAN MIXTURE After convergence of the J=8 GMM, we can use the Nx8 expectation matrix as a partition function Each point is partitioned via its maximum expectation Now we have 8 partitions of roughly size N/8 19

20 HIERARCHICAL GAUSSIAN MIXTURE We can now run the algorithm recursively on each partition Each partition contains ~ N/8 points that will be modeled as another J=8 GMM Note that this will produce 64 clusters in total 20

PARALLEL PARTITIONING USING CUDA Given each point's max expectation and associated cluster index, we can "invert" this index using parallel scans to group together point ID's having same partition #: Cluster 1 Cluster 2 Cluster 3 [0 0 1 0 1 1 1 2 0 2 2 2] ➔ [[0 1 3 8] [2 4 5 6] [7 9 10 11]] Now we can run a 2D cuda kernel where Dimension 1: index into original point cloud Dimension 2: cluster of the parent e.g. 3 clusters, 12 points, 2 threads/threadblock ➔ grid size of (2, 3) 21

HGMM COMPLEXITY Even though we now have 64 clusters, we only need to query 8 clusters for each point (avoiding the computation of all NxJ (sparse) expectations) Due to the 2D cuda grid and indexing structure, this segmentation of the points into 64 clusters is the exact same complexity/speed as the original "simple" J=8 GMM. Thus, we can keep increasing the complexity of the model eightfold while incurring only a linear time penalty 22

HGMM ALGORITHM Small EM algorithms (8 clusters at a time) are recursively performed on increasingly smaller partitions of the point cloud data E Step: Associate points to clusters M Step: Update mixture means, covariances, and weights Partition Step: Before each recursion step, new point partitions are determined by maximum likelihood point-cluster associations from last E Step 23

HGMM DATA STRUCTURE Efficiency GMM J=8 “Level 2” GMM benefits of hierarchical GMM GMM J=8 J=8 structures like Octree “Level 3” GMM GMM GMM GMM Theoretical J=8 J=8 J=8 J=8 benefits of a probabilistic “Level 4” generative GMM GMM GMM GMM GMM GMM GMM GMM J=8 J=8 J=8 J=8 J=8 J=8 J=8 J=8 model 24

E Step Performance 25

Compactness vs Fidelity

COMPACTNESS VS FIDELITY Reconstruction Error (PSNR) vs Model Size (kB) 20 kB 27

MODELING LARGE POINT CLOUDS Endeavor Snapshots: ~80 GB of Point Cloud Data each HGMM Level 6: <12 MB Volume created from stochastically sampled Marching Cubes Visualization is real-time: ~20 fps on Titan X 28

ENDEAVOR DATA: BILLIONS OF POINTS 29

APPLICATION: RIGID REGISTRATION Point-sampled surfaces displaced by some rigid transformation Recover translation, rotation that best overlaps point clouds 30

Registration as EM with HGMM MLE over Space of Rotations, Translations Goal: Maximize data likelihood over T given some probability model θ 31

Outdoor Urban Velodyne Data • Velodyne VLP-16 – ~15k pts/frame – ~10 frames/sec • Frame-to-Frame model-building and registration with overlap estimation

HGMM-Based Registration • Average Frame- to-Frame Error: 0.0960

Robust Point-to-Plane ICP • Average Frame- to-Frame Error: 0.1519 • best result on libpointmatcher

Speed vs Accuracy Trade-Off Test: Random transformations of point cloud pairs while varying the subsampling rate. Less subsampling yields better accuracy, but slower speeds. Bottom left is fastest and most accurate. Our proposed methods are red/teal/ black . Our Proposed Methods 35

HGMM COMING TO ISAAC ~350 fps on Titan Xp ~30 fps on Xavier Error: ~0.05 ° yaw (median, 4 Hz updates) 36

DRIVEWORKS (Future Release) With Velodyne HDL-64E: ~ 300 FPS on Titan Xp ~ 30 FPS on Xavier 37

DNN-BASED STEREO DEPTH MAPS 38

FINAL REMARKS HGMM’s have many nice properties for modeling point clouds: Efficient: Fast to compute via CUDA/GPU, even scaling to billions of points Multi-Level: Can well-model the data distribution at multiple levels simultaneously Probabilistic: allows Bayesian optimization for applications like registration Compact and Continuous: no voxels and no aliasing artifacts, easy to transform 39

QUESTIONS? 40

REGISTRATION FROM DNN-BASED STEREO Noisy point cloud output is well-suited for HGMM representation 42

Stanford Lounge Dataset (Kinect) Frame-to-frame registration from point cloud data only (no depth maps), subsampled to 2000 points, first 100 frames. Histograms of average Euler angle error per frame shown. GMM- Based ICP-Based Proposed 43

Noise Handling • Test: Random (uniform) noise injected at increasing amounts • Result: Mixture component “stick” to areas of geometrically coherent, dense areas, disregarding areas of noise

Ƹ SAMPLING FOR PROBABILISTIC OCCUPANCY 𝑞 = 𝑀 Σ 𝑞 + 𝜈 ∀ 𝜈, Σ ∈ Θ 46

MESHING UNDER NOISE 47

ADAPTIVE MULTI- SCALE 48

MULTI-SCALE MODELING Multilevel cross-sections can be adaptively chosen for robustness 49

E Step: Parallelized Tree Search Adaptive Thresholding Finds the Most Appropriate Scale to Associate Point Data to the Point Cloud Model Point-model associations are found through parallelized adaptive tree search in CUDA. Complexity() is defined to be , but other suitable heuristics are possible. 50

PROCESSING WITH HIERARCHICAL GAUSSIAN MIXTURES Ben Eckart, NVIDIA - PowerPoint PPT Presentation

GPU-ACCELERATED 3D POINT CLOUD PROCESSING WITH HIERARCHICAL GAUSSIAN MIXTURES Ben Eckart, NVIDIA Research, Learning and Perception Group, 3/20/2019 2 3D POINT CLOUD DATA Basic data type for unstructured 3D data Emergence of commercial depth

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Ballot Processing | PP 2016 Ballot Processing | PP 2016 Keys to processing the PP from Heidi Hunt,

STAR-CCM+ Pre/Post Processing Bill Jester, CD-adapco Introduction Pre/Post Processing

61A Lecture 30 Announcements Data Processing Data Processing 4 Data Processing Many data sets

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Signal Processing - Introduction Signal Processing Analogue/digital filters: extensively used

Cryosat Processing Prototype Cryosat Processing Prototype (CPP) (CPP) CRYOSAT LRM, TRK and SAR

Image Processing Tricks in Image Processing Tricks in OpenGL OpenGL Simon Green Simon Green

f Fermilab SRF Cavity Processing for SRF Cavity Processing for Project X and ILC R& D

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Post Processing Effects By Michael Michuki What is Post processing? Post Processing is the

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Traditional Processing Pipeline Roman Kern <rkern@tugraz.at>

Stream Processing Marco Serafini COMPSCI 532 Lecture 5 Stream vs. Batch Processing Batch

Digital Signal Processing Solutions Digital Signal Processing Solutions SIGNAL PROCESSING

Holdco 2 S.A. Q2-18 Interim Results August 29 th , 2018 1 Strictly Private and Confidential

How to use Gaussian mixture models on patches for solving image inverse problems Workshop

Fisher Vector Faces (FVF) in the Wild Karn Simonyan , Omkar Parkhi, Andrea Vedaldi, Andrew

Agenda Turnaround Phase 1. Overview 2. Financial review 3. Operating update Growth Phase 4.

LEADERSHIP & INNOVATION GMM Development Limited is an ISO 9001 certifjed global supplier of

PAPA Technical Meetings - 2017 HMA PRODUCTION BY YEAR 1,200,000 1,000,000 980,000 1,000,000

Can greater bank capital lead to less bank lending? An analysis of the bank-level evidence from

Wel elcome come to the General Membership Meeting In Invocat ocation ion Nan Nancie cie

PROCESSING WITH HIERARCHICAL GAUSSIAN MIXTURES Ben Eckart, NVIDIA - PowerPoint PPT Presentation

GPU-ACCELERATED 3D POINT CLOUD PROCESSING WITH HIERARCHICAL GAUSSIAN MIXTURES Ben Eckart, NVIDIA Research, Learning and Perception Group, 3/20/2019 2 3D POINT CLOUD DATA Basic data type for unstructured 3D data Emergence of commercial depth

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Ballot Processing | PP 2016 Ballot Processing | PP 2016 Keys to processing the PP from Heidi Hunt,

STAR-CCM+ Pre/Post Processing Bill Jester, CD-adapco Introduction Pre/Post Processing

61A Lecture 30 Announcements Data Processing Data Processing 4 Data Processing Many data sets

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Signal Processing - Introduction Signal Processing Analogue/digital filters: extensively used

Cryosat Processing Prototype Cryosat Processing Prototype (CPP) (CPP) CRYOSAT LRM, TRK and SAR

Image Processing Tricks in Image Processing Tricks in OpenGL OpenGL Simon Green Simon Green

f Fermilab SRF Cavity Processing for SRF Cavity Processing for Project X and ILC R&amp; D

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Post Processing Effects By Michael Michuki What is Post processing? Post Processing is the

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Traditional Processing Pipeline Roman Kern &lt;rkern@tugraz.at&gt;

Stream Processing Marco Serafini COMPSCI 532 Lecture 5 Stream vs. Batch Processing Batch

Digital Signal Processing Solutions Digital Signal Processing Solutions SIGNAL PROCESSING

Holdco 2 S.A. Q2-18 Interim Results August 29 th , 2018 1 Strictly Private and Confidential

How to use Gaussian mixture models on patches for solving image inverse problems Workshop

Fisher Vector Faces (FVF) in the Wild Karn Simonyan , Omkar Parkhi, Andrea Vedaldi, Andrew

Agenda Turnaround Phase 1. Overview 2. Financial review 3. Operating update Growth Phase 4.

LEADERSHIP &amp; INNOVATION GMM Development Limited is an ISO 9001 certifjed global supplier of

PAPA Technical Meetings - 2017 HMA PRODUCTION BY YEAR 1,200,000 1,000,000 980,000 1,000,000

Can greater bank capital lead to less bank lending? An analysis of the bank-level evidence from

Wel elcome come to the General Membership Meeting In Invocat ocation ion Nan Nancie cie

f Fermilab SRF Cavity Processing for SRF Cavity Processing for Project X and ILC R& D

Natural Language Processing: Traditional Processing Pipeline Roman Kern <rkern@tugraz.at>

LEADERSHIP & INNOVATION GMM Development Limited is an ISO 9001 certifjed global supplier of