3d deep learning on geometric forms
play

3D Deep Learning on Geometric Forms Hao Su Many 3D representations - PowerPoint PPT Presentation

3D Deep Learning on Geometric Forms Hao Su Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models 3D representation Candidates: multi-view images


  1. 3D Deep Learning on Geometric Forms Hao Su

  2. Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models

  3. 3D representation Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models Novel view image synthesis [Su et al., ICCV15] [Dosovitskiy et al., ECCV16]

  4. 3D representation Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models

  5. 3D representation Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models

  6. 3D representation Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models

  7. 3D representation Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models

  8. 3D representation Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models a chair assembled by cuboids

  9. Two groups of representations Candidates: multi-view images Rasterized form depth map (regular grids) volumetric polygonal mesh Geometric form point cloud (irregular) primitive-based CAD models

  10. Extant 3D DNNs work on grid-like representations Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models

  11. Ideally, a 3D representation should be Friendly to learning • easily formulated as the input/output of a neural network • fast forward-/backward- propagation • etc.

  12. Ideally, a 3D representation should be Friendly to learning • easily formulated as the input/output of a neural network • fast forward-/backward- propagation • etc. Flexible • can precisely model a great variety of shapes • etc.

  13. Ideally, a 3D representation should be Friendly to learning • easily formulated as the output of a neural network • fast forward-/backward- propagation • etc. Flexible • can precisely model a great variety of shapes • etc. Geometrically manipulable for networks • geometrically deformable, interpolable and extrapolable for networks • convenient to impose structural constraints • etc. Others

  14. The problem of grid representations Affability Geometric Flexibility to learning manipulability Multi-view images Volumetric occupancy Expensive to compute: O(N 3 ) Depth map Cannot model “back side”

  15. Typical artifacts of volumetric reconstruction Missing or extra thin structures Volumes are hard for the network to rotate / deform / interpolate

  16. Learn to analyze / generate Geometric Forms? Candidates: multi-view images Rasterized form depth map (regular grids) volumetric polygonal mesh Geometric form point cloud (irregular) primitive-based CAD models

  17. Outline Motivation 3D point cloud / CAD model reconstruction 3D point cloud analysis, e.g., segmentation

  18. 3D point clouds A dual formulation of occupancy Flexibility Geometric manipulability Affability to learning Lagrangian Eulerian Prob. distribution Particle filters Volumetric Point occupancy clouds

  19. Result: 3D reconstruction from real Images Input Reconstructed 3D point cloud

  20. Result: 3D reconstruction from real Images Input Reconstructed 3D point cloud

  21. An end-to-end synthesis-for-learning system Image rendering   ( x 0 1 , y 0 1 , z 0 1 )     ( x 0 2 , y 0 2 , z 0 2 )   ... sampling     ( x 0 n , y 0 n , z 0 n )   3D model Groundtruth point cloud

  22. An end-to-end learning system Image Predicted set   ( x 1 , y 1 , z 1 )     Deep Neural ( x 2 , y 2 , z 2 )   Network ...     ( x n , y n , z n )     ( x 0 1 , y 0 1 , z 0 1 )     ( x 0 2 , y 0 2 , z 0 2 )   ...     ( x 0 n , y 0 n , z 0 n )   Groundtruth point cloud

  23. An end-to-end learning system Image Predicted set   ( x 1 , y 1 , z 1 )     Deep Neural ( x 2 , y 2 , z 2 )   Network ...     ( x n , y n , z n )   Point Set Distance   ( x 0 1 , y 0 1 , z 0 1 )     ( x 0 2 , y 0 2 , z 0 2 )   ...     ( x 0 n , y 0 n , z 0 n )   Groundtruth point cloud

  24. An end-to-end learning system Image Predicted set   ( x 1 , y 1 , z 1 )     Deep Neural ( x 2 , y 2 , z 2 )   Network ...     ( x n , y n , z n )   Point Set Distance   ( x 0 1 , y 0 1 , z 0 1 )     ( x 0 2 , y 0 2 , z 0 2 )   ...     ( x 0 n , y 0 n , z 0 n )   Groundtruth point cloud

  25. Network architecture: Vanilla version Fully connected layer as predictor in standard classification network fully connected conv Encoder input shape embedding ! " point set Predictor

  26. Network architecture: Vanilla version Fully connected layer as predictor in standard classification network fully connected conv Encoder input shape embedding ! " point set Predictor & ! " Independently regress n*3 numbers from : #×3

  27. Natural statistics of geometry • Many objects, especially man-made objects, contain large smooth surfaces • Deconvolution can generate locally smooth textures for images

  28. Network architecture: Output from deconv branch Two branch version conv deconv fully connected set union input Encoder # ' =24*32=768 points Predictor point set # ( =256 points 3-channel map of XYZ coordinates

  29. Network architecture: Output from deconv branch Two branch version conv deconv fully connected set union input Encoder # ' =24*32=768 points Predictor point set # ( =256 points 3-channel map of XYZ coordinates  C 1 C 1 ∈ R n 1 × 3 � C = C 2 ∈ R n 2 × 3 C 2

  30. Network architecture: Output from deconv branch Two branch version conv deconv fully connected set union input Encoder # ' =24*32=768 points Predictor point set # ( =256 points 3-channel map of XYZ coordinates

  31. Network architecture: The role of two branches blue : deconv branch – large, consistent, smooth structures red : fully-connected branch – flexibly reconstruct intricate structures

  32. An end-to-end learning system Predicted set   ( x 1 , y 1 , z 1 )     Deep Neural ( x 2 , y 2 , z 2 )   Network ...     ( x n , y n , z n )   Point Set Loss   ( x 0 1 , y 0 1 , z 0 1 )     ( x 0 2 , y 0 2 , z 0 2 )   ...     ( x 0 n , y 0 n , z 0 n )   Groundtruth point cloud

  33. Distance metrics between point sets Given two sets of points, measure their discrepancy

  34. Common distance metrics Worst case: Hausdorff distance (HD) Average case: Chamfer distance (CD) Optimal case: Earth Mover’s distance (EMD)

  35. Common distance metrics Worst case: Hausdorff distance (HD) d HD( S 1 , S 2 ) = max { max x i ∈ S 1 min y j ∈ S 2 k x i � y j k , max y j ∈ S 2 min x i ∈ S 1 k x i � y j k } A single farthest pair determines the distance. In other words, not robust to outliers!

  36. Common distance metrics Worst case: Hausdorff distance (HD) Average case: Chamfer distance (CD) Average all the nearest neighbor distance by nearest neighbors

  37. Common distance metrics Worst case: Hausdorff distance (HD) Average case: Chamfer distance (CD) Optimal case: Earth Mover’s distance (EMD) Solves the optimal transportation (bipartite matching) problem!

  38. Required properties of distance metrics Geometric requirement • Induces a nice shape space • In other words, a good metric should reflect the natural shape differences Computational requirement • Defines a loss that is numerically easy to optimize

  39. Required properties of distance metrics Geometric requirement • Induces a nice shape space • In other words, a good metric should reflect the natural shape differences Computational requirement • Defines a loss that is numerically easy to optimize

  40. How distance metric affects the learned geometry? A fundamental issue: there is always uncertainty in prediction By loss minimization, the network tends to predict a “ mean shape ” that averages out uncertainty in geometry

  41. How distance metric affects the learned geometry? A fundamental issue: there is always uncertainty in prediction, due to • limited network ability • Insufficient training data • inherent ambiguity of groundtruth for 2D-3D dimension lifting • etc. By loss minimization, the network tends to predict a “ mean shape ” that averages out uncertainty in geometry

  42. Mean shapes are affected by distance metric The mean shape carries characteristics of the distance metric x = argmin ¯ E s ∼ S [ d ( x, s )] x continuous hidden variable (radius) Input EMD mean Chamfer mean

  43. Mean shapes from distance metrics The mean shape carries characteristics of the distance metric x = argmin ¯ E s ∼ S [ d ( x, s )] x continuous hidden variable (radius) discrete hidden variable (add-on location) Input EMD mean Chamfer mean

  44. Comparison of predictions by CD versus EMD Chamfer EMD Input

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend