 
              Stanford University 1 Adobe Research 2 Deformation-Aware 3D Model Embedding and Retrieval Mikaela Uy 1 Jingwei Huang 1 Minhyuk Sung 2 Tolga Birdal 1 Leo Guibas 1
Motivation (a) Real Scan (b) CAD Model (c) Overlay Photo taken from [1] 2 [1] End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans. Avetisyan et. al., ICCV 2019.
Goal Retrieve Query Model Closest Model Chamfer Distance: 4.45×10 !" 3
Goal Retrieve Deform Query Model Ours Retrieved Ours Deformed Chamfer Distance: Chamfer Distance: 7.09×10 !" ↑ 1.71×10 !" ↓ 4
Problem 3D Model Database Input Retrieval 3D Model / TurboSquid 3D Warehouse Scan / Deform Retrieved Model Deformed Model Image 5
Fitting Gap • Deformations introduce Introduce constraints/regularizations constraints/regularizations that preventing the perfect fitting. ensure plausible variations without losing the original CAD model Deform 𝓔 features. = • Fitting gap 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) : Deformed Source 𝐭 Target 𝐮 Fitting distance ( 𝑒 ) after deforming a Source 𝓔 𝐭; 𝐮 database shape ( 𝐭 ) to the query ( 𝐮 ) ≠ using deformation function  . 6
Properties of Fitting Gap • Fitting gap measures the distance in Deform the real space. = • Properties of fitting gap: Deform 1. (Non-negativity) 𝑓  𝐭, 𝐮 ≥ 0 = 2. (Identity) 𝑓  𝐮, 𝐮 = 0 3. ( Asymmetry ) 𝑓  𝐭, 𝐮 ≠ 𝑓  𝐮, 𝐭 Not a metric! 7
Egocentric Distance Field (𝐭) Egocentric •  is source-dependent. Distance Field Query •  is represented with a positive semi-definite matrix.  𝐭 ∈ 𝒯 # ( 𝐭 ≽ 0) 8
Distance in Embedding Space 𝜀 (ℱ 𝐮 − ℱ 𝐭 ) ) (𝐭)(ℱ 𝐮 − ℱ 𝐭 ) 𝜀 𝐭; 𝐮 = Properties 1. (Non-negativity) 𝜀 𝐭, 𝐮 ≥ 0 2. (Identity) 𝜀 𝐮, 𝐮 = 0 3. ( Asymmetry ) (ℱ 𝐮 − ℱ 𝐭 ) ) (𝐭)(ℱ 𝐮 − ℱ 𝐭 ) 𝜀 𝐭, 𝐮 = (ℱ 𝐭 − ℱ 𝐮 ) ) (𝐮)(ℱ 𝐭 − ℱ 𝐮 ) 𝜀 𝐮, 𝐭 = 𝜀 𝐭, 𝐮 ≠ 𝜀 𝐮, 𝐭 9
Deformation-Aware Embedding Egocentric Distance Field Query PointNet MLPs ℱ(𝒖) ∈ ℝ & Encoder 𝑢 ∈ ℝ #×% shared shared 𝑡 ∈ ℝ #×% MLPs ℱ(𝒕) ∈ ℝ & PointNet Encoder MLPs & (𝒕) ∈ 𝕋 ' (ℱ 𝐮 − ℱ 𝐭 ) ) (𝐭)(ℱ 𝐮 − ℱ 𝐭 ) 𝑓  𝐭, 𝐮 ~ 𝜀 𝐭; 𝐮 = 10
PNet MLPs ℱ(𝒖) ∈ ℝ ! Network Training MLPs ℱ(𝒕) ∈ ℝ ! PNet MLPs ! (𝒕) ∈ 𝕋 " Candidate Sources 𝜀 𝐭; 𝐮 = (ℱ 𝐮 − ℱ 𝐭 ) ! (𝐭)(ℱ 𝐮 − ℱ 𝐭 ) Target Positives Negatives We precompute the fitting gap ( 𝑓  ). … … … 𝐐 𝐮 𝐎 𝐮 𝐮 𝐘 𝐮 • Margin-loss-based approach P 𝐮 = {𝐭 ∈ 𝐘 𝐮 |𝑓  𝐭, 𝐮 ≤ 𝜏 7 } N 𝐮 = {𝐭 ∈ 𝐘 𝐮 |𝑓  𝐭, 𝐮 > 𝜏 8 } 9 [max 𝐪∈< 𝐮 𝜀 𝐪; 𝐮 − 𝜀 𝐨; 𝐮 ) + 𝑛 = 𝐨∈: ! 11 [2] FaceNet: A Unified Embedding for Face Recognition and Clustering. Schroff et. al., CVPR 2015
̂ PNet MLPs ℱ(𝒖) ∈ ℝ ! Network Training MLPs ℱ(𝒕) ∈ ℝ ! PNet MLPs ! (𝒕) ∈ 𝕋 " Candidate Sources 𝜀 𝐭; 𝐮 = (ℱ 𝐮 − ℱ 𝐭 ) ! (𝐭)(ℱ 𝐮 − ℱ 𝐭 ) Target We precompute the fitting gap ( 𝑓  ). … … 𝐮 𝐘 𝐮 • Regression-based approach B ) E - 𝐭;𝐮 exp(−𝑓  𝐭; 𝐮 /2𝜏 A 𝑞 𝐭; 𝐮 = 𝑞 𝐭; 𝐮 = , E - 𝐭;𝐮 ∑ 𝐭,∈𝐘𝐮 B ) , exp(−𝑓  𝐭 D ; 𝐮 /2𝜏 A ∑ 𝐭 , ∈𝐘 𝐮 1 D | 9 | ̂ 𝑞 𝐭; 𝐮 − 𝑞 𝐭; 𝐮 | |X 𝐮 , 𝐭∈H ! 12 [3] Stochastic Neighbor Embedding. Hinton et. al., NeurIPS 2002.
Summary OURS 1. Fitting gap 𝑒( , ) 𝑒( , ) Deform = Chamfer Distance Chamfer Distance After Deformation Before Deformation 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) 𝒆(𝐭, 𝐮) 2. Egocentric distance field (𝐭) 3. Training approaches: 𝓗 is fixed to identity. 𝓗 is source-dependent. Symmetric Asymmetric embedding • Margin-loss-based embedding distance distance • Regression-based 13
Implementation Details • Training data: ShapeNet (5 categories) • Backbone architecture: PointNet (sampling points over the meshes) • Deformation function  : Simplified as-rigid-as-possible (ARAP) [4] ShapeNet: An Information-Rich 3D Model Repository. Chang et. al. , arXiv 2015. [5] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Qi et. al., CVPR 2017. 14 [6] As-rigid-as-possible surface modeling. Sorkine et. al., SGP 2007.
Quantitative Results (Mean Chamfer Distance ×10 "# for the best of the top 3 retrieval) bold = smallest, underline = second smallest Before Deformation (B.D.) After Deformation (A.D.) 𝒆(𝐭, 𝐮) 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) 15
Quantitative Results (Mean Chamfer Distance ×10 "# for the best of the top 3 retrieval) bold = smallest, underline = second smallest Before Deformation (B.D.) After Deformation (A.D.) Ranked by Chamfer 𝒆(𝐭, 𝐮) 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) Distance ( Ranked CD ) Ranked CD 3.025 1.104 • Select the shape with smallest B.D. • No embedding space. 16
Quantitative Results (Mean Chamfer Distance ×10 "# for the best of the top 3 retrieval) bold = smallest, underline = second smallest Before Deformation (B.D.) After Deformation (A.D.) Autoencoder ( AE ) 𝒆(𝐭, 𝐮) 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) • PointNet Ranked CD 3.025 1.104 autoencoder for reconstruction. AE 3.188 1.116 • Use the bottleneck layer as the embedding space. 17
Quantitative Results (Mean Chamfer Distance ×10 "# for the best of the top 3 retrieval) bold = smallest, underline = second smallest Before Deformation (B.D.) After Deformation (A.D.) CD-Margin 𝒆(𝐭, 𝐮) 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) Ranked CD 3.025 1.104 𝑒( ) , AE 3.188 1.116 Fitting gap CD-Margin 3.321 1.168 CD-Reg 5.057 2.108 Egocentric distance field Margin-loss is used. (PointNet encoder is used for the embedding space.) 18
Quantitative Results (Mean Chamfer Distance ×10 "# for the best of the top 3 retrieval) bold = smallest, underline = second smallest Before Deformation (B.D.) After Deformation (A.D.) CD-Reg 𝒆(𝐭, 𝐮) 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) Ranked CD 3.025 1.104 𝑒( ) , AE 3.188 1.116 Fitting gap CD-Margin 3.321 1.168 CD-Reg 5.057 2.108 Egocentric distance field Reg-loss is used. (PointNet encoder is used for the embedding space.) 19
Quantitative Results (Mean Chamfer Distance ×10 "# for the best of the top 3 retrieval) bold = smallest, underline = second smallest Before Deformation (B.D.) After Deformation (A.D.) Symm-Margin 𝒆(𝐭, 𝐮) 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) Ranked CD 3.025 1.104 𝑒( ) , AE 3.188 1.116 Fitting gap CD-Margin 3.321 1.168 CD-Reg 5.057 2.108 Symm-Margin 3.537 1.092 Egocentric distance field Symm-Reg 4.649 1.657 Margin-loss is used. (PointNet encoder is used for the embedding space.) 20
Quantitative Results (Mean Chamfer Distance ×10 "# for the best of the top 3 retrieval) bold = smallest, underline = second smallest Before Deformation (B.D.) After Deformation (A.D.) Symm-Reg 𝒆(𝐭, 𝐮) 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) Ranked CD 3.025 1.104 𝑒( ) , AE 3.188 1.116 Fitting gap CD-Margin 3.321 1.168 CD-Reg 5.057 2.108 Symm-Margin 3.537 1.092 Egocentric distance field Symm-Reg 4.649 1.657 Reg-loss is used. (PointNet encoder is used for the embedding space.) 21
Quantitative Results (Mean Chamfer Distance ×10 "# for the best of the top 3 retrieval) bold = smallest, underline = second smallest Before Deformation (B.D.) After Deformation (A.D.) Ours-Margin 𝒆(𝐭, 𝐮) 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) Ranked CD 3.025 1.104 𝑒( ) , AE 3.188 1.116 Fitting gap CD-Margin 3.321 1.168 CD-Reg 5.057 2.108 Symm-Margin 3.537 1.092 Egocentric distance field Symm-Reg 4.649 1.657 Margin-loss is used. Ours-Margin 3.587 1.076 Ours-Reg 3.650 0.984 (PointNet encoder is used for the embedding space.) 22
Quantitative Results (Mean Chamfer Distance ×10 "# for the best of the top 3 retrieval) bold = smallest, underline = second smallest Before Deformation (B.D.) After Deformation (A.D.) Ours-Reg 𝒆(𝐭, 𝐮) 𝑓  𝐭, 𝐮 = 𝒆(𝓔 𝐭; 𝐮 , 𝐮) Ranked CD 3.025 1.104 𝑒( ) , AE 3.188 1.116 Fitting gap CD-Margin 3.321 1.168 CD-Reg 5.057 2.108 Symm-Margin 3.537 1.092 Egocentric distance field Symm-Reg 4.649 1.657 Reg-loss is used. Ours-Margin 3.587 1.076 Ours-Reg 3.650 0.984 (PointNet encoder is used for the embedding space.) 23
Recommend
More recommend