Learning 3D Reconstruction in Function Space Andreas Geiger - PowerPoint PPT Presentation

Learning 3D Reconstruction in Function Space Andreas Geiger Autonomous Vision Group MPI for Intelligent Systems and University of T¨ ubingen June 15, 2020 University of Tübingen MPI for Intelligent Systems Autonomous Vision Group

Collaborators Lars Mescheder Michael Niemeyer Michael Oechsle Sebastian Nowozin Andreas Geiger 2

Our goal is to make intelligent systems more autonomous, robust and safe

Intelligent systems interact with a 3D world

3D reconstruction is a hard problem

1963: Blocks World Larry Roberts: Machine Perception of Three-Dimensional Solids. PhD Thesis, MIT, 1963. 6

Traditional 3D Reconstruction Pipeline Input Images Camera Poses Dense Correspondences 3D Reconstruction Depth Map Fusion Depth Maps 7

Humans recognize 3D from a single 2D image

Can we learn to infer 3D from a 2D image ?

3D Reconstruction from a 2D Image Input Images Neural Network 3D Reconstruction 11

What is a good output representation?

3D Representations Voxels: ◮ Discretization of 3D space into grid ◮ Easy to process with neural networks ◮ Cubic memory O ( n 3 ) ⇒ limited resolution ◮ Manhattan world bias [Maturana et al., IROS 2015] 13

3D Representations Points: ◮ Discretization of surface into 3D points ◮ Does not model connectivity / topology ◮ Limited number of points ◮ Global shape description [Fan et al., CVPR 2017] 13

3D Representations Meshes: ◮ Discretization into vertices and faces ◮ Limited number of vertices / granularity ◮ Requires class-specific template – or – ◮ Leads to self-intersections [Groueix et al., CVPR 2018] 13

3D Representations This work: ◮ Implicit representation ⇒ No discretization ◮ Arbitrary topology & resolution ◮ Low memory footprint ◮ Not restricted to specific class 13

Occupancy Networks Key Idea: ◮ Do not represent 3D shape explicitly ◮ Instead, consider surface implicitly as decision boundary of a non-linear classifier: 3D Condition Occupancy Location Probability (eg, Image) Concurrent work: ◮ DeepSDF [Park et al., CVPR 2019] ◮ IM-NET [Chen et al., CVPR 2019] Mescheder, Oechsle, Niemeyer, Nowozin and Geiger: Occupancy Networks: Learning 3D Reconstruction in Function Space. CVPR, 2019. 14

Training Objective Occupancy Network:Variational Occupancy Encoder: K � L ( θ, ψ ) = BCE ( f θ ( p ij , z i ) , o ij ) + KL [ q ψ ( z | ( p ij , o ij ) j =1: K ) � p 0 ( z )] j =1 ◮ K : Randomly sampled 3D points ( K = 2048 ) ◮ BCE: Cross-entropy loss ◮ q ψ : Encoder Mescheder, Oechsle, Niemeyer, Nowozin and Geiger: Occupancy Networks: Learning 3D Reconstruction in Function Space. CVPR, 2019. 15

Results Mescheder, Oechsle, Niemeyer, Nowozin and Geiger: Occupancy Networks: Learning 3D Reconstruction in Function Space. CVPR, 2019. 16

Can we also learn about object appearance ?

Texture Fields 3D Model Textured 3D Model Texture Field 2D Image Oechsle, Mescheder, Niemeyer, Strauss and Geiger: Texture Fields: Learning Texture Representations in Function Space. ICCV, 2019. 18

Texture Fields Image Encoder 2D Image Texture Field Rendering Unprojection Color Recon. Loss 3D Shape Depth Map 3D Point Predicted Image True Image S a m p l i n g Color Legend: Shape Encoder Conditional Model GAN Model VAE Model Point Cloud Oechsle, Mescheder, Niemeyer, Strauss and Geiger: Texture Fields: Learning Texture Representations in Function Space. ICCV, 2019. 19

Texture Fields Texture Field Rendering Unprojection Color 3D Shape Depth Map 3D Point Predicted Image True Image S a m p l i n g Adver- GAN Discriminator Color Legend: Shape Encoder sarial Loss Conditional Model GAN Model VAE Model Point Cloud Oechsle, Mescheder, Niemeyer, Strauss and Geiger: Texture Fields: Learning Texture Representations in Function Space. ICCV, 2019. 19

Texture Fields KL Divergence VAE Encoder Texture Field Rendering Unprojection Color Recon. Loss Depth Map 3D Shape 3D Point Predicted Image True Image S a m p l i n g Color Legend: Shape Encoder Conditional Model GAN Model VAE Model Point Cloud Oechsle, Mescheder, Niemeyer, Strauss and Geiger: Texture Fields: Learning Texture Representations in Function Space. ICCV, 2019. 19

Representation Power ◮ Ground truth vs. Texture Field vs. Voxelization Oechsle, Mescheder, Niemeyer, Strauss and Geiger: Texture Fields: Learning Texture Representations in Function Space. ICCV, 2019. 20

Results Oechsle, Mescheder, Niemeyer, Strauss and Geiger: Texture Fields: Learning Texture Representations in Function Space. ICCV, 2019. 21

What about object motion ?

Occupancy Flow ◮ Extending Occupancy Networks to 4D is hard (curse of dimensionality) ◮ Represent shape at t = 0 using a 3D Occupancy Network ◮ Represent motion by temporally and spatially continuous vector field ◮ Relationship between 3D trajectory s and velocity v given by (differentiable) ODE: ∂ s ( t ) = v ( s ( t ) , t ) ∂t Niemeyer, Mescheder, Oechsle and Geiger: Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics. ICCV, 2019. 23

Occupancy Flow Niemeyer, Mescheder, Oechsle and Geiger: Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics. ICCV, 2019. 24

Results ◮ No correspondences needed ⇒ implicitly established by our model! Niemeyer, Mescheder, Oechsle and Geiger: Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics. ICCV, 2019. 25

Can we learn implicit representations from images ?

Architecture Occupancy Probability + + Niemeyer, Mescheder, Oechsle and Geiger: Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision. CVPR, 2020. 27

Forward Pass (Rendering)

Differentiable Volumetric Rendering Forward Pass: ◮ For all pixels u ◮ Find surface point ˆ p along ray w via ray marching and root finding ◮ Evaluate texture field t θ (ˆ p ) at ˆ p ◮ Insert color t θ (ˆ p ) at pixel u Niemeyer, Mescheder, Oechsle and Geiger: Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision. CVPR, 2020. 29

Backward Pass (Differentiation)

Differentiable Volumetric Rendering Backward Pass: ◮ Image Observation I ◮ Loss L ( ˆ u � ˆ I , I ) = � I u − I u � ◮ Gradient of loss function: · ∂ ˆ ∂ L ∂ L I u � = ∂ ˆ ∂θ ∂θ I u u ∂ ˆ ∂ t θ (ˆ p ) + ∂ t θ (ˆ p ) · ∂ ˆ I u p = ∂θ ∂θ ∂ ˆ p ∂θ ◮ Differentiation of f θ (ˆ p ) = τ yields: � − 1 ∂f θ (ˆ ∂ ˆ p � ∂f θ (ˆ p ) p ) ⇒ Analytic solution and no need for ∂θ = − w · w ∂ ˆ p ∂θ storing intermediate results Niemeyer, Mescheder, Oechsle and Geiger: Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision. CVPR, 2020. 31

Results Niemeyer, Mescheder, Oechsle and Geiger: Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision. CVPR, 2020. 32

Summary

Summary Neural Implicit Models: ◮ Effective output representation for shape, appearance, material, motion, etc. ◮ No discretization, model arbitrary topology ◮ Can be efficiently learned using 2D supervision ◮ Many applications: reconstruction, view synthesis, segmentation, etc. Challenges: ◮ Geometry must be extracted in post-processing step (1-3 sec for ONet) ◮ Extension to 4D not straightforward (curse of dimensionality) ◮ Fully connected architecture and global condition lead to oversmooth results ◮ Promising: Local features (ConvONet, PiFU), Better input encoding (NeRF) 34

Thank you! http://autonomousvision.github.io

Learning 3D Reconstruction in Function Space Andreas Geiger - PowerPoint PPT Presentation

Learning 3D Reconstruction in Function Space Andreas Geiger Autonomous Vision Group MPI for Intelligent Systems and University of T ubingen June 15, 2020 University of Tbingen MPI for Intelligent Systems Autonomous Vision Group

3D RECONSTRUCTION Reconstruction method Reconstruction from images Reconstruction from video

Delaunay Triangulation: Applications Reconstruction Meshing 1 Reconstruction From points 2 -

1. Reconstruction and the West 1.1 Reconstruction: Americas Unfinished Revolution, 1865-1877

Volumetric Scene Reconstruction Volumetric Scene Reconstruction Goal Goal from Multiple

Vertex reconstruction Vertex reconstruction in large liquid scintillator detectors in large

Baldwin Space Summary October 25 1 Baldwin School Space Summary 2 Baldwin School Space Summary

Design of Geofoam Embankment for the I-15 Reconstruction I 15 Reconstruction Steven F. Bartlett,

Curve and surface reconstruction Steve Oudot Reconstruction Paradigm Q What do you see? Why?

Type Reconstruction and Polymorphism 1 Type Checking and Type Reconstruction We now come to the

S Surface f Reconstruction Digitalisierung Surface Reconstruction: Dr. Peer Stelldinger WS

Advanced Methods for Data Processing and Reconstruction Accelerating Reconstruction on advanced

Surface Reconstruction Level Sets Computer Graphics Hoppe et al, Surface reconstruction from

Spectral Surface Reconstruction Nils Erik Flick January 13, 2009 Surface Reconstruction

2D Fan Beam Reconstruction 3D Cone Beam Reconstruction Mario Koerner March 17, 2006 1 2D Fan

Function Fields, Curves Introduction Function Fields vs. Curves and Global sections Function

Function Calls Function Calls Python supports expressions with math-like functions A

Anisotropic Element-Local Implicit Time Stepping for High Order Flux Reconstruction Semih Akkurt 1

Dynamic Memory Allocation in the Heap (malloc and free) Explicit allocators (a.k.a. manual memory

The CloudyR Project: Statistical Cloud Computing in R with Amazon and Google Thomas J. Leeper

virtual memory 2 1 last time page table: map from virtual to physical pages omit parts of

NEMO: A New I mplicit NEMO: A New I mplicit Connection Graph- -Based Based Gridless Gridless

Lecture 3: Instruction Lecture 3: Instruction of a computer that a machine language of a

Virtual Memory II Philipp Koehn 11 November 2019 Philipp Koehn Computer Systems Fundamentals:

Implicit learning and working memory in children Nadiia Denhovska ndenhovska@uclan.ac.uk