scene represe sentation networks ks
play

Scene Represe sentation Networks: ks: Continuous - PowerPoint PPT Presentation

Scene Represe sentation Networks: ks: Continuous 3D-Structure-Aware Neural Scene Representations Vincent Sitzmann Michael Zollhfer Gordon Wetzstein single image camera pose Novel Views Surface Normals intrinsics Self-supervised Scene


  1. Scene Represe sentation Networks: ks: Continuous 3D-Structure-Aware Neural Scene Representations Vincent Sitzmann Michael Zollhöfer Gordon Wetzstein

  2. single image camera pose Novel Views Surface Normals intrinsics

  3. Self-supervised Scene Representation Learning { } , ,… Lat Latent ent 3D 3D Scenes cenes { } , ,… Obse serva vations + + Image + Pose & Intrinsics What can we learn about latent 3D scenes from observations? Vision: Learn rich representations just by watching video!

  4. Self-supervised Scene Representation Learning Re Re-Re Rende dered d Obse serva vations Obse serva vations , ,… , ,… Model Image Loss

  5. Self-supervised Scene Representation Learning Re Re-Re Rende dered d Obse serva vations Obse serva vations Neur eural al Scene cene Represe sentation , ,… , ,… Persistent feature representation of scene. Image Loss

  6. Self-supervised Scene Representation Learning Re Re-Re Rende dered d Obse serva vations Obse serva vations Neur eural al Scene cene Neur eural al Rend ender erer er Represe sentation , ,… , ,… Persistent feature Render from different representation of camera perspectives. scene. Image Loss

  7. 2D baseline: Autoencoder Re Re-Re Rende dered d Obse serva vations Obse serva vations Conv Conv , ,… , ,… Latent Code Encoder Decoder + Output Pose Image Loss

  8. 2D baseline: Autoencoder Re Re-Re Rende dered d Obse serva vations Obse serva vations Conv , ,… , ,… Latent Code Decoder Output Pose Image Loss

  9. Doesn’t capture 3D properties of scenes. Trained on ~2500 shapenet cars with 50 observations each. Need 3D inductive bias!

  10. Related Work 3D inductive ve bias s / Self Se lf-su supervi vise sed 3D st structure with pose sed images Scene Represe sentation Learning Tatarchenko et al., 2015 Worrall et al., 2017 Eslami et al., 2018 … 2D Generative ve Models Goodfellow et al., 2014 Kingma et al., 2013 Kingma et al., 2018 … 3D Computer Visi sion Choy et al., 2016 Huang et al., 2018 Park et al., 2018 … Voxe xel-base sed Represe sentations Memory inefficient: ! " # . • Sitzmann et al., 2019 • Doesn’t parameterize scene surfaces smoothly. Lombardi et al., 2019 • Generalization is hard. Phuoc et al., 2019 …

  11. Scene Representation Networks Re Re-Re Rende dered d Obse serva vations Obse serva vations Neur eural al Scene cene Neur eural al Rend ender erer er , ,… , ,… Represe sentation Image Loss

  12. Scene Representation Networks Re Re-Re Rende dered d Obse serva vations Obse serva vations Neur eural al Scene cene Neur eural al Rend ender erer er , ,… , ,… Represe sentation Image Loss

  13. Free Space ! " Objects ! #

  14. Model scene as function Φ that maps coordinates to features. [] Free Space $ % " ∈ … [] Objects Φ: ℝ ) → ℝ + " ∈ … [] … Free " ∈ Space … $ &

  15. Scene Representation Network parameterizes Φ as MLP. [] Free Space ) * Sc Scene " ∈ Represe sentation … Net etwor ork [] Φ: ℝ & → ℝ ( Objects " ∈ … [] … Free " ∈ Space … ) +

  16. Scene Representation Network parameterizes Φ as MLP. Sc Scene Can sample anywhere, Represe sentation Net etwor ork at arbitrary resolutions. Φ: ℝ $ → ℝ & Parameterizes scene surfaces smoothly. Memory scales with scene complexity.

  17. Scene Representation Networks Neur eural al Scene cene Represe sentation Re Re-Re Rende dered d Obse serva vations Obse serva vations Φ: ℝ $ → ℝ & Neur eural al Rend ender erer er , ,… , ,… Image Loss

  18. Scene Representation Networks Neur eural al Scene cene Represe sentation Re Re-Re Rende dered d Obse serva vations Obse serva vations Φ: ℝ $ → ℝ & Neur eural al Rend ender erer er , ,… , ,… Image Loss

  19. Neural Renderer. Free Space ! " ! #

  20. Neural Renderer.

  21. Neural Renderer.

  22. Neural Renderer Step 1: Intersection Testing. Idea: march along ray until arrived at surface. ? ? ? ? ?

  23. Neural Renderer Step 1: Intersection Testing. feature $ # vector Scene Represe sentation Φ: ℝ ( → ℝ * ! " ! # world coordinates

  24. Neural Renderer Step 1: Intersection Testing. Ray Marching LSTM feature # " * "+, vector Step length Scene Represe sentation Φ: ℝ ' → ℝ ) Feasible step length: Distance to closest scene surface ! - ! "+, ! " world coordinates

  25. Neural Renderer Step 1: Intersection Testing. Iteration 0

  26. Neural Renderer Step 1: Intersection Testing. Iteration 1

  27. Neural Renderer Step 1: Intersection Testing. Iteration 2

  28. Neural Renderer Step 1: Intersection Testing. Iteration 3

  29. Neural Renderer Step 2: Color Generation Iteration 4

  30. Neural Renderer Step 1: Intersection Testing. Iteration …

  31. Neural Renderer Step 1: Intersection Testing.

  32. Neural Renderer Step 2: Color Generation Scene Represe sentation Φ: ℝ $ → ℝ & Color MLP

  33. Can now train end-to-end with posed images only! Neur eural al Scene cene Neur eural al Rend ender erer er Represe sentation Re-Re Re Rende dered d Obse serva vations Obse serva vations Φ: ℝ $ → ℝ & , ,… , ,… Image Loss

  34. Generalizing across a class of scenes

  35. Each scene represented by its own SRN. parameters ! & ∈ ℝ % parameters ! " ∈ ℝ % parameters ! ' ∈ ℝ % parameters ! ( ∈ ℝ %

  36. Each scene represented by its own SRN. parameters ! * ∈ ℝ $ parameters ! ( ∈ ℝ $ ! " live on k-dimensional subspace of ℝ $ , % < ' . parameters ! + ∈ ℝ $ parameters ! , ∈ ℝ $

  37. Each scene represented by its own SRN. embedding ! & ∈ ℝ % parameters ) & ∈ ℝ * embedding ! " ∈ ℝ % parameters ) " ∈ ℝ * Represent each scene with low-dimensional embedding embedding ! ' ∈ ℝ % parameters ) ' ∈ ℝ * embedding ! ( ∈ ℝ % parameters ) ( ∈ ℝ *

  38. Each scene represented by its own SRN. embedding ) & ∈ ℝ * parameters ! & ∈ ℝ % Hyp ypernetwork k embedding ) " ∈ ℝ * parameters ! " ∈ ℝ % Ψ: ℝ * → ℝ % , z / ↦ Ψ ) 1 = ! 1 embedding ) ' ∈ ℝ * parameters ! ' ∈ ℝ % embedding ) ( ∈ ℝ * parameters ! ( ∈ ℝ %

  39. Results

  40. Novel View Synthesis – Baseline Comparison struction of objects in held-out test set Shapenet v2 – si single-sh shot reconst SRNs (Ours) Tatarchenko et al. Training § Shapenet cars / chairs. § 50 observations per object. Tatarchenko et al. Worrall et al. 2015 Testing • Cars / chairs from unseen test set • Single observation! Worrall et al. Deterministic 2017 Input pose GQN Deterministic GQN, adapted Eslami et al. SRNs 2018

  41. Novel View Synthesis – SRN Output In Input pose se struction of objects in held-out test set Shapenet v2 – si single-sh shot reconst

  42. Sampling at arbitrary resolutions 512x512 32x32 64x64 128x128 256x256 Surface Normals RGB

  43. Generalization to unseen camera poses Camera close-up Camera Roll SRNs

  44. Generalization to unseen camera poses Camera close-up Camera Roll SRNs Doesn’t reconstruct Doesn’t reconstruct Tatarchenko et al. geometry geometry

  45. Latent code interpolation RGB Surface Normals

  46. Latent code interpolation RGB Surface Normals

  47. Can represent room-scale scenes, but aren’t compositional. Training set novel-view synthesis on Work-in-progress: Compositional SRNs GQN rooms (Eslami et al. 2018) with generalize to unseen numbers of objects! Shapenet cars, 50 observations.

  48. Scene Representation Networks: Continuous 3D-structure-aware Neural Scene Representations Vincent Sitzmann Michael Zollhöfer Gordon Wetzstein Find me at Poster # 71! vsitzmann.github.io Looki king fo for rese search posi sitions @vincesitzmann in n sc scene represe sentation lear earni ning ng . Single-shot reconstruction Interpolation Camera pose extrapolation

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend