4DVideo & Dynamic Scenes Torsten Sattler and Martin Oswald - - PowerPoint PPT Presentation

4dvideo dynamic scenes
SMART_READER_LITE
LIVE PREVIEW

4DVideo & Dynamic Scenes Torsten Sattler and Martin Oswald - - PowerPoint PPT Presentation

4DVideo & Dynamic Scenes Torsten Sattler and Martin Oswald Spring 2018 1 Institute of Visual Computing 3D Reconstruction over time? 2 Institute of Visual Computing Motivation: Dynamic Scenes 3 Institute of Visual Computing


slide-1
SLIDE 1

Institute of Visual Computing

1

4DVideo & Dynamic Scenes

Torsten Sattler and Martin Oswald Spring 2018

slide-2
SLIDE 2

Institute of Visual Computing

2

3D Reconstruction over time?

slide-3
SLIDE 3

Institute of Visual Computing

3

Motivation: Dynamic Scenes

slide-4
SLIDE 4

Institute of Visual Computing

4

Applications

Free Viewpoint TV Markerless Motion Capture Special Effects for Movies e.g. „Bullet Time“ Effect from „The Matrix“

The making of:

Motion Analysis for Sports Content Creation for Movies and Games / VR / AR

slide-5
SLIDE 5

Institute of Visual Computing

5

Overview

Calibrate and synchronize 3D reconstruction Video-based Rendering Free viewpoint video content Input videos 3D representation

  • f the scene
slide-6
SLIDE 6

Institute of Visual Computing

6

Surface Representations

Explicit Representation Implicit Representation

surface is directly parametrized

  • > (n-1)-dimensional space

e.g. mesh, spline function/NURBS embedding volume is labeled (indicator function / signed distance to surface)

  • > n-dimensional space

e.g. voxel grid, octree, tetrahedra + efficient storage, small data amounts

  • topology changes are difficult
  • mathematical operations (e.g. set

union/intersection, average) are complex + topology changes are trivial + watertight, closed manifold + mathematical operations (e.g. set union/intersection, average) are simple

  • expensive storage, large data amounts
slide-7
SLIDE 7

Institute of Visual Computing

7

Overview

  • Model-based Approaches
  • Spatio-temporal 3D Reconstruction
  • Scene Flow Approaches
  • Hybrid Approaches
slide-8
SLIDE 8

Institute of Visual Computing

8

Marker-less Motion Capture

(Ballan and Cortelazzo, 3DPVT‘08)

slide-9
SLIDE 9

Institute of Visual Computing

9

Textured model Wireframe model

Obtaining a 3D Model

Home-made 3D body scanner

[Ballan et al. ‘06]

Captured Images

Silhouette + Photoconsistency wavelet based texture mapping

slide-10
SLIDE 10

Institute of Visual Computing

10

Recording Studio

§

4 cameras

§

21 fps

§

0.8 Mpixels

slide-11
SLIDE 11

Institute of Visual Computing

11

Marker-less Motion Capture

§

Start with known pose

§

Track pose over time with all cameras

§

Optical flow & silhouette constraints

slide-12
SLIDE 12

Institute of Visual Computing

12

Interactions

Close contacts Fails

More challenging situations 2 people + an object (83 DOF)

slide-13
SLIDE 13

Institute of Visual Computing

13

Model-based Tracking

(Vlasic et al., TOG‘08)

slide-14
SLIDE 14

Institute of Visual Computing

14

Face Tracking (Model-based)

(Cao et al., SIGGRAPH‘15)

slide-15
SLIDE 15

Institute of Visual Computing

15

A More Challenging Problem

Self-similarities, similar color Occlusions Collisions

§ § §

Tracking closely interacting people

<<

Motion Capture of Interacting Hands

Much more challenging

slide-16
SLIDE 16

Institute of Visual Computing

16

Motion Capture of Hands

Template Model Scene recorded from multiple viewing angles Kinematic Structure

+

+

(Ballan et al., ECCV‘12)

slide-17
SLIDE 17

Institute of Visual Computing

17

Motion Capture of Hands

Output: Input:

Scene Motion (angles and positions)

slide-18
SLIDE 18

Institute of Visual Computing

18

Motion Capture of Hands

Full 3D Geometry

  • f the Scene

Output: Input:

slide-19
SLIDE 19

Institute of Visual Computing

19

Collisions and Self-Intersections

Collisions Lack of information

§ § Overfitting

Local Distance Field Colliding faces Self-Intersections

slide-20
SLIDE 20

Institute of Visual Computing

20

Results

slide-21
SLIDE 21

Institute of Visual Computing

21

Unique

  • bject

How to handle additional Objects

3D models

Scan the

  • bject

Virtual scene bone

  • Collisions
  • Occlusions

Handled transparently

§ §

Use the algorithm as it is!!

slide-22
SLIDE 22

Institute of Visual Computing

22

Results

slide-23
SLIDE 23

Institute of Visual Computing

23

Results

slide-24
SLIDE 24

Institute of Visual Computing

24

Overview

  • Model-based Approaches
  • Spatio-temporal 3D Reconstruction
  • Scene Flow Approaches
  • Hybrid Approaches
slide-25
SLIDE 25

Institute of Visual Computing

25

  • Unreliable silhouettes: do not make decision about their location
  • Do sensor fusion: use all image information simultaneously

Shape from Silhouettes

(Franco and Boyer, ICCV‘05)

slide-26
SLIDE 26

Institute of Visual Computing

26

Bayesian Formulation

  • Idea: find the content of the scene from images, as a

probability grid

  • Modeling forward problem - explaining images given

grid state - is easy è sensor model.

  • Bayesian inference: formulation of initial inverse

problem from the sensor model

  • Simplification for tractability: independent analysis and

processing of voxels

(Franco and Boyer, ICCV‘05)

slide-27
SLIDE 27

Institute of Visual Computing

27

Modeling

  • I: color information in images
  • B: background color models
  • F: silhouette detection variable (0 or 1): hidden
  • GX: occupancy at voxel X (0 or 1)

Sensor model: Inference: Grid

Gx

Silhouette likelihood Image likelihood

(Franco and Boyer, ICCV‘05)

slide-28
SLIDE 28

Institute of Visual Computing

28

Visualization

(Franco and Boyer, ICCV‘05)

slide-29
SLIDE 29

Institute of Visual Computing

29

Variational Volumetric Reconstruction

(Oswald et al., 4DMOD‘13) 3D Reconstruction

weighted TV + data term [Kolev et al. IJCV’09] Interior/exterior labeling

spatial temporal data term regularization term

Interior/exterior labeling

4D Reconstruction

slide-30
SLIDE 30

Institute of Visual Computing

30

Variational Volumetric Reconstruction

(Oswald et al., BMVC‘14)

slide-31
SLIDE 31

Institute of Visual Computing

31

Advanced: Handling Thin Structures

(Oswald et al., ECCV‘14) input no connectivity generalized connectivity

Jancosek and Pajdla CVPR‘11 Furukawa et al. PAMI‘11 Oswald and Cremers 4DMOD‘13 Oswald et al., ECCV‘14

slide-32
SLIDE 32

Institute of Visual Computing

32

Advanced: Handling Thin Structures

(Oswald et al., ECCV‘14)

slide-33
SLIDE 33

Institute of Visual Computing

33

Overview

  • Model-based Approaches
  • Spatio-temporal 3D Reconstruction
  • Scene Flow Approaches
  • Hybrid Approaches
slide-34
SLIDE 34

Institute of Visual Computing

34

Sceneflow Approaches

(Guan et al., CVPR‘10)

slide-35
SLIDE 35

Institute of Visual Computing

35

Sceneflow Approaches

(Joo et al., CVPR‘14)

slide-36
SLIDE 36

Institute of Visual Computing

36

Sceneflow Approaches

slide-37
SLIDE 37

Institute of Visual Computing

37

Overview

  • Model-based Approaches
  • Spatio-temporal 3D Reconstruction
  • Scene Flow Approaches
  • Hybrid Approaches
slide-38
SLIDE 38

Institute of Visual Computing

38

Implicit Surface Representation

Signed Distance Function

slide-39
SLIDE 39

Institute of Visual Computing

39

Static Case: KinectFusion (and variants)

(Whelan et al., RSS‘15)

slide-40
SLIDE 40

Institute of Visual Computing

40

DynamicFusion

(Newcombe et al., CVPR‘15)

slide-41
SLIDE 41

Institute of Visual Computing

41

DynamicFusion

(Newcombe et al., CVPR‘15)

slide-42
SLIDE 42

Institute of Visual Computing

42

VolumeDeform

(Innmann et al., ECCV‘16)

slide-43
SLIDE 43

Institute of Visual Computing

43

KillingFusion

(Slavcheva et al., CVPR‘17)

slide-44
SLIDE 44

Institute of Visual Computing

44

Fusion4D

(Dou et al., SIGGRAPH‘16)

slide-45
SLIDE 45

Institute of Visual Computing

45

Next Lecture:

Final Presentations!

slide-46
SLIDE 46

Institute of Visual Computing

46

References

  • L. Ballan and G. M. Cortelazzo, “Marker-less Motion Capture of Skinned Models in a Four Camera Set-up using Optical Flow and Silhouettes”, 3DPVT 2008.
  • L. Ballan, G. J. Brostow, J. Puwein and M. Pollefeys, “Unstructured Video-Based Rendering: Interactive Exploration of Casually Captured Videos”, SIGGRAPH 2010,

http://cvg.ethz.ch/research/unstructured-vbr/

  • Chen Cao, Derek Bradley, Kun Zhou, Thabo Beeler:, “Real-Time High-Fidelity Facial Performance Capture”, ACM SIGGRAPH 2015,

https://www.disneyresearch.com/publication/realtimeperformancecapture/

  • M. Dou, S. Khamis, Y. Degtyarev, P. Davidson, Sean Ryan Fanello, A. Kowdle, S. Orts Escolano, C. Rhemann,, David Kim, David Sweeney, J. Taylor, Pushmeet Kohli, V. Tankovich, Shahram

Izadi “Fusion4D: Real-time Performance Capture of Challenging Scenes”, SIGGRAPH, 2016, https://www.youtube.com/watch?v=2dkcJ1YhYw4

  • T. Whelan, S. Leutenegger, R. F. Salas-Moreno, B. Glocker and A. J. Davison , ElasticFusion: Dense SLAM Without A Pose Graph, Robotics: Science and Systems (RSS), Rome, Italy, July 2015,

https://www.youtube.com/watch?v=XySrhZpODYs

  • Matthias Innman, Michael Zollhöfer, Matthias Nießner, Christian Theobalt, Marc Stamminger, “VolumeDeform: Real-time Volumetric Non-rigid Reconstruction”, ECCV, 2016,

http://lgdv.cs.fau.de/publications/publication/Pub.2016.tech.IMMD.IMMD9.volume_6/

  • Alvaro Collet, Ming Chuang, Patrick Sweeney, Don Gillett, Dennis Evseev, David Calabrese, Hugues Hoppe, Adam Kirk, Steve Sullivan. "High-quality streamable free-viewpoint video."

ACM Transactions on Graphics (SIGGRAPH), 34(4), 2015, https://www.youtube.com/watch?v=SkJG-uFU2yA

  • Hanbyul Joo, Hyun Soo Park, and Yaser Sheikh, „MAP Visibility Estimation for Large-Scale Dynamic 3D Reconstruction“, http://www.cs.cmu.edu/~hanbyulj/14/visibility.html
  • R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. Davison, P. Kohli, J. Shotton, S. Hodges, A. Fitzgibbon, „KinectFusion: real-time dense surface mapping and tracking“, ISMAR
  • 2011. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ismar2011.pdf
  • R. A. Newcombe, D. Fox, and S. M. Seitz. „Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time“, CVPR, 2015, https://www.youtube.com/watch?v=i1eZekcc_lM,

http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Newcombe_DynamicFusion_Reconstruction_and_2015_CVPR_paper.pdf

  • Martin R. Oswald and Daniel Cremers. A convex relaxation approach to space time multi-view 3D reconstruction. In ICCV Workshop on Dynamic Shape Capture and Analysis (4DMOD), 2013,

http://youtu.be/axGBJbawacA

  • M. R. Oswald, J. Stühmer, D. Cremers. „ Generalized Connectivity Constraints for Spatio-temporal 3D Reconstruction“, European Conference on Computer Vision (ECCV), 2014,

http://youtu.be/4H0GmCUDEsc

  • M. R. Oswald, D. Cremers, “Surface Normal Integration for Convex Space-time Multi-view Reconstruction”, British Machine Vision Conference (BMVC), 2014, http://youtu.be/e9T4o0WHhPI
  • Y. Liu, J. Gall, C. Stoll, Q. Dai, H.-P. Seidel, C. Theobalt, “Markerless Motion Capture of Multiple Characters Using Multi-view Image Segmentation”, IEEE Trans. PAMI 2013,

https://www.youtube.com/watch?v=eoUWLip_z8A

  • T. Whelan, S. Leutenegger, R. F. Salas-Moreno, B. Glocker and A. J. Davison, "ElasticFusion: Dense SLAM Without A Pose Graph”, RSS '15, https://github.com/mp3guy/ElasticFusion
  • Daniel Vlasic, Ilya Baran, Wojciech Matusik, Jovan Popović , ”Articulated Mesh Animation from Multi-view Silhouettes”, ACM Transactions on Graphics 27(3), 2008,

http://people.csail.mit.edu/drdaniel/mesh_animation

  • M. Slavcheva, M. Baust, D. Cremers, S. Ilic, KillingFusion: Non-rigid 3D Reconstruction without Correspondences, Conference on Computer Vision and Pattern Recognition (CVPR), 2017,

http://campar.in.tum.de/Chair/PublicationDetail?pub=slavcheva2017cvpr