Marc Habermann
DeepCap: Monocular Human Performance Capture Using Weak Supervision - - PowerPoint PPT Presentation
DeepCap: Monocular Human Performance Capture Using Weak Supervision - - PowerPoint PPT Presentation
DeepCap: Monocular Human Performance Capture Using Weak Supervision Marc Habermann, Weipeng Xu , Michael Zollhoefer, Gerard Pons-Moll, and Christian Theobalt Marc Habermann DeepCap Human performance capture from a monocular camera Marc
Marc Habermann
DeepCap
2
Human performance capture from a monocular camera
Marc Habermann
Challenges § Monocular setting is inherently ambiguous § High-dimensional problem
– Pose and surface deformation
3
Source: https://www.fiylo.de/
Marc Habermann
Related Work § Capture using parametric models
4
Kanazawa et al. 2018 Xiang et al. 2018
Metaxas et al. 1993, Plaenkers et al. 2001, Sminchisescu et al. 2003, Sigal et al. 2004, Joo et al. 2018, Pavlakos et al. 2018, Kanazawa et al. 2019, Pavlakos et al. 2019, …
Marc Habermann
Related Work § Monocular template-free capture
5
Zheng et al. 2019 Saito et al. 2019
Huang et al. 2018, Varol et al. 2018, Natsume et al. 2019, …
Marc Habermann
Related Work § Template-based capture
6
Habermann et al. 2019 Xu et al. 2018
Carranza et al. 2003, Bray et al. 2006, Starck et al. 2007, De Aguiar et al. 2008, Brox et al. 2010, Cagniart et al. 2010, …
Marc Habermann
DeepCap
7
Learning based approach Pose + surface deformation Weak multi-view supervision
Marc Habermann
Personalized Character Model
8
Template mesh Embedded graph Skeleton
Fully automatic
Marc Habermann
Inference Time
9
Marc Habermann
Direct Supervision?
10
Difficult to obtain
Ground truth 3D pose Ground truth 3D surface
Marc Habermann
Weak Supervision
11
Multi-view 2D detections Multi-view foreground masks
Differentiable 3D to 2D modules
Marc Habermann
Training Data – Weak Multi View
12 Calibrated multi-view images 2D keypoints Foreground mask OpenPose (Cao et al. 2019) Color keying
Marc Habermann
Pipeline
13
Marc Habermann
PoseNet
14
Multi-view Sparse Keypoint Loss Kinematics Layer Global Alignment Layer Pose Net Pose Prior Loss Segmented Input Image Rotation ! Joint Angles " Root Relative Landmarks Global Landmarks # Joint Detections $%,'
PoseNet Root rotation ! ∈ ℝ* Joint angles " ∈ ℝ*
Marc Habermann
PoseNet
15
Multi-view Sparse Keypoint Loss Kinematics Layer Global Alignment Layer Pose Net Pose Prior Loss Segmented Input Image Rotation ! Joint Angles " Root Relative Landmarks Global Landmarks # Joint Detections $%,'
Skeletool pose Function +
' !, " : ℝ*- → ℝ* per landmark /
Camera and root relative 3D landmark positions #%0,' Kinematics Layer
Marc Habermann
PoseNet
16
Multi-view Sparse Keypoint Loss Kinematics Layer Global Alignment Layer Pose Net Pose Prior Loss Segmented Input Image Rotation ! Joint Angles " Root Relative Landmarks Global Landmarks # Joint Detections $%,'
Rigid transform for landmark #%0,'
Camera and root relative 3D space Global 3D space
#' = 2%0
3 #%0,' + 5
Inverse extrinsic rotation of the input camera 67 Global translation
Marc Habermann
PoseNet
17
Multi-view Sparse Keypoint Loss Kinematics Layer Global Alignment Layer Pose Net Pose Prior Loss Segmented Input Image Rotation ! Joint Angles " Root Relative Landmarks Global Landmarks # Joint Detections $%,'
Multi-view Sparse Keypoint Loss Projecting (9) 3D landmark #' into camera view 6 Comparing to 2D joint detection $%,'
;<= # = >
%
>
'
9% #' − $%,' @
@
Marc Habermann
DefNet
18
Deformation Layer Multi-view Non-rigid Silhouette Loss ARAP Loss Multi-view Sparse Keypoint Graph Loss Root Relative Landmarks Global Landmarks A Global Vertices B Root Relative Vertices Rotation C Translation D Foreground Masks Pose Net Segmented Input Image Rotation ! Joint Angles " Global Alignment Layer Joint Detections $%,' Def Net
DefNet Regresses embedded deformation* in canonical pose Per node E rotation angles C< and translation D<
*(Sumner et al. 2007, Sorkine et al. 2007)
Marc Habermann
19 Posed and deformed Landmarks A%0,' Vertices B%0,F Pose Deformation
Deformation Layer
Embedded deformation Dual Quaternion Skinning (Kavan et al. 2007)
DefNet
Deformation Layer Multi-view Non-rigid Silhouette Loss ARAP Loss Multi-view Sparse Keypoint Graph Loss Root Relative Landmarks Global Landmarks A Global Vertices B Root Relative Vertices Rotation C Translation D Foreground Masks Pose Net Segmented Input Image Rotation ! Joint Angles " Global Alignment Layer Joint Detections $%,' Def Net
Marc Habermann
20
Rigid transform for landmark G and vertex H
Camera and root relative 3D landmark A%0,' and vertex B%0,F Global 3D landmark A' and vertex BF
DefNet
Deformation Layer Multi-view Non-rigid Silhouette Loss ARAP Loss Multi-view Sparse Keypoint Graph Loss Root Relative Landmarks Global Landmarks A Global Vertices B Root Relative Vertices Rotation C Translation D Foreground Masks Pose Net Segmented Input Image Rotation ! Joint Angles " Global Alignment Layer Joint Detections $%,' Def Net
Marc Habermann
21
Multi-view Sparse Keypoint Graph Loss
;<=I # = >
%
>
'
9% A' − $%,' @
@
Global 3D landmark A'
DefNet
Deformation Layer Multi-view Non-rigid Silhouette Loss ARAP Loss Multi-view Sparse Keypoint Graph Loss Root Relative Landmarks Global Landmarks A Global Vertices B Root Relative Vertices Rotation C Translation D Foreground Masks Pose Net Segmented Input Image Rotation ! Joint Angles " Global Alignment Layer Joint Detections $%,' Def Net
Marc Habermann
22
Non-rigid Silhouette Loss
;JFK B = >
%
>
F∈LM
N% 9% OF
@ @
Distance transform image Set of boundary vertices for camera 6
DefNet
Deformation Layer Multi-view Non-rigid Silhouette Loss ARAP Loss Multi-view Sparse Keypoint Graph Loss Root Relative Landmarks Global Landmarks A Global Vertices B Root Relative Vertices Rotation C Translation D Foreground Masks Pose Net Segmented Input Image Rotation ! Joint Angles " Global Alignment Layer Joint Detections $%,' Def Net
Marc Habermann
Qualitative Evaluation
23
Overlay on input image Ours Habermann et al. 2019 Overlay on reference view
Marc Habermann
Qualitative Evaluation
24
Overlay on input image
Ours Zheng et al. 2019
3D view
Saito et al. 2019
Marc Habermann
25
Quantitative Evaluation
Method (on S4) Multi-view IoU* (in %) HMR (Kanazawa et al. 2018) 65.1 HMMR(Kanazawa et al. 2019) 63.79 LiveCap (Habermann et al. 2019) 59.96 Ours 82.53
Surface reconstruction accuracy
*IoU = Intersection over Union Person-specific Person-unspecific
Marc Habermann
More results
26
Marc Habermann
Thank you!
27
Weipeng Xu Michael Zollhoefer Gerard Pons-Moll Christian Theobalt Marc Habermann