and deformable objects in hand object interactions
play

and Deformable Objects in Hand-object Interactions Hao Zhang, Zi-Hao - PowerPoint PPT Presentation

InteractionFusion: Real-time Reconstruction of Hand Poses and Deformable Objects in Hand-object Interactions Hao Zhang, Zi-Hao Bo, Jun-Hai Yong, Feng Xu * School of Software, Tsinghua University Outline Background Overview LSTM-based


  1. InteractionFusion: Real-time Reconstruction of Hand Poses and Deformable Objects in Hand-object Interactions Hao Zhang, Zi-Hao Bo, Jun-Hai Yong, Feng Xu * School of Software, Tsinghua University

  2. Outline  Background  Overview  LSTM-based Pose Prediction  Joint Hand-Object Motion Tracking  Experiments & Results  Limitations & Future Work  Conclusion -1-

  3. Background  Hand tracking has many applications HCI Robots VR/AR  Human hand often interacts with objects Hand-Object Interaction Reconstruction -2-

  4. Background Challenges  Hand-Object Interaction  Isolated Hand Tracking • more occlusions in interactions • complex motions • high dimensional solution space • lack of geometry/texture features • physical plausibility • self-occlusion [Tkach et al. 2016] [Tzionas et al. 2016] -3-

  5. Background  Hand tracking in interactions No Object In Output [Mueller et al. 2017] [Taylor et al. 2017] [Simon et al. 2017] [Mueller et al. 2018] -4-

  6. Background  In hand reconstruction No Hand In Output [Weise et al. 2008] [Weise et al. 2011] [Yuheng Ren et al. 2013] [Petit et al. 2018] -5-

  7. Background  Joint hand-object reconstruction Rigid object Require initial template [Panteleris et al. 2015] [Wang et al. 2013] [Tzionas et al. 2016] [Tsoli et al. 2018] -6-

  8. Our Work  Reconstruct hand pose, object model and deformation in real-time -7-

  9. Overview Synchronized Depth Sequences -8-

  10. Overview D N N Synchronized Hand-Object Depth Sequences Segmentation -9-

  11. Overview D N N Synchronized Hand-Object Depth Sequences Segmentation DenseAttentionSeg DenseAttentionSeg: Segment Hands from Interacted Objects Using Depth Input. arXiv preprint arXiv:1903.12368 (2019) -10-

  12. Overview D Hand-Object N Motion Tracking N Synchronized Hand-Object Joint Hand-Object Motion Tracking and Depth Sequences Segmentation Model Fusion -11-

  13. Overview D Hand-Object N Motion Tracking N Hand Motion Tracking Object Motion Tracking Synchronized Hand-Object Joint Hand-Object Motion Tracking and Depth Sequences Segmentation Model Fusion -12-

  14. Overview LSTM-based Pose Prediction Predicted Pose D Hand-Object N Motion Tracking N Hand Motion Tracking Object Motion Tracking Synchronized Hand-Object Joint Hand-Object Motion Tracking and Depth Sequences Segmentation Model Fusion LSTM Model -13-

  15. Overview LSTM-based Pose Prediction Predicted Pose D Hand-Object N Motion Tracking N Hand Motion Tracking Object Motion Tracking Synchronized Hand-Object Joint Hand-Object Motion Tracking and Depth Sequences Segmentation Model Fusion New regularizer New regularizer Hand-Object Interaction Term for object tracking for hand tracking Unified Energy Optimization Joint Hand-Object Motion Tracking LSTM Model -14-

  16. Overview LSTM-based Pose Prediction Predicted Pose D Hand-Object N Motion Tracking N Object Model Hand Motion Tracking Fusion Object Motion Tracking Synchronized Hand-Object Joint Hand-Object Motion Tracking and Depth Sequences Segmentation Model Fusion New regularizer New regularizer Hand-Object Interaction Term for object tracking for hand tracking Unified Energy Optimization Joint Hand-Object Motion Tracking LSTM Model -15-

  17. LSTM-based Pose Prediction Aim : • Learning the hand motion pattern in interactions • Improving the hand tracking accuracy in interactions Structure Input: 22 DoFs of Hand Pose Output: 22 DoFs of Hand Pose -16-

  18. LSTM-based Pose Prediction Dataset & Training  34 interaction sequences with about 20K frames.  90% as the training set, 10% as the evaluation set.  Select no more than 3 DoFs in each frame to add large Gaussian noise.  100 epochs using Adam optimizer with learning rate of 0.001. Mean Standard Deviation in input Test of LSTM Selected DoFs Other DoFs 0.45 rad 0.042 rad -17-

  19. Joint Hand-Object Motion Tracking  Unified Energy Energy for Energy for Energy for Total Energy Hand Tracking Object Tracking Hand-Obj Interaction  Energy for Hand Tracking Energy for Fit Model to Fit Model in Static Joint Hand Tracking Depth Silhouette Pose Prior Limitation Motion Pattern Finger Joint Position Collision Prior in Interaction Temporary Smooth Sphere-meshes for realtime hand modeling and tracking. Anastasia Tkach, et al.TOG2016  Energy for Object Tracking Output of LSTM Energy for Fit Model To Constrain Model Variational Rigidity Object Tracking Depth in Silhouette Dynamicfusion: Reconstruction and tracking of non-rigid scenes in -18- real-time. Richard A Newcombe et al. CVPR2015

  20. Joint Hand-Object Motion Tracking  hand-object interaction r n c l   2 c v f f E ( d i d ) o c l i press support   1 d 0   i  ( d ) i  0 else Object Surface Sphere of Hand  model to silhouette Reconstructed Object with Reference Color without  variational rigidity Area near Contact point Area far from Contact point Small Rigidity Large Rigidity -19-

  21. Experiments & Results Evaluations  Ablation Study for Hand Tracking Mean Pixel Error Sequence Frames RotatePepper 440 PourBottle 280 ReconstructCat 890 Lstm based pose prediction Lstm baseline Intr Interaction term BL -20-

  22. Experiments & Results Evaluations  Ablation Study for Object Tracking (a) Variational Rigidity -21-

  23. Experiments & Results Evaluations  Ablation Study for Object Tracking (b) Interaction Term -22-

  24. Experiments & Results Evaluations  Ablation Study for Object Tracking (c) Silhouette Term -23-

  25. Experiments & Results Qualitative Comparison  Comparison With KinectFusion -24-

  26. Experiments & Results Quantitative Comparison  Comparison With KinectFusion  Comparison With DynamicFusion -25-

  27. Limitations & Future Work  Limitations • No color information in object tracking • Only consider contact constraints • Only one hand and one object • Cannot handle topology change of object  Future Work • Achieve more realistic interaction reconstruction color information, two hands with multi-objects, topology-change • Reduce equipment requirement use one RGB-D camera -26-

  28. Conclusions  An LSTM-based predictor, a novel interaction term, and variational rigidity  A unified framework integrating segmentation information, pose prediction and new regularizers  A system simultaneously achieving hand tracking, object fusion and nonrigid object tracking in real-time -27-

  29. Conclusions Thanks for Your Attention! -28-

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend