dynamic facial analysis from bayesian filtering to rnn
play

DYNAMIC FACIAL ANALYSIS: FROM BAYESIAN FILTERING TO RNN Jinwei Gu, - PowerPoint PPT Presentation

DYNAMIC FACIAL ANALYSIS: FROM BAYESIAN FILTERING TO RNN Jinwei Gu, 2017/4/18 with Xiaodong Yang, Shalini De Mello, and Jan Kautz FACIAL ANALYSIS IN VIDEOS Exploit temporal coherence to track facial features in videos Head/Face Tracking


  1. DYNAMIC FACIAL ANALYSIS: FROM BAYESIAN FILTERING TO RNN Jinwei Gu, 2017/4/18 with Xiaodong Yang, Shalini De Mello, and Jan Kautz

  2. FACIAL ANALYSIS IN VIDEOS Exploit temporal coherence to track facial features in videos Head/Face Tracking Performance 3D Capture HeadPoseFromDepth, 2015 DeepHeadPose, 2015 HyperFace, 2016 2

  3. CLASSICAL APPROACH: BAYESIAN FILTERING It is challenging to design Bayesian filters specific for each task! Particle Filters Tree-based DPM Spatial-Temporal RNN Head Pose Tracking Face Landmark Tracking Face Landmark [2010] [ICCV2015] [ECCV2016] 3

  4. FROM BAYESIAN FILTERING TO RNN Use RNN to avoid tracker-engineering Output 𝐳 𝑢−1 𝐳 𝑢 𝐳 𝑢−1 𝐳 𝑢 (Target) Hidden 𝐢 𝑢−1 𝐢 𝑢 𝐢 𝑢−1 𝐢 𝑢 State Input 𝐲 𝑢−1 𝐲 𝑢 𝐲 𝑢−1 𝐲 𝑢 (Measurement) Bayesian Filter RNN (unfolded) 4

  5. FROM BAYESIAN FILTERING TO RNN Use RNN to avoid tracker-engineering 5

  6. AN EXAMPLE: KALMAN FILTERS VS. RNN state transition process noise (process model) noisy estimated input state ℎ 𝑢 = 𝜏 1 (𝑋ℎ 𝑢−1 + 𝑉𝑦 𝑢 + 𝑐 1 ) 𝑦 𝑢 = 𝑋𝑦 𝑢−1 + 𝑜 1 measurement noise 𝑧 𝑢 = 𝜏 2 (𝑊ℎ 𝑢 + 𝑐 2 ) 𝑧 𝑢 = 𝑊𝑦 𝑢 + 𝑜 2 target measurement model output noisy observation Simple RNN (i.e., vanilla RNN) Linear Kalman Filter 6

  7. AN EXAMPLE: KALMAN FILTERS VS. RNN Kalman Gain noisy input 𝑦 𝑢 = 𝑋𝑦 𝑢−1 + 𝐿 𝑢 (𝑧 𝑢 − 𝑊𝑦 𝑢−1 ) ℎ 𝑢 = 𝜏 1 (𝑋ℎ 𝑢−1 + 𝑉𝑦 𝑢 + 𝑐 1 ) noisy Input 𝑦 𝑢 = (𝑋 −𝐿 𝑢 𝑊)𝑦 𝑢−1 +𝐿 𝑢 𝑧 𝑢 𝑧 𝑢 = 𝜏 2 (𝑊ℎ 𝑢 + 𝑐 2 ) 𝑨 𝑢 = 𝑊𝑦 𝑢 target output target output Linear Kalman Filter Simple RNN (i.e., vanilla RNN) 7

  8. AN EXAMPLE: KALMAN FILTERS VS. RNN Kalman Gain 𝑦 𝑢 = 𝑋𝑦 𝑢−1 + 𝐿 𝑢 (𝑧 𝑢 − 𝑊𝑦 𝑢−1 ) noisy noisy Input Input 𝑦 𝑢 = 𝑋𝑦 𝑢−1 + 𝑉𝑧 𝑙 𝑦 𝑢 = (𝑋 −𝐿 𝑢 𝑊)𝑦 𝑢−1 +𝐿 𝑢 𝑧 𝑢 𝑨 𝑢 = 𝑊𝑦 𝑢 𝑨 𝑢 = 𝑊𝑦 𝑢 target target output output Simple RNN (i.e., vanilla RNN): Linear Kalman Filter assume linear activation & no bias 8

  9. A TOY EXAMPLE: TRACKING A MOVING CURSOR Input: a noisy curve y(t) state: [x, x’, x’’] Kalman Filter: 𝑦 𝑢 = (𝑋 −𝐿 𝑢 𝑊)𝑦 𝑢−1 +𝐿 𝑢 𝑧 𝑢 𝑨 𝑢 = 𝑊𝑦 𝑢 LSTM: 𝑦 𝑢 = 𝑀𝑇𝑈𝑁(𝑦 𝑢−1 , 𝑧 𝑢 ) 𝑨 𝑢 = 𝑊𝑦 𝑢 9

  10. FACIAL ANLYSIS IN VIDEOS WITH RNN Variants of RNN: FC-RNN*, LSTM, GRU 10

  11. HEAD POSE FROM VIDEOS Results on BIWI dataset 11

  12. HEAD POSE FROM VIDEOS Input Per-Frame + KF RNN (Ours) 12

  13. LARGE SYNTHETIC DATASET MATTERS! The SynHead Dataset 10 high-quality 3D scans of head models 51,096 head poses from 70 motion tracks 510,960 RGB images in total Accurate head pose and landmark annotations (2D/3D) Available at: https://research.nvidia.com (BIWI Dataset: 24 videos and 15,678 frames in total) 13

  14. LARGE SYNTHETIC DATASET MATTERS! The SynHead Database 14

  15. FACIAL LANDMARKS FROM VIDEO HyperFace Per-Frame RNN (Ours) Ground Truth Estimated 15

  16. MORE EXAMPLES 16

  17. VARIANTS OF RNN FOR LANDMARK ESTIMATION FC-RNN FC-LSTM FC-GRU fc6 0.7567, 0.10 0.7690, 0.13 0.7715 , 0.15 fc7 0.7424, 0.06 0.7539, 0.06 0.7554, 0.36 fc6+fc7 0.7630, 0.28 0.7456, 0.27 0.7605, 0.19 (Latest results) 17

  18. CO-PILOT DEMO IN THE CES KEYNOTE (together with GazeNet by Shalini et.al.) 18

  19. DYNAMIC FACIAL ANALYSIS: From Bayesian Filtering to RNN RNNs can be views as a variant of Bayesian Filters • • A general framework to leverage temporal coherence in videos Large synthetic datasets improve the performance • The SynHead Dataset Available at: https://research.nvidia.com 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend