Human Pose Estimation and Action Recognition Gang Yu, Megvii - PowerPoint PPT Presentation

ICIP 2019 Tutorial Human Pose Estimation and Action Recognition Gang Yu, Megvii (Face++) Junsong Yuan, SUNY Buffalo Zicheng Liu, Microsoft

Overview • Part1: Human Pose • Part2: Action Recognitio n Estimation – Datasets • 2D Skeleton • RGB • Top-Down • RGB-D • Bottom-Up • – Skeleton based 3D Skeleton • 2D -> 3D Skeleton approaches • 2D -> 3D Shape • 2D and 3D skeletons • Application – Video based approaches • 2D/3D CNN features

Human Pose Estimation Algorithm and Application Gang Yu y u g a n g @ m e g v i i . c o m

Outline • Introduction to Human Pose Estimation • 2D Skeleton • Top-Down • Bottom-Up • 3D Skeleton • 2D -> 3D Skeleton • 2D -> 3D Shape • Application • Conclusion

What is Human Pose Estimation?

Benchmark and Evaluation • Benchmark • Single-person Estimation • MPII, FLIC, LSP, LIP • Multi-person Keypoint Detection • COCO, CrowdPose • Video • PoseTrack • 3D • Human3.6M, DensePose • Evaluation on COCO

Outline • Introduction to Human Pose Estimation • 2D Skeleton • Top-Down • Bottom-Up • 3D Skeleton • 2D -> 3D Skeleton • 2D -> 3D Shape • Application • Conclusion

2D Skeleton: How to Do Pose Estimation • Top-down Approach VS Bottom-up Approach Top-down Head Human L-Arm Bottom-up • Top-down • Mask R-CNN, CPN, MSPN • High Performance (good localization ability), High Recall • Bottom-up • Openpose, Associative Embeding • Clean framework, potentially fast speed Mask R-CNN, Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick, ICCV 2018 Cascaded Pyramid Network for Multi-Person Pose Estimation, Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, Jian Sun, CVPR 2018 Rethinking on Multi-Stage Networks for Human Pose Estimation, Wenbo Li, Zhicheng Wang, Binyi Yin, Qixiang Peng, Yuming Du, Tianzi Xiao, Gang Yu, Hongtao Lu, Yichen Wei, Jian Sun OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, Yaser Sheikh, Associative Embedding: End-to-End Learning for Joint Detection and Grouping, Alejandro Newell, Zhiao Huang, Jia Deng, NIPS 2017

Challenges • Ambiguous Appearance • Crowd Case • Large Pose • Inference Speed

Top-Down: Mask R-CNN • Motivation: • Multi-task learning • ROI Pool -> ROI Align Mask R-CNN, Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick, ICCV 2017

Top-Down: Mask R-CNN • Experiments on COCO Skeleton: Mask R-CNN, Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick, ICCV 2017

Top-Down: Hourglass • Motivation: • Crop & Single Person Skeleton • Multi-stage context refinement Stacked Hourglass Networks for Human Pose Estimation, Alejandro Newell, Kaiyu Yang, and Jia Deng, ECCV 2016

Top-Down: Hourglass • Structure of a one block Stacked Hourglass Networks for Human Pose Estimation, Alejandro Newell, Kaiyu Yang, and Jia Deng, ECCV 2016

Top-Down: Hourglass • Experiments Stacked Hourglass Networks for Human Pose Estimation, Alejandro Newell, Kaiyu Yang, and Jia Deng, ECCV 2016

Top-Down: Single Person Skeleton: CPM • Motivation: • Multi-stage context refinement • Large receptive Field -> long range spatial relationship Convolutional Pose Machines, Shih-En Wei, Varun Ramakrishna, Takeo Kanade, Yaser Sheikh, CVPR 2016

Top-Down: Cascade Pyramid Network • Motivation: How to locate the “hard” joints • Human perspective Cascaded Pyramid Network for Multi-Person Pose Estimation, Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, Jian Sun, CVPR 2018

Top-Down: Cascade Pyramid Network • Motivation: How to locate the “hard” joints • Human perspective ✓ Nose ✓ Left elbow Visible easy keypoints ✓ Right hand ✕ What ? easy visible parts ✕ What?

Top-Down: Cascade Pyramid Network • Motivation: How to locate the “hard” joints • Human perspective ✓ Nose ✓ Left elbow Visible easy keypoints context ✓ Right hand ✓ Left knee ✕ ✓ Visible hard enlarge view Right knee What ? keypoints ✓ Left hip easy visible parts ✕ hard to hard visible parts enlarge view What? distinguish?

Top-Down: Cascade Pyramid Network • Motivation: How to locate the “hard” joints • Human perspective ✓ Nose ✓ Left elbow Visible easy keypoints context ✓ Right hand ✓ Left knee ✕ ✓ Visible hard enlarge view Right knee What ? keypoints context ✓ Left hip easy visible parts ✕ Right hard to hard visible parts enlarge view ✓ What? shoulder distinguish? Invisible part

Top-Down: Cascade Pyramid Network • Motivation: How to locate the “hard” joints • Human perspective ： Coarse to Fine coarse fine parts parts Input image receptive view getting larger Output image & more context

Network Architecture Network Design Principles: ● Inspired by the process of human locating keypoints and adjusted to CNN network ○ locate easy parts => locate hard parts ● Two stages ○ GlobalNet: to locate the easy parts (Vanilla L2 loss) ○ RefineNet: to locate hard parts (deep layers) with online hard keypoint mining(Hard Mining Loss)

Experiments: Person Detector 69.4 69.7 69.8 69.8 Keypoint mAP 68.8 36.3 41.1 44.3 49.3 52.1 Det mAP

Experiments: Online Hard Keypoints Mining

Experiments: Design Choices of GlobalNet & RefineNet

Experiments

Summary for CPN • Hard Keypoints with Coarse-to-fine Strategy (context) • Code: https://github.com/chenyilun95/tf-cpn • MS COCO2017 Challenge Winner

Top-Down: A Simple Baseline • Motivation • Simple Baseline & OKS based tracking • Spatial Resolution Simple Baselines for Human Pose Estimation and Tracking, Bin Xiao, Haiping Wu, Yichen Wei, ECCV 2018

Top-Down: A Simple Baseline • Experiments on COCO and PoseTrack Simple Baselines for Human Pose Estimation and Tracking, Bin Xiao, Haiping Wu, Yichen Wei, ECCV 2018

Top-Down: HRNet • Motivation • High Resolution Feature maps Deep High-Resolution Representation Learning for Human Pose Estimation , Ke Sun, Bin Xiao, Dong Liu, Jingdong Wang, CVPR2019

Top-Down: HRNet Deep High-Resolution Representation Learning for Human Pose Estimation , Ke Sun, Bin Xiao, Dong Liu, Jingdong Wang, CVPR2019

Top-Down: HRNet • Experiments Deep High-Resolution Representation Learning for Human Pose Estimation , Ke Sun, Bin Xiao, Dong Liu, Jingdong Wang, CVPR2019

Top-Down: Multi-stage Pose Estimation • Motivation • Upperbound • Only Two-stages available (limited Context) Rethinking on Multi-Stage Networks for Human Pose Estimation, Wenbo Li, Zhicheng Wang, Binyi Yin, Qixiang Peng, Yuming Du, Tianzi Xiao, Gang Yu, Hongtao Lu, Yichen Wei, Jian Sun

Top-Down: Multi-stage Pose Estimation • Method • Coarse-to-fine with better information flow • Involve more stages

Top-Down: Multi-stage Pose Estimation • Cross Stage Feature Aggregation • Coarse-to-fine Supervision

Experiments: More Stages

Experiments: CTF & CSFA

Experiments: COCO test-dev

Experiments: COCO test-Challenge

Summary for MSPN • Refined Coarse-to-fine Strategy • Code: https://github.com/megvii-detection/MSPN • MS COCO2018 Challenge Winner

Bottom-Up: DeepCut • Motivation • Part Detector • Assemble (Integer Linear Optimization) DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation, Leonid Pishchulin, Eldar Insafutdinov, Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, Peter Gehler, Bernt Schiele, CVPR 2016

Bottom-Up: DeeperCut • Motivation • Deeper Part Detector + Assemble (image-conditioned pairwise terms + incremental optimization) DeeperCut: A Deeper, Stronger, and Faster Multi-Person Pose Estimation Model, Eldar Insafutdinov, Leonid Pishchulin, Bjoern Andres, Mykhaylo Andriluka, Bernt Schiele, ECCV2016

Bottom-Up: OpenPose • Motivation • Part Detector (CPM) + Assemble (PAF) Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh, CVPR 2017

Bottom-Up: OpenPose • Experiments on MPI and COCO Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh, CVPR 2017

Bottom-Up: Associative Embedding • Motivation • Part Detector (Hourglass) + Assemble (AE) Associative Embedding: End-to-End Learning for Joint Detection and Grouping, Alejandro Newell, Zhiao Huang, Jia Deng, NIPS 2017

Human Pose Estimation and Action Recognition Gang Yu, Megvii - PowerPoint PPT Presentation

ICIP 2019 Tutorial Human Pose Estimation and Action Recognition Gang Yu, Megvii (Face++) Junsong Yuan, SUNY Buffalo Zicheng Liu, Microsoft Overview Part1: Human Pose Part2: Action Recognitio n Estimation Datasets 2D Skeleton

Human Pose Estimation by Yannic Jnike - 04.11.2019 https://www.youtube.com/watch?v=mxKlUO_tjcg

Hand Pose Estimation Matthew Krenik Advisor: Fabrizio Pece Agenda What is Hand Pose

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Fields of Parts & Friends peter.gehler.net p i Detection + Geometry p i Human Pose

Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image Denis Tom

LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking Authors: Guanghan Ning,

Nonlinear Filter Design for Pose and IMU Bias Estimation Glauco Garcia Scandaroli, Pascal Morin.

Low Cost solution for Pose Estimation of Quadrotor mangal@iitk.ac.in

Gesture Recognition: Hand Pose Estimation Adrian Spurr Ubiquitous Computing Seminar FS2014

Tsinghua University Monocular Depth-Pose Prediction [R, t] Depth and Pose RGB PoseNet

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

Pictorial Structures Revisited: People Detection and Articulated Pose Estimation Mykhaylo

CosyPose: Consistent multi-view multi-object 6D pose estimation arXiv:2008.08465 Yann Labb 1,2

Human Pose Recovery And Gesture Recognition CS365 : Artificial Intelligence Khandesh

Human Pose Search using Deep Poselets Nataraj Jammalamadaka * Andrew Zisserman C. V. Jawahar *

Chirality Nets for Human Pose Regression Raymond A. Yeh, Yuan-Ting Hu, Alexander G. Schwing

How Do You Know That? Helping Students Write About Claims and Evidence Presented by: Jodi

2a Kinesiology: Names and Locations of Bones and Posterior Muscles 2a Kinesiology:

Staying on Your Feet Taking Steps to Prevent Falls Speakers Notes Slide 1 Staying on Your

32b Passive Stretches: Guided Full Body 32b Passive Stretches: Guided Full Body Class Outline

THE SKELETON OF EUPLECTELLA ASPERGILLUM AS FOUNDATION FOR THE DEVELOPMENT OF NOVEL COMPOSITE

Learning Objective Aim To understand how fossils can be used to give us information about

Science Arizona Department of Education Exceptional Student Services Janet Fukuda, MEd

Realities that m atter: Doings and Makings of an Online Gam e Shu in Shanghai Silvia Lindtner,

Human Pose Estimation and Action Recognition Gang Yu, Megvii - PowerPoint PPT Presentation

ICIP 2019 Tutorial Human Pose Estimation and Action Recognition Gang Yu, Megvii (Face++) Junsong Yuan, SUNY Buffalo Zicheng Liu, Microsoft Overview Part1: Human Pose Part2: Action Recognitio n Estimation Datasets 2D Skeleton

Human Pose Estimation by Yannic Jnike - 04.11.2019 https://www.youtube.com/watch?v=mxKlUO_tjcg

Hand Pose Estimation Matthew Krenik Advisor: Fabrizio Pece Agenda What is Hand Pose

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Fields of Parts &amp; Friends peter.gehler.net p i Detection + Geometry p i Human Pose

Lifting from the Deep: Convolutional 3D Pose Estimation from a Single Image Denis Tom

LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking Authors: Guanghan Ning,

Nonlinear Filter Design for Pose and IMU Bias Estimation Glauco Garcia Scandaroli, Pascal Morin.

Low Cost solution for Pose Estimation of Quadrotor mangal@iitk.ac.in

Gesture Recognition: Hand Pose Estimation Adrian Spurr Ubiquitous Computing Seminar FS2014

Tsinghua University Monocular Depth-Pose Prediction [R, t] Depth and Pose RGB PoseNet

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

Pictorial Structures Revisited: People Detection and Articulated Pose Estimation Mykhaylo

CosyPose: Consistent multi-view multi-object 6D pose estimation arXiv:2008.08465 Yann Labb 1,2

Human Pose Recovery And Gesture Recognition CS365 : Artificial Intelligence Khandesh

Human Pose Search using Deep Poselets Nataraj Jammalamadaka * Andrew Zisserman C. V. Jawahar *

Chirality Nets for Human Pose Regression Raymond A. Yeh*, Yuan-Ting Hu*, Alexander G. Schwing

How Do You Know That? Helping Students Write About Claims and Evidence Presented by: Jodi

2a Kinesiology: Names and Locations of Bones and Posterior Muscles 2a Kinesiology:

Staying on Your Feet Taking Steps to Prevent Falls Speakers Notes Slide 1 Staying on Your

32b Passive Stretches: Guided Full Body 32b Passive Stretches: Guided Full Body Class Outline

THE SKELETON OF EUPLECTELLA ASPERGILLUM AS FOUNDATION FOR THE DEVELOPMENT OF NOVEL COMPOSITE

Learning Objective Aim To understand how fossils can be used to give us information about

Science Arizona Department of Education Exceptional Student Services Janet Fukuda, MEd

Realities that m atter: Doings and Makings of an Online Gam e Shu in Shanghai Silvia Lindtner,

Fields of Parts & Friends peter.gehler.net p i Detection + Geometry p i Human Pose

Chirality Nets for Human Pose Regression Raymond A. Yeh, Yuan-Ting Hu, Alexander G. Schwing