3D Human Action Segmentation and Recognition using Pose Kinetic - - PowerPoint PPT Presentation

3d human action segmentation and recognition using pose
SMART_READER_LITE
LIVE PREVIEW

3D Human Action Segmentation and Recognition using Pose Kinetic - - PowerPoint PPT Presentation

3D Human Action Segmentation and Recognition using Pose Kinetic Energy Junjie Shan Srinivas Akella Department of Computer Science University of North Carolina, Charlotte Charlotte, North Carolina Action Recognition for Patient Safety


slide-1
SLIDE 1

3D Human Action Segmentation and Recognition using Pose Kinetic Energy

Junjie Shan Srinivas Akella

Department of Computer Science University of North Carolina, Charlotte Charlotte, North Carolina

slide-2
SLIDE 2

Action Recognition for Patient Safety

Microsoft Kinect sensor Linet bed

slide-3
SLIDE 3

Human Poses and Actions

  • Pose: Configuration (set of 3D joint coordinates) of human
  • Action: Sequence of poses

Pose 1 Pose 2 Pose 3 Pose 4 Pose 5

*Skeletons and RGB images are from Cornell Activity Dataset

slide-4
SLIDE 4

Action Recognition Problem

  • Action Recognition: Given a sequence of poses

containing 3D skeleton data, what is the action type?

slide-5
SLIDE 5

Challenges

  • Varied nature of human actions
  • Spatial variations
  • human body size differences
  • orientation and position change
  • pose estimation error
  • Temporal variations
  • non-linear stretching
  • random pauses
  • differences in number of repetitions
slide-6
SLIDE 6

RGB-D Sensor Features

  • RGB + Depth + Coordinates
  • Depth + Coordinates: most common

[LiZL12, WangLWY12, SungPSS12]

  • Coordinates [YangT14]

Source: http://pr.cs.cornell.edu/humanactivities/

slide-7
SLIDE 7

Classification Approaches

  • Sequence based

– HMM [CalinonB05], MEMM [SungPSS12], DTW

[DarrellP93, GavrilaD95, ShaoL13]

– Difficult to train

  • High-level feature extraction

– Extract abstract, meaningful features

[WangLWY12, YangT14]

– Can use many machine learning algorithms

slide-8
SLIDE 8

Our Approach

  • Normalize spatial features
  • Normalize all human poses to same scale
  • Rotate and translate all poses to same position and
  • rientation
  • Repair/discard broken poses
  • Extract temporal features
  • Identify key poses, omit transition poses
  • Ignore random pauses
  • Segment repetitions
  • Apply the machine learning algorithm (Random

Forest, SVM, KNN)

slide-9
SLIDE 9

Outline of Approach

slide-10
SLIDE 10

Pose Kinetic Energy

  • Idea: Identify characteristic poses of

action, at extrema of movements

  • Use kinetic energy
slide-11
SLIDE 11

Key Poses

  • Key poses: Poses that have zero kinetic

energy

  • A key pose P* must satisfy E(P*)=0
  • In practice,
slide-12
SLIDE 12

Identifying Key Poses

slide-13
SLIDE 13

Atomic Action Template

  • 5-tuple of key and intermediate poses
slide-14
SLIDE 14

Atomic Action Template

  • Intermediate pose: Pose at middle frame

between two consecutive key poses

  • Atomic action templates used as features
  • Templates preserve temporal order, e.g., sit down

versus stand up

slide-15
SLIDE 15

Classification Results on Cornell Data

Tested on Cornell Activity dataset with

  • Random Forest (RF)
  • Support vector machine (SVM)
  • K-Nearest neighbor (KNN)
  • Hidden Markov Model (HMM)
slide-16
SLIDE 16

Results on Cornell Activity Dataset

slide-17
SLIDE 17

Microsoft Action3D Dataset

slide-18
SLIDE 18

Temporal Variations

  • Method works well on actions with small

temporal variations

  • In fact, robust to significant temporal

variations

  • Tested on randomly stretched action

samples

slide-19
SLIDE 19

Random Temporal Stretching of Actions

Original Stretched

slide-20
SLIDE 20

Randomly Stretched Action Sample

  • Can still identify key poses in randomly

stretched action samples

slide-21
SLIDE 21

Results: Random Stretching

  • Cornell Activity Dataset
slide-22
SLIDE 22

Conclusion

  • Method to extract features from 3D joint

coordinates using kinetic energy, and recognize actions from features

  • Atomic action templates with key poses exhibit

good discriminative power with multiple classifiers

  • Can perform as well or better than existing

methods while using less data

  • Works robustly on randomly stretched actions
slide-23
SLIDE 23

Future Work

  • Inter-person variation still a challenge
  • Identifying actions in the presence of noise and
  • cclusion
  • Evaluation on streaming data
  • Test on action samples that contain a mix of

different types of actions

slide-24
SLIDE 24

Acknowledgments

  • National Science Foundation Award

IIS-1258335.