Human Action Recognition Using Semi-Latent Topic Models Yang Wang - PowerPoint PPT Presentation

Human Action Recognition Using Semi-Latent Topic Models Yang Wang and Greg Mori , 2009 SE367 Paper Presentation - Deepak Pathak 10222

Introduction • Human Action Recognition ( What ?) • Still Images (eg: Poselets) v/s Video Sequences Motivation: • Bag of words representation of image – good results in Object Recognition [Wang,Mori,2009] Bag of Words

Earlier Work (Action Recognition) • Motion Based: • Interest Point Methods: Learning features which Capture local features e.g. based on visual cues train SVM over the (motion + shape) , optical features obtained by STIP flows • Temporal Dynamic Models: • Topic Models: Generative (e.g. HMM) and “Bag of Words” Discriminative (e.g. CRF) Paradigm. to model and learn features (analogous to NLP)

Bag of Words (analogue: NLP to VISION) CodeWord Word (Each frame) CodeBook Vocabulary (all codewords) Topic Action Label Video Sequence Document

Construction of CodeBook Similarity measure Compute Optical Track and between different Flow – then Stabilize person frames descriptors Affinity Matrix K-medoid Codewords: (among all frames clustering into V centroid of these cluster of all sequences) clusters * Here codeword capture large scale features (containing overall temporal information of all videos in training set) * Each video is a sequence of frames where each frame is represented by any codeword obtained above, thus video is a bag of words, removing temporal information.

Topic Models • LDA : Genereative • Semilatent LDA: model to learn the Introduces supervision in distribution of LDA by making use of topics(actions) given a action labels present in document(video) and training dataset. distribution of topics - Thus, better estimate the (action) over words parameters of probability (codewords). Proposed distribution - Dirichlet Distribution Modification • Semilatent CTM- CTM : Similar but Supervised CTM • Logistic Distribution to properly correlation of Note: Don’t have to different topics in a choose topics as they are document. just equal to class labels (unlike unsupervised)

Classification • Classify each frame in the sequence: For each frame, given frame calculate its distribution over action labels i.e. p(z i | W ) . Here, we chose W instead of just the corresponding frame so as to ensure that action label not just depend on the frame itself but video sequence as a whole • SLDA : Models/approximates this probability distribution using other distribution by minimizing KL divergence between the two. • SCTM : It approximates by using coordinate ascent techniques (Variational EM-expected maximization) Firstly we can classify each frame using distribution over • action labels(take maximum) and then if video contains single action then perform majority voting.

Results (per video classification) • Soccer Dataset: • KTH Dataset: SCTM - 78.64% SLDA - 91.2% SLDA - 77.81% SCTM - 90.33% Ballet Dataset: • Weizmann Dataset: • SCTM - 91.36% SLDA - 100% SLDA - 88.66% SCTM - 100% CTM captures correlations • Hockey Dataset: better than LDA, thus SLDA - 87.5% performs better on multiple SCTM - 76.04% action video datasets (i.e. soccer & ballet).

Datasets [Wang,Mori,2009] Sample frames from our datasets

Conclusion Proposals : • 1. A novel “Bag of words” approach for representing video sequences where each frame corresponds to a word, thus capturing large scale features. 2. Two new models : SLDA & SCTM which are basically supervised form of LDA &CTM, thus training is easy with better performance. • Benefit : This paper focuses mainly on per-frame classification, thus works significantly well on datasets of video containing multiple actions.

References Wang, Yang, and Greg Mori. "Human action recognition • by semilatent topic models." Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.10 (2009): 1762-1774. Blei, David M., Andrew Y . Ng, and Michael I. Jordan. • "Latent dirichlet allocation." the Journal of machine Learning research 3 (2003): 993-1022. • Lucas, Bruce D., and Takeo Kanade. "An iterative image registration technique with an application to stereo vision." Proceedings of the 7th international joint conference on Artificial intelligence . 1981.

Human Action Recognition Using Semi-Latent Topic Models Yang Wang - PowerPoint PPT Presentation

Human Action Recognition Using Semi-Latent Topic Models Yang Wang and Greg Mori , 2009 SE367 Paper Presentation - Deepak Pathak 10222 Introduction Human Action Recognition ( What ?) Still Images (eg: Poselets) v/s Video

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Action recognition in videos Action recognition in videos Cordelia Schmid Cordelia Schmid

Action recognition in videos II Action recognition in videos II Cordelia Schmid INRIA Grenoble

Learning for Action Recognition Yemin Shi shiyemin@pku.edu.cn 2018-03 1 Background Action

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

Action recognition in videos Cordelia Schmid Action recognition - goal Short actions, i.e.

Action recognition in videos Cordelia Schmid Action recognition - goal Short actions, i.e.

Action recognition Cordelia Schmid INRIA Grenoble Action recognition examples Short

Action recognition Cordelia Schmid INRIA Grenoble Action recognition examples Short

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Keypoint-Based Action Keypoint-Based Action Recognition Recognition Presenter: Jianchao Yang

Face Tracking Tracking and Person and Person Face Action Recognition Recognition Action

WILLIAMSON COUNTY Recognition Honoring the Recognition Honoring the Human Resources Department

EMPLOYEE RECOGNITION OBJECTIVES Types of recognition Creating a culture of recognition

License Plate Recognition License Plate Recognition License Plate Recognition License Plate

Nuclear Level Density, Underlying Physics, and Constant Temperature Model Vladimir Zelevinsky

Higher-order tensor renormalization group with the corner transfer matrix Satoshi Morita (ISSP ,

Computer Graphics (CS 543) Lecture 3 (Part 3): Implementing Transformations Prof Emmanuel Agu

Integrating NeuroML 2 with PyNN, Brian & CSA Padraig Gleeson Department of Neuroscience,

Introduction to iPEPS Philippe Corboz, Institute for Theoretical Physics, University of Amsterdam

The Classical Turing Machine 2IT70 Finite Automata and Process Theory Technische Universiteit

Assessing the performance of RegCM4- chem in the simulation of ozone levels: Summer circulation

Concurrency Theory Winter Semester 2019/20 Lecture 11: Trace Equivalence Joost-Pieter Katoen and

Human Action Recognition Using Semi-Latent Topic Models Yang Wang - PowerPoint PPT Presentation

Human Action Recognition Using Semi-Latent Topic Models Yang Wang and Greg Mori , 2009 SE367 Paper Presentation - Deepak Pathak 10222 Introduction Human Action Recognition ( What ?) Still Images (eg: Poselets) v/s Video

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Action recognition in videos Action recognition in videos Cordelia Schmid Cordelia Schmid

Action recognition in videos II Action recognition in videos II Cordelia Schmid INRIA Grenoble

Learning for Action Recognition Yemin Shi shiyemin@pku.edu.cn 2018-03 1 Background Action

Green Action Centre, 2019 Green Action Centre, 2019 Green Action Centre, 2019 Green Action

Action recognition in videos Cordelia Schmid Action recognition - goal Short actions, i.e.

Action recognition in videos Cordelia Schmid Action recognition - goal Short actions, i.e.

Action recognition Cordelia Schmid INRIA Grenoble Action recognition examples Short

Action recognition Cordelia Schmid INRIA Grenoble Action recognition examples Short

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Keypoint-Based Action Keypoint-Based Action Recognition Recognition Presenter: Jianchao Yang

Face Tracking Tracking and Person and Person Face Action Recognition Recognition Action

WILLIAMSON COUNTY Recognition Honoring the Recognition Honoring the Human Resources Department

EMPLOYEE RECOGNITION OBJECTIVES Types of recognition Creating a culture of recognition

License Plate Recognition License Plate Recognition License Plate Recognition License Plate

Nuclear Level Density, Underlying Physics, and Constant Temperature Model Vladimir Zelevinsky

Higher-order tensor renormalization group with the corner transfer matrix Satoshi Morita (ISSP ,

Computer Graphics (CS 543) Lecture 3 (Part 3): Implementing Transformations Prof Emmanuel Agu

Integrating NeuroML 2 with PyNN, Brian &amp; CSA Padraig Gleeson Department of Neuroscience,

Introduction to iPEPS Philippe Corboz, Institute for Theoretical Physics, University of Amsterdam

The Classical Turing Machine 2IT70 Finite Automata and Process Theory Technische Universiteit

Assessing the performance of RegCM4- chem in the simulation of ozone levels: Summer circulation

Concurrency Theory Winter Semester 2019/20 Lecture 11: Trace Equivalence Joost-Pieter Katoen and

Integrating NeuroML 2 with PyNN, Brian & CSA Padraig Gleeson Department of Neuroscience,