BIL 722: Advanced Topics in Computer Vision Mehmet Kerim Y cel - PowerPoint PPT Presentation

BIL 722: Advanced Topics in Computer Vision Mehmet Kerim Y ü cel Deep Structured Models For Group Activity Recognition Deng et al. BMVC 2015 Simon Fraser University, SPORTLOGIQ , Canada

Overview 25 April 2016 Deep Structured Models For Group Activity Recognition - Deng, Zhiwei et al., BMVC 2015 - Simon Fraser University, SPORTLOGIQ, Canada Useful links - Paper http://arxiv.org/pdf/1506.04191.pdf - Code / BMVC presentation not available 2 Brunel University London Deep Structured Models For Group Activity Recognition

Overview 25 April 2016 • Individual / group activity recognition problem • Combining atomic action information with their dependencies • Use deep CNNs to learn atomic actions / scene labels • Then refine these labels with NN-made graphical models • State of the art achieved for Collective Activity and Nursing Home (?) Datasets 3 Brunel University London Deep Structured Models For Group Activity Recognition

Overview 25 April 2016 • Major contributions • First to combine CNNs and GM for group activity recognition • Message passing phase created by Neural Nets • Results comparable to state of the art 4 Brunel University London Deep Structured Models For Group Activity Recognition

Literature Review 25 April 2016 • Event understanding is a notorious problem • Need to acquire spot-on info on atomic actions • Such actions include walking, running, waving, etc... • Hand-crafted features (HOG, MBH, improved dense trajectories) in the context of BoW [ Wang, Heng, and Cordelia Schmid ] • Then feed these into a discriminative or generative model • These are swept by DL approaches [ Karpathy, Andrej et al.] [Simonyan, Karen, and Andrew Zisserman] 5 Brunel University London Deep Structured Models For Group Activity Recognition

Literature Review 25 April 2016 • Action Recognition with Improved Trajectories 1 • Improve dense trajectories with camera motion • Remove trajectories consistent with the estimated camera motion • Cancel out camera motion from optical flow 6 Brunel University London Deep Structured Models For Group Activity Recognition

Literature Review 25 April 2016 • Large-scale Video Classification with convolutional neural networks 2 • CNN variants experimented with (taking into account time-domain) 7 Brunel University London Deep Structured Models For Group Activity Recognition

Literature Review 25 April 2016 • Two-stream Convolutional Networks for Action Recognition in Videos 3 • Spatial stream net trained on single frame • Temporal stream net trained on optical flow 8 Brunel University London Deep Structured Models For Group Activity Recognition

Literature Review 25 April 2016 • Event understanding is a notorious problem • We need : interactions of individuals, higher level information • Such interactions and high level activities are suitable for a hierarchical structure • Rich features to capture context; social cues [ Lan, Tian, Leonid Sigal, and Greg Mori ] • Hierarchical Graphical Models [ Amer, Mohamed Rabie, Peng Lei, and Sinisa Todorovic ] • Dynamic Bayesian Networks [ Zhu, Yingying, Nandita Nayak, and Amit Roy-Chowdhury ] 9 Brunel University London Deep Structured Models For Group Activity Recognition

Literature Review 25 April 2016 • Combination of convolutional neural nets with graphical models • Tompson, Jonathan J., et al. one step message passing implemented as convolution operation, incorporating spatial relations between local responses for human body pose estimation • Deng, Jia, et al. relations between predicted labels considered via training a GM on top of a neural net with joint training 10 Brunel University London Deep Structured Models For Group Activity Recognition

Problem Statement & Motivation 25 April 2016 • Motivation of this work • Further the state of the art in group activity recognition • Accurately detect atomic actions/scene labels • Incorporate dependencies between labels for actions/activities • Perform label refinement through a hierarchical structure incorporating said dependencies ... via using a CNN and a HGM based on a neural net that mimics message passing 11 Brunel University London Deep Structured Models For Group Activity Recognition

Graphical Models in a Neural Network 25 April 2016 • Graphical Models... • Defines a joint distribution over states of a set of nodes • Take a factor graph; • Inference done by belief propagation • Belief propagation , at each step of message passing, collects relevant info from connected nodes to a factor node, then passes these messages to variable nodes 12 Brunel University London Deep Structured Models For Group Activity Recognition

Graphical Models in a Neural Network 25 April 2016 • Key point: Mimic message passing using a neural network! • Represent each combination of states as a neuron ( factor neuron ) • Factor neuron can learn dependencies between states and pass messages • Can adopt various neuron types (linear, ReLU, TanH, etc...) • Parameter sharing due to GM integration into NN; reduced free parameters 13 Brunel University London Deep Structured Models For Group Activity Recognition

Graphical Models in a Neural Network 25 April 2016 14 Brunel University London Deep Structured Models For Group Activity Recognition

Message Passing CNN Architecture 25 April 2016 • Key point: Two-stage architecture • First stage: Fine-tuned CNNs that produce scene scores for a frame, and action and pose scores for each person in that frame • Second stage: Message Passing NN that captures label dependencies 15 Brunel University London Deep Structured Models For Group Activity Recognition

Message Passing CNN Architecture 25 April 2016 16 Brunel University London Deep Structured Models For Group Activity Recognition

Message Passing CNN Architecture 25 April 2016 • First stage: Three separate CNNs, for scene, action and pose information • All are fine-tuned using an AlexNet architecture trained on ImageNet • Quite similar architecture, except pooling is done before normalization • Five convolutional layer, two FC layers with softmax output 17 Brunel University London Deep Structured Models For Group Activity Recognition

Message Passing CNN Architecture 25 April 2016 • Second stage: outputs of first stage taken as input • Can contain several steps of message passing • In each step, two types of passes occur: • from outputs of step k-1 to factor layer • from factor layer to k step outputs 18 Brunel University London Deep Structured Models For Group Activity Recognition

Message Passing CNN Architecture 25 April 2016 • Second stage: • In the k th message passing step, the first pass computes dependencies between the states • Inputs to this step; • The first term is the scene score of Image I for label g • The second term is the action score of person I m for label h • The third term is the pose score of person I m for label z 19 Brunel University London Deep Structured Models For Group Activity Recognition

Message Passing CNN Architecture 25 April 2016 • Second stage: • In the factor layer, interactions of pose, action and scene are calculated as; • α g,h,z is 3-d parameter template for combination of scene g, action h and pose z. 20 Brunel University London Deep Structured Models For Group Activity Recognition

Message Passing CNN Architecture 25 April 2016 • Second stage: • Pose actions for all people in the scene are calculated as; • r is all output nodes for all people, t is the factor neuron index for scene g. • T latent neurons are used for a scene g. • Parameters β & α are shared within factors with the same semantic meaning. 21 Brunel University London Deep Structured Models For Group Activity Recognition

Message Passing CNN Architecture 25 April 2016 • Second stage: Output of k th step message passing, score for the scene label g is; • • . is the factor node connected with scene g in scene-action-pose component. is the pose-global factor node. 22 Brunel University London Deep Structured Models For Group Activity Recognition

Message Passing CNN Architecture 25 April 2016 • Second stage: Output of k th step message passing, action score is; • • Pose score; • Model parameters are weights on the edges of NN. • Concatenation of weights from factor layer to output ( W ) (2 nd pass) β & α are weights from inputs to factor layer (1 st pass) • 23 Brunel University London Deep Structured Models For Group Activity Recognition

Components in Factor Layers 25 April 2016 • Unary component • Group activity scores for an image I, action and pose scores for each person I m in frame I • Acquired from previous message passing step and added to the output of next step 24 Brunel University London Deep Structured Models For Group Activity Recognition

Components in Factor Layers 25 April 2016 • Group activity-action-pose layer ϕ • Measure the compatibility between individuals and groups • Capture dependencies between a person’s fine -grained action and the scene label 25 Brunel University London Deep Structured Models For Group Activity Recognition

BIL 722: Advanced Topics in Computer Vision Mehmet Kerim Y cel - PowerPoint PPT Presentation

BIL 722: Advanced Topics in Computer Vision Mehmet Kerim Y cel Deep Structured Models For Group Activity Recognition Deng et al. BMVC 2015 Simon Fraser University, SPORTLOGIQ , Canada Overview 25 April 2016 Deep Structured Models For

BIL Manage Invest Structuring and servicing your investment fund needs Introducing BIL Manage

BIL-722 ADVANCED TOPICS IN COMPUTER VISION a da Ba , N10266943 Paper: Searching for

ASEAN FOR YOU Analysts Presentation 4Q11 RESULTS 1 27 February 2012 Financial Highlights of

Syllabus for CMSC 722, AI Planning Dana S. Nau University of Maryland 2:06 PM January 25,

2018 INSPIR ING TECHNOLOGICAL TR A N SFO R M ATION TOTAL APPROVED INVESTMENTS IN 2018 TOTAL

Civil Remotely Piloted Aircraft System (RPAS) Regulations in Australia Cees Bil School of

B L-722 ADVANCED TOPICS IN COMPUTER VISION a da Ba , N10266943 Paper: Robust Object

UNIVERSITY OF MASSACHUSETTS AMHERST OFFICE OF THE FACULTY SENATE From the 722 nd Meeting of the

Po p u la tio n G ro w th 324,722 350,000 300,000 Po pula tio n 250,000 200,000 150,000

Oscar Lizardi 943 casualties (221 killed, 722 wounded) 13 law enforcement killed 20 law

!"#$%"&$'(")*' +,-.."./ 01,$2.3'45'6&722,'!"#$%"&$

EMA Conference Raj Chudgar rchudgar@pogens.com 832-722-6388 www.pogens.com www.pogens.com

Caring'Safely Module'722 Organizational'Health Trauma'Informed'Work'and'ACEs 1

Chapter 20 Planning in Robotics Dana S. Nau CMSC 722, AI Planning University of Maryland,

1 & 2 KINGS 1 K 111 1K 122K 17 2K 1825 Single United Divided Kingdom Kingdom

CMSC 722, AI Planning Planning and Scheduling Dana S. Nau University of Maryland Fall 2009 Dana

Concurrency 2 Processes and Threads Alexandre David adavid@cs.aau.dk Credits for the

locks / cache coherency / spinlocks / other sync (intro) 1 Changelog 12 Feb 2020: add solution

Solving Concurrent Multiagent Planning using Classical Planning Daniel Furelos-Blanco and Anders

Memory consistency in C++ Computer Architecture J. Daniel Garca Snchez (coordinator) David

Theory of Interaction Yuxi Fu BASICS, Shanghai Jiao Tong University Talk at BASICS 2009,

Atoms Atoms (also called atomic formulas) over are formed according to this syntax: A , B ::=

Many-Sorted First-Order Model Theory Lecture 10 9 th July, 2020 1 / 48 Ehrenfeucht-Fra ss

Foundations of AI 9. Predicate Logic Syntax and Semantics, Normal Forms, Herbrand Expansion

BIL 722: Advanced Topics in Computer Vision Mehmet Kerim Y cel - PowerPoint PPT Presentation

BIL 722: Advanced Topics in Computer Vision Mehmet Kerim Y cel Deep Structured Models For Group Activity Recognition Deng et al. BMVC 2015 Simon Fraser University, SPORTLOGIQ , Canada Overview 25 April 2016 Deep Structured Models For

BIL Manage Invest Structuring and servicing your investment fund needs Introducing BIL Manage

BIL-722 ADVANCED TOPICS IN COMPUTER VISION a da Ba , N10266943 Paper: Searching for

ASEAN FOR YOU Analysts Presentation 4Q11 RESULTS 1 27 February 2012 Financial Highlights of

Syllabus for CMSC 722, AI Planning Dana S. Nau University of Maryland 2:06 PM January 25,

2018 INSPIR ING TECHNOLOGICAL TR A N SFO R M ATION TOTAL APPROVED INVESTMENTS IN 2018 TOTAL

Civil Remotely Piloted Aircraft System (RPAS) Regulations in Australia Cees Bil School of

B L-722 ADVANCED TOPICS IN COMPUTER VISION a da Ba , N10266943 Paper: Robust Object

UNIVERSITY OF MASSACHUSETTS AMHERST OFFICE OF THE FACULTY SENATE From the 722 nd Meeting of the

Po p u la tio n G ro w th 324,722 350,000 300,000 Po pula tio n 250,000 200,000 150,000

Oscar Lizardi 943 casualties (221 killed, 722 wounded) 13 law enforcement killed 20 law

!&quot;#$%&quot;&amp;$'(&quot;)*' +,-..&quot;./ 01,$2.3'45'6&amp;722,'!&quot;#$%&quot;&amp;$

EMA Conference Raj Chudgar rchudgar@pogens.com 832-722-6388 www.pogens.com www.pogens.com

Caring'Safely Module'722 Organizational'Health Trauma'Informed'Work'and'ACEs 1

Chapter 20 Planning in Robotics Dana S. Nau CMSC 722, AI Planning University of Maryland,

1 &amp; 2 KINGS 1 K 111 1K 122K 17 2K 1825 Single United Divided Kingdom Kingdom

CMSC 722, AI Planning Planning and Scheduling Dana S. Nau University of Maryland Fall 2009 Dana

Concurrency 2 Processes and Threads Alexandre David adavid@cs.aau.dk Credits for the

locks / cache coherency / spinlocks / other sync (intro) 1 Changelog 12 Feb 2020: add solution

Solving Concurrent Multiagent Planning using Classical Planning Daniel Furelos-Blanco and Anders

Memory consistency in C++ Computer Architecture J. Daniel Garca Snchez (coordinator) David

Theory of Interaction Yuxi Fu BASICS, Shanghai Jiao Tong University Talk at BASICS 2009,

Atoms Atoms (also called atomic formulas) over are formed according to this syntax: A , B ::=

Many-Sorted First-Order Model Theory Lecture 10 9 th July, 2020 1 / 48 Ehrenfeucht-Fra ss

Foundations of AI 9. Predicate Logic Syntax and Semantics, Normal Forms, Herbrand Expansion

!"#$%"&$'(")*' +,-.."./ 01,$2.3'45'6&722,'!"#$%"&$

1 & 2 KINGS 1 K 111 1K 122K 17 2K 1825 Single United Divided Kingdom Kingdom