Disentangling and Unifying Graph Convolutions for Skeleton-Based - PowerPoint PPT Presentation

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition Ziyu Liu 1,3 , Hongwen Zhang 2 , Zhenghao Chen 1 , Zhiyong Wang 1 , Wanli Ouyang 1,3 1 The University of Sydney, 2 University of Chinese Academy of Sciences & CASIA   3 The University of Sydney, SenseTime Computer Vision Research Group, Australia

Agenda • Overview • Contributions 1. Factorized Modeling Unified Spatial-Temporal Modeling → 2. Adjacency Powering Disentangling Neighborhoods → • Experiments & Results

Action Recognition from Skeletons • Human actions can be e ffi ciently represented by skeletons • Free of background clutter / lighting conditions / clothing variations Skeleton-Based … … … Action Recognition “Hand Shaking” Input Video Estimated 2D/3D Poses Skeletons Predicted Action Image Credit: Amir Shahroudy, Jun Liu, Tian-Tsong Ng, Gang Wang, "NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis", CVPR 2016

Previous Approaches • Traditional • Handcraft features (e.g. Vemulapalli et al., CVPR’14; Huang et al., CVPR’17) • CNNs / RNNs (e.g. Ke et al., Wang et al., Liu et al., CVPR’17; Si et al., ECCV’18) • Often overlook semantic connectivity patterns between joints Huang et al., CVPR’17 • Graph-Based (e.g. Shi et al., Li et al., Si et al., CVPR’19; Li et al., Wen et al., AAAI’19) • Graphs naturally captures the structure of human bodies • nodes, bones edges Joints → → • No hand-crafted node traversal G = ( V , E )

̂ Preliminaries • Actions as Graph Sequences • Structure: -node graph with adjacency matrix (normalized ) N A A • Features: Joint locations over frames X T • Goal: Learn to classify graph sequences Each Frame … Entire Action A ∈ ℝ N × N A ∈ ℝ N × N Structure: Structure: X t ∈ ℝ N × C X ∈ ℝ T × N × C Features: Features:

̂ Preliminaries • Feature Learning with Graph Convolutional Nets (GCNs) (Kipf et al., ICLR’17) 1. Neighborhood feature aggregation 2. Layer-wise feature update Each Frame Entire Action Feature Update Neighborhood X ( l +1) = σ ( … A X ( l ) Θ ( l ) ) Aggregation Neighborhood A ∈ ℝ N × N A ∈ ℝ N × N Structure: Structure: Aggregation X t ∈ ℝ N × C X ∈ ℝ T × N × C Features: Features:

Existing Graph-Based Approaches 2. Multi-Scale Graph Convolutions 1. Factorized Modeling GCNs + Temporal Models GCNs over di ff erent adjacency powers … … Spatial Aggregation Spatial Aggregation Temporal Aggregation e.g. Li et al. CVPR’19, Shi et al. CVPR’19, Shi et al. CVPR’19, Yan et al. AAAI’18 e.g. Li et al. CVPR’19, Li et al. AAAI’18

Previous Approach #1: Factorized Modeling • Learn spatial-temporal features with spatial / temporal modules • Spatial : Neighborhood aggregation (GCNs) • Temporal : Node-wise sequence models (1d conv / recurrent) … … Spatial Aggregation Temporal Aggregation (cf. Factorized 3D CNNs)

Motivation #1: Indirect Information Flow • Factorization can create bottlenecks for feature propagation • Unweighted message passing (GCNs) can also make aggregated features generic information bottleneck Hard to learn spatial-temporal relationships (cf. Factorized 3D CNNs)

Idea #1: Unified Spatial-Temporal Modeling • G3D modules : neighborhood aggregation across space and time • Edges serve as skip connections, allowing more direct information flow G3D Spatial-Temporal Information Flow

Idea #1: Unified Spatial-Temporal Modeling Spatial Graph Skeleton Features (1) Sliding G3D Spatial-Temporal Window Temporal Window Window Features Spatial-Temporal Information Flow Temporal Edges (3) Graph Convolutions over Windows Spatial Edges +: )×) G3D Spatial-Temporal Edges #: % &' ×)×* &' Sliding Temporal Window size = ! , dilation = " + (-) : !)×!) (4) Squeeze Windows with 1x1 Conv # (-) : % /01 ×!)×* &' GCN X ( τ ) ! (Window Size) # (-) : % /01 ×!)×* 2&3 (2) Extrapolate Spatial Connectivity Conv 1x1 Collapse Window Reshape + FC BatchNorm #: % /01 ×)×* /01 X

Discussion • Analogous to 3D convolutions on RGB videos • Unlike 3D conv, # parameters is independent of receptive field size • Temporal receptive field can be controlled based on input resolution • Considers more information at once and helps prevent losing features during unweighted spatial aggregation • Memory footprint Spatial-Temporal Neighborhood Aggregation

̂ ̂ ̂ ̂ ̂ ̂ ̂ Previous Approach #2: Multi-Scale Graph Convolutions ˜ • Making A k k -hop neighbors reachable with • Mixing features A k X for k = 0,1,... with normalized A 0 0 0 5 5 5 3 3 3 Multi-Scale Aggregation 1 1 1 2 2 2 4 4 4 A k with di ff erent and Θ ( k ) 7 7 7 = σ ( ( k ) ) K ∑ X ( l +1) A k X ( l ) t Θ ( l ) t k =0 A 1 A 2 A 3 e.g. Li et al. CVPR’19, Abu-El-Haija et al. ICML’19, Luan et al. NeurIPS’19, Liao et al. ICLR’19

̂ ̂ ̂ Multi-Scale Aggregation A k with di ff erent and Θ ( k ) = σ ( ( k ) ) K Motivation #2: Biased Node Weighting ∑ X ( l +1) A k X ( l ) t Θ ( l ) t k =0 • Node weights from A k are biased towards closer nodes • More length- walks to closer nodes due to cyclic walks k Number of length- k walks from Node 1 Number of length- k walks to Self 500000 600000 To Node 1 Node 1 To Node 2 Node 2 To Node 3 Node 3 500000 400000 To Node 4 Node 4 � To Node 5 Node 5 Number of Walks Number of Walks 400000 300000 � � 300000 200000 200000 � � 100000 100000 0 0 0 2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 18 Walk Length k Walk Length k

̂ ̂ ̂ Multi-Scale Aggregation A k with di ff erent and Θ ( k ) = σ ( ( k ) ) K Motivation #2: Biased Node Weighting ∑ X ( l +1) A k X ( l ) t Θ ( l ) t k =0 • Node weights from A k are biased towards closer nodes • More length- walks to closer nodes due to cyclic walks k Number of length- k walks to Self 400 Node 1 Node 1 (Self-loops) 350 Node 2 Node 2 (Self-loops) 300 Node 3 Number of Walks � Node 3 (Self-loops) 250 � Node 4 � 200 Node 4 (Self-loops) Node 5 150 Node 5 (Self-loops) � � 100 50 0 1 2 3 4 5 6 Walk Length k

Disentangling and Unifying Graph Convolutions for Skeleton-Based - PowerPoint PPT Presentation

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition Ziyu Liu 1,3 , Hongwen Zhang 2 , Zhenghao Chen 1 , Zhiyong Wang 1 , Wanli Ouyang 1,3 1 The University of Sydney, 2 University of Chinese Academy of Sciences &

Self-Supervised Model Training and Selection for Disentangling GANs Previous title: InfoGAN-CR:

Unifying Traditional and Unifying Traditional and Formal Verification Through Formal

Unifying Heterogeneous Cray Unifying Heterogeneous Cray Resources and Systems into an

Unifying Notions of Feedback Sergey Goncharov FAU Tag der Informatik 2019, April 26 Unifying

Unifying Mirror Symmetry Constructions David Favero favero@ualberta.ca University of Alberta

Fast Convolutions Via the Overlap- and-Save Method Using Shared Memory FFT Karel Admek , Sofia

Laplace Transforms and Convolutions Bernd Schr oder logo1 Bernd Schr oder Louisiana Tech

Time-aware Large Kernel Convolutions Vasileios Lioutas and Yuhong Guo ICML | 2020 Brief Overview

Dense Predictions Using Dilated Convolutions Najmus Ibrahim University of Toronto Institute for

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Unifying Logic, Dynamics and Probability: Founda9ons, Algorithms and Challenges Vaishak Belle

(Unifying?) rheology of soft glasses and jammed solids Ludovic Berthier Laboratoire Charles

Hidden Markov Model, Kalman Filter and A Unifying View Mu Li April 16, 2013 Outline Hidden

Vectorization in Graphics Recognition: To Thin or not to Thin Karl Tombre and Salvatore Tabbone

An Overview of Algebraic Topology Richard Wong UT Austin Math Club Talk, March 2017 Slides can

1 Data parallelism Data-parallel Reduction Given: Given: One or several data

Eden: Parallel Processes, Patterns and Skeletons Jost Berthold berthold@diku.dk Department of

r rr r rs

Three questions on graphs of polytopes Guillermo Pineda-Villavicencio Federation University

We put stunning user experiences on the road. 2 Agenda Prototyping

ROVIS Rosetta visualisation and (much) more Bjrn Grieger Trajectories and Operations

Disentangling and Unifying Graph Convolutions for Skeleton-Based - PowerPoint PPT Presentation

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition Ziyu Liu 1,3 , Hongwen Zhang 2 , Zhenghao Chen 1 , Zhiyong Wang 1 , Wanli Ouyang 1,3 1 The University of Sydney, 2 University of Chinese Academy of Sciences &

Self-Supervised Model Training and Selection for Disentangling GANs Previous title: InfoGAN-CR:

Unifying Traditional and Unifying Traditional and Formal Verification Through Formal

Unifying Heterogeneous Cray Unifying Heterogeneous Cray Resources and Systems into an

Unifying Notions of Feedback Sergey Goncharov FAU Tag der Informatik 2019, April 26 Unifying

Unifying Mirror Symmetry Constructions David Favero favero@ualberta.ca University of Alberta

Fast Convolutions Via the Overlap- and-Save Method Using Shared Memory FFT Karel Admek , Sofia

Laplace Transforms and Convolutions Bernd Schr oder logo1 Bernd Schr oder Louisiana Tech

Time-aware Large Kernel Convolutions Vasileios Lioutas and Yuhong Guo ICML | 2020 Brief Overview

Dense Predictions Using Dilated Convolutions Najmus Ibrahim University of Toronto Institute for

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Graph Indexing: Tree + Delta Delta &gt;= Graph &gt;= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Unifying Logic, Dynamics and Probability: Founda9ons, Algorithms and Challenges Vaishak Belle

(Unifying?) rheology of soft glasses and jammed solids Ludovic Berthier Laboratoire Charles

Hidden Markov Model, Kalman Filter and A Unifying View Mu Li April 16, 2013 Outline Hidden

Vectorization in Graphics Recognition: To Thin or not to Thin Karl Tombre and Salvatore Tabbone

An Overview of Algebraic Topology Richard Wong UT Austin Math Club Talk, March 2017 Slides can

1 Data parallelism Data-parallel Reduction Given: Given: One or several data

Eden: Parallel Processes, Patterns and Skeletons Jost Berthold berthold@diku.dk Department of

r rr r rs

Three questions on graphs of polytopes Guillermo Pineda-Villavicencio Federation University

We put stunning user experiences on the road. 2 Agenda Prototyping

ROVIS Rosetta visualisation and (much) more Bjrn Grieger Trajectories and Operations

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,