Learning SPD-matrix-based Representation for Visual Recognition Lei - PowerPoint PPT Presentation

Learning SPD-matrix-based Representation for Visual Recognition Lei Wang VILA group School of Computing and Information Technology University of Wollongong, Australia 22-OCT-2018

Introduction • How to represent an image? – Scale, rotation, illumination, occlusion, background clutter, deformation, … Cat:

1. Before year 2000 • Hand-crafted, global features – Color, texture, shape, structure, etc. – Goal: “ Invariant and discriminative ” • Classifier – K-nearest neighbor, SVMs, Boosting, …

2. Days of the Bag of Features (BoF) model Local Invariant Features • Invariant to view angle, rotation, scale, illumination, clutter, ... Interest point detection or Dense sampling An image becomes “A bag of features”

3. Era of Deep Learning Deep Local Descriptors Depth Height Width “Cat”

Image(s): a set of points/vectors Object detection & classification Image set classification vs. Action recognition Neuroimaging analysis 6 How to pool a set of points/vectors to obtain a global visual representation ?

Covariance representation Essentially a second-order pooling x 1 x 2 How to pool? A set of . local . descriptors . x n Max pooling, average (sum) pooling, etc. • • Covariance pooling

• Introduction on Covariance representation • Our research work – Discriminatively Learning Covariance Representation – Exploring Sparse Inverse Covariance Representation – Moving to Kernel-matrix -based Representation (KSPD) – Learning KSPD in deep neural networks • Conclusion

Introduction on Covariance representation Covariance Matrix vs.

Introduction on Covariance representation Use a Covariance matrix as a feature representation 10 Image is from http://www.statsref.com/HTML/index.html?multivariate_distributions.html

Introduction on Covariance representation belongs to Symmetric Positive Definite (SPD) matrix resides on a manifold instead of the whole space 11

Introduction on Covariance representation How to measure the similarity of two SPD matrices? ? 12

Introduction on SPD matrix Similarity measures for SPD matrices Geodesic distance SPD Euclidean matrices mapping Kernel method 13

Introduction on SPD matrix Fletcher P T, Principal geodesic analysis on symmetric spaces: Statistics of Pennec X, Fillard P, Ayache N. A diffusion tensors. Computer Vision and Riemannian framework for tensor Mathematical Methods in Medical and computing. IJCV, 2006 Biomedical Image Analysis., 2004 2004 2006 Geodesic 2003 distance Lenglet C, Statistics on the manifold of Förstner W, Moonen B. A metric for multivariate normal distributions: Theory and covariance matrices, Geodesy-The application to diffusion tensor MRI processing. Challenge of the 3rd Millennium, 2003 Journal of Mathematical Imaging and Vision, 2006 14

Introduction on SPD matrix Arsigny V, Log-Euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic resonance in medicine, 2006, 2005 2008 Euclidean 2006 mapping Tuzel O, Pedestrian detection via classification on Veeraraghavan A, Matching shape sequences in riemannian manifolds. PAMI, IEEE Transactions on, 2008 video with applications in human movement analysis. PAMI, IEEE Transactions on, 2005 15

Introduction on SPD matrix Sra S. Positive definite matrices and the S- Wang R., et. al., Covariance discriminative Vemulapalli R, Pillai J K, Chellappa R. divergence. arXiv preprint arXiv:1110.1773, learning: A natural and efficient approach Kernel learning for extrinsic classification 2011. to image set classification, CVPR, 2012 of manifold features, CVPR, 2013 2011 Kernel methods 2012 2013 2014 Quang, Minh Ha, et. Al., Log-Hilbert- Schmidt metric between positive definite Harandi M et al. Sparse coding and dictionary S. Jayasumana, et. al., Kernel methods on the Riemannian operators on Hilbert spaces. NIPS. 2014. learning for SPD matrices: a kernel approach, manifold of symmetric positive definite matrices, CVPR 2013. ECCV, 2012 16

Introduction on SPD matrix Lin et al, Bilinear CNN Models for Fine-grained Ionescu et al, , Matrix Backpropagation for Li et al., Is Second-order Information Visual Recognition, ICCV2015 Deep Networks with Structured Layers, Helpful for Large-scale Visual Recognition? ICCV2015 ICCV2017 2015 Integration with deep learning 2017 2018 Koniusz et al., A Deeper Look at Power Normalizations,, CVPR 2018 Huang et al., A Riemannian Network for SPD Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017 Matrix Learning, AAAI2017 17

Motivation Covariance Matrix Covariance matrix needs to be estimated from data

Motivation • Covariance estimate becomes unreliable – High-dimensional ( d ) features – Small sample ( n ) • Existing work – Not consider the quality of covariance representation – Especially the estimate of eigenvalues 20

Motivation Stein Kernel 21

Motivation 1. Eigenvalue estimation becomes biased when the number of samples is inadequate 22

Motivation 2. The eigenvalues are not collectively manipulated toward greater discrimination Class 2 Class 1 23

Proposed method Let’s do a data-dependent “ eigenvalue massage ” Class 2 Class 1 adjustment Class 2 Class 1 24

Proposed method We propose “ Discriminative Covariance Representation ” Power-based adjustment Coefficient-based adjustment 25

Proposed method -adjusted S-Divergence: • Power-based adjustment • Coefficient-based adjustment Discriminative Stein kernel (DSK) 26

Proposed method How to learn the optimal adjustment parameter ? • Kernel Alignment based method • Class Separability based method • Radius-margin Bound based Framework Discriminative Stein kernel (DSK) 27

Experimental Result Data sets • Brodatz texture • FERET face • ETH-80 object • ADNI rs-fMRI

Experimental Result The most difficult 15 pairs of Brodatz texture data set 29

Experimental Result The most difficult 15 pairs of Brodatz texture data set 30

Discussion DSK vs. eigenvalue estimation improvement methods [1] X. Mestre, “Improved estimation of eigenvalues and eigenvectors of covariance matrices using their sample estimates,” IEEE Trans. Inf. Theory, vol. 54, pp. 5113–5129, Nov. 2008. [2] B. Efron and C. Morris, “Multivariate empirical Bayes and estimation of covariance matrices,” Ann. Stat., vol. 4, pp. 22–32, 1976. [3] A. Ben-David and C. E. Davidson, “Eigenvalue estimation of hyper-spectral Wishart covariance matrices from limited number of samples,” IEEE Trans. Geosci. Remote Sens., vol. 50, pp. 4384–4396, May 2012. 31

Introduction Applications with high dimensions but small sample issue Small sample 10 ~ 300 High dimensions 50 ~ 400 33

Introduction This results in singular covariance estimate, which adversely affects representation. How to address this situation? Data + Prior knowledge Explore the underlying structure of visual features 34

Proposed SICE representation Structure sparsity in skeletal human action recognition • Only a small number of joints are directly linked . How to represent such direct links ? • Sparse Inverse Covariance Estimation (SICE) 35

Proposed SICE representation 36

Proposed SICE representation Properties of SICE representation: • is guaranteed to be nonsingular • reduces over-fitting, giving more reliable representation • Measures the partial correlation , allowing the sparsity prior to be conveniently imposed 37

Application to Skeletal Action Recognition 38

Application to Skeletal Action Recognition

Application to other tasks The principle of `` Bet on sparsity '' 40

Introduction Again, look into Covariance representation … 42

Introduction Again, look into Covariance representation i -th feature j -th feature … Just a linear kernel function! 43

Introduction Covariance representation Resulting issues: • Only modeling linear correlation of features. • A single, fixed representation form. • Unreliable or even singular covariance estimate. 44

Proposed kernel-matrix representation Let’s use a kernel matrix instead Covariance SPD Matrix! Advantages : Model nonlinear relationship between features; • • For many kernels, M is guaranteed to be nonsingular , no matter what the feature dimensions and sample size are. • Maintain the size of covariance representation and the computational load. 45

Learning SPD-matrix-based Representation for Visual Recognition Lei - PowerPoint PPT Presentation

Learning SPD-matrix-based Representation for Visual Recognition Lei Wang VILA group School of Computing and Information Technology University of Wollongong, Australia 22-OCT-2018 Introduction How to represent an image? Scale,

The ALICE AMORE SPD GUI Marco van Woerden NIKHEF / University of Amsterdam / Leiden University

OR/SPD Relations Maureen Greer, Corporate Manager SPD Renee McGuire, Patient Care Specialist

SET PARTITIONING Denis Khryashchev INTRODUCTION INTRODUCTION PROBLEM STATEMENT NUMBER OF

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

DEFUND SPD/REINVEST IN COMMUNITY July 7, 2020 Decriminalize Seattle King County Equity Now

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Building an IoT Platform with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Introductory Matrix Operations Matrix Entries Defn. For matrix A , notation a ij means the en-

Gov 2000: 10. Multiple Regression in Matrix Form Matthew Blackwell Fall 2016 1 / 64 1. Matrix

Liberating Communication with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

Matrix COSEC Right People in Right Place at Right Time Matrix COmplete SECurity Matrix COSEC

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector

Textual Similarity of Firm Accounting Policy Disclosures Reginald Edwards, PhD April 2018

t t tts

Citation networks in economics Carlo D Ippoliti Carlo D Ippoliti Citation Networks in

Investor Similarity Affects Investment Decisions This Paper: investors who trade an asset care

Fuzzy Clustering in Parallel Universes Bernd Wiswedel Michael R. Berthold ALTANA Chair for

Automatic Scoring of Automatic Scoring of Handwritten Essays using Latent Handwritten Essays

Semantic Density Analysis: Comparing word meaning across time and phonetic space Sagi, Kauffman,

2Q19 Earnings Presentation August 28, 2019 For use with the general public. 1 Forward looking