Fine-grained Image Recognition Lei Wang VILA group School of - - PowerPoint PPT Presentation
Fine-grained Image Recognition Lei Wang VILA group School of - - PowerPoint PPT Presentation
Learning Kernel-matrix-based Representation for Fine-grained Image Recognition Lei Wang VILA group School of Computing and Information Technology University of Wollongong, Australia 11-Dec-2019 Introduction Fine-grained image recognition
Introduction
- Fine-grained image recognition
Image courtesy of “Wei et. al., Deep learning for fine-grained image analysis: A survey”
Introduction
- Fine-grained image recognition
Image courtesy of “Wei et. al., Deep learning for fine-grained image analysis: A survey”
Introduction
- Fine-grained image recognition
Image courtesy of “Wei et. al., Deep learning for fine-grained image analysis: A survey”
Introduction
- Feature: how to represent an image?
– Scale, rotation, illumination, occlusion, deformation, … – Differences with respect to other classes – Ultimate goal: “Invariant and discriminative”
Cat:
- Hand-crafted, global features
– Color, texture, shape, structure, etc. – An image becomes a feature vector
- Shallow classifiers
– K-nearest neighbor, SVMs, Boosting, …
- 1. Before year 2000
- Invariant to view angle, rotation, scale,
illumination, clutter, ...
- 2. Days of Bag of Features model (2003-12)
Local Invariant Features
Interest point detection
- r
Dense sampling
An image becomes “A set of feature vectors”
- 3. Era of Deep Learning (since 2012)
Deep Local Descriptors
“Cat”
Depth Height Width
Again, an image becomes “A set of feature vectors”
Image(s): a set of points/vectors
Image set classification
9
Action recognition
vs.
Neuroimaging analysis How to pool a set of points/vectors to obtain a global visual representation ? Object recognition
Pooling operation
- Covariance pooling (second-order pooling)
A set of local descriptors
x1 x2 . . . xn
How to pool?
- Max pooling, average (sum) pooling, etc.
- Introduction on Covariance representation
- Our research work
– Moving to Kernel-matrix-based Representation (KSPD) – End-to-end Learning KSPD in deep neural networks
- Conclusion
Outline
11
Introduction on Covariance representation
Covariance Matrix
vs.
Introduction on Covariance representation
Use a Covariance matrix as a feature representation
13
Image is from http://www.statsref.com/HTML/index.html?multivariate_distributions.html
Introduction on Covariance representation
14
belongs to Symmetric Positive Definite (SPD) matrix resides on a manifold instead of the whole space
Introduction on Covariance representation
?
15
How to measure the similarity of two SPD matrices?
Introduction on SPD matrix
Similarity measures for SPD matrices
16
SPD matrices Kernel method Geodesic distance Euclidean mapping
Introduction on SPD matrix
17
Geodesic distance
Pennec X, Fillard P, Ayache N. A Riemannian framework for tensor
- computing. IJCV, 2006
Fletcher P T, Principal geodesic analysis
- n symmetric spaces: Statistics of
diffusion tensors. Computer Vision and Mathematical Methods in Medical and Biomedical Image Analysis., 2004 Förstner W, Moonen B. A metric for covariance matrices, Geodesy-The Challenge of the 3rd Millennium, 2003 Lenglet C, Statistics on the manifold of multivariate normal distributions: Theory and application to diffusion tensor MRI processing. Journal of Mathematical Imaging and Vision, 2006
2003 2004 2006
Introduction on SPD matrix
18
Euclidean mapping
Arsigny V, Log‐Euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic resonance in medicine, 2006, Veeraraghavan A, Matching shape sequences in video with applications in human movement
- analysis. PAMI, IEEE Transactions on, 2005
Tuzel O, Pedestrian detection via classification on riemannian manifolds. PAMI, IEEE Transactions on, 2008
2005 2006 2008
Introduction on SPD matrix
19
Kernel methods
Vemulapalli R, Pillai J K, Chellappa R. Kernel learning for extrinsic classification
- f manifold features, CVPR, 2013
Harandi M et al. Sparse coding and dictionary learning for SPD matrices: a kernel approach, ECCV, 2012
- S. Jayasumana, et. al., Kernel methods on the Riemannian
manifold of symmetric positive definite matrices, CVPR 2013. Sra S. Positive definite matrices and the S-
- divergence. arXiv preprint arXiv:1110.1773,
2011. Wang R., et. al., Covariance discriminative learning: A natural and efficient approach to image set classification, CVPR, 2012
2011 2012 2013
Quang, Minh Ha, et. Al., Log-Hilbert- Schmidt metric between positive definite
- perators on Hilbert spaces. NIPS. 2014.
2014
Introduction on SPD matrix
20
Integration with deep learning
Li et al., Is Second-order Information Helpful for Large-scale Visual Recognition? ICCV2017 Huang et al., A Riemannian Network for SPD Matrix Learning, AAAI2017 Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017 Lin et al, Bilinear CNN Models for Fine-grained Visual Recognition, ICCV2015 Ionescu et al, , Matrix Backpropagation for Deep Networks with Structured Layers, ICCV2015
2015 2017
Koniusz et al., A Deeper Look at Power Normalizations,, CVPR 2018
2018
Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017
Wang et al., G2DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition, CVPR2017
2016
Gao et al. Compact Bilinear Pooling, CVPR2016 Cui et al., Kernel Pooling for Convolutional Neural Networks, CVPR 2017
Introduction on SPD matrix
21
Integration with deep learning
Gao et al., Global Second-order Pooling Convolutional Networks, CVPR2019 Yu et al., Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition, ECCV2018 Wang et al., Deep Global Generalized Gaussian Networks, CVPR2019 Li et al, Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization, CVPR2018 Wei et al., Kernelized Subspace Pooling for Deep Local Descriptors, CVPR2018
2018
Brooks et al., Riemannian batch normalization for SPD neural networks, NeurIPS2019
2019
Zheng et al., Learning Deep Bilinear Transformation for Fine-grained Image Representation , NeurIPS2019
Yu and Salzmann, Statistically- motivated Second-order Pooling, ECCV2018 Lin et al., Second-order Democratic Aggregation, ECCV2018 Fang et al., Bilinear Attention Networks for Person Retrieval, ICCV2019 Zheng et al., Learning Deep Bilinear Transformation for Fine-grained Image Representation , NeurIPS2019
Outline
22
- Introduction on Covariance representation
- Our research work
– Moving to Kernel-matrix-based Representation (KSPD) – End-to-end Learning KSPD in deep neural networks
- Conclusion
Motivation
Covariance Matrix
Covariance matrix needs to be estimated from data
Motivation
- Covariance estimate becomes singular
– High-dimensional (d) features – Small sample (n)
- Covariance matrix only characterises linear
correlation between feature components
24
Introduction
Applications with high dimensions but small sample issue
25
Small sample 10 ~ 300 High dimensions 50 ~ 400
Introduction
Look into Covariance representation
26
…
i-th feature j-th feature Just a linear kernel function!
Proposed kernel-matrix representation (ICCV15)
Let’s use a kernel function instead
27
Advantages:
- Model nonlinear relationship between features;
- For many kernels, M is guaranteed to be nonsingular, no matter
what the feature dimensions (d) and sample size (n) are.
- Maintain the size of covariance representation to be d x d and
therefore the computational load.
Covariance Kernel Matrix!
Application to skeletal action recognition
28
* Cov_JH_SVM uses a kernel function to map each of the n samples into an infinite-dimensional space and implicitly computes a covariance matrix there.
Application to skeletal action recognition
29
* Cov_JH_SVM uses a kernel function to map each of the n samples into an infinite-dimensional space and implicitly computes a covariance matrix there.
Application to object recognition (handcrafted features)
30
Application to scene recognition (extracted deep features)
31
58 60 62 64 66 68 70 72 74 76 78 80 Alex Net (F7) VGG-19 Net (Conv5) Fisher Vector (CVPR15) Cov-RP Ker-RP (RBF)
Comparison on MIT Indoor Scenes Data Set
(Classification accuracy in percentage) *
* Cimpoi et. al., Deep filter banks for texture recognition and segmentation, CVPR2015
Outline
32
- Introduction on Covariance representation
- Our research work
– Moving to Kernel-matrix-based Representation (KSPD) – End-to-end Learning KSPD in deep neural networks
- Conclusion
Covariance representation
Integration with Deep Learning
Bilinear CNN Models for Fine-grained Visual Recognition, Lin et al, ICCV2015
Covariance representation
Integration with Deep Learning
Matrix Backpropagation for Deep Networks with Structured Layers, Ionescu et al, ICCV2015
Covariance representation
Integration with Deep Learning
G^2DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition, Wang et al, CVPR2017
Covariance representation
Integration with Deep Learning
Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017
Covariance representation
Integration with Deep Learning
Is Second-order Information Helpful for Large-scale Visual Recognition?, Li et al., ICCV2017
Proposed DeepKSPD (ECCV18)
Motivation
38
The kernel-matrix-based SPD (KSPD) representation has not been developed upon deep local descriptors has not been jointly learned via deep learning Existing matrix backpropagation for learning covariance-
representation via deep networks
encounters numerical stability issue
Proposed DeepKSPD
Architecture and layers
39
Proposed DeepKSPD
Matrix backpropagation
40
Proposed DeepKSPD
Matrix backpropagation
41
H = f(K) on the kernel matrix K
~
?
Proposed DeepKSPD
Existing matrix backpropagation
Matrix Backpropagation for Deep Networks with Structured Layers, Ionescu et al, ICCV2015
Proposed DeepKSPD
Result from the literature of Operator Theory (1951)
Proposed DeepKSPD
Existing matrix backpropagation (Ionescu et al, ICCV2015) Proposed matrix backpropagation What is their relationship?
Proposed DeepKSPD
Generalize to matrix α-rooting normalization
Experimental Result
Fine-grained Image Recognition
Experimental Result (ECCV18)
Fine-grained Image Recognition
Experimental Result
Numerical stability of backpropagation
Experimental Result
DeepKSPD vs DeepCOV
Experimental Result
Ablation study
- Learning width θ in the GRBF kernel
- Learning α in matrix α-rooting normalisation
Our archive website
https://saimunur.github.io/spd-archive/
Our archive website
https://saimunur.github.io/spd-archive/
Our archive website
https://saimunur.github.io/spd-archive/
71 73 75 77 79 81 83 85 87 89 91
Accuracy
CUB-200-2011 dataset (SPD & related)
Our archive website
https://saimunur.github.io/spd-archive/
76 78 80 82 84 86 88 90 92 94
Accuracy
FGVC-Aircraft dataset (SPD & related)
Our archive website
https://saimunur.github.io/spd-archive/
90 90.5 91 91.5 92 92.5 93 93.5 94 94.5 95
Accuracy
Stanford Cars dataset (SPD & related)
Research trends on learning SPD representation
- Compactness of second-order feature representation &
Computational efficiency
- Efficient training of SPD structural layers by considering the
underlying manifold structure
- Second-order correlation across layers
- Deeply integrated into convolutional neural networks
- More applications explored
- Generic and Fine-grained image recognition
- Image segmentation, Person reidentification and retrieval
- Action parsing & analysis, Image super-resolution
- More to be explored…
Key references
57
- 1. R. Wang, H. Guo, L. S. Davis and Q. Dai, "Covariance discriminative learning: A natural and
efficient approach to image set classification," 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, 2012, pp. 2496-2503.
- 2. S. Jayasumana, R. Hartley, M. Salzmann, H. Li and M. Harandi, "Kernel Methods on the
Riemannian Manifold of Symmetric Positive Definite Matrices," 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, 2013, pp. 73-80.
- 3. T. Lin, A. RoyChowdhury and S. Maji, "Bilinear CNN Models for Fine-Grained Visual
Recognition," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1449-1457.
- 4. P. Li, J. Xie, Q. Wang and W. Zuo, "Is Second-Order Information Helpful for Large-Scale
Visual Recognition?" 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 2089-2097.
- 5. L. Wang, J. Zhang, L. Zhou, C. Tang and W. Li, "Beyond Covariance: Feature
Representation with Nonlinear Kernel Matrices," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 4570-4578.
- 6. M. Engin, L. Wang, L. Zhou, X. Liu “DeepKSPD: Learning Kernel-matrix-based SPD
Representation for Fine-grained Image Recognition.” The European Conference on Computer Vision (ECCV), Munich, 2018, pp. 629—645.
- 7. Second- and Higher-order Representations in Computer Vision, Tutorial at ICCV2019.
- 8. SPD Representations Methods Archive
Q&A
Images Courtesy of Google Image