addressing inter class similarity in fine grained visual
play

Addressing Inter-Class Similarity in Fine-Grained Visual - PowerPoint PPT Presentation

Addressing Inter-Class Similarity in Fine-Grained Visual Classification Abhimanyu Dubey Collaborators: Nikhil Naik, Ryan Farrell, Otkrist Gupta, Pei Guo, Ramesh Raskar Fine-Grained Visual Classification - Image classification with target


  1. Addressing Inter-Class Similarity in Fine-Grained Visual Classification Abhimanyu Dubey Collaborators: Nikhil Naik, Ryan Farrell, Otkrist Gupta, Pei Guo, Ramesh Raskar

  2. Fine-Grained Visual Classification - Image classification with target categories that are visually very similar - Classification within subcategories of the same larger visual category - Examples: - Identifying the make and model of a vehicle - Identifying species categorizations among flora/fauna Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  3. Fine-Grained Visual Classification Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  4. How is this different from large-scale classification? Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  5. How is this different from large-scale classification? - Foreground vs Background: - Diverse (Large-Scale) Problems: Background context can be relevant for foreground classification - eg: we probably won’t come across an image of an airplane in someone’s living room - Fine-Grained Problems: Background usually varies independently of the foreground classification - eg: many bird species can be photographed in the same setting Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  6. How is this different from large-scale classification? - Inter-class and intra-class diversity: - In large-scale classification, the average visual diversity between classes is typically much larger than the variation that exists within samples of the same class - In fine-grained classification: - samples within a class can vary significantly based on background, pose and lighting - samples across classes, on average, exhibit smaller diversity due to the minute differences between the foreground objects Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  7. How is this different from large-scale classification? samples from the same class samples from different classes (labrador retriever) (norfolk terrier vs cairn terrier) Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  8. How is this different from large-scale classification? - Data collection is harder: - Domains require expert knowledge - Smaller datasets on average, too little for directly training CNNs - Data is imbalanced: - Large-scale classification typically has a uniform distribution of labels in the training set - FGVC may have some classes harder to photograph, giving a fatter tail in the data distribution Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  9. Approaches to Fine-Grained Visual Classification - Object parts are common across classes: - We can utilize object part annotations to remove unwanted context - Removes background sensitivity, part-based pooling introduces pose invariance [Cui et al, CVPR09] Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  10. Explicit Part Localization [Cui et al, CVPR09] Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  11. Part Alignment via co-segmentation [Krause et al, CVPR15] Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  12. Bilinear Pooling [Lin et al, ICCV15, Cui et al, ICCV17, Gao et al CVPR16] Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  13. Our Intuition: - Only experts among humans can identify a fine-grained class with certainty - Typically, we would expect confusion between classes during training, instead of memorizing each sample with complete confidence - For example: p(y|x) dog1 dog2 dog3 dogN p(y|x) dog1 dog2 dog3 dogN Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  14. Our Hypothesis: - Foreground objects in fine-grained samples do not have enough diversity between classes to enjoy generalization with strongly-discriminative training (minimizing cross-entropy from the training set) - Therefore, to reduce training error, they probably memorize samples based on non-generalizable artefacts (background, distractor objects, occlusions) Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  15. A solution: make training less discriminative [ECCV18] - Cross-entropy will enforce samples from different classes to have predictions very different from each other by the end of training - The most obvious fix: Can we bring predictions from different classes closer together? d( , ) p(y|x) p(y|x) dog1 dog2 dog3 dogN dog1 dog2 dog3 dogN Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  16. Measuring divergence between predictions - KL-divergence: standard divergence between probability distributions - Problem: asymmetric - Solution: consider Jeffrey’s divergence - KL(p || q) + KL(q + p) - New Problem: Will get arbitrarily big as predictions concentrate on one class Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  17. Measuring divergence between predictions Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  18. Alternative: Euclidean Distance - Symmetric - Easy to compute - Well-behaved: Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  19. Training Pipeline: Pairwise Confusion Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  20. Results: Fine-Grained Classification - We take baseline models and train them with our modified objective - “Basic” Models (ResNets, Inception, DenseNets): Average improvement of 4.5% in top-1 accuracy across 6 datasets - “Fine-Grained” Models (Bilinear Pooling, Spatial Transformer Nets): Average improvement of 1.9% in top-1 performance across 6 datasets ( 4.5x larger relative improvement) - Training time and LR-schedule is the same - Only minor variations in performance based on the choice of hyperparameter Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  21. Results: Large-Scale v/s Fine-Grained - We want to compare the importance of “low visual diversity” in the performance of weakly-discriminative training - We subsampled all the Dog classes from ImageNet (116 classes, ~117K points) and compared performance on this subset with performance on a similarly sized random subsample from ImageNet - We obtained an average improvement of around 2.7% in top-1 on the Dog subset, and only 0.18% on the random subset Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  22. Results: Robustness to Distractors - We compared the overlap in the heatmaps returned by Grad-CAM on our models vs the true object annotations: Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  23. A lot was left to be desired: - Not really “principled”: Many different formulations can be derived from the intuition of “weakly” discriminative training - How does this objective effect generalization performance? - Can we quantify the notion of visual diversity? Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  24. Entropy and Weakly Discriminative Training [NeurIPS18] - We would desire the learnt probability distribution during training to have the weakest discriminatory power while also predicting the correct class - More formally, for the distribution p(y|x), we would like it to have the maximum entropy possible while ensuring that MODE(p(y|x)) = training label - However, directly enforcing a mode alignment constraint is non-differentiable, therefore we can relax this constraint and attempt to minimize cross-entropy - Additionally, we would wish to maximise the entropy: Cross-entropy Entropy Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  25. Maximum-Entropy and Generalization: Analysis Preliminaries: - Since we are performing a fine-tuning task, we assume the pre-trained feature map ɸ(x) to be a multivariate mixture of m (unknown and possibly very large) Gaussians for any data distribution p x : Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  26. Maximum-Entropy and Generalization: Analysis Preliminaries: - Under this assumption, the variance of the feature space, given by the overall covariance matrix Σ * characterize the variation of the features under the data distribution. Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  27. Maximum-Entropy and Generalization: Diversity Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

  28. Maximum-Entropy and Generalization: Diversity To see how well this measure of diversity characterizes fine-grained problems, we look at the spread of features projected onto the top-2 eigenvectors from ImageNet training set(red) and CUB-2011 training set (blue) for GoogLeNet pool5 features: Addressing Inter-Class Similarity in Fine-Grained Visual Classification | Abhimanyu Dubey

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend