Learning from Fine-Grained and Long-Tailed Visual Data Yin Cui - PowerPoint PPT Presentation

Learning from Fine-Grained and Long-Tailed Visual Data Yin Cui Google Research Dec 11 2019

Visual Recognition System Database Supervised Learning Convolutional Neural “bird” Network (CNN)

Visual Recognition System Larger Database (more images, more classes) Supervised Learning Convolutional Neural “Northern cardinal” Network (CNN)

Visual Recognition System Even Larger Database Supervised Learning Convolutional Neural “Northern cardinal” Network (CNN)

Problem occurs... ● Long-tailed ○ Majority of categories are rare Even Larger Database ● Hard to get labels ○ Labeling effort grows dramatically per image. ○ Human expertise. Supervised Learning Convolutional Neural “Northern cardinal” Network (CNN)

In reality... Medium-sized Database Supervised Learning Convolutional Neural “bird” Network (CNN)

But luckily, we have transfer learning Large-Scale Dataset Medium-sized Database Transfer Learning Supervised Learning Convolutional Neural “Northern cardinal” Network (CNN)

A diverse array of data sources Social Network Search Engine Communities

Can we build a generic, one size fits all pre-trained model for transfer learning?

Large-scale pre-training C. Sun et al. Revisiting Unreasonable Effectiveness of Data in Deep D. Mahajan et al. Exploring the Limits of Weakly Supervised Pretraining. Learning Era. ICCV 2017. ECCV 2018.

Generic vs. Specialized Model ● ImageNet pre-training vs. iNaturalist pre-training ● iNaturalist 2017 contains 859k images from 5000+ natural categories. ● Fine-tuned on 7 medium-sized datasets. CUB-200 Stanford Dogs Flowers-102 Stanford Cars Aircraft Food-101 NA-Birds ImageNet 82.84 84.19 96.26 91.31 85.49 88.65 82.01 iNat 89.26 78.46 97.64 88.31 82.61 88.80 87.91

Generic vs. Specialized Model ● ImageNet pre-training vs. iNaturalist pre-training ● iNaturalist 2017 contains 859k images from 5000+ natural categories. ● Fine-tuned on 7 medium-sized datasets. ● Combining ImageNet + iNat. More data doesn’t always help. CUB-200 Stanford Dogs Flowers-102 Stanford Cars Aircraft Food-101 NA-Birds ImageNet 82.84 84.19 96.26 91.31 85.49 88.65 82.01 iNat 89.26 78.46 97.64 88.31 82.61 88.80 87.91 ImageNet + iNat 85.84 82.36 97.07 91.38 85.21 88.45 83.98

Model Capacity is not a problem ● Combined training achieve similar performance on each dataset. ● Model is able to learn well on both datasets, but cannot transfer well. ○ Trade-off between quantity and quality in transfer learning. ○ Pre-training a more specialized model could help.

Domain similarity via Earth Mover’s Distance ● Red: source domain. Green: target domain. Cui et al. Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018.

Source domain selection ● Greedy selection strategy: sort and include most similar source classes. ○ Simple and no guarantee on the optimality, but works well in practice. Cui et al. Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018.

Improved Transfer Learning ● Comparable to the best of ImageNet, iNat with only a subset of 585 classes. CUB-200 Stanford Dogs Flowers-102 Stanford Cars Aircraft Food-101 NA-Birds ImageNet 82.84 84.19 96.26 91.31 85.49 88.65 82.01 iNat 89.26 78.46 97.64 88.31 82.61 88.80 87.91 ImageNet + iNat 85.84 82.36 97.07 91.38 85.21 88.45 83.98 Ours (585-class) 88.76 85.23 97.37 90.58 86.13 88.37 87.89

Transfer Learning via Fine-tuning ● Transfer learning performance can be estimated by domain similarity.

Discussion ● In the AutoML setting: ○ We need a model that performs well on a small dataset. Usually it’s domain specific. ○ We have access to large datasets and pre-trained models. ○ The problem cannot be solved by pre-training on a single large source domain. ● Architectural search is one solution. ● Another solution could be from the perspective of source domain selection: ○ A model zoo with models trained on different datasets. ○ Select a source domain / pre-trained model based on domain similarity.

Dealing with long-tailed data distribution

The World is Long-Tailed ● A large number of classes are rare in nature. ● Cannot easily scale the data collection for those classes in the long tail. Cui et al. Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019.

Overview Effective Number of Samples: n: number of samples. Class-Balanced Loss:

The more data, the better, but... ● As the number of samples increases, the marginal benefit a model can extract from the data diminishes. image courtesy: https://me.me/i/ate-too-much-regrets-nothing-5869266

Data Sampling as Random Covering ● In order to measure data overlap, we associate each sample with a small region of unit volume 1 instead of a point. Assume the volume of all possible data is N.

Theoretical Results

Class-Balanced Loss ● Class-Balanced Softmax Cross-Entropy Loss: ● Class-Balanced Sigmoid Cross-Entropy Loss: ● Class-Balanced Focal Loss:

Datasets

Classification Error Rate of ResNet-32 on CIFAR ● Original Losses and best class-balanced loss. ● SM: Softmax; SGM: Sigmoid.

Analysis

Classification Error Rate on ImageNet and iNat

ResNet-50 Training Curves on ImageNet and iNat

ResNet-50 Training Curves on iNat and ImageNet

Discussion ● The concept of effective number of samples for long-tailed data distribution. ● A theoretical framework to quantify effective number of samples. ○ Model each example as a small region instead of a point. ● Class-balanced loss. ● Improved performance on 3 commonly used loss functions. ● Non-parametric. We do not assume the distribution of data. ● Code available at: https://github.com/richardaecn/class-balanced-loss ● Two follow-up work: ○ K Cao et al. Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss. NeurIPS 2019. ○ B Kang et al. Decoupling Representation and Classifier for Long-Tailed Recognition. https://arxiv.org/abs/1910.09217.

Thanks!

Learning from Fine-Grained and Long-Tailed Visual Data Yin Cui - PowerPoint PPT Presentation

Learning from Fine-Grained and Long-Tailed Visual Data Yin Cui Google Research Dec 11 2019 Visual Recognition System Database Supervised Learning Convolutional Neural bird Network (CNN) Visual Recognition System Larger Database

Fine Grained Access Control Fine-Grained Access Control Fine Grained Access Control

Fine-Grained Access Control Fine Grained Access Control Fine-grained access control examples:

Fine-grained Visual Analysis: From Classification to Retrieval Yi-Zhe Song SketchX Lab, CVSSP,

Fine-Grained Geographic Communication (Geocast) Nexus Workshop Frank Drr 23.07.2003 1

Average-Case Fine-Grained Hardness Marshall Ball Alon Rosen Manuel Sabin Prashant Nalini

Addressing Inter-Class Similarity in Fine-Grained Visual Classification Abhimanyu Dubey

Mechanized Verification of Fine-grained Concurrent Programs Ilya Sergey Aleks Nanevski

Rethinking Class-Balanced Methods for Long-tailed Visual Recognition from a Domain Adaptation

Junfeng Fan ESAT/COSIC ECC implementation methods Multi-core systems Coarse-Grained

Combining Data-Intense and Compute-Intense Methods for Fine-Grained Morphological Analyses Petra

Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed

Fine-grained Image Recognition Lei Wang VILA group School of Computing and Information

Fine-Grained Power Modeling for Smartphones Using System Call Tracing Based on paper and

Enhancing Fine- Grained Parallelism Loop vectorization, Loop distribution, Scalar expansion

Fine-Grained Tracking of Grid Infections Ashish Gehani SRI Basim Baig, Salman Mahmood, Dawood

On the Correctness Criteria of Fine-Grained Access Control in Relational Databases Qihua Wang,

Building a Digital First Future: Digital Primary Care Congress Chamber of Commerce, Manchester 5

Extracting drug-drug interactions from pharmacological texts. Isabel Segura Bedmar, Cesar de

OF NE REGIONS DBT-NECAB Workshop Assam Agricultural University, Jorhat 1 September 12 -14

Landscaping Building pollinator habitats for Pollinators Design considerations

Introduction to Distributed Hash Tables Eric Rescorla Network Resonance ekr@networkresonance.com

Distributed Hash Tables What is a DHT? Hash Table data structure that maps keys to

CompSci 514: Computer Networks Lecture 13: Distributed Hash Table Xiaowei Yang Overview

Handling Churn in a DHT Sean Rhea, Dennis Geels, Timothy Roscoe, and John Kubiatowicz UC