Combating Label Noise in Deep Learning using Abstention Speaker: - PowerPoint PPT Presentation

Combating Label Noise in Deep Learning using Abstention Speaker: Sunil Thulasidasan sunil@lanl.gov

sunil@lanl.gov A Practical Challenge for Deep Learning State-of-the-art models require large amounts of clean , annotated data.

sunil@lanl.gov Annotation is labor intensive! Slide from Fei-Fei Li and Jia Deng ImageNet: 15 million labeled • images; over 20,000 classes 49k workers • The data that transformed AI 167 countries research—and possibly the world (D. Gershgorn, quartz, magazine, 2017) • 2.5 years to complete!

sunil@lanl.gov Approaches to large-scale labeling • Crowdsource at scale – labor intensive, but relatively cheap • Use weak labels from queries, user tags and pre-trained classifiers

sunil@lanl.gov Approaches to large-scale labeling • Crowdsource at scale – labor intensive, but Both approaches can lead to cheap Dog significant labeling errors! Taxi Banana • Use weak labels from Slide credit: S queries, user tags and Guo et al ‘2018 pre-trained classifiers

• Label noise is an inconsistent mapping from features X to labels Y Dog Dog Dog

The Deep Abstaining Classifier (DAC) Approach: Use learning difficulty on incorrectly labeled or confusing samples to defer on learning -- “abstain” -- till correct mapping is learned.

sunil@lanl.gov Training a Deep Abstaining Classifier k ! p ( x ) i 1 X L ( x ) = (1 − p ( x ) k +1 ) t ( x ) i log + α log − 1 − p ( x ) k +1 1 − p ( x ) k +1 i =1 Cross entropy as usual

sunil@lanl.gov Training a Deep Abstaining Classifier Abstention class k ! p ( x ) i 1 X L ( x ) = (1 − p ( x ) k +1 ) t ( x ) i log + α log − 1 − p ( x ) k +1 1 − p ( x ) k +1 i =1 Encourages abstention Cross entropy over actual classes

sunil@lanl.gov Training a Deep Abstaining Classifier Abstention class k ! p ( x ) i 1 X L ( x ) = (1 − p ( x ) k +1 ) t ( x ) i log + α log − 1 − p ( x ) k +1 1 − p ( x ) k +1 i =1 Encourages abstention Cross entropy over actual classes Penalizes abstention Automatically tuned during learning.

sunil@lanl.gov Abstention Dynamics Introduce abstention after a warmup period. Abstention reduces as the DAC makes learning Ideal rate of progress abstention Overfitting regime! Abstained percent on training set vs epoch with 10% label noise .

sunil@lanl.gov The DAC gives state-of-art results in label-noise experiments. CIFAR-100 60% label noise Training protocol: • Use DAC to identify and eliminate label noise. • Retrain on cleaner set. CIFAR-10 CIFAR-10 60% label noise WebVision: Real-world noisy dataset. 80% label noise ~2.4M images. ~35-40% label noise GCE: Generalized Cross-Entropy Loss (Zhang et al NIPS ‘18); Forward (Patrini et al, CVPR ’17); MentorNet (Li et al, ICML ‘18)

sunil@lanl.gov Abstention in the presence of Systematic Label Noise: The Random Monkeys Experiment All the monkey labels in the training set (STL- 10) are randomized. Can the DAC learn that images containing monkey features have unreliable labels and abstain on monkeys in the test set?

sunil@lanl.gov Random Monkeys: DAC Predictions on Monkey Images 0.5 0.0 airplane bird car cat deer dog horse monkey ship truck Abstained The DAC abstains on most of the monkeys in the test set!

sunil@lanl.gov Image Blurring Blur a subset (20%) of the images in the training set and randomize labels Will the DAC learn to abstain on blurred images in the test set?

DAC Behavior on Blurred Images For DAC, validation accuracy is DAC abstains on most of the calculated on non-abstained blurred images in the test set samples.

sunil@lanl.gov Conclusions Code available at https://github.com/thulas/dac-label-noise • Abstention training is an effective way to clean label noise in a deep learning pipeline . • Abstention can also be used as a representation learner for label noise. • Especially useful for interpretability in “don’t- know” decision situations .

sunil@lanl.gov Code available at https://github.com/thulas/dac-label-noise Joint work with……. Poster: Tue Jun 11th Gopinath Jeff Bilmes Tanmoy Jamal Mohd- Chennupati University of Bhattacharya Yusof 06:30 -- 09:00 Los Alamos Washington Los Alamos Los Alamos PM @ Pacific National Lab National Lab National Lab Ballroom #9 Point of Contact: Sunil Thulasidasan (sunil@lanl.gov )

Combating Label Noise in Deep Learning using Abstention Speaker: - PowerPoint PPT Presentation

Combating Label Noise in Deep Learning using Abstention Speaker: Sunil Thulasidasan sunil@lanl.gov sunil@lanl.gov A Practical Challenge for Deep Learning State-of-the-art models require large amounts of clean , annotated data. sunil@lanl.gov

Blue Label Pilot-plant Reactor 1 Product Line-up Platinum Label Gold Label Blue Label Blue

AG! Blue Label Bench-top Reactor 1 Product line up Platinum Label Gold Label Blue Label Blue

Extreme Classification A New Paradigm for Ranking & Recommendation Manik Varma Microsoft

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Boosting under high noise. Adaboost is sensitive to label noise Letter / Irvine Database

Visioning Committee Air Quality and Noise January 23, 2020 Noise Data Noise is evaluated on

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Johnson Noise: Determinations of k and Absolute Zero Edwin Ng | 12 December 2011 Nyquists

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Active Learning for Classification with Abstention Shubhanshu Shekhar 1 University of California,

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

COMBATING COMBATING TERRORISM TERRORISM FINANCING - - FINANCING THE PHILIPPINE EXPERIENCE

Club Med Bintan Island, Indonesia A HOLISTIC WELLNESS ESCAPE JUST OFF SINGAPORE Image label

Achieving Cross-System Collaboration to Support Young People in the Transition Years: A Tip

Cross-lingual named entity disambiguation for concept translation Tadej tajner Joef Stefan

Adversarial Examples and Adversarial Training Ian Goodfellow, OpenAI Research Scientist Guest

Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity

Calling External Routines in Stata Giovanni Cerulli and Antonio Zinilli IRCrES-CNR 1

Security and Privacy in Machine Learning Nicolas Papernot Pennsylvania State University &

FIGHTING COVID-19 ON THE FRONTLINE Frank Baez, BS, RN NYU Langone Health Challenges for Nursing

Data Mining Model Overfitting Introduction to Data Mining, 2 nd Edition by Tan, Steinbach,

Combating Label Noise in Deep Learning using Abstention Speaker: - PowerPoint PPT Presentation

Combating Label Noise in Deep Learning using Abstention Speaker: Sunil Thulasidasan sunil@lanl.gov sunil@lanl.gov A Practical Challenge for Deep Learning State-of-the-art models require large amounts of clean , annotated data. sunil@lanl.gov

Blue Label Pilot-plant Reactor 1 Product Line-up Platinum Label Gold Label Blue Label Blue

AG! Blue Label Bench-top Reactor 1 Product line up Platinum Label Gold Label Blue Label Blue

Extreme Classification A New Paradigm for Ranking &amp; Recommendation Manik Varma Microsoft

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Boosting under high noise. Adaboost is sensitive to label noise Letter / Irvine Database

Visioning Committee Air Quality and Noise January 23, 2020 Noise Data Noise is evaluated on

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Johnson Noise: Determinations of k and Absolute Zero Edwin Ng | 12 December 2011 Nyquists

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -&gt; value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -&gt; value Pseudo-random:

Active Learning for Classification with Abstention Shubhanshu Shekhar 1 University of California,

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

COMBATING COMBATING TERRORISM TERRORISM FINANCING - - FINANCING THE PHILIPPINE EXPERIENCE

Club Med Bintan Island, Indonesia A HOLISTIC WELLNESS ESCAPE JUST OFF SINGAPORE Image label

Achieving Cross-System Collaboration to Support Young People in the Transition Years: A Tip

Cross-lingual named entity disambiguation for concept translation Tadej tajner Joef Stefan

Adversarial Examples and Adversarial Training Ian Goodfellow, OpenAI Research Scientist Guest

Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity

Calling External Routines in Stata Giovanni Cerulli and Antonio Zinilli IRCrES-CNR 1

Security and Privacy in Machine Learning Nicolas Papernot Pennsylvania State University &amp;

FIGHTING COVID-19 ON THE FRONTLINE Frank Baez, BS, RN NYU Langone Health Challenges for Nursing

Data Mining Model Overfitting Introduction to Data Mining, 2 nd Edition by Tan, Steinbach,

Extreme Classification A New Paradigm for Ranking & Recommendation Manik Varma Microsoft

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Security and Privacy in Machine Learning Nicolas Papernot Pennsylvania State University &