Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels Lu - PowerPoint PPT Presentation

Proprietary + Confidential Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels Lu Jiang, Di Huang, Mason Liu, Weilong Yang. Lu Jiang Di Huang Mason Liu Weilong Yang

Deep Learning on Noisy Labels Deep networks are very good at memorizing the noisy labels ( Zhang et al. 2017) . Memorization leads to a critical issue since noisy labels are inevitable in big data. Zhang, Chiyuan, et al. "Understanding deep learning requires rethinking generalization." ICLR (2017). Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Controlled Noisy Labels Pergorming controlled experiments on noisy labels is essential in existing works. Correct label Wrong label noise level=20% 80% 40% Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Issues with Controlled Synthetic Labels Issue: existing studies only pergorm controlled experiments on synthetic labels (or random labels). 1. Contradictory fjndings. For example, DNNs are robust to massive label noise? Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Issues with Controlled Synthetic Labels Issue: existing studies only pergorm controlled experiments on synthetic labels (or random labels). 1. Contradictory fjndings. For example, DNNs are robust to massive label noise? (Zhang et al. 2017) (Rolnick et al. 2017) Zhang, Chiyuan, et al. "Understanding deep learning requires rethinking generalization." ICLR (2017). Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem Rolnick, D., et al. Deep learning is robust to massive label noise. arXiv preprint arXiv:1705.10694, 2017.

Issues with Controlled Synthetic Labels Issue: existing studies only pergorm controlled experiments on synthetic labels (or random labels). 2. Inconsistent empirical results We found that methods that pergorm well on synthetic noise may not work as well on real-world noisy labels. ● Motivation of our research project. Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Our Contributions: We establish the fjrst benchmark of controlled real-world label noise (from the web). 1. 2. A simple but highly efgective method to overcome both synthetic and real-world noisy labels (best results on the WebVision benchmark) 3. We conduct the largest study by far into understanding deep neural networks trained on noisy labels across difgerent noise levels, noise types, network architectures, methods, and training setuings. Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Contribution I: New Dataset First benchmark of controlled real-world label noise Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Datasets of noisy training labels real-world label synthetic noise image corruption (Hendrycks & Dietterich, 2019) content (Zhang et al., 2019a) adversarial attack Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Datasets of noisy training labels WebVision, uncontrolled Clothing1M etc. real-world our work ? Controlled → Missing label (Zhang et al. 2017) synthetic Controlled noise image corruption (Hendrycks & Dietterich, 2019) content (Zhang et al., 2019a) adversarial attack Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Construction of controlled synthetic label noise 1. Starts with a well-labeled dataset. Correct label 2. Randomly selects p% examples. 3. Independently flips each label to a random incorrect class (symmetric or asymmetric). 4. Repeats Step 1-3 with a different p (noise level) Mini-ImageNet Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Construction of controlled synthetic label noise 1. Starts with a well-labeled dataset. Correct label 2. Randomly selects p% examples. 3. Independently flips each label to a random incorrect class (symmetric or asymmetric). 4. Repeats Step 1-3 with a different p (noise level) noise level p = 20% Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Construction of controlled synthetic label noise 1. Starts with a well-labeled dataset. Correct label Wrong label 2. Randomly selects p% examples. 3. Independently flips each label to a random incorrect class (symmetric or asymmetric). 4. Repeats Step 1-3 with a different p (noise level) noise level p = 20% Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Construction of controlled synthetic label noise 1. Starts with a well-labeled dataset. Correct label Wrong label 2. Randomly selects p% examples. 3. Independently flips each label to a random incorrect class (symmetric or asymmetric). 4. Repeats Step 1-3 with a different p (noise level) noise level p = 40% This process generates controlled synthetic label noise. Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Construction of uncontrolled web label noise label correctness unknown noise level p = ??% This process can automatically collect noisy labeled images from the web. But the noise level is fixed and unknown (unsuitable for controlled studies). Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

From uncontrolled to controlled noise Correct label Wrong label correct incorrect correct noise level p is known We have each retrieved image annotated by 3-5 works using Google Cloud Labeling Service https://cloud.google.com/ai-platform/data-labeling/docs Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Construction of our dataset 1. Starts with a well-labeled dataset. Correct label wrong label 2. Randomly selects p% examples. 3. Replaces the clean images with the incorrectly labeled web images while leaving the label unchanged*. 4. Repeats Step 1-3 with a different p (noise level) noise level p = 20% *We show that an alternative way to construct the dataset by removing all image-to-image results leads to consistent results in the Appendix Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Our Dataset: Controlled Noisy Labels from the Web Manually annotate 212K images through 800K annotations. We establish the fjrst benchmark of controlled web label noise for two classifjcation tasks: coarse (Mini-ImageNet) and fjne-grained (Stanford Cars) Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Our Dataset: Controlled Noisy Labels from the Web Manually annotate 212K images through 800K annotations. We establish the fjrst benchmark of controlled web label noise for two classifjcation tasks: coarse (Mini-ImageNet) and fjne-grained (Stanford Cars) Red noise: label noise from the web Blue noise: synthetic label noise Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Mini-ImageNet Stanford Cars Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Contribution II: New Method to overcome synthetic and real-world label noise Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Overview Problem : Given a noisy dataset of some unknown noise level, fjnd a robust learning method that generalizes well on the clean test data. Prior works : Many techniques tackle it from multiple directions, among others, Regularization (Azadi et al., 2016; Noh et al., 2017; etc.) ● Label cleaning (Reed et al., 2014; Goldberger, 2017; Li et al., 2017b; Veit et al., 2017; Song et al., 2019; etc.) ● Example weighting (Jiang et al., 2018; Ren et al., 2018; Shu et al., 2019; Jiang et al., 2015; Liang et al., 2016; etc.) ● Data augmentation (Zhang et al., 2018; Cheng et al., 2019) ● … ... ● Our Method: a simple and efgective method called MentorMix. Why need yet another method? We show our method overcomes both synthetic and real-world noisy labels. Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels Lu - PowerPoint PPT Presentation

Proprietary + Confidential Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels Lu Jiang, Di Huang, Mason Liu, Weilong Yang. Lu Jiang Di Huang Mason Liu Weilong Yang Deep Learning on Noisy Labels Deep networks are very good at

Formal Modeling in Cognitive Science 1 Noisy Channel Model Channel Capacity Lecture 29: Noisy

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Synthetic Biology Considerations in Synthetic Biology Considerations in Synthetic Biology

Visioning Committee Air Quality and Noise January 23, 2020 Noise Data Noise is evaluated on

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Johnson Noise: Determinations of k and Absolute Zero Edwin Ng | 12 December 2011 Nyquists

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Screening Controlled Substance Screening Controlled Substance Screening Controlled Substance

MEDICAL SOLUTIONS Controlled Power Company MEDICAL SOLUTIONS Controlled Power Company MEDICAL

Count Controlled CSCI-UA.0002-008 Loops Count Controlled Loops A count controlled loop is a

NOISE AT WORK AWARENESS SESSION FOR SUPERVISORS OBJECTIVES Understand what is noise

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Distributed Synthetic Data Platform for Deep Learning Applications BITCOIN OR ETHER AMAZON DEEP

WIDER Development Conference 2018 2.4 Tax and benefit policies What are tax-benefit

RESOURCES Slides, additional resources, and a recorded version of todays webcast will be

Virtually Persistent Data Andrew Warfield XenSource Quick Overview Blktap driver

DUNE DAQ Firmware David Cussans Firmware Meeting 16/Jan/20 You Inst Logo You Inst Logo Goal

Hitex ARM Conference Future Proof Software Introduction HCC is in a fairly unique position

Deploy Like A Boss Oliver Nicholas DEPLOY LIKE A BOSS THE JOURNEY FROM 2 SERVERS TO 20,000 THE

The I VOA Architecture Christophe Arviset (ESA/ ESAC) Head of the Science Archives and Computer

AREF The Wellness Trend Wednesday 1 st May 2019 Slido.com - ref: #H131 Proudly sponsored by:

Sambuz

Useful Links

Newsletter

Mail Us

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels Lu - PowerPoint PPT Presentation

Proprietary + Confidential Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels Lu Jiang, Di Huang, Mason Liu, Weilong Yang. Lu Jiang Di Huang Mason Liu Weilong Yang Deep Learning on Noisy Labels Deep networks are very good at

Formal Modeling in Cognitive Science 1 Noisy Channel Model Channel Capacity Lecture 29: Noisy

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Synthetic Biology Considerations in Synthetic Biology Considerations in Synthetic Biology

Visioning Committee Air Quality and Noise January 23, 2020 Noise Data Noise is evaluated on

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Johnson Noise: Determinations of k and Absolute Zero Edwin Ng | 12 December 2011 Nyquists

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -&gt; value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -&gt; value Pseudo-random:

Screening Controlled Substance Screening Controlled Substance Screening Controlled Substance

MEDICAL SOLUTIONS Controlled Power Company MEDICAL SOLUTIONS Controlled Power Company MEDICAL

Count Controlled CSCI-UA.0002-008 Loops Count Controlled Loops A count controlled loop is a

NOISE AT WORK AWARENESS SESSION FOR SUPERVISORS OBJECTIVES Understand what is noise

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Distributed Synthetic Data Platform for Deep Learning Applications BITCOIN OR ETHER AMAZON DEEP

WIDER Development Conference 2018 2.4 Tax and benefit policies What are tax-benefit

RESOURCES Slides, additional resources, and a recorded version of todays webcast will be

Virtually Persistent Data Andrew Warfield XenSource Quick Overview Blktap driver

DUNE DAQ Firmware David Cussans Firmware Meeting 16/Jan/20 You Inst Logo You Inst Logo Goal

Hitex ARM Conference Future Proof Software Introduction HCC is in a fairly unique position

Deploy Like A Boss Oliver Nicholas DEPLOY LIKE A BOSS THE JOURNEY FROM 2 SERVERS TO 20,000 THE

The I VOA Architecture Christophe Arviset (ESA/ ESAC) Head of the Science Archives and Computer

AREF The Wellness Trend Wednesday 1 st May 2019 Slido.com - ref: #H131 Proudly sponsored by:

Sambuz

Useful Links

Newsletter

Mail Us

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random: