Unsupervised Label Noise Modeling and Loss Correction International - - PowerPoint PPT Presentation

unsupervised label noise modeling and loss correction
SMART_READER_LITE
LIVE PREVIEW

Unsupervised Label Noise Modeling and Loss Correction International - - PowerPoint PPT Presentation

Unsupervised Label Noise Modeling and Loss Correction International Conference on Machine Learning Eric Arazo*, Diego Ortego*, Paul Albert, Noel OConnor Long Beach, June 2019 and Kevin McGuinness eric.arazo@insight-centre.org,


slide-1
SLIDE 1

International Conference

  • n Machine Learning

Long Beach, June 2019

Unsupervised Label Noise Modeling and Loss Correction

Eric Arazo*, Diego Ortego*, Paul Albert, Noel O’Connor and Kevin McGuinness

eric.arazo@insight-centre.org, diego.ortego@insight-centre.org

slide-2
SLIDE 2
  • Motivation
  • Observations
  • Proposed method

○ Label noise modeling ○ Loss correction approach

  • Results

Outline

slide-3
SLIDE 3

Motivation: why label noise?

3

  • Top performing DNN models: strong supervision
  • Labeled data is a scarce resource
  • Several alternatives to relax strong supervision
slide-4
SLIDE 4

Motivation: why label noise?

4

  • Top performing DNN models: strong supervision
  • Labeled data is a scarce resource
  • Several alternatives to relax strong supervision

Data Semi-supervised learning

Unlabeled Labeled

slide-5
SLIDE 5

Motivation: why label noise?

5

  • Top performing DNN models: strong supervision
  • Labeled data is a scarce resource
  • Several alternatives to relax strong supervision

Data Automatic labeling (label noise)

Incorrectly labeled Correctly Labeled

slide-6
SLIDE 6

Observations

6

  • “Deep neural networks easily fit random labels” [1]

[1] Zhang et al., “Understanding Deep Learning Requires Re-thinking Generalization”, ICLR 2017.

Source: [1]

CIFAR-10

slide-7
SLIDE 7

Observations

7

  • Noisy samples take longer to learn

○ “Simple patterns are learned first” [2] ○ “Small loss” [3] ○ “High learning rate prevents memorization [4]”

CIFAR-10 80% label noise Uniform label noise

Epoch Loss

[2] Arpit et al., “A Closer Look at Memorization in Deep Networks”, ICML 2017. [3] Yu et al., How does disagreement help against label corruption?, ICML 2019 [4] Tanaka et al., “Joint Optimization Framework for Learning with Noisy Labels”, CVPR 2018.

slide-8
SLIDE 8

Label noise modeling

8

  • Before label noise memorization: clean and noisy samples are (to some

extent) distinguishable in the loss

  • Two-component mixture model suits the problem

Epoch Loss

slide-9
SLIDE 9

Label noise modeling

9

  • Before label noise memorization: clean and noisy samples are (to some

extent) distinguishable in the loss

  • Two-component mixture model suits the problem

Epoch Loss

slide-10
SLIDE 10

Label noise modeling

10

  • Before label noise memorization: clean and noisy samples are (to some

extent) distinguishable in the loss

  • Two-component mixture model suits the problem

Epoch Loss

slide-11
SLIDE 11

Label noise modeling

11

  • Before label noise memorization: clean and noisy samples are (to some

extent) distinguishable in the loss

  • Two-component mixture model suits the problem

Epoch Loss

slide-12
SLIDE 12

Loss correction approach

12

  • Bootstrapping loss correction [5] + mixup data augmentation [6]

[5] Reed t al. “Training deep neural networks on noisy labels with bootstrapping”, ICLR 2015. [6] Zhang et al., “mixup: Beyond Empirical Risk Minimization”, ICLR 2018.

slide-13
SLIDE 13

Loss correction approach

13

  • Bootstrapping loss correction [5] + mixup data augmentation [6]
  • Our Beta Mixture Model drives our learning approach a step further by:

○ Preventing memorization ○ Correcting noisy labels to learn from them

[5] Reed t al. “Training deep neural networks on noisy labels with bootstrapping”, ICLR 2015. [6] Zhang et al., “mixup: Beyond Empirical Risk Minimization”, ICLR 2018.

slide-14
SLIDE 14

Loss correction approach

14

  • Standard training (left) vs proposed training (right)

Epoch Loss Epoch

CIFAR-10, 80% label noise, uniform label noise

slide-15
SLIDE 15

Loss correction approach

15

  • Original labels training (left) vs predicted labels after training (right)
slide-16
SLIDE 16

Results

16

CIFAR-10 results

Code on github: https://git.io/svE

slide-17
SLIDE 17

For more details and discussions...

17

Come to our poster!

(Pacific Ballroom #176)

Thanks!