Generalized Cross Entropy Loss for Noisy Labels Zhilu Zhang and Mert - - PowerPoint PPT Presentation

generalized cross entropy loss for noisy labels
SMART_READER_LITE
LIVE PREVIEW

Generalized Cross Entropy Loss for Noisy Labels Zhilu Zhang and Mert - - PowerPoint PPT Presentation

Generalized Cross Entropy Loss for Noisy Labels Zhilu Zhang and Mert R. Sabuncu Cornell University Generalized Cross Entropy Loss for Noisy Labels Poster # 101


slide-1
SLIDE 1

Generalized Cross Entropy Loss for Noisy Labels

Zhilu Zhang and Mert R. Sabuncu Cornell University

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 1

slide-2
SLIDE 2

Cornell University

Deep neural networks:

  • Often need lots of clean labeled data - can be expensive to obtain
  • Can overfit to noisy labels [Zhang et al. 2016]

Motivation

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 2

slide-3
SLIDE 3

Cornell University

  • A loss function is symmetric if
  • Symmetric loss can be tolerant to noisy labels [Ghosh et al. 2017]
  • MAE for classification with probabilistic outputs is symmetric

Symmetric Loss

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 3

slide-4
SLIDE 4

Cornell University

  • MAE is noise-robust but can converge to lower accuracy

Limitations of MAE

ResNet on CIFAR-10 Slight gap in test accuracy Much slower convergence

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 4

slide-5
SLIDE 5

Cornell University

  • MAE is noise-robust but can converge to lower accuracy

Limitations of MAE

ResNet on CIFAR-100 Using MAE, the highest accuracy achieved is 38.29% in 2000 epochs, and CCE achieved better performance after 7 epochs! ~ 20%

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 4

slide-6
SLIDE 6

Cornell University

Generalized Cross Entropy (Lq Loss)

CCE

  • Good convergence, but prone to label noise

MAE

  • More noise robust, but bad convergence

Use the Box-Cox Transformation to combine them

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 5

slide-7
SLIDE 7

Cornell University

Generalized Cross Entropy (Lq Loss)

CCE MAE ! " [0,1]

  • Lq loss has bounded sum of losses for non zero q
  • The tighter the bound, the more noise robust the Lq loss

! = 1 ! = 0

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 5

slide-8
SLIDE 8

Cornell University

Generalized Cross Entropy (Lq Loss)

ResNet on CIFAR-10

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 5

CCE MAE ! " [0,1] ! = 1 ! = 0

slide-9
SLIDE 9

Cornell University

  • Propose the truncated Lq loss
  • Often has tighter bound
  • Use alternative convex search algorithm for optimization

Truncated Lq Loss

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 6

slide-10
SLIDE 10

Cornell University

58.72% 48.20% 15.80% 9.03% 66.81% 61.77% 67.61% 62.64% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00%

20% NOISE 40% NOISE

CIFAR-100

Trunc Lq Lq (q = 0.7) MAE CCE

  • ResNet on CIFAR-10,

CIFAR-100 and FASHION-MNIST with synthetic noise

  • Consistent improvements
  • ver CCE and MAE

Experiments

86.98% 81.88% 83.72% 67% 89.83% 87.13% 89.70% 87.62% 60.00% 65.00% 70.00% 75.00% 80.00% 85.00% 90.00% 95.00%

20% NOISE 40% NOISE

CIFAR-10

Generalized Cross Entropy Loss for Noisy Labels – Poster # 101 7

slide-11
SLIDE 11

Cornell University

  • Thank you very much for your attention!
  • Hope to see you at Poster #101