Samples for Robust Deep Learning Hwanjun Song , Minseok Kim , - - PowerPoint PPT Presentation

samples for robust deep learning
SMART_READER_LITE
LIVE PREVIEW

Samples for Robust Deep Learning Hwanjun Song , Minseok Kim , - - PowerPoint PPT Presentation

SELFIE: Refurbishing Unclean Samples for Robust Deep Learning Hwanjun Song , Minseok Kim , Jae-Gil Lee * Graduate School of Knowledge Service Engineering, KAIST * Corresponding Author Standard Supervised Learning Setting ,


slide-1
SLIDE 1

Hwanjun Song†, Minseok Kim†, Jae-Gil Lee†*

† Graduate School of Knowledge Service Engineering, KAIST * Corresponding Author

SELFIE: Refurbishing Unclean Samples for Robust Deep Learning

slide-2
SLIDE 2

0.0% 25.0% 50.0% 75.0% 100.0% 25 50 75 100

Train Error Epochs

2

  • Standard Supervised Learning Setting

– Assume: training data {(𝑦𝑗, 𝑧𝑗)}𝑗=1

𝑂 , 𝒛𝒋: True label

– In practical setting, 𝑧𝑗 → ෥ 𝑧𝑗, ෥ 𝒛𝒋: Noisy label

  • High cost and time consuming
  • Expert knowledge
  • Unattainable at scale
  • Learning with Noisy Label

– Suffer from poor generalization on test data (VGG-19 on CIFAR-10)

Difficulties of label annotation

0.0% 25.0% 50.0% 75.0% 100.0% 25 50 75 100

Test Error Epochs

Label Noise 0% 20% 40%

slide-3
SLIDE 3
  • Loss Correction

– Modify the loss ℒ of all samples before backward step – Suffer from accumulated noise by the false correction → Fail to handle heavily noisy data

  • Sample Selection (Recent direction)

– Select low-loss (easy) samples as clean samples 𝓓 for SGD – Use only partial exploration of the entire training data → Ignore useful hard samples classified as unclean

Selected samples All corrected samples

(a) Loss correction (b) Sample selection

3

slide-4
SLIDE 4
  • SELFIE (SELectively reFurbIsh unclEan samples)

– Hybrid of loss correction and sample selection – Introduce refurbishable samples 𝓢

  • The samples can be “corrected with high precision”

– Modified update equation on mini-batch {(𝑦𝑗, ෥

𝑧𝑗)}𝑗=1

𝑐

  • Correct the losses of samples in 𝓢
  • Combine them with the losses of samples in 𝓓
  • Exclude the samples not in 𝓢 ∪ 𝓓

𝜄𝑢+1 = 𝜄𝑢 − 𝛽𝛼 1 𝓢 ∪ 𝓓 ෍

𝒚∈𝓢

𝓜 𝒚, 𝒛𝒔𝒇𝒈𝒗𝒔𝒄 + ෍

𝒚∈𝓓∩𝓢−𝟐

𝓜 𝒚, ෥ 𝒛

𝓢 𝓓

Corrected losses Selected clean losses

4

slide-5
SLIDE 5

𝑦 ෤ 𝑧 cat

dog cat dog dog dog dog dog …

dog (refurbished label)

Consistent label predictions

  • Clean Samples 𝓓 from 𝓝 (mini-batch)

– Adopt loss-based separation (Han et al., 2018) – 𝓓 ← 100 − 𝑜𝑝𝑗𝑡𝑓 𝑠𝑏𝑢𝑓 % of low-loss samples in 𝓝

  • Refurbishable Samples 𝓢 from 𝓝

– 𝓢 ← the samples with consistent label predictions – Replace its label into the most frequently predicted label

෥ 𝒛𝒋 → 𝒛𝒋

𝒔𝒇𝒈𝒗𝒔𝒄

5

slide-6
SLIDE 6
  • Synthetic Noise: pair and symmetric

– Injected two widely used noises

  • Realistic Noise

– Built ANIMAL-10N dataset with real-world noise

  • Crawled 5 pairs of confusing animals

E.g., {(cat, lynx), (jaguar, cheetah),…}

  • Educated 15 participants for one hour
  • Asked the participants to annotate the label

– Summary

# Training 50,000 Resolution 64x64 (RGB) # Test 5,000 Noise Rate 8% (estimated) # Classes 10 Data Created April 2019

6

slide-7
SLIDE 7
  • Results with two synthetic noises (CIFAR-10, CIFAR-100)
  • Results with realistic noise (ANIMAL-10N)

(a) Varying pair noises CIFAR-10 (b) Varying symmetric noises CIFAR-10 CIFAR-100 CIFAR-100 (a) DenseNet (L=25, k=12) (b) VGG-19

7

slide-8
SLIDE 8