a kernel theory of modern data augmentation
play

A Kernel Theory of Modern Data Augmentation Tr Tri Dao ao , Albert - PowerPoint PPT Presentation

th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:309:00 P 9:00 PM A Kernel Theory of Modern Data Augmentation Tr Tri Dao ao , Albert


  1. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM A Kernel Theory of Modern Data Augmentation Tr Tri Dao ao , Albert Gu, Alex Ratner, Virginia Smith, Chris De Sa, Chris Ré

  2. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation is important to accuracy…

  3. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation is important to accuracy… 3.7 pt. average gain across top ten CIFAR-10 models

  4. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation is important to accuracy… 3.7 pt. average gain across top ten CIFAR-10 models 13.9 pt. average gain for CIFAR-100

  5. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation is important to accuracy… 3.7 pt. average gain across top ten CIFAR-10 models 13.9 pt. average gain for CIFAR-100 A form of weak supervision: expresses domain knowledge (invariance)

  6. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM … but is not well understood

  7. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM … but is not well understood How does data augmentation affect the model? • Learning process • Parameters and decision surface

  8. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Augmentation as sequence modeling • TANDA [Ratner et al., 2017] • AutoAugment [Cubuk et al., 2018]

  9. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Augmentation as sequence modeling • TANDA [Ratner et al., 2017] • AutoAugment [Cubuk et al., 2018] Model augmentation as a Markov chain

  10. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Augmentation as kernels Base classifier: k-nearest neighbors + Data augmentation = Asymptotic kernel classifier

  11. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers

  12. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o x x o x x o x x Invariance

  13. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o o o o o o x x x x o o x x x o o x x x x x Regularization Invariance

  14. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o o o o o o x x x x o o x x x o o x x x x x Regularization Invariance Practical utility

  15. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o o o o o o x x x x o o x x x o o x x x x x Regularization Invariance Practical utility speeding up training

  16. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Effects of data augmentation on kernel classifiers o o o o o o o o o o x x x x o o x x x o o x x x x x Regularization Invariance Practical utility as a speeding up diagnostic training

  17. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Model of data augmentation: kernel classifier n 1 X ` ( w > � ( x i )) min Non-augmented: w n Loss function i =1 Feature map

  18. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Model of data augmentation: kernel classifier n 1 X ` ( w > � ( x i )) min Non-augmented: w n Loss function i =1 Feature map n 1 X E z i ⇠ T ( x i ) ` ( w > � ( z i )) min Augmented: w n i =1 Transformed versions of data point

  19. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation effects n n 1 E z i ⇠ T ( x i ) ` ( w > � ( z i )) ≈ 1 X X ` ( w > E z i ⇠ T ( x i ) � ( z i )) n n i =1 i =1 Average of augmented features (i.e. kernel mean embedding)

  20. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation effects n n 1 E z i ⇠ T ( x i ) ` ( w > � ( z i )) ≈ 1 X X ` ( w > E z i ⇠ T ( x i ) � ( z i )) n n i =1 i =1 Average of augmented features (i.e. kernel mean embedding) 1 st order effect: induces invariance by feature averaging

  21. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Data augmentation effects n n 1 E z i ⇠ T ( x i ) ` ( w > � ( z i )) ≈ 1 X X ` ( w > E z i ⇠ T ( x i ) � ( z i )) n n i =1 i =1 Average of augmented features (i.e. kernel mean embedding) 1 st order effect: 2 nd order effect: reduces induces invariance model complexity by feature via a data-dependent averaging regularization

  22. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM A diagnostic: kernel alignment metric ψ ( x ) = E z ∼ T ( x ) φ ( z ) Averaged features: Kernel target alignment [Cristianini et al., 2002]: how well separated are features from different classes

  23. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM A diagnostic: kernel alignment metric Kernel alignment Kernel alignment MNIST

  24. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM A diagnostic: kernel alignment metric Kernel alignment Kernel alignment MNIST Kernel alignment correlates with accuracy.

  25. th 6:30 11 th ICML Oral 06/11/2019 Post ster #227 | Tue Ju Jun 11 6:30—9:00 P 9:00 PM Summary • Data augmentation + k-NN = asymptotic kernel classifier. • Data augmentation induces invariance and regularizes. • Application in speeding up training and diagnostics. Tri Dao trid@stanford.edu Poster #227 on Tuesday Jun 11 th at 6:30pm

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend