t ask n orm
play

T ASK N ORM : Rethinking Batch Normalization for Meta-Learning John - PowerPoint PPT Presentation

T ASK N ORM : Rethinking Batch Normalization for Meta-Learning John Bronskill Jonathan Gordon James Requeima Sebastian Nowozin Richard E. Turner University of University of University of Microsoft Research University of Cambridge


  1. T ASK N ORM : Rethinking Batch Normalization for Meta-Learning John Bronskill Jonathan Gordon James Requeima Sebastian Nowozin Richard E. Turner University of University of University of Microsoft Research University of Cambridge Cambridge Cambridge, Cambridge, Invenia Labs Microsoft Research Department of Engineering Paper: * Bronskill, J. * Gordon, J. Requeima , J., Nowozin, S. and Turner, R.E. “ TaskNorm: Rethinking Batch Normalization for Meta-Learning .” Proceedings of the 37th International Conference on Machine Learning, PMLR 108 (2020). * Equal contribution . Code: https://github.com/cambridge-mlg/cnaps

  2. TaskNorm: Batch Normalization for Meta-learning with Images • We demonstrate the significant effect of batch normalization (BN) on meta-learning image classification accuracy and training efficiency. • We identify issues with transductive BN schemes used in well known meta-learning algorithms. • We introduce T ASK N ORM , a normalization algorithm that is tailored for the meta-learning setting and improves both image classification accuracy and training efficiency.

  3. Meta-Learning

  4. Meta-Learning ➢ Early Machine Learning : Learn classifier based on engineered features

  5. Meta-Learning ➢ Early Machine Learning : Learn classifier based on engineered features ➢ Deep learning : Jointly learn features and classifier

  6. Meta-Learning ➢ Early Machine Learning : Learn classifier based on engineered features ➢ Deep learning : Jointly learn classifier and model ➢ Meta-Learning : Jointly learn features, classifier, and algorithm [1] [1] Hospedales, Timothy, et al. "Meta-learning in neural networks: A survey." arXiv preprint arXiv:2004.05439 (2020).

  7. Meta-Learning ➢ Early Machine Learning : Learn model based on engineered features ➢ Deep learning : Jointly learn features and model ➢ Meta-Learning : Jointly learn features, model, and algorithm [1] Given a task distribution, learn a new task efficiently. [2] [1] Hospedales, Timothy, et al. "Meta-learning in neural networks: A survey." arXiv preprint arXiv:2004.05439 (2020). [2] Sergey Levine & Chelsea Finn - Meta-Learning: from Few-Shot Learning to Rapid Reinforcement Learning: https://metalearning-cvpr2019.github.io/assets/CVPR_2019_Metalearning_Tutorial_Chelsea_Finn.pdf

  8. Meta-Learning ➢ Early Machine Learning : Learn model based on engineered features ➢ Deep learning : Jointly learn features and model ➢ Meta-Learning : Jointly learn features, model, and algorithm [1] Given a task distribution, learn a new task efficiently. [2] ➢ Focus on utilizing meta-learning in the few-shot classification scenario [1] Hospedales, Timothy, et al. "Meta-learning in neural networks: A survey." arXiv preprint arXiv:2004.05439 (2020). [2] Sergey Levine & Chelsea Finn - Meta-Learning: from Few-Shot Learning to Rapid Reinforcement Learning: https://metalearning-cvpr2019.github.io/assets/CVPR_2019_Metalearning_Tutorial_Chelsea_Finn.pdf

  9. Few-Shot Meta-Training / Meta-Testing

  10. Few-Shot Meta-Training / Meta-Testing Task 𝜐 Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  11. Few-Shot Meta-Training / Meta-Testing Task 𝜐 meter watch stopwatch clock clock stopwatch Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝐸 1 1 Meta-Train Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  12. Few-Shot Meta-Training / Meta-Testing Task 𝜐 meter watch stopwatch clock clock stopwatch Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝐸 1 1 Meta-Train Context Images & Labels Meta-Learner Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  13. Few-Shot Meta-Training / Meta-Testing Task 𝜐 meter watch stopwatch clock clock stopwatch Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝐸 1 1 Meta-Train Context Images & Labels Meta-Learner Learner Parameters Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  14. Few-Shot Meta-Training / Meta-Testing Task 𝜐 meter watch stopwatch clock clock stopwatch Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝐸 1 1 Meta-Train Context Target Images & Labels Images Meta-Learner Learner Predictions Parameters Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  15. Few-Shot Meta-Training / Meta-Testing Task 𝜐 meter watch stopwatch clock clock stopwatch Target Context Context Target Target Labels Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝐸 1 1 Meta-Train Loss Context Target Images & Labels Images Meta-Learner Learner Predictions Parameters Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  16. Meta-Training / Meta-Testing Task 𝜐 meter watch stopwatch clock clock stopwatch Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝐸 1 1 Meta-Train Aramaic8 Aramaic9 Aramaic15 Aramaic19 Aramaic19 Aramaic9 𝑈 2 𝐸 2 Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  17. Meta-Training / Meta-Testing Task 𝜐 meter watch stopwatch clock clock stopwatch Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝐸 1 1 Meta-Train Aramaic8 Aramaic9 Aramaic15 Aramaic19 Aramaic19 Aramaic9 Target Labels 𝑈 2 𝐸 2 Loss Target Context Images Images & Labels Meta-Learner Learner Predictions Parameters Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  18. Meta-Training / Meta-Testing Task 𝜐 meter watch watch stopwatch stopwatch clock meter clock clock clock stopwatch stopwatch Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝑈 𝐸 1 𝐸 1 1 1 Meta-Train Aramaic8 Aramaic9 Aramaic15 Aramaic19 Aramaic19 Aramaic9 … 𝑈 2 𝐸 2 Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  19. Meta-Training / Meta-Testing Task 𝜐 meter watch stopwatch clock clock stopwatch Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝐸 1 1 Meta-Train Aramaic8 Aramaic9 Aramaic15 Aramaic19 Aramaic19 Aramaic9 … 𝑈 2 𝐸 2 ? ? curve speed stop no trucks Meta-Test ∗ ∗ 𝑈 𝐸 1 1 Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  20. Meta-Training / Meta-Testing Task 𝜐 meter watch stopwatch clock clock stopwatch Context Context Target Target Set (𝐸 𝜐 ) Set (𝐸 𝜐 ) Set (𝑈 Set (𝑈 𝜐 ) 𝜐 ) 𝑈 𝐸 1 1 Meta-Train Aramaic8 Aramaic9 Aramaic15 Aramaic19 Aramaic19 Aramaic9 … 𝑈 2 𝐸 2 ? ? curve speed stop no trucks Meta-Test ∗ ∗ 𝑈 𝐸 1 1 Target Images Context Images & Labels Learner Meta-Learner Predictions Parameters Hugo Larochelle – Generalizing From Few Examples With Meta-Learning: https://www.dropbox.com/s/sm68skkkbxbob0i/metalearning.pdf?dl=0

  21. Batch Normalization Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by Reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).

  22. Batch Normalization ➢ Goal : Normalize each training batch so that it has: • zero mean • unit variance Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by Reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).

  23. Batch Normalization ➢ Goal : Normalize each training batch so that it has: • zero mean • unit variance ➢ Accelerates Neural Network training by: • Allowing the use of higher learning rates. • Decreasing the sensitivity to network initialization. Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by Reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).

  24. “Conventional” Batch Normalization Algorithm Training: Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by Reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).

  25. “Conventional” Batch Normalization Algorithm Training: 𝐶 = 𝑦 1 , 𝑦 2 , … , 𝑦 𝑛 # a mini-batch ⓪ Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by Reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend