cs 4803 7643 deep learning
play

CS 4803 / 7643: Deep Learning Topics: (Continue) Low-label ML - PowerPoint PPT Presentation

CS 4803 / 7643: Deep Learning Topics: (Continue) Low-label ML Formulations Zsolt Kira Georgia Tech Administrative Projects! Poster details out on piazza Note: No late days for anything project related! Also note: Keep


  1. CS 4803 / 7643: Deep Learning Topics: – (Continue) Low-label ML Formulations Zsolt Kira Georgia Tech

  2. Administrative • Projects! – Poster details out on piazza – Note: No late days for anything project related! – Also note: Keep track of your GCP usage and costs! Set limits on spending (C) Dhruv Batra & Zsolt Kira 31

  3. Meta-Learning for Few-Shot Recognition • Key idea : We want to learn from a few examples (called the support set ) to make predictions on query set for novel classes – Assume: We have larger labeled dataset for a different set of categories ( base classes ) • How do we test this? – N-way k-shot test – k: Number of examples in support set – N: Number of “confusers” that we have to choose target class among Target Query Set (C) Dhruv Batra & Zsolt Kira 32

  4. Normal Approach • Do what we always do: Fine-tuning – Train classifier on base classes – Freeze features – Learn classifier weights for new classes using few amounts of labeled data (during “ inference” time!) A Closer Look at Few-shot Classification, Wei-Yu Chen, Yen-Cheng Liu, (C) Dhruv Batra & Zsolt Kira 33 Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang

  5. Cons of Normal Approach • The training we do on the base classes does not factor the task into account • No notion that we will be performing a bunch of N- way tests • Idea: simulate what we will see during test time (C) Dhruv Batra & Zsolt Kira 34

  6. Meta-Training Approach • Set up a set of smaller tasks during training which simulates what we will be doing during testing – Can optionally pre-train features on held-out base classes (not typical) • Testing stage is now the same, but with new classes (C) Dhruv Batra & Zsolt Kira 35

  7. Meta-Learning Approaches • Learning a model conditioned on support set (C) Dhruv Batra & Zsolt Kira 36

  8. Meta-Learner • How to parametrize learning algorithms? • Two approaches to defining a meta-learner – Take inspiration from a known learning algorithm • kNN/kernel machine: Matching networks (Vinyals et al. 2016) • Gaussian classifier: Prototypical Networks (Snell et al. 2017) • Gradient Descent: Meta-Learner LSTM (Ravi & Larochelle, 2017) , MAML (Finn et al. 2017) – Derive it from a black box neural network • MANN (Santoro et al. 2016) • SNAIL (Mishra et al. 2018) (C) Dhruv Batra & Zsolt Kira 37 Slide Credit: Hugo Larochelle

  9. Meta-Learner • How to parametrize learning algorithms? • Two approaches to defining a meta-learner – Take inspiration from a known learning algorithm • kNN/kernel machine: Matching networks (Vinyals et al. 2016) • Gaussian classifier: Prototypical Networks (Snell et al. 2017) • Gradient Descent: Meta-Learner LSTM (Ravi & Larochelle, 2017) , MAML (Finn et al. 2017) – Derive it from a black box neural network • MANN (Santoro et al. 2016) • SNAIL (Mishra et al. 2018) (C) Dhruv Batra & Zsolt Kira 38 Slide Credit: Hugo Larochelle

  10. Matching Networks (C) Dhruv Batra & Zsolt Kira 39 Slide Credit: Hugo Larochelle

  11. Prototypical Networks (C) Dhruv Batra & Zsolt Kira 40 Slide Credit: Hugo Larochelle

  12. Prototypical Networks (C) Dhruv Batra & Zsolt Kira 41 Slide Credit: Hugo Larochelle

  13. More Sophisticated Meta-Learning Approaches • Learn gradient descent: – Parameter initialization and update rules • Learn just an initialization and use normal gradient descent (MAML) (C) Dhruv Batra & Zsolt Kira 42

  14. Meta-Learner LSTM (C) Dhruv Batra & Zsolt Kira 43 Slide Credit: Hugo Larochelle

  15. Meta-Learner LSTM (C) Dhruv Batra & Zsolt Kira 44 Slide Credit: Hugo Larochelle

  16. Meta-Learner LSTM (C) Dhruv Batra & Zsolt Kira 45 Slide Credit: Hugo Larochelle

  17. Meta-Learner LSTM (C) Dhruv Batra & Zsolt Kira 46 Slide Credit: Hugo Larochelle

  18. Meta-Learning Algorithm (C) Dhruv Batra & Zsolt Kira 47

  19. Meta-Learner LSTM (C) Dhruv Batra & Zsolt Kira 48 Slide Credit: Hugo Larochelle

  20. Model-Agnostic Meta-Learning (MAML) (C) Dhruv Batra & Zsolt Kira 49 Slide Credit: Hugo Larochelle

  21. Model-Agnostic Meta-Learning (MAML) (C) Dhruv Batra & Zsolt Kira 50

  22. Model-Agnostic Meta-Learning (MAML) (C) Dhruv Batra & Zsolt Kira 51 Slide Credit: Sergey Levine

  23. Model-Agnostic Meta-Learning (MAML) (C) Dhruv Batra & Zsolt Kira 52 Slide Credit: Sergey Levine

  24. Comparison (C) Dhruv Batra & Zsolt Kira 53 Slide Credit: Sergey Levine

  25. Meta-Learner • How to parametrize learning algorithms? • Two approaches to defining a meta-learner – Take inspiration from a known learning algorithm • kNN/kernel machine: Matching networks (Vinyals et al. 2016) • Gaussian classifier: Prototypical Networks (Snell et al. 2017) • Gradient Descent: Meta-Learner LSTM (Ravi & Larochelle, 2017) , MAML (Finn et al. 2017) – Derive it from a black box neural network • MANN (Santoro et al. 2016) • SNAIL (Mishra et al. 2018) (C) Dhruv Batra & Zsolt Kira 54 Slide Credit: Hugo Larochelle

  26. Experiments (C) Dhruv Batra & Zsolt Kira 55 Slide Credit: Hugo Larochelle

  27. Memory-Augmented Neural Network (C) Dhruv Batra & Zsolt Kira 56 Slide Credit: Hugo Larochelle

  28. But beware A Closer Look at Few-shot Classification, Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang (C) Dhruv Batra & Zsolt Kira 57 Slide Credit: Hugo Larochelle

  29. (C) Dhruv Batra & Zsolt Kira 58

  30. Distribution Shift • What if there is a distribution shift (cross- domain)? • Lesson: Methods that are successful within-domain might be worse across domains ! (C) Dhruv Batra & Zsolt Kira 59

  31. Distribution Shift (C) Dhruv Batra & Zsolt Kira 60

  32. Random Task Proposals (C) Dhruv Batra & Zsolt Kira 61

  33. Does it Work? (C) Dhruv Batra & Zsolt Kira 62

  34. Discussions • What is the right definition of distributions over problems? – varying number of classes / examples per class (meta- training vs. meta-testing) ? – semantic differences between meta-training vs. meta-testing classes ? – overlap in meta-training vs. meta-testing classes (see recent “low-shot” literature) ? • Move from static to interactive learning – how should this impact how we generate episodes ? – meta-active learning ? (few successes so far) (C) Dhruv Batra & Zsolt Kira 63 Slide Credit: Hugo Larochelle

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend