unraveling meta learning understanding feature
play

Unraveling Meta-Learning: Understanding Feature Representations for - PowerPoint PPT Presentation

Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks Micah Goldblum, Steven Reich, Liam Fowl, Renkun Ni, Valeriia Cherepanova, Tom Goldstein University of Maryland, College Park, Maryland, USA


  1. Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks Micah Goldblum, Steven Reich, Liam Fowl, Renkun Ni, Valeriia Cherepanova, Tom Goldstein University of Maryland, College Park, Maryland, USA goldblum@umd.edu August 14, 2020

  2. A Brief Synopsis What is the difgerence between meta-learned and classically trained networks? fine-tuning perform clustering in feature space. problems by encouraging feature-space clustering. performance by enforcing a consensus penalty. Unraveling Meta-Learning Goldblum et al. August 14, 2020 2/17 • Meta-learners which fix the feature extractor during • Improve the performance of classical training for few-shot • Relate Reptile to consensus optimization and improve its

  3. Meta-Learning for Few-Shot Classification 8 August 14, 2020 Goldblum et al. Unraveling Meta-Learning Algorithm 1: The meta-learning framework 11 end while n 10 Update base model parameters (outer loop): 9 end for 7 6 5 3 while not done do 4 3/17 1 Require: Base model, F θ , fine-tuning algorithm, A , learning rate, γ , and distribution over tasks, p ( T ) . 2 Initialize θ , the weights of F ; Sample batch of tasks, {T i } n i =1 , where T i ∼ p ( T ) and T i = ( T s i , T q i ) . for i = 1 , . . . , n do Fine-tune model on T i (inner loop). New network parameters are written θ i = A ( θ, T s i ) . Compute gradient g i = ∇ θ L ( F θ i , T q i ) θ ← θ − γ ∑ i g i

  4. Meta-Learning for Few-Shot Classification 2017]. last linear layer) [Bertinetto et al. 2018]. linear layer) [Lee et al. 2019]. fine-tune last layer) [Snell et al. 2017]. Unraveling Meta-Learning Goldblum et al. August 14, 2020 4/17 • Meta-learning methods mainly difger in fine-tuning procedure. • MAML: SGD to fine-tune all network parameters [Finn et al. • R2-D2: Ridge regression on the one-hot labels (only fine-tune • MetaOptNet: Difgerentiable solver for SVM (only fine-tune last • ProtoNet: Nearest neighbors with class prototypes (only

  5. Meta-Learned Feature Extractors Are Better for 48.29 51.80 55.89 47 .89 53.72 R2-D2-Classical 48.39 28.77 46.39 44.31 Table 1: Comparison of meta-learning and classical transfer learning models on 5-way 1-shot mini-ImageNet. Column headers denote the fine-tuning algorithm used for evaluation. Unraveling Meta-Learning Goldblum et al. August 14, 2020 R2-D2-Meta 41.89 Few-Shot Classification 55.09 same architecture trained with SGD. fine-tuning algorithm. Model SVM RR ProtoNet MAML MetaOptNet-Meta 62.64 60.50 51.99 55.77 MetaOptNet-Classical 56.18 5/17 • Meta-learned models perform better than models of the • Meta-learned models are not simply well-tuned for their own

  6. Clustering in Feature Space Hypothesis: meta-learning algorithms which fix the feature extractor during the inner loop cluster each class around a point. Unraveling Meta-Learning Goldblum et al. August 14, 2020 6/17 • Visualize feature clustering. • Measure feature clustering. • Suffjcient condition for good few-shot classification. • Clustering regularizers improve few-shot performance.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend