CS 330
Non-Parametric Few-Shot Learning
1
Non-Parametric Few-Shot Learning CS 330 1 Logistics Homework 1 due - - PowerPoint PPT Presentation
Non-Parametric Few-Shot Learning CS 330 1 Logistics Homework 1 due tonight, Homework 2 out soon Fill out project group form if you havent already. Project suggestions & project spreadsheet posted 2 Plan for Today Non-Parametric Few-Shot
1
2
3
Dtr
i
φi
Dtr
i
φi
Note: some of these methods precede parametric approaches
6
i
7
8
i
9
Koch et al., ICML ‘15
10
Koch et al., ICML ‘15
11
Koch et al., ICML ‘15
12
Koch et al., ICML ‘15
j
13
Vinyals et al. Matching Networks, NeurIPS ‘16
i
bidirec9onal LSTM convolu9onal encoder
i
14
e ˆ yts = X
xk,yk∈Dtr
fθ(xts, xk)yk
fθ(xts, xk)y)yk
i , Dtest i
i )
i
15
Compute ˆ yts = X
xk,yk∈Dtr
fθ(xts, xk)yk
Update θ using rθL(ˆ yts, yts)
(Parameters integrated
ϕ
Snell et al. Prototypical Networks, NeurIPS ‘17
16
cn = 1 K X
(x,y)∈Dtr
i
(y = n)fθ(x)
n0 exp(d(fθ(x), cn0))
17
Link: h^ps://arxiv.org/abs/1811.03066
(h^p://www.dermnet.com/)
(Top 200 classes only!)
Acquire accurate
Compare: FT200-*CE
PN - standard ProtoNets, trained on 150 base classes, pre-trained on ImageNet FTN-*NN - ImageNet pre-training, fine-tuned ResNet on N classes, *-nearest neighbors in resul9ng embedding space
(very strong baseline, accesses more info during training, requires re-training for new classes)
Evalua0on Metric: mean class accuracy (mca), i.e. average of per-class accuracies across 200 classes. k = 5 k = 10 More visualiza9ons and analysis in the paper! PCN > PN PCN > FTN-*NN PCN ≈ FT200-*CE
without requiring re-training
22
yts xts
i , xts)
Jiang et al. CAML ‘19
Triantafillou et al. Proto-MAML ‘19
23
Rusu et al. LEO ‘19
where cn = 1 K X
(x,y)∈Dtr
i
(y = n)fθ(x)
24
+ en9rely feedforward + computa0onally fast & easy to
+ easy to combine with variety of learning problems (e.g. SL, RL)
induc9ve bias at the ini9aliza9on)
+ posi0ve induc0ve bias at the start
+ handles varying & large K well + model-agnos0c
intensive
25
+ complete expressive power
+ consistent, reduces to GD ~ expressive for very deep models*
+ expressive for most architectures ~ consistent under certain condi0ons
*for supervised learning sekngs
(likely says more about the benchmarks than the methods)
26
i
i
(Nguyen et al. Meta-Learning GNN Ini9aliza9ons for Low-Resource Molecular Property Predic9on. 2020)
i
i
[poten9ally useful for low-resource drug discovery problems]
(Gui et al. Few-Shot Human Mo9on Predic9on via Meta-Learning. ECCV 2018)
[poten9ally useful for human-robot interac9on, autonomous driving]
i
i
i
i
spelling correction simple math problems translating between languages
33
34