CS 4803 / 7643: Deep Learning
Zsolt Kira Georgia Tech
Topics:
– Low-label ML Formulations
CS 4803 / 7643: Deep Learning Topics: Low-label ML Formulations - - PowerPoint PPT Presentation
CS 4803 / 7643: Deep Learning Topics: Low-label ML Formulations Zsolt Kira Georgia Tech Administrativia Projects! Project Check-in due April 11 th Will be graded pass/fail, if fail then you can address the issues Counts
– Low-label ML Formulations
– Will be graded pass/fail, if fail then you can address the issues – Counts for 5 points of project score
– No presentations
(C) Dhruv Batra & Zsolt Kira 2
– Your project should include doing something beyond just downloading open-source code and tuning hyper- parameters. – This can include:
code),
(C) Dhruv Batra & Zsolt Kira 3
Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
(C) Dhruv Batra & Zsolt Kira 4
(C) Dhruv Batra & Zsolt Kira 5
A Survey on Transfer Learning Sinno Jialin Pan and Qiang Yang Fellow, IEEE
3D pose estimation
http://taskonomy.stanford.edu/
Slide Credit: Camilo & Higuera Disentangling Task Transfer Learning, Amir R. Zamir, Alexander Sax*, William B. Shen*, Leonidas Guibas, Jitendra Malik, Silvio Savarese
Builds graph of transferability between computer vision tasks:
Hierarchy Process (from pairwise comparisons between all possible sources for each target task)
selection optimization (best performance from a limited set of source tasks): transfer policy Empirical study on performance and data-efficiency gains from transfer using different datasets (Places and Imagenet)
Slide Credit: Camilo & Higuera
Slide Credit: Camilo & Higuera
(C) Dhruv Batra & Zsolt Kira 9
– unlabeled data (unsupervised learning) – Multi-modal data (multimodal learning) – Multi-domain data (transfer learning, domain adaptation)
(C) Dhruv Batra & Zsolt Kira 10
(C) Dhruv Batra & Zsolt Kira 11
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 12
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 13
– we want to design a learning algorithm A that outputs good parameters 𝜾 of a model M, when fed a small dataset Dtrain={(xi,yi)} i=1
– this is known as meta-learning or learning to learn
– ideally there should be no human involved in producing a model for new datasets
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 14
– One-Shot learning of object categories (2006) Fei-Fei Li, Rob
Fergus and Pietro Perona
– Knowledge transfer in learning to recognize visual objects classes (2004) Fei-Fei Li – Object classification from a single example utilizing class relevance pseudo-metrics (2004) Michael Fink – Cross-generalization: learning novel classes from a single example by feature replacement (2005) Evgeniy Bart and Shimon
Ullman
– with recent progress in end-to-end deep learning, we hope to learn a representation better suited for few-shot learning
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 15
– Learning a synaptic learning rule (1990) Yoshua Bengio, Samy
Bengio, and Jocelyn Cloutier
– The Evolution of Learning: An Experiment in Genetic Connectionism (1990) David Chalmers – On the search for new learning rules for ANNs (1995) Samy
Bengio, Yoshua Bengio, and Jocelyn Cloutier
– Learning to control fast-weight memories: An alternative to dynamic recurrent networks (1992) Jürgen Schmidhuber – A neural network that embeds its own meta-levels (1993)
Jürgen Schmidhuber
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 16
– outputs update, so can decide to do something else than gradient descent
(2016) Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W.
Hoffman, David Pfau, Tom Schaul, and Nando de Freitas
Steven Younger, and Peter R. Conwell
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 17
– idea of learning the learning rates and the initialization conditions
and Ryan P. Adams
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 18
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 19
– input: training set –
–
– input: meta-training set
–
–
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 20
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 21
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 22
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 23
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 24
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 25
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 26
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 27
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 28
Slide Credit: Hugo Larochelle
– Take inspiration from a known learning algorithm
MAML (Finn et al. 2017)
– Derive it from a black box neural network
(C) Dhruv Batra & Zsolt Kira 29
Slide Credit: Hugo Larochelle
– Take inspiration from a known learning algorithm
MAML (Finn et al. 2017)
– Derive it from a black box neural network
(C) Dhruv Batra & Zsolt Kira 30
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 31
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 32
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 33
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 34
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 35
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 36
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 37
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 38
(C) Dhruv Batra & Zsolt Kira 39
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 40
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 41
Slide Credit: Sergey Levine
(C) Dhruv Batra & Zsolt Kira 42
Slide Credit: Sergey Levine
(C) Dhruv Batra & Zsolt Kira 43
Slide Credit: Sergey Levine
– Take inspiration from a known learning algorithm
MAML (Finn et al. 2017)
– Derive it from a black box neural network
(C) Dhruv Batra & Zsolt Kira 44
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 45
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 46
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 47
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 48
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 49
Slide Credit: Hugo Larochelle
(C) Dhruv Batra & Zsolt Kira 50
Slide Credit: Hugo Larochelle A Closer Look at Few-shot Classification, Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, Jia-Bin Huang
(C) Dhruv Batra & Zsolt Kira 51
(C) Dhruv Batra & Zsolt Kira 52
(C) Dhruv Batra & Zsolt Kira 53
(C) Dhruv Batra & Zsolt Kira 54
(C) Dhruv Batra & Zsolt Kira 55
(C) Dhruv Batra & Zsolt Kira 56
– varying number of classes / examples per class (meta- training vs. meta-testing) ? – semantic differences between meta-training vs. meta-testing classes ? – overlap in meta-training vs. meta-testing classes (see recent “low-shot” literature) ?
– how should this impact how we generate episodes ? – meta-active learning ? (few successes so far)
Slide Credit: Hugo Larochelle