Efficient Meta Learning via Minibatch Proximal Update Pan Zhou - PowerPoint PPT Presentation

Efficient Meta Learning via Minibatch Proximal Update Pan Zhou Joint work with Xiao-Tong Yuan, Huan Xu, Shuicheng Yan, Jiashi Feng National University of Singapore pzhou@u.nus.edu Dec 11, 2019 1

Meta Learning via Minibatch Proximal Update (Meta-MinibatchProx) Meta-MinibatchProx learns a good prior model initialization from observed tasks such that is close to the optimal models of new similar tasks, promoting new task learning 2

Meta Learning via Minibatch Proximal Update (Meta-MinibatchProx) Meta-MinibatchProx learns a good prior model initialization from observed tasks such that is close to the optimal models of new similar tasks, promoting new task learning Training model: given a task distribution , we minimize a bi-level meta learning model • where each task has training samples is empirical loss with predictor and loss . 3

Meta Learning via Minibatch Proximal Update (Meta-MinibatchProx) Meta-MinibatchProx learns a good prior model initialization from observed tasks such that is close to the optimal models of new similar tasks, promoting new task learning Training model: given a task distribution , we minimize a bi-level meta learning model • update task-specific solution where each task has training samples is empirical loss with predictor and loss . 4

Meta Learning via Minibatch Proximal Update (Meta-MinibatchProx) Meta-MinibatchProx learns a good prior model initialization from observed tasks such that is close to the optimal models of new similar tasks, promoting new task learning Training model: given a task distribution , we minimize a bi-level meta learning model • update the prior model where each task has training samples is empirical loss with predictor and loss . 5

Meta Learning via Minibatch Proximal Update (Meta-MinibatchProx) Meta-MinibatchProx learns a good prior model initialization from observed tasks such that is close to the optimal models of new similar tasks, promoting new task learning Training model: given a task distribution , we minimize a bi-level meta learning model • where each task has training samples is empirical loss with predictor and loss . small average distance to optimum models of all tasks in expectation 6

Meta Learning via Minibatch Proximal Update (Meta-MinibatchProx) Meta-MinibatchProx learns a good prior model initialization from observed tasks such that is close to the optimal models of new similar tasks, promoting new task learning Test model: given a randomly sampled task consisting of K samples • where denotes the learnt prior initialization. 7

Meta Learning via Minibatch Proximal Update (Meta-MinibatchProx) Meta-MinibatchProx learns a good prior model initialization from observed tasks such that is close to the optimal models of new similar tasks, promoting new task learning Test model: given a randomly sampled task consisting of K samples • where denotes the learnt prior initialization. Benefit: a few data is sufficient for adaptation • small distance in expectation the learnt prior initialization is close to optimum when training and test tasks are sampled from the same distribution. 8

Optimization Algorithm We use SGD based algorithm to solve bi-level training model : 9

Optimization Algorithm We use SGD based algorithm to solve bi-level training model : Step1. select a mini-batch of task of size . • 10

Optimization Algorithm We use SGD based algorithm to solve bi-level training model : Step1. select a mini-batch of task of size . • Step2. for , compute an approximate minimizer: • 11

Optimization Algorithm We use SGD based algorithm to solve bi-level training model : Step1. select a mini-batch of task of size . • Step2. for , compute an approximate minimizer: • Step3. update the prior initialization model: • 12

Optimization Algorithm We use SGD based algorithm to solve bi-level training model : Step1. select a mini-batch of task of size . • Step2. for , compute an approximate minimizer: • Step3. update the prior initialization model: • Theorem 1 (convergence guarantees, informal). (1) Convex setting, i.e. convex . We prove (2) Nonconvex setting, i.e. smooth . We prove 13

Generalization Performance Guarantee Ideally, for a given task , one should train the model on the population risk • In practice, we has only K samples and adapt the learnt prior model to the new task: • Since , why is good for generalization in few-shot learning problem? • 14

Generalization Performance Guarantee Ideally, for a given task , one should train the model on the population risk • In practice, we has only K samples and adapt the learnt prior model to the new task: • Since , why is good for generalization in few-shot learning problem? • Theorem 2 (generalization performance guarantee, informal). Suppose each loss is convex and is smooth. Let . Then we have Remark: strong generalization performance , as our training model guarantees the learnt prior is close to the optimum model . 15

Experimental results Few-shot regression : smaller mean square error (MSE) between prediction and ground truth Few-shot classification: higher classification accuracy miniImageNet miniImageNet tieredImageNet tieredImageNet 1.15% 1.18% 55 72 1.44% 67 45 62 1.12% 5.15% 35 57 3.31% 25 0.8% 2.41% 52 47 15 1-shot 5-way 5-shot 5-way 1-shot 5-way 5-shot 5-way 1-shot 20-way 5-shot 20-way 1-shot 10-way 5-shot 10-way MAML FOMAML Reptile Ours MAML FOMAML Reptile Ours 16

POSTER # 26 05:00 -- 07:00 PM @ East Exhibition Hall B + C Thanks! 17

Efficient Meta Learning via Minibatch Proximal Update Pan Zhou - PowerPoint PPT Presentation

Efficient Meta Learning via Minibatch Proximal Update Pan Zhou Joint work with Xiao-Tong Yuan, Huan Xu, Shuicheng Yan, Jiashi Feng National University of Singapore pzhou@u.nus.edu Dec 11, 2019 1 Meta Learning via Minibatch Proximal Update

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

Meta-Learning of Structured Representation by Proximal Mapping Mao Li, Yingyi Ma, Xinhua

Convergence of perturbed Proximal Gradient algorithms Gersende Fort Institut de Math ematiques

Asymmetric Proximal Point Algorithms with Moving Proximal Centers Deren Han

Bayesian Model-Agnostic Meta-Learning Taesup Kim* (presenter), Jaesik Yoon* Ousmane Dia,

META Seal of Recognition and META Prize Award Ceremony Georg Rehm (DFKI) on behalf of the

Meta Learning Shengchao Liu Background Meta Learning (AKA Learning to Learn) A

A few meta learning papers Guy Gur-Ari Machine Learning Journal Club, September 2017 Meta

The Meta-Learning Problem & Black-Box Meta-Learning CS 330 Logistics Homework 1 posted today,

MetaFun: Meta-Learning with Iterative Functional Updates Jin Xu, Jean-Francois Ton, Hyunjik Kim,

Efficient Off-Policy Meta- Reinforcement Learning via Probabilistic Context Variables Rakelly,

Intelligent Tutoring Systems: A Meta-Analysis Meta-Analysis Wenting Ma March, 2011

Company profile Capabilities Customers & References META-LRA Kft. 8400 Ajka,

Individual Participant Data (IPD) Reviews and Meta analyses Lesley Stewart Director, CRD Larysa

Lecture 31/Chapter 25 More about Meta-Analysis Benefits and Pitfalls An Application:

Simultaneous meta and data manipulation in Blaise Marien Lina Statistics netherlands Statistics

Benefits (DOB) Exercise December 2011 Refer to the Notice The DOB process: A. Determine

Draft Great West Corridor Local Plan Review and Draft Brentford East Supplementary Planning

Issues and Considerations for State Decision-Makers August 23, 2018 Housekeeping Join audio:

control for the EXFEL and further SRF R&D - Motivation and goal - European

Collections Management DHPSNY Webinar February 15, 2017 Dyani Feige, Director of Preservation

Ernest N. Morial - New Orleans Exhibition Hall Authority Presentation to Fitch October 9, 2012

Expo Hosted by Port St. Lucie Worship Center Intl Ministry Womens Ministry Department W

RCE Warsaw Metropolitan RCE CE War arsaw Me Metr tropo politan as as the the pla place of

Efficient Meta Learning via Minibatch Proximal Update Pan Zhou - PowerPoint PPT Presentation

Efficient Meta Learning via Minibatch Proximal Update Pan Zhou Joint work with Xiao-Tong Yuan, Huan Xu, Shuicheng Yan, Jiashi Feng National University of Singapore pzhou@u.nus.edu Dec 11, 2019 1 Meta Learning via Minibatch Proximal Update

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

Meta-Learning of Structured Representation by Proximal Mapping Mao Li, Yingyi Ma, Xinhua

Convergence of perturbed Proximal Gradient algorithms Gersende Fort Institut de Math ematiques

Asymmetric Proximal Point Algorithms with Moving Proximal Centers Deren Han

Bayesian Model-Agnostic Meta-Learning Taesup Kim* (presenter), Jaesik Yoon* Ousmane Dia,

META Seal of Recognition and META Prize Award Ceremony Georg Rehm (DFKI) on behalf of the

Meta Learning Shengchao Liu Background Meta Learning (AKA Learning to Learn) A

A few meta learning papers Guy Gur-Ari Machine Learning Journal Club, September 2017 Meta

The Meta-Learning Problem &amp; Black-Box Meta-Learning CS 330 Logistics Homework 1 posted today,

MetaFun: Meta-Learning with Iterative Functional Updates Jin Xu, Jean-Francois Ton, Hyunjik Kim,

Efficient Off-Policy Meta- Reinforcement Learning via Probabilistic Context Variables Rakelly,

Intelligent Tutoring Systems: A Meta-Analysis Meta-Analysis Wenting Ma March, 2011

Company profile Capabilities Customers &amp; References META-LRA Kft. 8400 Ajka,

Individual Participant Data (IPD) Reviews and Meta analyses Lesley Stewart Director, CRD Larysa

Lecture 31/Chapter 25 More about Meta-Analysis Benefits and Pitfalls An Application:

Simultaneous meta and data manipulation in Blaise Marien Lina Statistics netherlands Statistics

Benefits (DOB) Exercise December 2011 Refer to the Notice The DOB process: A. Determine

Draft Great West Corridor Local Plan Review and Draft Brentford East Supplementary Planning

Issues and Considerations for State Decision-Makers August 23, 2018 Housekeeping Join audio:

control for the EXFEL and further SRF R&amp;D - Motivation and goal - European

Collections Management DHPSNY Webinar February 15, 2017 Dyani Feige, Director of Preservation

Ernest N. Morial - New Orleans Exhibition Hall Authority Presentation to Fitch October 9, 2012

Expo Hosted by Port St. Lucie Worship Center Intl Ministry Womens Ministry Department W

RCE Warsaw Metropolitan RCE CE War arsaw Me Metr tropo politan as as the the pla place of

The Meta-Learning Problem & Black-Box Meta-Learning CS 330 Logistics Homework 1 posted today,

Company profile Capabilities Customers & References META-LRA Kft. 8400 Ajka,

control for the EXFEL and further SRF R&D - Motivation and goal - European