Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong - PowerPoint PPT Presentation

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng

Ou Outline • Introduction to Meta Learning • Types of Meta-Learning Models • Papers: • 《 Optimization as a model for few-shot learning 》 ICLR2017 • 《 Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks 》 ICML2017 • 《 Meta-Learning for Low-Resource Neural Machine Translation 》 EMNLP2018 • Conclusion

Me Meta-lear learnin ing Reinforcement learning Machine Learning 对于序列决策问题，单⼀深度学复杂分类效果差习⽆法解决（结合DL+RL） Meta Learning 之前依赖于巨量的训 Deep Learning 结合表示学习，基本上解练，需要充分的利用以往的知识经验来指决了⼀对⼀映射的问题导新任务的学习最前沿：百家争鸣的 Meta Learning/Learning to learn https://zhuanlan.zhihu.com/p/28639662

Me Meta-lear learnin ing • Learning to learn （学会学习） • 学会学习：拥有学习的能⼒。 • 举⼀个⾦庸武侠的例⼦：我们都知道，在⾦庸的武侠世界中，有各种各样的武功，不同的武功都不⼀样，有内功也有外功。那么里面的张⽆忌就特别厉害，因为他练成了九阳神功。有了九阳神功，张⽆忌学习新的武功就特别快，在电影倚天屠龙记之魔教教主中，张⽆忌分分钟学会了张三丰的太极拳打败了⽞冥⼆老。九阳神功就是⼀种学会学习的武功！ • Meta learning 就是 AI 中的九阳神功学会学习 Learning to Learn ：让AI拥有核⼼价值观从⽽实现快速学习 https://zhuanlan.zhihu.com/p/27629294

Ex Exampl ple Learner 模型 model （用于完成某⼀任务）（用于完成某⼀任务）分类 • 分类 • 回归 • 回归 • 序列标注 • 序列标注 • ⽣成 • ⽣成 • …… • …… • ⼈ SGD/Adam Meta-learner （学会优化 Learner ） Learning rate Dacay …… Meta learning Machine or Deep learning

Ty Types of Meta-Le Learn rning Mod Models • Humans learn following different methodologies tailored to specific circumstances. • In the same way, not all meta-learning models follow the same techniques. • Types of Meta-Learning Models 1. Few Shots Meta-Learning 2. Optimizer Meta-Learning 3. Metric Meta-Learning 4. Recurrent Model Meta-Learning 5. Initializations Meta-Learning What’s New in Deep Learning Research: Understanding Meta-Learning

Fe Few Shots Meta ta-Le Learn rning • Create models that can learn from minimalistic datasets mimicking --> (learn from tiny data) • Papers • Optimization As A Model For Few Shot Learning （ ICLR2017 ） • One-Shot Generalization in Deep Generative Models （ ICML2016 ） • Meta-Learning with Memory-Augmented Neural Networks （ ICML2016 ）

Op Optimizer Meta-Le Learn rning • Task: Learning how to optimize a neural network to better accomplish a task. • There is one network (the meta-learner) which learns to update another network (the learner) so that the learner effectively learns the task. • Papers: • Learning to learn by gradient descent by gradient descent (NIPS 2016) • Learning to Optimize Neural Nets

Me Metri ric Me Meta-Le Learn rning • To determine a metric space in which learning is particularly efficient. This approach can be seen as a subset of few shots meta-learning in which we used a learned metric space to evaluate the quality of learning with a few examples • Papers: • Prototypical Networks for Few-shot Learning(NIPS2017) • Matching Networks for One Shot Learning(NIPS2016) • Siamese Neural Networks for One-shot Image Recognition • Learning to Learn: Meta-Critic Networks for Sample Efficient Learning

Re Recurrent Model Meta-Le Learn rning • The meta-learner algorithm will train a RNN model will process a dataset sequentially and then process new inputs from the task • Papers: • Meta-Learning with Memory-Augmented Neural Networks • Learning to reinforcement learn • 𝑆𝑀 # : Fast Reinforcement Learning via Slow Reinforcement Learning

Initializ Initializatio tions ns Meta-Le Learn rning • Optimized for an initial representation that can be effectively fine-tuned from a small number of examples • Papers: • Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks （ ICML 2017 ） • Meta-Learning for Low-Resource Neural Machine Translation （ EMNLP2018 ）

Pa Papers Few Shots Meta-Learning 、 Recurrent Model Meta- Learning 、 Optimizer Meta-Learning 、 Initializations Meta-Learning 、 Supervised Meta Learning Optimization As a Model For Few Shot Learning (ICLR2017) Modern Meta Learning Meta Learning in NLP Model-Agnostic Meta-Learning for Meta-Learning for Low-Resource Fast Adaptation of Deep Networks Neural Machine Translation (ICML2017) (EMNLP2018)

Op Optimization on As a Mod odel For or Few Sh Shot ot Le Lear arnin ing g Twitter, Sachin Ravi, Hugo Larochelle ICLR2017 Few Shots Meta-Learning • Recurrent Model Meta-Learning • Optimizer Meta-Learning • Supervised Meta Learning • Initializations Meta-Learning •

Fe Few Shots Learning • Given a tiny labelled training set 𝑇 ， which has 𝑂 examples, 𝑇 = 𝑦 ( , 𝑧 ( , … 𝑦 , , 𝑧 , , • In classification problem: • 𝐿 − 𝑡ℎ𝑝𝑢 Learning • 𝑂 classes • 𝐿 labelled examples( 𝐿 is always less than 20)

LSTM TM-Ce Cell state update forgetting the things we decided to forget earlier new cell state old cell state new candidate values 理解 LSTM ⽹络 https://www.jianshu.com/p/9dc9f41f0b29

Su Supervised l learn rning 神经⽹络 NN （用于完成某⼀任务） Optimizer 分类 • SGD 回归 • Adam 序列标注 • …… ⽣成 • …… • 𝑔(𝑦) → 𝑧 image label

Me Meta l learn rning • Meta-learning suggests framing the learning problem at two levels. (Thrun, 1998; Schmidhuber et al., 1997) • The first is quick acquisition of knowledge within each separate task presented. (Fast adaption) • This process is guided by the second, which involves slower extraction of information learned across all the tasks.(Learning)

Mot Motivation on • Deep Learning has shown great success in a variety of tasks with large amounts of labeled data. • Gradient-based optimization (momentum, Adagrad, Adadelta and ADAM) in high capacity classifiers requires many iterative steps over many examples to perform well. • Start from a random initialization of its parameters. • Perform poorly on few-shot learning tasks. Is there an optimizer can finish the optimization task using just few examples?

Me Method od LSTM cell-state update ： Gradient based update ： Propose an LSTM based meta-learner model to learn the exact optimization algorithm used to train another learner neural network classifier in the few-shot regime.

LSTM-based meta-learner Me Method od optimizer that is trained to optimize a learner neural network classifier. Current parameter 𝜄 89( Gradient ∇ ; <=> ℒ Meta-learner Learner Learn optimization algorithm Neural network classifier New parameter 𝜄 8 Gradient-based optimization: Meta-learner optimization: 𝜄 8 = metalearner(𝜄 89( , ∇ ; <=> ℒ) knowing how to quickly optim the parameters

Mod Model Given by learner Given by learner

Ta Task Description episode Used to train learner Used to train meta-learner

Tr Training • Example: 5 classes, 1 shot learning • 𝒠 8HIJK , 𝒠 8LM8 ← Random dataset from 𝒠 OL8I98HIJK Loss ℒ Learner Neural network classifier ( 𝜄 89( ) Gradient ∇ ; <=> ℒ Loss ℒ Meta-learner Output of Current param 𝜄 89( Learn optimization meta learner algorithm( Θ Q9( ) Gradient ∇ ; <=> ℒ 𝐷 8 Output of Learner Learner Update learner meta learner Neural network Update 𝐷 8 classifier ( 𝜄 8 ) Learner Meta-Learner Neural network Loss ℒ 8LM8 Update classifier ( 𝜄 8 ) Θ Q = Θ Q9( − 𝛽∇ T U=> ℒ 8LM8

Initializ Initializatio tions ns Meta-Le Learn rning • Initial value of the cell state 𝐷 V • Initial weights of the classifier 𝜄 V • 𝐷 V = 𝜄 V • Learning this initial value lets the meta-learner determine the optimal initial weights of the learner

Te Testing • Example: 5 classes, 1 shot learning • 𝒠 8HIJK , 𝒠 8LM8 ← Random dataset from 𝒠 OL8I9WXYW Loss ℒ Learner (Init with 𝜄 V , Current 𝜄 89( ) Gradient ∇ ; <=> ℒ Loss ℒ Meta-learner Output of Current param 𝜄 89( learn optimization meta learner algorithm( Θ ) Gradient ∇ ; <=> ℒ 𝐷 8 Output of Learner Learner Update learner meta learner Neural network Update 𝐷 8 classifier( 𝜄 8 ) Learner Testing Neural network Metric classifier

Tr Training Learner Update Meta-Learner Update

Tr Trick • Parameter Sharing • meta-learner to produce updates for deep neural networks, which consist of tens of thousands of parameters, to prevent an explosion of meta-learner parameters we need to employ some sort of parameter sharing. • Batch Normalization • Speed up learning of deep neural networks by reducing internal covariate shift within the learner’s hidden layers.

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong - PowerPoint PPT Presentation

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng Ou Outline Introduction to Meta Learning Types of Meta-Learning Models Papers: Optimization as a model for few-shot learning ICLR2017

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng TG Ph.D. Student

A Bri A Brief ef Hi Hist story ory A Br A Brief ief Hi Hist story ory A Bri A Brief

I . Preliminaries: practical matters I . Preliminaries: practical matters A. Office

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

LEAR C ONTIGUOUS A REAS A NALYSIS (CAA) M APPING R EFINEMENT LEAR Open House Presentation April

Trainin ing and Exercis ising th the Nucle lear Safety and Nucle lear Securit ity In

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Health Credentials AM OSullivan PR March 2019 Introduct ction Expertise and services AM

CIT CITA PRESENTATIO ION 25 25.0 .09.2018 BRUNSWICK INT INTRODUCT CTION TO MOLA S T R E E

R in Grenoble DATA CHALLENGES Magali Richard & Florent Chuffart Introduct ction Data

2019 Annual Shareholders Meeting IN INTRODUCT CTION OF DIR IRECT CTORS Harold Lynn

Company y Backg ckground, Product ct Introduct ction, and Tech chnology y Roadmap Jan, 2017

June une 2018 2018 Introduct ction ons Steve Farmer, MD, F FACC, CC, F FASE ASE

Introduct ction to Semantic c Web Databases Prepared By: Amgad Madkour Ph.D. Candidate

Vehicular Cyber-Physical Systems (Or, Improving Your Commute) Hari Balakrishnan

November 2 0 1 8 Disclaimer This management presentation is intended to provide an overview of

Multi-Resolution Broadcasting Over the Grassmann and Stiefel Manifolds Mohammad T. Hussien ,

Handling Line Continua- tions Seth Stewart FamilySearch Language Modeling Combining

2019-2020 Resource Allocation Process Joint Strategic Planning & Budget Committee November

Circular economy of gypsum plasterboard Plasterboard waste PRECONSUMER -> CLEAN, FAST AND

Relending Program Application Workshop November 10, 2016 INVESTING IN COMMUNITY We advance

Draft water market rules Marie Conti AUSTRALIAN COMPETITION AND CONSUMER COMMISSION 2008

Sambuz

Useful Links

Newsletter

Mail Us

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong - PowerPoint PPT Presentation

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng Ou Outline Introduction to Meta Learning Types of Meta-Learning Models Papers: Optimization as a model for few-shot learning ICLR2017

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng TG Ph.D. Student

A Bri A Brief ef Hi Hist story ory A Br A Brief ief Hi Hist story ory A Bri A Brief

I . Preliminaries: practical matters I . Preliminaries: practical matters A. Office

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

LEAR C ONTIGUOUS A REAS A NALYSIS (CAA) M APPING R EFINEMENT LEAR Open House Presentation April

Trainin ing and Exercis ising th the Nucle lear Safety and Nucle lear Securit ity In

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Health Credentials AM OSullivan PR March 2019 Introduct ction Expertise and services AM

CIT CITA PRESENTATIO ION 25 25.0 .09.2018 BRUNSWICK INT INTRODUCT CTION TO MOLA S T R E E

R in Grenoble DATA CHALLENGES Magali Richard &amp; Florent Chuffart Introduct ction Data

2019 Annual Shareholders Meeting IN INTRODUCT CTION OF DIR IRECT CTORS Harold Lynn

Company y Backg ckground, Product ct Introduct ction, and Tech chnology y Roadmap Jan, 2017

June une 2018 2018 Introduct ction ons Steve Farmer, MD, F FACC, CC, F FASE ASE

Introduct ction to Semantic c Web Databases Prepared By: Amgad Madkour Ph.D. Candidate

Vehicular Cyber-Physical Systems (Or, Improving Your Commute) Hari Balakrishnan

November 2 0 1 8 Disclaimer This management presentation is intended to provide an overview of

Multi-Resolution Broadcasting Over the Grassmann and Stiefel Manifolds Mohammad T. Hussien ,

Handling Line Continua- tions Seth Stewart FamilySearch Language Modeling Combining

2019-2020 Resource Allocation Process Joint Strategic Planning &amp; Budget Committee November

Circular economy of gypsum plasterboard Plasterboard waste PRECONSUMER -&gt; CLEAN, FAST AND

Relending Program Application Workshop November 10, 2016 INVESTING IN COMMUNITY We advance

Draft water market rules Marie Conti AUSTRALIAN COMPETITION AND CONSUMER COMMISSION 2008

Sambuz

Useful Links

Newsletter

Mail Us

R in Grenoble DATA CHALLENGES Magali Richard & Florent Chuffart Introduct ction Data

2019-2020 Resource Allocation Process Joint Strategic Planning & Budget Committee November

Circular economy of gypsum plasterboard Plasterboard waste PRECONSUMER -> CLEAN, FAST AND