Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong - PowerPoint PPT Presentation

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng TG Ph.D. Student 2018-12-01

Ou Outline • Introduction to Meta Learning • Types of Meta-Learning Models • Papers: • � Optimization as a model for few-shot learning � ICLR2017 • � Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks � ICML2017 • � Meta-Learning for Low-Resource Neural Machine Translation � EMNLP2018 • Conclusion

Me Meta-lear learnin ing Reinforcement learning Machine Learning ��L�� L�� Meta Learning ��R�� Deep Learning �� +�� L�� D�� Meta Learning/Learning to learn https://zhuanlan.zhihu.com/p/28639662

Me Meta-lear learnin ing • Learning to learn �� • �� • �� I�� A�� • Meta learning �� AI �� Learning to Learn �� https://zhuanlan.zhihu.com/p/27629294

Ex Exampl ple Learner �� model �� • �� • �� • �� • �� • �� • �� • �� • �� • �� • � SGD/Adam Meta-learner �� Learner � Learning rate Dacay …… Meta learning Machine or Deep learning

Ty Types of Meta-Le Learn rning Mod Models • Humans learn following different methodologies tailored to specific circumstances. • In the same way, not all meta-learning models follow the same techniques. • Types of Meta-Learning Models 1. Few Shots Meta-Learning 2. Optimizer Meta-Learning 3. Metric Meta-Learning 4. Recurrent Model Meta-Learning 5. Initializations Meta-Learning What’s New in Deep Learning Research: Understanding Meta-Learning

Fe Few Shots Meta ta-Le Learn rning • Create models that can learn from minimalistic datasets mimicking --> (learn from tiny data) • Papers • Optimization As A Model For Few Shot Learning � ICLR2017 � • One-Shot Generalization in Deep Generative Models � ICML2016 � • Meta-Learning with Memory-Augmented Neural Networks � ICML2016 �

Op Optimizer Meta-Le Learn rning • Task: Learning how to optimize a neural network to better accomplish a task. • There is one network (the meta-learner) which learns to update another network (the learner) so that the learner effectively learns the task. • Papers: • Learning to learn by gradient descent by gradient descent (NIPS 2016) • Learning to Optimize Neural Nets

Me Metri ric Me Meta-Le Learn rning • To determine a metric space in which learning is particularly efficient. This approach can be seen as a subset of few shots meta-learning in which we used a learned metric space to evaluate the quality of learning with a few examples • Papers: • Prototypical Networks for Few-shot Learning(NIPS2017) • Matching Networks for One Shot Learning(NIPS2016) • Siamese Neural Networks for One-shot Image Recognition • Learning to Learn: Meta-Critic Networks for Sample Efficient Learning

Re Recurrent Model Meta-Le Learn rning • The meta-learner algorithm will train a RNN model will process a dataset sequentially and then process new inputs from the task • Papers: • Meta-Learning with Memory-Augmented Neural Networks • Learning to reinforcement learn • !" # : Fast Reinforcement Learning via Slow Reinforcement Learning

Initializ Initializatio tions ns Meta-Le Learn rning • Optimized for an initial representation that can be effectively fine-tuned from a small number of examples • Papers: • Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks � ICML 2017 � • Meta-Learning for Low-Resource Neural Machine Translation � EMNLP2018 �

Pa Papers Few Shots Meta-Learning � Recurrent Model Meta- Learning � Optimizer Meta-Learning � Initializations Meta-Learning � Supervised Meta Learning Optimization As a Model For Few Shot Learning (ICLR2017) Modern Meta Learning Meta Learning in NLP Model-Agnostic Meta-Learning for Meta-Learning for Low-Resource Fast Adaptation of Deep Networks Neural Machine Translation (ICML2017) (EMNLP2018)

Op Optimization on As a Mod odel For or Few Sh Shot ot Le Lear arnin ing g Twitter, Sachin Ravi, Hugo Larochelle ICLR2017 Few Shots Meta-Learning • Recurrent Model Meta-Learning • Optimizer Meta-Learning • Supervised Meta Learning • Initializations Meta-Learning •

Fe Few Shots Learning • Given a tiny labelled training set ! � which has " examples, ! = $ % , ' % , … $ ) , ' ) , • In classification problem: • * − ,ℎ./ Learning • " classes • * labelled examples( * is always less than 20)

LSTM TM-Ce Cell state update forgetting the things we decided to forget earlier new cell state old cell state new candidate values �� https://www.jianshu.com/p/9dc9f41f0b29

Su Supervised l learn rning �� NN �� Optimizer �� • SGD �� • Adam �� • …… �� • �� • !(#) → & image label

Me Meta l learn rning • Meta-learning suggests framing the learning problem at two levels. (Thrun, 1998; Schmidhuber et al., 1997) • The first is quick acquisition of knowledge within each separate task presented. (Fast adaption) • This process is guided by the second, which involves slower extraction of information learned across all the tasks.(Learning)

Mot Motivation on • Deep Learning has shown great success in a variety of tasks with large amounts of labeled data. • Gradient-based optimization (momentum, Adagrad, Adadelta and ADAM) in high capacity classifiers requires many iterative steps over many examples to perform well. • Start from a random initialization of its parameters. • Perform poorly on few-shot learning tasks. Is there an optimizer can finish the optimization task using just few examples?

Me Method od LSTM cell-state update � Gradient based update � Propose an LSTM based meta-learner model to learn the exact optimization algorithm used to train another learner neural network classifier in the few-shot regime.

LSTM-based meta-learner Me Method od optimizer that is trained to optimize a learner neural network classifier. Current parameter ! "#$ Gradient ∇ & '() ℒ Meta-learner Learner Learn optimization algorithm Neural network classifier New parameter ! " Gradient-based optimization: Meta-learner optimization: ! " = metalearner(! "#$ , ∇ & '() ℒ) knowing how to quickly optim the parameters

Mod Model Given by learner Given by learner

Ta Task Description episode Used to train learner Used to train meta-learner

Tr Training • Example: 5 classes, 1 shot learning • & "'()* , & ",-" ← Random dataset from & /,"(#"'()* Loss ℒ Learner Neural network classifier ( ! "#$ ) Gradient ∇ 1 234 ℒ Loss ℒ Meta-learner Output of Current param ! "#$ Learn optimization meta learner algorithm( Θ 6#$ ) Gradient ∇ 1 234 ℒ 7 " Output of Learner Learner Update learner meta learner Neural network Update 7 " classifier ( ! " ) Learner Meta-Learner Neural network Loss ℒ ",-" Update classifier ( ! " ) Θ 6 = Θ 6#$ − :∇ ; <34 ℒ ",-"

Initializ Initializatio tions ns Meta-Le Learn rning • Initial value of the cell state ! " • Initial weights of the classifier # " • ! " = # " • Learning this initial value lets the meta-learner determine the optimal initial weights of the learner

Te Testing • Example: 5 classes, 1 shot learning • ' #()*+ , ' #-.# ← Random dataset from ' 0-#)$1231 Loss ℒ Learner (Init with ! " , Current ! #$% ) Gradient ∇ 5 678 ℒ Loss ℒ Meta-learner Output of Current param ! #$% learn optimization meta learner algorithm( Θ ) Gradient ∇ 5 678 ℒ : # Output of Learner Learner Update learner meta learner Neural network Update : # classifier( ! # ) Learner Testing Neural network Metric classifier

Tr Training Learner Update Meta-Learner Update

Tr Trick • Parameter Sharing • meta-learner to produce updates for deep neural networks, which consist of tens of thousands of parameters, to prevent an explosion of meta-learner parameters we need to employ some sort of parameter sharing. • Batch Normalization • Speed up learning of deep neural networks by reducing internal covariate shift within the learner’s hidden layers.

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong - PowerPoint PPT Presentation

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng TG Ph.D. Student 2018-12-01 Ou Outline Introduction to Meta Learning Types of Meta-Learning Models Papers: Optimization as a model for few-shot

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng Ou Outline

A Bri A Brief ef Hi Hist story ory A Br A Brief ief Hi Hist story ory A Bri A Brief

I . Preliminaries: practical matters I . Preliminaries: practical matters A. Office

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

LEAR C ONTIGUOUS A REAS A NALYSIS (CAA) M APPING R EFINEMENT LEAR Open House Presentation April

Trainin ing and Exercis ising th the Nucle lear Safety and Nucle lear Securit ity In

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Health Credentials AM OSullivan PR March 2019 Introduct ction Expertise and services AM

CIT CITA PRESENTATIO ION 25 25.0 .09.2018 BRUNSWICK INT INTRODUCT CTION TO MOLA S T R E E

R in Grenoble DATA CHALLENGES Magali Richard & Florent Chuffart Introduct ction Data

2019 Annual Shareholders Meeting IN INTRODUCT CTION OF DIR IRECT CTORS Harold Lynn

Company y Backg ckground, Product ct Introduct ction, and Tech chnology y Roadmap Jan, 2017

June une 2018 2018 Introduct ction ons Steve Farmer, MD, F FACC, CC, F FASE ASE

Introduct ction to Semantic c Web Databases Prepared By: Amgad Madkour Ph.D. Candidate

Student Presentation: Mobile Technologies Talk: Mobile Technology GROUP to research, master

Web Security: Browsers CS 161: Computer Security Prof. David Wagner February 19, 2013

High Resolution Rapid Refresh Model Brian Blaylock and Dr. John Horel Department of Atmospheric

Having Fun with OpenCV Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps ? 2010

SEARCH FOR NEUTRINOLESS DOUBLE BETA DECAY WITH GERDA Luciano Pandola INFN, Laboratori Nazionali

Derivative pricing in fractional SABR model Tai-Ho Wang Conference Honoring Jim Gatherals 60th

HIGHLIGHTS BUSINESS AREAS ACCOUNTS BROADBAND STRATEGY OUTLOOK AND OBJECTIVES 1

Domestic value added content of exports: a cross-country comparison for the major European

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong - PowerPoint PPT Presentation

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng TG Ph.D. Student 2018-12-01 Ou Outline Introduction to Meta Learning Types of Meta-Learning Models Papers: Optimization as a model for few-shot

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng Ou Outline

A Bri A Brief ef Hi Hist story ory A Br A Brief ief Hi Hist story ory A Bri A Brief

I . Preliminaries: practical matters I . Preliminaries: practical matters A. Office

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

LEAR C ONTIGUOUS A REAS A NALYSIS (CAA) M APPING R EFINEMENT LEAR Open House Presentation April

Trainin ing and Exercis ising th the Nucle lear Safety and Nucle lear Securit ity In

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Health Credentials AM OSullivan PR March 2019 Introduct ction Expertise and services AM

CIT CITA PRESENTATIO ION 25 25.0 .09.2018 BRUNSWICK INT INTRODUCT CTION TO MOLA S T R E E

R in Grenoble DATA CHALLENGES Magali Richard &amp; Florent Chuffart Introduct ction Data

2019 Annual Shareholders Meeting IN INTRODUCT CTION OF DIR IRECT CTORS Harold Lynn

Company y Backg ckground, Product ct Introduct ction, and Tech chnology y Roadmap Jan, 2017

June une 2018 2018 Introduct ction ons Steve Farmer, MD, F FACC, CC, F FASE ASE

Introduct ction to Semantic c Web Databases Prepared By: Amgad Madkour Ph.D. Candidate

Student Presentation: Mobile Technologies Talk: Mobile Technology GROUP to research, master

Web Security: Browsers CS 161: Computer Security Prof. David Wagner February 19, 2013

High Resolution Rapid Refresh Model Brian Blaylock and Dr. John Horel Department of Atmospheric

Having Fun with OpenCV Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps ? 2010

SEARCH FOR NEUTRINOLESS DOUBLE BETA DECAY WITH GERDA Luciano Pandola INFN, Laboratori Nazionali

Derivative pricing in fractional SABR model Tai-Ho Wang Conference Honoring Jim Gatherals 60th

HIGHLIGHTS BUSINESS AREAS ACCOUNTS BROADBAND STRATEGY OUTLOOK AND OBJECTIVES 1

Domestic value added content of exports: a cross-country comparison for the major European

R in Grenoble DATA CHALLENGES Magali Richard & Florent Chuffart Introduct ction Data