Learning to Learn Kernels with Variational Random Features - PowerPoint PPT Presentation

Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong Zhen*, Haoliang Sun*, Yingjun Du*, Jun Xu, Yilong Yin, Ling Shao, Cees Snoek ICML | 2020

Meta-Learning (Leaning to Learn) D1 D2 D3 D’ Datasets … … Meta Knowledge ： t’ Tasks t1 t2 t3 Ø Good parameter initialization (Finn Base Base Base new et al., 2017) Learner1 Learner2 Learner3 Learner Meta Learner Ø Efficient optimization update rules (Ravi et al., 2017) Meta-Learning. Ø General feature extractors (Vinyals Ø Extract prior (meta) knowledge from related tasks (meta learner) et al., 2016) ... Ø Fast adaptation to a new task (base learner) ICML | 2020

Few-Shot Learning (FSL) with Meta-Learning (ML) Ø The episodic training-testing strategy -- meta-training : a meta-learner is trained to enhance base-learners’ performance on the meta-training set with a batch of few-shot learning tasks -- meta-testing : base-learners are evaluated on the meta-test set with novel categories of data Ø An episode (task) -- sample 𝐷 -way 𝑙 -shot classification tasks from the meta-training (testing) set -- 𝑙 is the number of labelled examples for each of the 𝐷 classes ICML | 2020

Few-Shot Learning (FSL) with Meta-Learning (ML) Episode 1 Episode 2 … Example of few-shot learning setup (Ravi et al., 2017) ICML | 2020

An Effective Meta-Learning Scenario Ø Base-learner: -- be powerful to solve individual tasks -- be able to absorb common information Ø Meta-learner: -- extract valid prior knowledge Key idea ： Ø integrate kernel learning with random features and variational inference (VI) into the ML framework for FSL Ø formulate the optimization as a VI problem by deriving new ELBO Ø a context inference puts the inference of random bases of the current task into the context of all previous, related tasks ICML | 2020

Problem Statement Meta-learning with kernels For task , support set , query set , predictor , base-learner , loss , mapping function , . A practical base-learner (Kernel ridge regression) The closed-form solution . The predictor . Learning adaptive kernels with data-driven random Fourier features ICML | 2020

Problem Statement Random Fourier Features (RFFs) Ø learn adaptive kernels in a data-driven way Ø leverage the shared knowledge by exploring dependencies among related tasks to generate rich features Ø construct approximate translation-invariant kernels using explicit feature maps via random bases (Bochner’s theorem) Data-driven adaptive kernels is to find the posterior for random bases Formulated as a variational inference problem ICML | 2020

Meta Variational Random Features (MetaVRF) The objective function Ø The posterior is intractable. Approximate it by using a meta variational distribution Variational distribution Ø The Evidence Lower Bound (ELBO) ELBO Ø The objective (maximizing ELBO w.r.t. tasks) ICML | 2020

Context Inference Ø generate rich random bases to build strong kernels Ø put the inference of bases of the current task into the context of all previous, related tasks Ø The context of related tasks x , S t , -th task dependency C . W The directed graphical model. ICML | 2020

An LSTM-Based Context Inference Network Ø LSTM transformation with input of the support set and previous cell states Ø shared MLPs for inference outputs the parameter of the variational distribution Ø The optimization objective with the context inference ICML | 2020

ICML | 2020

Experiments Ø Few-Shot Regression -- Fitting a target sine function Ø Few-Shot Classification -- Three benchmarks Ø Further analysis -- Deep embedding -- Efficiency -- Versatility ICML | 2020

Evaluation: Few-Shot Regression 3-shot 5-shot 10-shot 4 3 4 0 0 0 1.913 0.415 0.294 1.072 0.063 0.024 0.722 0.047 0.009 -4 0.700 0.022 0.003 -4 -3 -5 0 5 -5 0 5 -5 0 5 Figure 1: ICML | 2020

Evaluation: Few-Shot Classification ICML | 2020

Further Analysis ICML | 2020

Conclusion v A novel meta-learning framework, MetaVRF, introducing RFFs into the meta-learning framework and leveraging VI to infer the spectral distribution in a data-driven way. v The LSTM-based context inference explores the shared knowledge and generates rich random features. v Achieve the state-of-the-art performance. v Learned kernels exhibit high representational power with a low spectral sampling rate. v Robustness and flexibility to a great variety of testing conditions. ICML | 2020

ICML | 2020

Learning to Learn Kernels with Variational Random Features - PowerPoint PPT Presentation

Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong Zhen, Haoliang Sun, Yingjun Du*, Jun Xu, Yilong Yin, Ling Shao, Cees Snoek ICML | 2020 Meta-Learning (Leaning to Learn) D1 D2 D3 D Datasets

Overview: Kernels for Sequences and Graphs String Kernels 8 Example Sequence Classification

The Gray Code Kernels The Gray Code Kernels The Gray Code Kernels Gil Ben-Artzi Hagit Hel-Or

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Kernels on structures Andrea Passerini passerini@disi.unitn.it Machine Learning Kernels on

Beta kernels and transformed kernels applications to copulas and quantiles Arthur Charpentier

SVM Kernels COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning SVM Kernels 1 /

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Modelling covariance kernels for nonstationary random fields Christopher G. Small University of

Scalable Machine Learning 6. Kernels Alex Smola Yahoo! Research and ANU

Machine Learning Fall 2017 Kernels (Kernels, Kernelized Perceptron and SVM) Professor Liang

You will learn what git is . You will learn how you can use git . You will learn how to learn more

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Random Walks, Random Fields, and Graph Kernels John Lafferty School of Computer Science

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

print culture & beyond Quality of Information October 31, 2007 "the web is a global

Governance Body Meeting Thursday, November 10t h , 2016 12:00 PM 1:00 PM EDT 1-866-774-4804;

BARCLAYS GLOBAL CONSUMER STAPLES CONFERENCE SEPTEMBER 6, 2017 1 MARK HUNTER PR ESID EN T A N D

Making Major Money with Minor Crops PRODUCI NG PROFI T ON THE EDGES ( B A SI L , CI L A N T RO,

Venues of Excellence Spring Members Forum Friday 5 th April 2019 West Court, Je Jesus Colle

Digital Literary Stylis.cs Anne BANDRY-SCUBBI Womens Novels 1750s-1830s and the Company They

Learning Concept Taxonomies from Multi-modal Data Hao Zhang Zhiting Hu, Yuntian Deng, Mrinmaya

Fishing in a sample to discard irrelevant RNA-Seq reads Paola Bonizzoni, Tamara Ceccato, Gianluca

Learning to Learn Kernels with Variational Random Features - PowerPoint PPT Presentation

Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong Zhen*, Haoliang Sun*, Yingjun Du*, Jun Xu, Yilong Yin, Ling Shao, Cees Snoek ICML | 2020 Meta-Learning (Leaning to Learn) D1 D2 D3 D Datasets

Overview: Kernels for Sequences and Graphs String Kernels 8 Example Sequence Classification

The Gray Code Kernels The Gray Code Kernels The Gray Code Kernels Gil Ben-Artzi Hagit Hel-Or

Variational Auto-encoders 2 VARIATIONAL AUTO-ENCODERS INTRODUCTION VARIATIONAL AUTO-ENCODERS

Kernels on structures Andrea Passerini passerini@disi.unitn.it Machine Learning Kernels on

Beta kernels and transformed kernels applications to copulas and quantiles Arthur Charpentier

SVM Kernels COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning SVM Kernels 1 /

An Introduction to An Introduction to Variational Variational Methods for Graphical Models

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Modelling covariance kernels for nonstationary random fields Christopher G. Small University of

Scalable Machine Learning 6. Kernels Alex Smola Yahoo! Research and ANU

Machine Learning Fall 2017 Kernels (Kernels, Kernelized Perceptron and SVM) Professor Liang

You will learn what git is . You will learn how you can use git . You will learn how to learn more

COMPANY PROFILE WATER FEATURES 1 WATER FEATURES 2 WATER FEATURES 3 WATER FEATURES 4 WATER

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Random Walks, Random Fields, and Graph Kernels John Lafferty School of Computer Science

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

print culture &amp; beyond Quality of Information October 31, 2007 &quot;the web is a global

Governance Body Meeting Thursday, November 10t h , 2016 12:00 PM 1:00 PM EDT 1-866-774-4804;

BARCLAYS GLOBAL CONSUMER STAPLES CONFERENCE SEPTEMBER 6, 2017 1 MARK HUNTER PR ESID EN T A N D

Making Major Money with Minor Crops PRODUCI NG PROFI T ON THE EDGES ( B A SI L , CI L A N T RO,

Venues of Excellence Spring Members Forum Friday 5 th April 2019 West Court, Je Jesus Colle

Digital Literary Stylis.cs Anne BANDRY-SCUBBI Womens Novels 1750s-1830s and the Company They

Learning Concept Taxonomies from Multi-modal Data Hao Zhang Zhiting Hu, Yuntian Deng, Mrinmaya

Fishing in a sample to discard irrelevant RNA-Seq reads Paola Bonizzoni, Tamara Ceccato, Gianluca

Learning to Learn Kernels with Variational Random Features Presenter : Haoliang Sun Xiantong Zhen, Haoliang Sun, Yingjun Du*, Jun Xu, Yilong Yin, Ling Shao, Cees Snoek ICML | 2020 Meta-Learning (Leaning to Learn) D1 D2 D3 D Datasets

print culture & beyond Quality of Information October 31, 2007 "the web is a global