Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz - PowerPoint PPT Presentation

Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz Masud Ziko, Jose Dolz, Eric Granger and Ismail Ben Ayed ETS Montreal 1

Overview Few-Shot Proposed Experiments Learning LaplacianShot - Experimental Setup - What and Why ? - The context - SOTA results on 5 different few-shot - Brief discussion - Proposed formulation benchmarks. on existing approaches. - Optimization - Proposed Algorithm 2

Few-Shot Learning (An example) 3

Few-Shot Learning (An example) - Given C = 5 classes From these - Each class c having 1 examples. To classify Learn a this Model (5-way 1-shot) 4

Few-Shot Learning (An example) 2 4 - Given C = 5 classes From these - Each class c having 1 examples. To classify Learn a this Model (5-way 1-shot) 5

Few-Shot Learning 2 4 Humans recognize perfectly with few examples 6

Few-Shot Learning Modern ML methods generalize poorly ❏ Need a better way. ❏ 7

Few-shot learning A very large body of recent works, mostly based on: Meta-learning framework 8

Meta-Learning Framework 9

Meta-Learning Framework Training set with enough labeled data (base classes different from the test classes) 10

Meta-Learning Framework Training set with enough labeled data to learn initial mode l 11

Meta-Learning Framework Create episodes and do episodic training to learn meta-learner Vinyal et al, (Neurips ‘16) , Snell et al, (Neurips ‘17) , Sung et al, (CVPR ‘ 18) , Finn et al, (ICML‘ 17) , Ravi et al, (ICLR‘ 17) , Lee et al, (CVPR‘ 19) , Hu et al, (ICLR ‘20) , Ye et al, (CVPR ‘20) , . . . 12

Taking a few steps backward . . Recently [ Chen et al., ICLR’19, Wang et al., ’19, Dhillon et al., ICLR’20 ] : Simple baselines outperform the overly convoluted meta-learning based approaches. 13

Baseline Framework No need to meta-train 14

Baseline Framework Simple conventional cross-entropy training The approaches mostly differ during inference 15

Inductive vs Transductive inference Supports Examples Vinayls et al., NEURIPS’ 16 (Attention mechanism) Query/Test point Snell et al., NEURIPS’ 17 (Nearest Prototype) 16

Inductive vs Transductive inference Supports Examples Liu et. al., ICLR’19 (Label propagation) Query/Test points Dhillon, ICLR’20 (Transductive fine-tuning) Transductive : Predict for all test points, instead of one at a time 17

Proposed LaplacianShot - Latent Assignment matrix for N Laplacian-regularized query samples: objective: - Label assignment for each query: - And Simplex Constraints: 18

Proposed LaplacianShot Nearest Prototype classification Laplacian-regularized objective: When Similar to ProtoNet ( Snell ’17 ) or SimpleShot ( Wang ’19 ) Laplacian Regularization Well known in Graph Laplacian: Spectral clustering ( Shi I‘00, Von ‘07 ) , SLK ( Ziko ’18 ) SSL ( Weston ‘12, Belkin ‘06 ) 19

LaplacianShot Takeaways ✓ SOTA results without bell and whistles. ✓ Simple constrained graph clustering works very well. ✓ No network fine-tuning , neither meta-learning ✓ Model Agnostic ✓ Fast transductive inference: almost inductive time 20

LapLacianShot More Details 21

Proposed LaplacianShot Nearest Prototype classification Laplacian-regularized objective: When Labeling according to nearest support prototypes - Feature embedding: - Prototype can be : - The support example in 1-shot or - Simple mean from support examples or - Weighted mean from both support and initially predicted query samples 22

Proposed LaplacianShot Pairwise similarity Laplacian-regularized objective: Laplacian Regularization Well known in Graph Laplacian: Encourages nearby points to have similar assignments 23

Proposed Optimization Laplacian-regularized objective: Tricky to optimize due to: 24

Proposed Optimization Laplacian-regularized objective: Tricky to optimize due to: ✖ Simplex/Integer Constraints. 25

Proposed Optimization Laplacian-regularized objective: Tricky to optimize due to: ✖ Laplacian over discrete variables. 26

Proposed Optimization Laplacian-regularized objective: Relax integer constraints: ✖ Require solving for the N×C variables all together Convex quadratic problem ➢ ✖ Extra projection steps for the simplex constraints 27

Proposed Optimization Laplacian-regularized objective: We do: ✓ Independent and closed-form updates for each assignment variable ✓ Concave relaxation ✓ Efficient bound optimization 28

Concave Laplacian 29

Concave Laplacian When Equal = = 30

Concave Laplacian When Not = Equal 31

Concave Laplacian When ฀฀ Not = Equal ฀฀ Degree 32

Concave Laplacian Remove constant terms = 33

Concave Laplacian Concave for PSD matrix 34

Concave-Convex relaxation Putting it altogether Convex barrier function: ● Avoids extra dual variables for ● Closed- form update for the simplex constraint duel 35

Bound optimization First-order approximation of concave term Fixed unary 36

Bound optimization Iteratively optimize: We get Iterative tight upper bound: Where: 37

Bound optimization Independent upper bound: 38

Bound optimization Minimize Independent upper bound: KKT conditions brings closed form updates: 39

LaplacianShot Algorithm 40

Experiments Generic Classification Datasets: mini ImageNet splits: 64 base, 16 1. Mini- ImageNet validation and 20 test classes tiered ImageNet splits: 351 base, 97 2. Tierd-ImageNet validation and 160 test classes 3. CUB 200-2001 Fine-Grained Classification 4. Inat Splits: 100 base, 50 validation and 50 test classes 41

Experiments Evaluation protocol: Datasets: - 5 -way 1 -shot/ 5 -shot . 1. Mini- ImageNet - 15 query samples per class 2. Tierd-ImageNet (N=75). 3. CUB 200-2001 - Average accuracy over 10,000 few-shot tasks with 95% 4. Inat confidence interval. 42

Experiments - More realistic and challenging Datasets: - Recently introduced (Wertheimer& Hariharan, 2019) 1. Mini- ImageNet - Slight class distinction 2. Tierd-ImageNet - Imbalanced class distribution with 3. CUB 200-2001 variable number of supports/query per class 4. Inat 43

Experiments Evaluation protocol: Datasets: - 227 -way multi-shot . 1. Mini- ImageNet - Top-1 accuracy averaged over 2. Tierd-ImageNet the test images Per Class . 3. CUB 200-2001 - Top-1 accuracy averaged over all the test images ( Mean ) 4. Inat 44

Experiments We do Cross-entropy training with base classes LaplacianShot during inference 45

Results (Mini-ImageNet) 46

Results (Mini-ImageNet) 47

Results (Tiered-ImageNet) 48

Results (CUB) Cross Domain 49

Results (iNat) 50

Ablation: Choosing 51

Ablation: Convergence 52

Ablation: Average Inference time Transductive 53

LaplacianShot Takeaways ✓ SOTA results without bell and whistles. ✓ Simple constrained graph clustering works very well. ✓ No network fine-tuning , neither meta-learning ✓ Model Agnostic: during inference with any training model and gain up to 4/5%!!! ✓ Fast transductive inference: almost inductive time 54

Thank you Code On: https://github.com/imtiazziko/LaplacianShot 55

Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz - PowerPoint PPT Presentation

Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz Masud Ziko, Jose Dolz, Eric Granger and Ismail Ben Ayed ETS Montreal 1 Overview Few-Shot Proposed Experiments Learning LaplacianShot - Experimental Setup - What and Why ?

Scalable Laplacian K-modes Imtiaz Masud Ziko, Eric Granger and Ismail Ben Ayed Laplacian K-modes

A fundamental inequality for the p-Laplacian and the -Laplacian Yi Ru-Ya Zhang ETH Z urich

Infinite Mixture Prototypes for Few-Shot Learning Adaptively inferring model capacity for simple

SHOT Brand Price NOTES WEST COAST MAGNUM SIZES 4 - 9 $ 39.20 Eagle shot prices may not be

Regularized generalized CCA (RGCCA) Arthur Tenenhaus (SUPELEC) Michel Tenenhaus (HEC Paris) 1

Non-Parametric Few-Shot Learning CS 330 1 Logistics Homework 1 due tonight, Homework 2 out soon

Concepts with Few-shot Supervision Xuming He ShanghaiTech University

A Parallel Solver for Laplacian Matrices Tristan Konolige (me) and Jed Brown Graph Laplacian

Local Laplacian Filters: Edge-aware Image Processing with a Laplacian Pyramid Paper by Sylvain

2 2 f f + = 0 2 2 x y Laplacian operator is discretized version

Zero-Shot Learning for Word Translation: Successes and Failures Ndapa Nakashole, University of

A Baseline for Few-Shot Image Classification Guneet S. Dhillon 1 , Pratik Chaudhari 2 , Avinash

Siamese Network & Matching Network for one-shot learning Reference Papers Siamese Neural

A Bayesian Approach to A Bayesian Approach to Unsupervised One- Unsupervised One -Shot Shot

Co-Representation Network for Generalized Zero-Shot Learning Fei Zhang, Guangming Shi XIDIAN

Meta-transfer Learning for Few-shot Learning Yaoyao Liu Tianjin University and NUS School of

How are the GTO teams implementing their coronagraph programs? is Stark , Marshall Perrin,

Tele Medicine During Pandemic and Beyond Anna K. Abramson MD Associate Professor of Clinical

5 Ways to Spark Joy & Get More Leads With Less Effort Josh Watched 1,000s of campaigns.

Welcome to C CSCI 112: Programming in C C is a high-level, imperative programming language C

Computer Graphics (CS 543) Lecture 1 (Part 2): Introduction to OpenGL/GLUT(Part 1) Prof Emmanuel

Mortgage Broker Mortgage Broker 1. Brokers teach you how to buy. Good brokers will guide you

SQL data manipula.on language SQL Data Manipulation Language (DML) Primarily declarative

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 18: Clustering

Sambuz

Useful Links

Newsletter

Mail Us

Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz - PowerPoint PPT Presentation

Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz Masud Ziko, Jose Dolz, Eric Granger and Ismail Ben Ayed ETS Montreal 1 Overview Few-Shot Proposed Experiments Learning LaplacianShot - Experimental Setup - What and Why ?

Scalable Laplacian K-modes Imtiaz Masud Ziko, Eric Granger and Ismail Ben Ayed Laplacian K-modes

A fundamental inequality for the p-Laplacian and the -Laplacian Yi Ru-Ya Zhang ETH Z urich

Infinite Mixture Prototypes for Few-Shot Learning Adaptively inferring model capacity for simple

SHOT Brand Price NOTES WEST COAST MAGNUM SIZES 4 - 9 $ 39.20 Eagle shot prices may not be

Regularized generalized CCA (RGCCA) Arthur Tenenhaus (SUPELEC) Michel Tenenhaus (HEC Paris) 1

Non-Parametric Few-Shot Learning CS 330 1 Logistics Homework 1 due tonight, Homework 2 out soon

Concepts with Few-shot Supervision Xuming He ShanghaiTech University

A Parallel Solver for Laplacian Matrices Tristan Konolige (me) and Jed Brown Graph Laplacian

Local Laplacian Filters: Edge-aware Image Processing with a Laplacian Pyramid Paper by Sylvain

2 2 f f + = 0 2 2 x y Laplacian operator is discretized version

Zero-Shot Learning for Word Translation: Successes and Failures Ndapa Nakashole, University of

A Baseline for Few-Shot Image Classification Guneet S. Dhillon 1 , Pratik Chaudhari 2 , Avinash

Siamese Network &amp; Matching Network for one-shot learning Reference Papers Siamese Neural

A Bayesian Approach to A Bayesian Approach to Unsupervised One- Unsupervised One -Shot Shot

Co-Representation Network for Generalized Zero-Shot Learning Fei Zhang, Guangming Shi XIDIAN

Meta-transfer Learning for Few-shot Learning Yaoyao Liu Tianjin University and NUS School of

How are the GTO teams implementing their coronagraph programs? is Stark , Marshall Perrin,

Tele Medicine During Pandemic and Beyond Anna K. Abramson MD Associate Professor of Clinical

5 Ways to Spark Joy &amp; Get More Leads With Less Effort Josh Watched 1,000s of campaigns.

Welcome to C CSCI 112: Programming in C C is a high-level, imperative programming language C

Computer Graphics (CS 543) Lecture 1 (Part 2): Introduction to OpenGL/GLUT(Part 1) Prof Emmanuel

Mortgage Broker Mortgage Broker 1. Brokers teach you how to buy. Good brokers will guide you

SQL data manipula.on language SQL Data Manipulation Language (DML) Primarily declarative

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 18: Clustering

Sambuz

Useful Links

Newsletter

Mail Us

Siamese Network & Matching Network for one-shot learning Reference Papers Siamese Neural

5 Ways to Spark Joy & Get More Leads With Less Effort Josh Watched 1,000s of campaigns.