Embedding and Data Augmentation yanweifu@fudan.edu.cn - PowerPoint PPT Presentation

One-shot Learning in Semantic Embedding and Data Augmentation 付彦伟复旦大学大数据学院 yanweifu@fudan.edu.cn http://yanweifu.github.io

One-shot Learning: “ learning object categories from just a few images, by incorporating “generic” knowledge which may be obtained from previously learnt models of unrelated categories ” . Fei-Fei et al. A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories. ICCV 2003 Fei-Fei, et al. One-Shot Learning of Object Categories. IEEE TPAMI 2006 One-shot Learning Object categorization

Fu, Y.; Hospedales , T.; Xiang, T; Gong, S. “Attribute Learning for Understanding Unstructured Social Activity”, ECCV 2012; Fu, Y. ; Hospedales , T. ; Xiang, T. ; Gong, S. “Learning Multi - modal Latent Attributes” IEEE TPAMI 2014; Fu et al. Semi-supervised Vocabulary-informed Learning. (CVPR 2016, oral) Fu et al. Vocabulary-informed Zero-shot and Open-set Learning. IEEE TPAMI to appear One-shot Learning by Semantic Embedding

Attribute Learning Pipeline mule lion horse Zebra strips tails Lampert, C. H. Learning to detect unseen object classes by between-class attribute transfer. CVPR 2009

Semantic Attributes in Zero/One-shot Learning Fu, Y.; Hospedales , T.; Xiang, T; Gong, S. “Attribute Learning for Understanding Unstructured Social Activity”, ECCV 2012; Fu, Y. ; Hospedales , T. ; Xiang, T. ; Gong, S. “Learning Multi - modal Latent Attributes” IEEE TPAMI 2014;

Learning Multi-modal Latent Attributes Fu, Y.; Hospedales , T.; Xiang, T; Gong, S. “Attribute Learning for Understanding Unstructured Social Activity”, ECCV 2012; Fu, Y. ; Hospedales , T. ; Xiang, T. ; Gong, S. “Learning Multi - modal Latent Attributes” IEEE TPAMI 2014;

Experimental Settings Dataset & Settings: • USAA dataset (4 source cls, 4 target cls, multiple round class splits); • Animal with Attributes (AwA) dataset (40 source cls; 10 target cls); Comparisons • Direct: KNN/SVM of features to classes; • DAP: Direct Attribute Prediction [Lampert et al. CVPR 2009]; • SVM-UD: an SVM generalization of DAP; • SCA: Topic models in [Wang et al CVPR 2009]; • ST: Synthetic Transfer in [Yu et al ECCV 2010];

Unstructured Social Activity Dataset (USAA) Music Non-music Wedding Wedding Wedding Parade Birthday party Graduation performance performance ceremony dance reception

One-shot Learning Results For more results, please check our papers.

Fu et al. Semi-supervised Vocabulary-informed Learning. (CVPR 2016, oral) Fu et al. Vocabulary-informed Zero-shot and Open-set Learning. IEEE TPAMI to appear Vocabulary-informed Learning

Supervised Learning Semantic labels Visual feature space airplane car unicycle tricycle

One-shot Learning Semantic labels Visual feature space airplane car unicycle tricycle

Zero/One-shot Learning by Semantic Embedding (Problem Definition) Semantic labels Visual feature space Zero/one-shot Learning: We have zero/one instances visually labeled instances of what these look like. bicycle truck

Learning Semantic labels Visual feature space airplane unicycle bicycle bicycle tricycle car truck truck

Inference airplane unicycle bicycle bicycle tricycle car truck truck Key Question: How do we define semantic space?

Semantic Label Vector Spaces Spaces Type Advantages Disadvantages Manual annotation Semantic Good interpretability of each dimension: Supervised Attributes Limited vocabulary Good vector representation for millions of Semantic Word Limited interpretability of Vectors Unsupervised vocabulary each dimension (e.g. word2vec)

Vocabulary-Informed Recognition Image unicycle tricycle Fu et al. Semi-supervised Vocabulary-informed learning, CVPR 2016 (Oral)

Estimating Density of Classes in the Space The knowledge of margin distribution of instances, rather than a single margin across all instances, is crucial for improving the generalization performance of a classifier. Instance margin : the distance between one instance and the separating hyperplane. The distribution for the minimal values of the margin distance is characterized by a Weibull distribution The probability of 𝑕(𝑦) included in the boundary estimated by 𝑕(𝑦 𝑗 ) Margin Distribution of Prototypes: Margin distribution of prototypes in the semantic space Coverage Distribution of Prototypes. Extreme Value Theorem Fu et al. Vocabulary-informed Zero-shot and Open-set Learning. IEEE TPAMI to appear

Experimental Dataset and Tasks Dataset: AwA dataset: • ImageNet 2012/2010 dataset. • We can address following tasks by learning semantic embedding, • SUPERVISED recognition • ZERO-SHOT recognition • GENERAL-ZERO-SHOT recognition • ONE-SHOT recognition • OPEN-SET recognition

Experimental Settings of Few-shot Learning • Learning Classifiers from Few Source Training Instances • Source classes: One-shot Recognition • Target classes: Zero-shot Recognition • Key insights: leveraging the knowledge from semantic space (vocabulary-informed) • Few-shot Target Training instances • Few-shot setting, consistent with general definition

Results on Few-shot Learning Few-shots on source dataset

Results on Few-shot Learning

One-shot learning aims to learn information about object categories from one, or only a few , training images. Meta-Learning Data-Augmentation Meta Augmentation Learning One-shot Learning by Data Augmentation

Multi-level Semantic Feature Augmentation for One-shot Learning Zitian Chen, Yanwei Fu, Yinda Zhang, Yu-Gang Jiang, Xiangyang Xue, and Leonid Sigal. IEEE Transaction on Image Processing (TIP) 2019

Motivation • A straight forward way to tackle one-shot learning is data augmentation • We want to utilize semantic space • Related concepts in the semantic space help to learn Help? Image Feature Space Semantic Feature Space Killer whale Sea lion Mountain goat Whale Hartebeest Orca Antelopes Pronghorn Muskrat Beaver Badger Woodchuck

Method Image Feature Space Semantic Feature Space Killer whale Sea lion Mountain goat Whale Hartebeest Orca Antelopes Pronghorn Muskrat 𝑔(𝑦) Beaver Badger Woodchuck 𝑕(𝑦)

Single-level • But we want to utilize different level visual concepts.

Multi-level • Use High-level feature and low-level feature help to encode • Decode semantic feature to different level feature diversify the augmented features

Visualization

Image Deformation Meta-Networks for One-Shot Learning Zitian Chen, Yanwei Fu, Yu-Xiong Wang, Lin Ma, Wei Liu, Martial Hebert

The Basic Idea of Jigsaw Augmentation Method Image Block Augmentation for One-Shot Learning. Zitian Chen, Yanwei Fu, Kaiyu Chen, Yu-Gang Jiang. AAAI 2019

Visual contents from other images may be helpful to synthesize new images 33

Stitched Ghosted Partially occluded Montaged Human can learn novel visual concepts even when images undergo various deformations 34

Deformed Images Visual contents from other images might be helpful

Approach 36

Motivation 1.Visual contents from other images may be helpful to synthesize new images. 2.Human can learn novel visual concepts even when images undergo various deformations. Approach We design a deformation sub-network that learns to deform images by fusing a pair of images — a probe image that keeps the visual content and a gallery image that diversifies the deformations. 37

Probe Image ANET Probe Image Concat find visually similar BNET Gallery Image Gallery Image Embedding Sub-Network Deformation Sub-Network

Top-1 accuracies(%) on miniImagenet Top-1 accuracies(%) on miniImagenet 75 75 70 70 Baseline Baseline 65 65 60 60 Ours Ours 55 55 50 50 1-Shot 1-Shot 5-Shot 5-Shot 39

Gaussian Ours real probe image deformed image real image 40 40

NeurIPS 2019

Falcon Hawk source: https://birdeden.com/distinguishing-between-hawks-falcons

Fine-grained Visual Recognition • Much harder than normal classification. • Difficult to collect data. • Can’t use crowdsourcing. • Need expert annotator. • Demand one-shot learning.

Can we generate more data? • How about state-of-the-art GANs? • Challenge: GAN training itself need a lot of data.

Our Idea: Fine-tune GANs trained on ImageNet. One Million General Images BigGAN Z Transfer generative knowledge from one million general images to a domain specific image. ? A Specific Image Z

Fine-tune BigGAN with a single image Generated Original

Technical Point: Fine-tune Batch Norm Only Original Fine-Tune All Fine-Tune BatchNorm

Our idea: Meta-Augmentation Learning Learning to reinforce with the original image Fused: 𝑥𝐽 + (1 − 𝑥)𝐻(𝐽) Generated: 𝐻(𝐽) Original: 𝐽 Image Fusion Net F Fusing Weight 𝑥 Use meta-learning to learn the best mixing strategy to help one-shot classifiers.

Examples

Our method has consistent improvement.

Embodied One-Shot Video Recognition: Learning from Actions of a Virtual Embodied Agent Yuqian Fu, Chengrong Wang, Yanwei Fu, Yu-Xiong Wang, Cong Bai, Xiangyang Xue, Yu-Gang Jiang ACM Multimedia 2019

Embedding and Data Augmentation yanweifu@fudan.edu.cn - PowerPoint PPT Presentation

One-shot Learning in Semantic Embedding and Data Augmentation yanweifu@fudan.edu.cn http://yanweifu.github.io One-shot Learning: learning object categories from just a few images, by incorporating

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Data Augmentation in NLP 2020-03-21 Xiachong Feng Outline Why we need Data Augmentation?

Population Based Augmentation Efficient Learning of Augmentation Policy Schedules Daniel Ho , Eric

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

image-augmentation April 9, 2019 1 Image Augmentation In [1]: % matplotlib inline import d2l

Galileo Local Element Augmentation System Galileo Local Element Augmentation System (GALILEA)

Embedding 3-manifolds via surgery on surfaces Kyle Larson University of Texas at Austin

ECE 417 Fall 2018 Lecture 19: Mini-Batch Training and Data Augmentation Mark Hasegawa-Johnson

SwitchOut: An Efficient Data Augmentation for Neural Machine Translation Xinyi Wang , Hieu

Convolutional Neural Networks with Data Augmentation against Jitter-Based Countermeasures Eleonora

Does Data Augmentation Lead to Positive Margin? Dimitris Po-Ling Loh Shashank Rajput* Zhili

Improving Molecular Design by Stochastic Iterative Target Augmentation Kevin Yang, Wengong Jin,

Table Augmentation SIGIR 2019 tutorial - Part V Shuo Zhang and Krisztian Balog University of

Knowledge Base Augmentation SIGIR 2019 tutorial - Part III Shuo Zhang and Krisztian Balog

IPR/Reservoir Augmentation Reservoir Storage Permitting Issues Michael R. Welch, Ph.D., P.E.

GPU Panel for High-Throughput Compu7ng Jimmy Lin University

The Patcher Case Peter Kruse (pkr@csis.dk), Head of CSIS eCrime and Research &

Annual Meeting - May 26, 2016 Call to Order Welcome and Introductions Welcome and Introductions

Outline Requirements Current programs ABCS, FCS, LVC-IA SIMCI Technical

Supersymmetry searches in ATLAS and CMS in hadronic final

Early Twentieth-Century Fiction e20fic14.blogs.rutgers.edu Prof. Andrew Goldstone

Zeus Financial Malware Samaneh Tajalizadehkhoob Hadi Asghari Carlos Gan Michel van Eeten

EHEALTH COMMISSION MEETING APRIL 11, 2018 APRIL AGENDA Call to Order 12:00 Roll Call and

Embedding and Data Augmentation yanweifu@fudan.edu.cn - PowerPoint PPT Presentation

One-shot Learning in Semantic Embedding and Data Augmentation yanweifu@fudan.edu.cn http://yanweifu.github.io One-shot Learning: learning object categories from just a few images, by incorporating

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Data Augmentation in NLP 2020-03-21 Xiachong Feng Outline Why we need Data Augmentation?

Population Based Augmentation Efficient Learning of Augmentation Policy Schedules Daniel Ho , Eric

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

image-augmentation April 9, 2019 1 Image Augmentation In [1]: % matplotlib inline import d2l

Galileo Local Element Augmentation System Galileo Local Element Augmentation System (GALILEA)

Embedding 3-manifolds via surgery on surfaces Kyle Larson University of Texas at Austin

ECE 417 Fall 2018 Lecture 19: Mini-Batch Training and Data Augmentation Mark Hasegawa-Johnson

SwitchOut: An Efficient Data Augmentation for Neural Machine Translation Xinyi Wang , Hieu

Convolutional Neural Networks with Data Augmentation against Jitter-Based Countermeasures Eleonora

Does Data Augmentation Lead to Positive Margin? Dimitris Po-Ling Loh Shashank Rajput* Zhili

Improving Molecular Design by Stochastic Iterative Target Augmentation Kevin Yang, Wengong Jin,

Table Augmentation SIGIR 2019 tutorial - Part V Shuo Zhang and Krisztian Balog University of

Knowledge Base Augmentation SIGIR 2019 tutorial - Part III Shuo Zhang and Krisztian Balog

IPR/Reservoir Augmentation Reservoir Storage Permitting Issues Michael R. Welch, Ph.D., P.E.

GPU Panel for High-Throughput Compu7ng Jimmy Lin University

The Patcher Case Peter Kruse (pkr@csis.dk), Head of CSIS eCrime and Research &amp;

Annual Meeting - May 26, 2016 Call to Order Welcome and Introductions Welcome and Introductions

Outline Requirements Current programs ABCS, FCS, LVC-IA SIMCI Technical

Supersymmetry searches in ATLAS and CMS in hadronic final

Early Twentieth-Century Fiction e20fic14.blogs.rutgers.edu Prof. Andrew Goldstone

Zeus Financial Malware Samaneh Tajalizadehkhoob Hadi Asghari Carlos Gan Michel van Eeten

EHEALTH COMMISSION MEETING APRIL 11, 2018 APRIL AGENDA Call to Order 12:00 Roll Call and

The Patcher Case Peter Kruse (pkr@csis.dk), Head of CSIS eCrime and Research &