On Leveraging Pretrained GANs for Generation with Limited Data - PowerPoint PPT Presentation

Motivation Our Method Related Work Experiments Summary On Leveraging Pretrained GANs for Generation with Limited Data Miaoyun Zhao, Yulai Cong, Lawrence Carin Duke University August 11, 2020

Motivation Our Method Related Work Experiments Summary Table of Contents Motivation 1 Our Method 2 Related Work 3 Experiments 4 Summary 5

Motivation Our Method Related Work Experiments Summary Motivation Generated from BigGAN Generated from StyleGAN GANs can generate highly realistic synthetic (“fake”) images Can augment training data, with new & realistic samples Useful in settings with limited training data However, training the GAN itself is challenging with limited data Training GANs with limited data may yield overfitting or training/mode collapse Propose to transfer additional information to facilitate GAN training with limited data Leverage valuable generalizable knowledge within GANs trained on different large-scale datasets

Motivation Our Method Related Work Experiments Summary Motivation Key observations associated with generalizable knowledge: For classification models pretrained on large-scale datasets lower-level filters (those close to the observation x ) are fairly general/transferable (Gabor-like) higher-level filters are more task-specific Source Target Transfer Feature Feature Extractor Low-level Extractor ( Frozen/Finetuning ) High-level Classifier Classifier For pretrained GAN generators lower-level layers portray generally-applicable local patterns higher-level layers represent more specific semantic objects or object parts It’s data-demanding to train well-behaved low-level filters transfer often delivers better efficiency and performance

Motivation Our Method Related Work Experiments Summary Our Contributions To better transfer common knowledge for generators, for design of generators based on limited data From GANs pretrained on large-scale source datasets Source Target FC FC Residual Block Residual Block Tailor Noise Noise Residual Block Residual Block Specific MLP High-level MLP Residual Block Residual Block Part Layers Specific FC Trainable FC Residual Block Residual Block Specific Part Style Block Part Trainable Residual Block Style Block Residual Block Trainable Style Block Residual Block Residual Block Style Block Residual Block Residual Block Residual Block AdaFM Block Adapt Residual Block Residual Block Residual Block AdaFM Block Residual Block Residual Block Residual Block General AdaFM Block General Low-level General Transfer Part Residual Block Part Residual Block Residual Block AdaFM Block Layers Part Frozen Frozen Frozen Residual Block Residual Block Residual Block (Trainable AdaFM Block AdaFM) Residual Block Residual Block Residual Block AdaFM Block Convolution Convolution Convolution Convolution (a) GP-GAN (b) GPHead (c) SmallHead (d) Our

Motivation Our Method Related Work Experiments Summary Table of Contents Motivation 1 Our Method 2 Related Work 3 Experiments 4 Summary 5

Motivation Our Method Related Work Experiments Summary Notation Within a GAN, there is a generator (actor) and a discriminator (critic) “General-Part” of either the generator or discriminator is composed of those model layers that are generally applicable across a wide range of images “Specific-Part” of generator or discriminator composed of layers that are specifically associated with a class of images Seek to transfer General-Part from GANs learned in data-rich settings, to those for which there are limited data The General-Part tends to be at and near layers that touch the input (discriminator) or output (generator) image

Motivation Our Method Related Work Experiments Summary 1. On Specifying the General-Part for Transfer Source Target FC FC Residual Block Residual Block Tailor Noise Noise Residual Block Residual Block Specific MLP High-level MLP Residual Block Part Residual Block Layers Specific FC Trainable FC Residual Block Residual Block Part Specific Style Block Part Trainable Style Block Residual Block Residual Block Trainable Residual Block Residual Block Style Block Style Block Residual Block Residual Block Residual Block AdaFM Block Adapt Residual Block Residual Block Residual Block AdaFM Block Residual Block Residual Block Residual Block General AdaFM Block Low-level General General Part Transfer Residual Block Residual Block Part Part Residual Block AdaFM Block Layers Frozen Frozen Frozen Residual Block Residual Block Residual Block (Trainable AdaFM Block AdaFM) Residual Block Residual Block Residual Block AdaFM Block Convolution Convolution Convolution Convolution (a) GP-GAN (b) GPHead (c) SmallHead (d) Our Source model : the GP-GAN 1 pretrained on ImageNet Target dataset : the perceptually-distinct CelebA ImageNet CelebA 1Which training methods for GANs do actually converge? ICML 2018.

Motivation Our Method Related Work Experiments Summary 1. On Specifying the General-Part for Transfer Source Target FC Group8 FC 4 × 4 G2D0 Residual Block Residual Block Group7 4 × 4 (22.33) Specific High-level Residual Block Residual Block Part Layers Residual Block Residual Block Trainable Group6 8 × 8 Residual Block Residual Block G4D0 Generator Residual Block Residual Block (13.12) Group5 16 × 16 = ⇒ Residual Block Residual Block Residual Block Residual Block Group4 32 × 32 G5D0 Residual Block Residual Block Transfer (15.20) Residual Block Residual Block 64 × 64 Low-level General Group3 Residual Block Residual Block Layers Part Residual Block Frozen Residual Block G6D0 128 × 128 Group2 Residual Block Residual Block (22.98) 128 × 128 Convolution Group1 Convolution Source Target Transfer 128 × 128 Convolution Group1 Convolution Low-level General G4D0 Layers Part Residual Block Residual Block 128 × 128 (13.12) Frozen Group2 Residual Block Residual Block Residual Block Residual Block Group3 64 × 64 Discriminator G4D2 Residual Block Residual Block (11.14) Residual Block Residual Block Group4 32 × 32 = ⇒ Residual Block Residual Block Residual Block Residual Block Group5 16 × 16 G4D3 Residual Block Residual Block Specific High-level (13.99) Residual Block Part Residual Block Layers Group6 8 × 8 Trainable Residual Block Residual Block Residual Block Residual Block G4D4 4 × 4 Group7 Residual Block Residual Block (25.08) 4 × 4 FC Group8 FC Real/Fake Real/Fake

Motivation Our Method Related Work Experiments Summary 2. On Tailoring the High-Level Specific-Part Source Target FC FC Residual Block Residual Block Tailor Noise MLP Noise Residual Block Residual Block Specific MLP High-level FC Residual Block Residual Block Part Layers Specific LeakyReLU Specific FC Trainable Part Residual Block Residual Block Style Block Part Convolution Trainable Trainable Style Block Residual Block Residual Block Noise Style Block Style Block Residual Block Residual Block AdaIN Residual Block Residual Block Residual Block AdaFM Block LeakyReLU Adapt Residual Block Residual Block Residual Block General AdaFM Block Convolution Part Residual Block Residual Block Residual Block AdaFM Block General General Low-level Frozen Noise Transfer Residual Block Residual Block Residual Block Part Part AdaFM Block Layers (Trainable AdaIN Frozen Frozen AdaFM) Residual Block Residual Block Residual Block AdaFM Block Residual Block Residual Block Residual Block AdaFM Block (c) Style Block Convolution Convolution Convolution Convolution (a) GP-GAN (b) GPHead (c) SmallHead (d) Our Even with the G4D2 general-part, mode collapse may still happen on small data (Flowers 8,189). Style blocks deliver disentangled high-level attributes ≫ efficient exploration of underlying data manifold ≫ better generative quality style mixing cheaper computation

Motivation Our Method Related Work Experiments Summary 3. On Better Adaption of the Transferred General-Part Source Target FC FC Residual Block Residual Block Tailor Noise Noise Residual Block Residual Block Specific MLP High-level MLP Residual Block Residual Block Part Layers Specific FC Trainable LeakyReLU FC Residual Block Residual Block Specific Part Trainable Style Block Part Convolution Style Block Residual Block Residual Block ( AdaFM ) Trainable Style Block Residual Block Residual Block Style Block LeakyReLU Residual Block Residual Block Residual Block AdaFM Block Convolution Adapt Residual Block Residual Block Residual Block AdaFM Block ( AdaFM ) General Residual Block Residual Block Residual Block AdaFM Block General General Low-level Transfer Part Residual Block Residual Block Residual Block Part Part AdaFM Block Layers (b) AdaFM Block Frozen Frozen Frozen Residual Block Residual Block Residual Block (Trainable AdaFM Block AdaFM) Residual Block Residual Block Residual Block AdaFM Block Convolution Convolution Convolution Convolution (a) GP-GAN (b) GPHead (c) SmallHead (d) Our We introduce the adaptive filter modulation (AdaFM) to better adapt the transferred general-part to target domains relax the requirements for the general-part Given a Conv filter W ∈ R C out × C in × K 1 × K 2 , AdaFM uses learnable γ ∈ R C out × C in and β ∈ R C out × C in to modulate its statistics W AdaFM = γ i,j W i,j, : , : + β i,j (1) i,j, : , :

On Leveraging Pretrained GANs for Generation with Limited Data - PowerPoint PPT Presentation

Motivation Our Method Related Work Experiments Summary On Leveraging Pretrained GANs for Generation with Limited Data Miaoyun Zhao, Yulai Cong, Lawrence Carin Duke University August 11, 2020 Motivation Our Method Related Work

GANs for Word Embeddings Akshay Budhkar and Krishnapriya Introduction GANs have shown incredible

Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs Yogesh

Advanced Section #8: Generative Adversarial Networks (GANs) CS109B Data Science 2 Vincent Casser

Reading group: Latent Optimized GANs (Game theory brings guns to GANs) Michal Sustr Dept. of

Bregman and Wasserstein, with Applications to Generative Adversarial Networks (GANs) and beyond

GANs, Optimal Transport, and Implicit Distribution Estimation Tengyuan Liang Econometrics and

Improving RGB-D face recognition via transferring pretrained 2D networks Xingwang Xiong, Xu Wen,

Probing pretrained models CS 685, Fall 2020 Introduction to Natural Language Processing

Leveraging GANs for fairness evaluations Emily Denton Research Scientist, Google Brain Emily

Lecture 20: GANS CS109B Data Science 2 Pavlos Protopapas and Mark Glickman 1 Outline Review of

On Minimax Optimality of GANs for Robust Mean Estimation Kaiwen Wu 1,2 With Gavin Weiguang Ding 3

GAN Frontiers/Related Methods Improving GAN Training Improved Techniques for Training GANs

Intro Tutorial on GANs Michela Paganini Fermilab Machine Learning Group Meeting March 21, 2018

GANs for Discrete Text Generation Junfu Oct. 20 th , 2018 Show, Tell and Discriminate

Making Faces: Conditional generation of faces using GANs via Keras+Tensorflow SOPHIE SEARCY

GANs for Limited Labeled Data MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Fate of Renewable Energy Under Trump Dan Whi(en Lauren Randall Peter Kelley Ma(hew Wagner

Neural machines with nonstandard input structure During the talk I will show work done by

Bottleneck Routing Games on Grids Costas Busch Rajgopal Kannan Alfred Samman Department of

Work Item C update: NOC tools, Work Item C update: NOC tools, p p , ,

Deep Dive into Android IPC/Binder Framework at Android Builders Summit 2013 Aleksandar (Saa)

The Mythos of Model Interpretability Zachary C. Lipton https://arxiv.org/abs/1606.03490 Outline

LBNE So(ware and Compu3ng Face-to-Face Mee3ng Tom Junk,

rt t strt

Sambuz

Useful Links

Newsletter

Mail Us