Generative Adversarial Network and it its Applications to Human La - PowerPoint PPT Presentation

Generative Adversarial Network and it its Applications to Human La Language Processing 李宏毅 Hung-yi Lee Full version of the tutorial

Outline Part I: General Introduction of Generative Adversarial Network (GAN) Part II: Applications to Natural Language Processing Part III: Applications to Speech Processing

All Kinds of GAN … https://github.com/hindupuravinash/the-gan-zoo GAN ACGAN BGAN CGAN It is a wise choice to DCGAN attend this tutorial. EBGAN fGAN GoGAN …… Mihaela Rosca, Balaji Lakshminarayanan, David Warde-Farley, Shakir Mohamed, “ Variational Approaches for Auto-Encoding Generative Adversarial Networks ”, arXiv, 2017

Generative Adversarial Network (GAN) • Anime face generation as example high vector Generator dimensional image vector Discri- score image minator Larger score means real, smaller score means fake.

Algorithm • Initialize generator and discriminator G D • In each training iteration: Step 1 : Fix generator G, and update discriminator D sample Update 1 1 1 1 D generated Database 0 0 0 0 objects randomly vector vector vector vector G Fix sampled Discriminator learns to assign high scores to real objects and low scores to generated objects.

Algorithm • Initialize generator and discriminator G D • In each training iteration: Step 2 : Fix discriminator D, and update generator G Generator learns to “ fool ” the discriminator hidden layer NN Discri- 0.13 Generator minator vector update fix large network Backpropagation

Algorithm • Initialize generator and discriminator G D • In each training iteration: Sample some Update 1 1 1 1 real objects: D Learning Generate some D fake objects: 0 0 0 0 vector vector vector vector fix G Learning G vector vector vector vector image 1 G D image image image update fix

The faces generated by machine. The images are generated by Yen-Hao Chen, Po-Chun Chien, Jun-Chen Xie, Tsung-Han Wu.

Conditional Generation Generation 0.3 0.1 −0.3 NN −0.1 −0.1 0.1 ⋮ ⋮ ⋮ Generator −0.7 0.7 0.9 In a specific range Conditional Generation “Girl with red hair NN and red eyes” Generator “Girl with yellow ribbon”

paired data blue eyes Conditional GAN red hair short hair c: red hair G Image x = G(c,z) 𝑨 Normal distribution 𝑑 D x is realistic or not + scalar (better) c and x are matched or not 𝑦 True text-image pairs: (red hair, ) 1 (blue hair , ) 0 0 (red hair, ) [Scott Reed, et al, ICML, 2016]

The images are generated by Yen-Hao Chen, Po-Chun Chien, Conditional GAN Jun-Chen Xie, Tsung-Han Wu. [Scott Reed, et al, ICML, 2016] x = G(c,z) paired data blue eyes G Image c: text red hair short hair red hair, green eyes blue hair, red eyes

Conditional GAN G Image c: sound "a dog barking sound" Training Data Collection video

The images are generated by Chia- Hung Wan and Shun-Po Chuang. Conditional GAN https://wjohn1483.github.io/ audio_to_scene/index.html Louder • Audio-to-image

Conditional GAN - Image-to-label Multi-label Image Classifier = Conditional Generator Input condition Generated output

Conditional GAN - Image-to-label F1 MS-COCO NUS-WIDE The classifiers can have VGG-16 56.0 33.9 different architectures. + GAN 60.4 41.2 Inception 62.4 53.5 The classifiers are +GAN 63.8 55.8 trained as conditional GAN. Resnet-101 62.8 53.1 +GAN 64.0 55.4 Resnet-152 63.3 52.1 +GAN 63.9 54.1 Att-RNN 62.1 54.7 [Tsai, et al., submitted to RLSD 62.0 46.9 ICASSP 2019]

Conditional GAN - Image-to-label F1 MS-COCO NUS-WIDE The classifiers can have VGG-16 56.0 33.9 different architectures. + GAN 60.4 41.2 Inception 62.4 53.5 The classifiers are +GAN 63.8 55.8 trained as conditional GAN. Resnet-101 62.8 53.1 +GAN 64.0 55.4 Conditional GAN Resnet-152 63.3 52.1 outperforms other +GAN 63.9 54.1 models designed for multi-label. Att-RNN 62.1 54.7 RLSD 62.0 46.9

Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Conditional GAN Model, https://arxiv.org/abs/1811.00787 – Speech Recognition

Unsupervised Conditional GAN G Condition Generated Object Object in Domain X Object in Domain Y Transform an object from one domain to another without paired data (e.g. style transfer) Domain Y Domain X Not Paired photos Vincent van Gogh’s paintings

Unsupervised Conditional Generation • Approach 1: Direct Transformation For texture or 𝐻 𝑌→𝑍 ? color change Domain X Domain Y • Approach 2: Projection to Common Space 𝐸𝐹 𝑍 𝐹𝑂 𝑌 Face Decoder of Encoder of Domain X Domain Y Attribute domain Y domain X Larger change, only keep the semantics

Domain X Direct Transformation Domain Y Become similar Domain X to domain Y 𝐻 𝑌→𝑍 ? 𝐸 𝑍 scalar Input image belongs to domain Y or not Domain Y

Domain X Direct Transformation Domain Y Become similar Domain X to domain Y 𝐻 𝑌→𝑍 Not what we want! ignore input 𝐸 𝑍 scalar Input image belongs to domain Y or not Domain Y

[Jun-Yan Zhu, et al., ICCV, 2017] Direct Transformation as close as possible Cycle consistency 𝐻 𝑌→𝑍 𝐻 Y→X Lack of information for reconstruction 𝐸 𝑍 scalar Input image belongs to domain Y or not Domain Y

Cycle GAN as close as possible 𝐻 𝑌→𝑍 𝐻 Y→X scalar: belongs to scalar: belongs to 𝐸 𝑍 𝐸 𝑌 domain X or not domain Y or not 𝐻 Y→X 𝐻 𝑌→𝑍 as close as possible

Unsupervised Conditional Generation • Approach 1: Direct Transformation For texture or 𝐻 𝑌→𝑍 ? color change Domain X Domain Y • Approach 2: Projection to Common Space 𝐸𝐹 𝑍 𝐹𝑂 𝑌 Face Decoder of Encoder of Domain X Domain Y Attribute domain Y domain X Larger change, only keep the semantics

Projection to Common Space Target 𝐸𝐹 𝑌 𝐹𝑂 𝑌 image image 𝐹𝑂 𝑍 𝐸𝐹 𝑍 image image Face Attribute Domain X Domain Y

Projection to Common Space Training Minimizing reconstruction error 𝐸𝐹 𝑌 𝐹𝑂 𝑌 image image 𝐹𝑂 𝑍 𝐸𝐹 𝑍 image image Domain X Domain Y

Projection to Common Space Training Minimizing reconstruction error Discriminator of X domain 𝐸𝐹 𝑌 𝐹𝑂 𝑌 𝐸 𝑌 image image 𝐸𝐹 𝑍 𝐹𝑂 𝑍 𝐸 𝑍 image image Discriminator Minimizing reconstruction error of Y domain Because we train two auto- encoders separately … The images with the same attribute may not project to the same position in the latent space.

Projection to Common Space Training Minimizing reconstruction error Discriminator of X domain 𝐸𝐹 𝑌 𝐹𝑂 𝑌 𝐸 𝑌 image image 𝐸𝐹 𝑍 𝐹𝑂 𝑍 𝐸 𝑍 image image Discriminator of Y domain Domain 𝐹𝑂 𝑌 and 𝐹𝑂 𝑍 fool the From 𝐹𝑂 𝑌 or 𝐹𝑂 𝑍 Discriminator domain discriminator The domain discriminator forces the output of 𝐹𝑂 𝑌 and 𝐹𝑂 𝑍 have the same distribution. [Guillaume Lample, et al., NIPS, 2017]

Projection to Common Space Training 𝐸𝐹 𝑌 𝐹𝑂 𝑌 𝐸𝐹 𝑍 𝐹𝑂 𝑍 Sharing the parameters of encoders and decoders Couple GAN [Ming-Yu Liu, et al., NIPS, 2016] UNIT [Ming-Yu Liu, et al., NIPS, 2017]

Projection to Common Space Training Minimizing reconstruction error Discriminator of X domain 𝐸𝐹 𝑌 𝐹𝑂 𝑌 𝐸 𝑌 image image 𝐸𝐹 𝑍 𝐹𝑂 𝑍 𝐸 𝑍 image image Discriminator of Y domain Cycle Consistency: Used in ComboGAN [Asha Anoosheh, et al., arXiv, 017]

Projection to Common Space Training To the same Discriminator latent space of X domain 𝐸𝐹 𝑌 𝐹𝑂 𝑌 𝐸 𝑌 image image 𝐸𝐹 𝑍 𝐹𝑂 𝑍 𝐸 𝑍 image image Discriminator of Y domain Semantic Consistency: Used in DTN [Yaniv Taigman, et al., ICLR, 2017] and XGAN [Amélie Royer, et al., arXiv, 2017]

Outline Part I: General Introduction of Generative Adversarial Network (GAN) Part II: Applications to Natural Language Processing Part III: Applications to Speech Processing

Unsupervised Conditional Generation Image Style Transfer Not Paired photos Vincent van Gogh’s paintings Text Style Transfer Not Paired It is bad. It is good. It’s a bad day. It’s a good day. I love you. I don’t love you. positive negative

Cycle GAN as close as possible 𝐻 𝑌→𝑍 𝐻 Y→X scalar: belongs to scalar: belongs to 𝐸 𝑍 𝐸 𝑌 domain X or not domain Y or not 𝐻 Y→X 𝐻 𝑌→𝑍 as close as possible

Cycle GAN as close as possible It is bad. It is good. 𝐻 𝑌→𝑍 𝐻 Y→X It is bad. negative positive negative 𝐸 𝑍 positive sentence? negative sentence? 𝐸 𝑌 I love you. I hate you. 𝐻 Y→X I love you. 𝐻 𝑌→𝑍 positive negative positive as close as possible

Discrete Issue Seq2seq model hidden layer with discrete output It is bad. It is good. 𝐻 𝑌→𝑍 negative positive update 𝐸 𝑍 positive sentence? large network fix Backpropagation

Generative Adversarial Network and it its Applications to Human La - PowerPoint PPT Presentation

Generative Adversarial Network and it its Applications to Human La Language Processing Hung-yi Lee Full version of the tutorial Outline Part I: General Introduction of Generative Adversarial Network (GAN) Part II: Applications

Generative Adversarial Nets(GANs) Troy Cary and Chenzhi Zhao A generative adversarial net is

CSC421/2516 Lecture 18: Generative Adversarial Networks Roger Grosse and Jimmy Ba Roger Grosse

Robust Estimation and Generative Adversarial Networks Weizhi ZHU Hong Kong University of Science

Generative Adversarial Networks Benjamin Striner CMU 11-785 March 21, 2018 Benjamin Striner

GAN-based Photo Video Synthesis Summary of Generative Adversarial Nets Lei Zhang What is

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

CSC321 Lecture 19: Generative Adversarial Networks Roger Grosse Roger Grosse CSC321 Lecture 19:

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

generative design systems Generative Brief Design Definitions Workshop Processes

Generative Adversarial Networks presented by Ian Goodfellow presentation co-developed with Aaron

Introduction to Generative Adversarial Network (GAN) Hongsheng Li Department of Electronic

Generative Adversarial Networks (GANs) Prof. Seungchul Lee Industrial AI Lab. Source 1

Generative networks part 2: GANs 23 / 54 Recap on generative networks Generative networks provide

Conditional Generative Adversarial Networks (and a brief look at image-to-image translation)

Generative Adversarial Networks Aaron Mishkin UBC MLRG 2018W2 1 Generative Adversial Networks

Generative Adversarial Networks Sahin Olut Department of Computer Engineering Istanbul Technical

How to Make a Video from a PowerPoint Presentation How to Make Video from PowerPoint 2010 Step 1

Doing Statistics with Real Biology Data Luke Wilcox Kentwood, Michigan How is knowledge

Intro to Photoshop Layers PHOTOSHOP WORKSHOP INCLUDES TOOLS, TIPS AND TRICKS Presentation

Defending Energy Utilities from ICS/IoT Attacks musings of a 40+ year veteran control system

i n g t h e c h i c k e n Did you know that wild chickens are very different to the

June 2018 Insect Bioconversion: a new circular industry Consumers Sustainable Food Waste

WALDORF WEDDING Long regarded as the ultimate in elegance and sophistication, Waldorf Astoria has

SUSTAINABLE AQUACULTURE: THE FEED COMPONENT PAUL B. BROWN, PH.D. PURDUE UNIVERSITY OVERVIEW