[PPT] - Addressing GAN limitations: resolution, lack of novelty and control PowerPoint Presentation

SLIDE 1

Addressing GAN limitations: resolution, lack of novelty and control on generations

Camille Couprie Facebook AI Research Joint works with O. Sbai, M. Aubry, A, Bordes, M. Elhoseiny, M. Riviere, Y. LeCun, M. Mathieu, P. Luc, N. Neverova, J. Verbeek.

2019

1

SLIDE 2

Why do we care about generative models

2

Scene Understanding can be assessed by checking the ability to

generate plausible new scenes.

Generative models are interesting if they can be used to go beyond

training data: data of higher resolution, data augmentation to help train better classifiers, use the learned representations in other tasks, or make prediction about uncertain events...

SLIDE 3

1/ Design inspiration from adversarial generative networks
2/ High resolution, decoupled generation
3/ Vector image generation by learning parametric layer

decomposition

4/ Future frame prediction

Outline

3

SLIDE 4

4

Sbai, Elhoseiny, Bordes, LeCun, Couprie, ECCV workshop 17

Design inspiration from generative networks

Novelty Hedonic Value 1 2 3 4

SLIDE 5

RANDOM NUMBERS 0 . 3 0 . 7 0 . 1 0 . 8 AdVERSARIAL NETWORK Generator Fake

GENERATED IMAGE

Generative Adversarial Networks

Goodfellow et al, 2014

SLIDE 6

RANDOM NUMBERS

Real INPUT

Generator Real AdVERSARIAL NETWORK

GENERATED IMAGE

0 . 3 0 . 7 0 . 1 0 . 8

Generative Adversarial Networks

Goodfellow et al, 2014

SLIDE 7

Deep convolutional GANs

RADFORD ET AL : ICLR 2015

SLIDE 8

Training with pictures of about 2000 Clothing items

SLIDE 9

Floral Striped Tiled Uniform Dotted Animal Print Graphical

Texture and shape labels

Dress Skirt Jacket Pullover T-Shirt Coat Top

SLIDE 10

RANDOM NUMBERS

Real INPUT

Generator AdVERSARIAL NETWORK

GENERATED IMAGE

0 . 3 0 . 7 0 . 1 0 . 8

S h a p e C L A S S 0 / 1 ( R E A L / FA K E ) T E X T U R E C L A S S

Class conditioned GAN

SLIDE 11

Generator’s

loss

Discriminator’s

loss

Auxiliary

classifier discriminator:

Additional loss

for the generator:

GAN Optimization objectives

SLIDE 12

Without conditioning With class conditioning

SLIDE 13

RANDOM NUMBERS Generator

0 . 3 0 . 7 0 . 1 0 . 8

AdVERSARIAL NETWORK Generated Image

Dotted Floral graphical uniform tiled striped Animal print

1 0 0 %

Introduction of a Style Deviation criterion

SLIDE 14

8 % 6 % 7 % 5 7 % 6 % 1 1 % 9 %

RANDOM NUMBERS Generator

0 . 3 0 . 7 0 . 1 0 . 8

AdVERSARIAL NETWORK Generated Image

Dotted Floral graphical uniform tiled striped Animal print

Introduction of a Style Deviation criterion

SLIDE 15

RANDOM NUMBERS Generator

0 . 3 0 . 7 0 . 1 0 . 8

AdVERSARIAL NETWORK Generated Image

D o t t e d F l o r a l g r a p h i c a l u n i f o r m t i l e d s t r i p e d A n i m a l p r i n t

With the Style Deviation criterion (CAN H)

SLIDE 16

Multi-class cross entropy loss: Binary cross entropy loss :

Tested deviation objectives

SLIDE 17

O v e r a l L L i k a b i l i t y ( % )

6 5 7 0 7 5 6 0 8

CAN: GAN with Creativity loss, (H) stands for the use of a holistic loss.

CAN texTURE GAN Style CAN(h) texTURE CAN(h) texTURE

r e a l i s t i c A p p e a r a n c e

4 8 5 3 . 5 5 9 6 4 . 5 7

Human Evaluation Study

SLIDE 18

Models with texture deviation are Most Popular

N o v e l t y

1

L i k e a b i l i t y

6 2 6 6 7 0 7 4 7 8

judged by humans and measured as a distance to similar training images

Can texture Can (H) Shape Can (H) texture style can (h) GAN

SLIDE 19

SLIDE 20

2

M. Riviere, C. Couprie, Y. LeCun

Decoupled adversarial image generation

Motivation:

Take advantage of white background clothing datasets
Potentially avoid defaults in generated shapes
Better enforce shape conditioning of generations

SLIDE 21

1024x1024 generations on the RTW dataset

2 1

Using Morgane’s pytorch “progressive growing of GANs” available online, Karras et al., ICLR’18

SLIDE 22

Decoupled architecture

2 2

Shape Generator Texture Generator Generation Real image

Shape and pose classes Texture, color, shape and pose classes

Discriminator Real / Fake Real image Discriminator Real / Fake

Texture, color, shape and pose classes

×

Dg

Ds

Gs

Gt

z

SLIDE 23

Random generations

2 3

Progressive growing Progressive growing with decoupled architecture

SLIDE 24

Better class conditioning

2 4

Accuracy of classifiers trained on FashionGen Clothing and FashionGen Shoes on our different models results (GAN-test metric) = +12% +14%

12%

Overall average improvement: 4.7%

SLIDE 25

2 5

Sbai, Couprie, Aubry, arxiv dec18

Vector Image Generation

by learning parametric layer decomposition

Current deep generative models are great but… … are limited in resolution, and control in generations

SLIDE 26

Kanan et al: Layered GANs (LR-GANs), ICLR’17 GANIN et al. SPIRAL, ICML’18

Related work

2 6

SLIDE 27

Our approach

2 7

Spoiler alert: yes, we can generate sharper images, this is just an example.

SLIDE 28

Iterative generation : !" = $(!, !"'() Vectorized mask generation "(+, ,) = $(+, ,, -") Alpha-blending !"(+, ,) = !"'( +, , . (1-"(+, ,))+ /"*"(+, ,)

Our iterative pipeline for image reconstruction

2 8

x x

×

<latexit sha1_base64="L3dZ0XgOyNX2PKpIro/rl+dxAfQ=">ACBXicbVDLSgMxFL3js46vqks3wSK4KjMi6LgxpVUsA9oS8lkMm1skhmTjDAMXbt1q/gTtz6Hf6CX2HazsK2HgczrmXc3OChDNtPO/bWVldW9/YLG252zu7e/vlg8OmjlNFaIPEPFbtAGvKmaQNwyn7URLAJOW8HoeuK3nqjSLJb3JktoT+CBZBEj2Fip2TVMUN0vV7yqNwVaJn5BKlCg3i/dMOYpIJKQzjWuhNx+ih7bo6VYTsdtNU0wGeEB7VgqsY3p5dN7x+jUKiGKYmWfNGiq/t3IsdA6E4GdFNgM9aI3Ef/zOqmJrno5k0lqCSzoCjlyMRo8nkUMkWJ4ZklmChmb0VkiBUmxlY0l6KNwCpT4di15fiLVSyT5nV96r+3UWldlvUVIJjOIEz8OESanADdWgAgQd4gVd4c56d+fD+ZyNrjFzhHMwfn6BRxkmZg=</latexit><latexit sha1_base64="L3dZ0XgOyNX2PKpIro/rl+dxAfQ=">ACBXicbVDLSgMxFL3js46vqks3wSK4KjMi6LgxpVUsA9oS8lkMm1skhmTjDAMXbt1q/gTtz6Hf6CX2HazsK2HgczrmXc3OChDNtPO/bWVldW9/YLG252zu7e/vlg8OmjlNFaIPEPFbtAGvKmaQNwyn7URLAJOW8HoeuK3nqjSLJb3JktoT+CBZBEj2Fip2TVMUN0vV7yqNwVaJn5BKlCg3i/dMOYpIJKQzjWuhNx+ih7bo6VYTsdtNU0wGeEB7VgqsY3p5dN7x+jUKiGKYmWfNGiq/t3IsdA6E4GdFNgM9aI3Ef/zOqmJrno5k0lqCSzoCjlyMRo8nkUMkWJ4ZklmChmb0VkiBUmxlY0l6KNwCpT4di15fiLVSyT5nV96r+3UWldlvUVIJjOIEz8OESanADdWgAgQd4gVd4c56d+fD+ZyNrjFzhHMwfn6BRxkmZg=</latexit><latexit sha1_base64="L3dZ0XgOyNX2PKpIro/rl+dxAfQ=">ACBXicbVDLSgMxFL3js46vqks3wSK4KjMi6LgxpVUsA9oS8lkMm1skhmTjDAMXbt1q/gTtz6Hf6CX2HazsK2HgczrmXc3OChDNtPO/bWVldW9/YLG252zu7e/vlg8OmjlNFaIPEPFbtAGvKmaQNwyn7URLAJOW8HoeuK3nqjSLJb3JktoT+CBZBEj2Fip2TVMUN0vV7yqNwVaJn5BKlCg3i/dMOYpIJKQzjWuhNx+ih7bo6VYTsdtNU0wGeEB7VgqsY3p5dN7x+jUKiGKYmWfNGiq/t3IsdA6E4GdFNgM9aI3Ef/zOqmJrno5k0lqCSzoCjlyMRo8nkUMkWJ4ZklmChmb0VkiBUmxlY0l6KNwCpT4di15fiLVSyT5nV96r+3UWldlvUVIJjOIEz8OESanADdWgAgQd4gVd4c56d+fD+ZyNrjFzhHMwfn6BRxkmZg=</latexit><latexit sha1_base64="L3dZ0XgOyNX2PKpIro/rl+dxAfQ=">ACBXicbVDLSgMxFL3js46vqks3wSK4KjMi6LgxpVUsA9oS8lkMm1skhmTjDAMXbt1q/gTtz6Hf6CX2HazsK2HgczrmXc3OChDNtPO/bWVldW9/YLG252zu7e/vlg8OmjlNFaIPEPFbtAGvKmaQNwyn7URLAJOW8HoeuK3nqjSLJb3JktoT+CBZBEj2Fip2TVMUN0vV7yqNwVaJn5BKlCg3i/dMOYpIJKQzjWuhNx+ih7bo6VYTsdtNU0wGeEB7VgqsY3p5dN7x+jUKiGKYmWfNGiq/t3IsdA6E4GdFNgM9aI3Ef/zOqmJrno5k0lqCSzoCjlyMRo8nkUMkWJ4ZklmChmb0VkiBUmxlY0l6KNwCpT4di15fiLVSyT5nV96r+3UWldlvUVIJjOIEz8OESanADdWgAgQd4gVd4c56d+fD+ZyNrjFzhHMwfn6BRxkmZg=</latexit>

×

<latexit sha1_base64="L3dZ0XgOyNX2PKpIro/rl+dxAfQ=">ACBXicbVDLSgMxFL3js46vqks3wSK4KjMi6LgxpVUsA9oS8lkMm1skhmTjDAMXbt1q/gTtz6Hf6CX2HazsK2HgczrmXc3OChDNtPO/bWVldW9/YLG252zu7e/vlg8OmjlNFaIPEPFbtAGvKmaQNwyn7URLAJOW8HoeuK3nqjSLJb3JktoT+CBZBEj2Fip2TVMUN0vV7yqNwVaJn5BKlCg3i/dMOYpIJKQzjWuhNx+ih7bo6VYTsdtNU0wGeEB7VgqsY3p5dN7x+jUKiGKYmWfNGiq/t3IsdA6E4GdFNgM9aI3Ef/zOqmJrno5k0lqCSzoCjlyMRo8nkUMkWJ4ZklmChmb0VkiBUmxlY0l6KNwCpT4di15fiLVSyT5nV96r+3UWldlvUVIJjOIEz8OESanADdWgAgQd4gVd4c56d+fD+ZyNrjFzhHMwfn6BRxkmZg=</latexit><latexit sha1_base64="L3dZ0XgOyNX2PKpIro/rl+dxAfQ=">ACBXicbVDLSgMxFL3js46vqks3wSK4KjMi6LgxpVUsA9oS8lkMm1skhmTjDAMXbt1q/gTtz6Hf6CX2HazsK2HgczrmXc3OChDNtPO/bWVldW9/YLG252zu7e/vlg8OmjlNFaIPEPFbtAGvKmaQNwyn7URLAJOW8HoeuK3nqjSLJb3JktoT+CBZBEj2Fip2TVMUN0vV7yqNwVaJn5BKlCg3i/dMOYpIJKQzjWuhNx+ih7bo6VYTsdtNU0wGeEB7VgqsY3p5dN7x+jUKiGKYmWfNGiq/t3IsdA6E4GdFNgM9aI3Ef/zOqmJrno5k0lqCSzoCjlyMRo8nkUMkWJ4ZklmChmb0VkiBUmxlY0l6KNwCpT4di15fiLVSyT5nV96r+3UWldlvUVIJjOIEz8OESanADdWgAgQd4gVd4c56d+fD+ZyNrjFzhHMwfn6BRxkmZg=</latexit><latexit sha1_base64="L3dZ0XgOyNX2PKpIro/rl+dxAfQ=">ACBXicbVDLSgMxFL3js46vqks3wSK4KjMi6LgxpVUsA9oS8lkMm1skhmTjDAMXbt1q/gTtz6Hf6CX2HazsK2HgczrmXc3OChDNtPO/bWVldW9/YLG252zu7e/vlg8OmjlNFaIPEPFbtAGvKmaQNwyn7URLAJOW8HoeuK3nqjSLJb3JktoT+CBZBEj2Fip2TVMUN0vV7yqNwVaJn5BKlCg3i/dMOYpIJKQzjWuhNx+ih7bo6VYTsdtNU0wGeEB7VgqsY3p5dN7x+jUKiGKYmWfNGiq/t3IsdA6E4GdFNgM9aI3Ef/zOqmJrno5k0lqCSzoCjlyMRo8nkUMkWJ4ZklmChmb0VkiBUmxlY0l6KNwCpT4di15fiLVSyT5nV96r+3UWldlvUVIJjOIEz8OESanADdWgAgQd4gVd4c56d+fD+ZyNrjFzhHMwfn6BRxkmZg=</latexit><latexit sha1_base64="L3dZ0XgOyNX2PKpIro/rl+dxAfQ=">ACBXicbVDLSgMxFL3js46vqks3wSK4KjMi6LgxpVUsA9oS8lkMm1skhmTjDAMXbt1q/gTtz6Hf6CX2HazsK2HgczrmXc3OChDNtPO/bWVldW9/YLG252zu7e/vlg8OmjlNFaIPEPFbtAGvKmaQNwyn7URLAJOW8HoeuK3nqjSLJb3JktoT+CBZBEj2Fip2TVMUN0vV7yqNwVaJn5BKlCg3i/dMOYpIJKQzjWuhNx+ih7bo6VYTsdtNU0wGeEB7VgqsY3p5dN7x+jUKiGKYmWfNGiq/t3IsdA6E4GdFNgM9aI3Ef/zOqmJrno5k0lqCSzoCjlyMRo8nkUMkWJ4ZklmChmb0VkiBUmxlY0l6KNwCpT4di15fiLVSyT5nV96r+3UWldlvUVIJjOIEz8OESanADdWgAgQd4gVd4c56d+fD+ZyNrjFzhHMwfn6BRxkmZg=</latexit>

+

<latexit sha1_base64="CuHZdZlV0qZYt64uTfAQB24j90=">ACAHicbVDLSsNAFJ3UV42vqks3g0UQhJKIoMuCG1fSgn1AG8pkctMOnUzizEQIoRu3bvUf3Ilb/8Rf8CuctlnY1gMXDufcy73+AlnSjvOt1VaW9/Y3Cpv2zu7e/sHlcOjtopTSaFYx7Lrk8UcCagpZnm0E0kMjn0PHt1O/8wRSsVg86CwBLyJDwUJGiTZS82JQqTo1Zwa8StyCVFGBxqDy0w9imkYgNOVEqV7I4VF4dk6kZpTDxO6nChJCx2QIPUMFiUB5+ezUCT4zSoDWJoSGs/UvxM5iZTKIt90RkSP1LI3Ff/zeqkOb7yciSTVIOh8UZhyrGM8/RsHTALVPDOEUMnMrZiOiCRUm3QWtigdEZnJYGKbcNzlKFZJ+7LmOjW3eVWt3xcxldEJOkXnyEXqI7uUAO1EWAXtArerOerXfrw/qct5asYuYLcD6+gWfb5cT</latexit><latexit sha1_base64="CuHZdZlV0qZYt64uTfAQB24j90=">ACAHicbVDLSsNAFJ3UV42vqks3g0UQhJKIoMuCG1fSgn1AG8pkctMOnUzizEQIoRu3bvUf3Ilb/8Rf8CuctlnY1gMXDufcy73+AlnSjvOt1VaW9/Y3Cpv2zu7e/sHlcOjtopTSaFYx7Lrk8UcCagpZnm0E0kMjn0PHt1O/8wRSsVg86CwBLyJDwUJGiTZS82JQqTo1Zwa8StyCVFGBxqDy0w9imkYgNOVEqV7I4VF4dk6kZpTDxO6nChJCx2QIPUMFiUB5+ezUCT4zSoDWJoSGs/UvxM5iZTKIt90RkSP1LI3Ff/zeqkOb7yciSTVIOh8UZhyrGM8/RsHTALVPDOEUMnMrZiOiCRUm3QWtigdEZnJYGKbcNzlKFZJ+7LmOjW3eVWt3xcxldEJOkXnyEXqI7uUAO1EWAXtArerOerXfrw/qct5asYuYLcD6+gWfb5cT</latexit><latexit sha1_base64="CuHZdZlV0qZYt64uTfAQB24j90=">ACAHicbVDLSsNAFJ3UV42vqks3g0UQhJKIoMuCG1fSgn1AG8pkctMOnUzizEQIoRu3bvUf3Ilb/8Rf8CuctlnY1gMXDufcy73+AlnSjvOt1VaW9/Y3Cpv2zu7e/sHlcOjtopTSaFYx7Lrk8UcCagpZnm0E0kMjn0PHt1O/8wRSsVg86CwBLyJDwUJGiTZS82JQqTo1Zwa8StyCVFGBxqDy0w9imkYgNOVEqV7I4VF4dk6kZpTDxO6nChJCx2QIPUMFiUB5+ezUCT4zSoDWJoSGs/UvxM5iZTKIt90RkSP1LI3Ff/zeqkOb7yciSTVIOh8UZhyrGM8/RsHTALVPDOEUMnMrZiOiCRUm3QWtigdEZnJYGKbcNzlKFZJ+7LmOjW3eVWt3xcxldEJOkXnyEXqI7uUAO1EWAXtArerOerXfrw/qct5asYuYLcD6+gWfb5cT</latexit><latexit sha1_base64="CuHZdZlV0qZYt64uTfAQB24j90=">ACAHicbVDLSsNAFJ3UV42vqks3g0UQhJKIoMuCG1fSgn1AG8pkctMOnUzizEQIoRu3bvUf3Ilb/8Rf8CuctlnY1gMXDufcy73+AlnSjvOt1VaW9/Y3Cpv2zu7e/sHlcOjtopTSaFYx7Lrk8UcCagpZnm0E0kMjn0PHt1O/8wRSsVg86CwBLyJDwUJGiTZS82JQqTo1Zwa8StyCVFGBxqDy0w9imkYgNOVEqV7I4VF4dk6kZpTDxO6nChJCx2QIPUMFiUB5+ezUCT4zSoDWJoSGs/UvxM5iZTKIt90RkSP1LI3Ff/zeqkOb7yciSTVIOh8UZhyrGM8/RsHTALVPDOEUMnMrZiOiCRUm3QWtigdEZnJYGKbcNzlKFZJ+7LmOjW3eVWt3xcxldEJOkXnyEXqI7uUAO1EWAXtArerOerXfrw/qct5asYuYLcD6+gWfb5cT</latexit>

ct

<latexit sha1_base64="bRbyomJsasHk6yTZKV4PERh8=">AC3icbVDLSsNAFJ3UV42vqks3g0VwVRIRdFlw40oq2Ac0oUwmk3bozCTOTIQ8glu3eo/uBO3foS/4Fc4abOwrQcuHM65l3s4QcKo0o7zbdXW1jc2t+rb9s7u3v5B4/Cop+JUYtLFMYvlIECKMCpIV1PNyCRBPGAkX4wvSn9/hORisbiQWcJ8TkaCxpRjLSRPI8jPQmiHBcjPWo0nZYzA1wlbkWaoEJn1PjxwhinAiNGVJqGDHyKHw7R1JTzEhe6kiCcJTNCZDQwXiRPn5LHUBz4wSwiWZoSGM/XvRY64UhkPzGaZUi17pfifN0x1dO3nVCSpJgLPH0UpgzqGZQUwpJgzTJDEJbUZIV4giTC2hS18EVpjmQmw8I25bjLVayS3kXLdVru/WzfVfVAcn4BScAxdcgTa4BR3QBRgk4AW8gjfr2Xq3PqzP+WrNqm6OwQKsr18ScpxS</latexit><latexit sha1_base64="bRbyomJsasHk6yTZKV4PERh8=">AC3icbVDLSsNAFJ3UV42vqks3g0VwVRIRdFlw40oq2Ac0oUwmk3bozCTOTIQ8glu3eo/uBO3foS/4Fc4abOwrQcuHM65l3s4QcKo0o7zbdXW1jc2t+rb9s7u3v5B4/Cop+JUYtLFMYvlIECKMCpIV1PNyCRBPGAkX4wvSn9/hORisbiQWcJ8TkaCxpRjLSRPI8jPQmiHBcjPWo0nZYzA1wlbkWaoEJn1PjxwhinAiNGVJqGDHyKHw7R1JTzEhe6kiCcJTNCZDQwXiRPn5LHUBz4wSwiWZoSGM/XvRY64UhkPzGaZUi17pfifN0x1dO3nVCSpJgLPH0UpgzqGZQUwpJgzTJDEJbUZIV4giTC2hS18EVpjmQmw8I25bjLVayS3kXLdVru/WzfVfVAcn4BScAxdcgTa4BR3QBRgk4AW8gjfr2Xq3PqzP+WrNqm6OwQKsr18ScpxS</latexit><latexit sha1_base64="bRbyomJsasHk6yTZKV4PERh8=">AC3icbVDLSsNAFJ3UV42vqks3g0VwVRIRdFlw40oq2Ac0oUwmk3bozCTOTIQ8glu3eo/uBO3foS/4Fc4abOwrQcuHM65l3s4QcKo0o7zbdXW1jc2t+rb9s7u3v5B4/Cop+JUYtLFMYvlIECKMCpIV1PNyCRBPGAkX4wvSn9/hORisbiQWcJ8TkaCxpRjLSRPI8jPQmiHBcjPWo0nZYzA1wlbkWaoEJn1PjxwhinAiNGVJqGDHyKHw7R1JTzEhe6kiCcJTNCZDQwXiRPn5LHUBz4wSwiWZoSGM/XvRY64UhkPzGaZUi17pfifN0x1dO3nVCSpJgLPH0UpgzqGZQUwpJgzTJDEJbUZIV4giTC2hS18EVpjmQmw8I25bjLVayS3kXLdVru/WzfVfVAcn4BScAxdcgTa4BR3QBRgk4AW8gjfr2Xq3PqzP+WrNqm6OwQKsr18ScpxS</latexit><latexit sha1_base64="bRbyomJsasHk6yTZKV4PERh8=">AC3icbVDLSsNAFJ3UV42vqks3g0VwVRIRdFlw40oq2Ac0oUwmk3bozCTOTIQ8glu3eo/uBO3foS/4Fc4abOwrQcuHM65l3s4QcKo0o7zbdXW1jc2t+rb9s7u3v5B4/Cop+JUYtLFMYvlIECKMCpIV1PNyCRBPGAkX4wvSn9/hORisbiQWcJ8TkaCxpRjLSRPI8jPQmiHBcjPWo0nZYzA1wlbkWaoEJn1PjxwhinAiNGVJqGDHyKHw7R1JTzEhe6kiCcJTNCZDQwXiRPn5LHUBz4wSwiWZoSGM/XvRY64UhkPzGaZUi17pfifN0x1dO3nVCSpJgLPH0UpgzqGZQUwpJgzTJDEJbUZIV4giTC2hS18EVpjmQmw8I25bjLVayS3kXLdVru/WzfVfVAcn4BScAxdcgTa4BR3QBRgk4AW8gjfr2Xq3PqzP+WrNqm6OwQKsr18ScpxS</latexit>

ctMt

<latexit sha1_base64="TJwISkHzUulgDYQt6eHG/gsMlR0=">ACEXicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFlw40apYB/QhjCZTNqhk0mcuSmU0K9w61b/wZ249Qv8Bb/CSduFbT1w4XDOvdzDCVLBNTjOt1VaW9/Y3CpvV3Z29/YP7MOjlk4yRVmTJiJRnYBoJrhkTeAgWCdVjMSBYO1geFP47RFTmifyEcYp82LSlzilICRfNvuxQGQZTiQ/4zgfrjo1Zwq8Stw5qaI5Gr790wsTmsVMAhVE624k2JP0KjlRwKlgk0ov0ywldEj6rGuoJDHTXj7NPsFnRglxlCgzEvBU/XuRk1jrcRyYzSKpXvYK8T+vm0F07eVcphkwSWePokxgSHBRBA65YhTE2BCFTdZMR0QRSiYuha+aIiJGqtwUjHluMtVrJLWRc1au7DZbV+P6+pjE7QKTpHLrpCdXSLGqiJKBqhF/SK3qxn6936sD5nqyVrfnOMFmB9/QI5+J3r</latexit><latexit sha1_base64="TJwISkHzUulgDYQt6eHG/gsMlR0=">ACEXicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFlw40apYB/QhjCZTNqhk0mcuSmU0K9w61b/wZ249Qv8Bb/CSduFbT1w4XDOvdzDCVLBNTjOt1VaW9/Y3CpvV3Z29/YP7MOjlk4yRVmTJiJRnYBoJrhkTeAgWCdVjMSBYO1geFP47RFTmifyEcYp82LSlzilICRfNvuxQGQZTiQ/4zgfrjo1Zwq8Stw5qaI5Gr790wsTmsVMAhVE624k2JP0KjlRwKlgk0ov0ywldEj6rGuoJDHTXj7NPsFnRglxlCgzEvBU/XuRk1jrcRyYzSKpXvYK8T+vm0F07eVcphkwSWePokxgSHBRBA65YhTE2BCFTdZMR0QRSiYuha+aIiJGqtwUjHluMtVrJLWRc1au7DZbV+P6+pjE7QKTpHLrpCdXSLGqiJKBqhF/SK3qxn6936sD5nqyVrfnOMFmB9/QI5+J3r</latexit><latexit sha1_base64="TJwISkHzUulgDYQt6eHG/gsMlR0=">ACEXicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFlw40apYB/QhjCZTNqhk0mcuSmU0K9w61b/wZ249Qv8Bb/CSduFbT1w4XDOvdzDCVLBNTjOt1VaW9/Y3CpvV3Z29/YP7MOjlk4yRVmTJiJRnYBoJrhkTeAgWCdVjMSBYO1geFP47RFTmifyEcYp82LSlzilICRfNvuxQGQZTiQ/4zgfrjo1Zwq8Stw5qaI5Gr790wsTmsVMAhVE624k2JP0KjlRwKlgk0ov0ywldEj6rGuoJDHTXj7NPsFnRglxlCgzEvBU/XuRk1jrcRyYzSKpXvYK8T+vm0F07eVcphkwSWePokxgSHBRBA65YhTE2BCFTdZMR0QRSiYuha+aIiJGqtwUjHluMtVrJLWRc1au7DZbV+P6+pjE7QKTpHLrpCdXSLGqiJKBqhF/SK3qxn6936sD5nqyVrfnOMFmB9/QI5+J3r</latexit><latexit sha1_base64="TJwISkHzUulgDYQt6eHG/gsMlR0=">ACEXicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFlw40apYB/QhjCZTNqhk0mcuSmU0K9w61b/wZ249Qv8Bb/CSduFbT1w4XDOvdzDCVLBNTjOt1VaW9/Y3CpvV3Z29/YP7MOjlk4yRVmTJiJRnYBoJrhkTeAgWCdVjMSBYO1geFP47RFTmifyEcYp82LSlzilICRfNvuxQGQZTiQ/4zgfrjo1Zwq8Stw5qaI5Gr790wsTmsVMAhVE624k2JP0KjlRwKlgk0ov0ywldEj6rGuoJDHTXj7NPsFnRglxlCgzEvBU/XuRk1jrcRyYzSKpXvYK8T+vm0F07eVcphkwSWePokxgSHBRBA65YhTE2BCFTdZMR0QRSiYuha+aIiJGqtwUjHluMtVrJLWRc1au7DZbV+P6+pjE7QKTpHLrpCdXSLGqiJKBqhF/SK3qxn6936sD5nqyVrfnOMFmB9/QI5+J3r</latexit>

SLIDE 29

2 9

Our iterative pipeline for image generation

SLIDE 30

3

Training criteria

Adversarial net criterion: Wasserstein loss with Gradient Penalty (WGAN-GP), Gulrajani et al. NIPS’17 Our generator loss in the GAN setting: Our generator loss in the image reconstruction setting:

SLIDE 31

Results using a l1 reconstruction loss

3 1

SLIDE 32

Results using a l1 reconstruction loss

3 2

Target Our iterative reconstruction

SLIDE 33

Editing the original image using extracted masks, by performing local modifications of luminosity (top), or color modification using a blending

f masks (bottom).

Applications

3 3

SLIDE 34

Image editing using masks: Using chosen extracted mask(s) from image reconstruction, we apply object removal (top) and color modifications with

bject removal (bottom).

Applications

3 4

SLIDE 35

Image vectorization: Reconstruction results on MNIST images. Our model learns a vectorized mask representation of digits that can be generated at any resolution without interpolation artifacts.

Applications

3 5

SLIDE 36

RGB image

Baselines

3 6

RGB image z 1/ MLP-baseline MLP Resnet

SLIDE 37

RGB image

Baselines

3 8

RGB image z 1/ MLP-baseline 2/ ResNet baseline Resnet Resnet

SLIDE 38

RGB image

Baselines

3 9

RGB image z 1/ MLP-baseline 2/ ResNet baseline 3/ MLP-xy baseline MLP (x,y) Resnet

SLIDE 39

Comparative results

4

Using a l1 reconstruction loss
Parameters for our approach:

20 masks, p of size 10, c of size 3: 260 parameters

Baselines: size of the latent

code z = 20x13=260

SLIDE 40

CelebA generations trained on 64x64 images, sampled at 256x256

GAN results

4 1

CIFAR10 generations trained on 32x32 images, sampled at 256x256

SLIDE 41

GAN Results on ImageNet

4 2

trained on 64x64 images, sampled at 1024x1024

SLIDE 42

b

Results

4 4

Result with perceptual loss Perceptual loss Target

SLIDE 43

Conclusion and Future work

4 5

Faster training Use class conditioning Texture image generation

SLIDE 44

Predicting next frames in videos

Michael Mathieu, Camille Couprie, Yann LeCun, ICLR16

4 input images Our 2 predictions

SLIDE 45

Deep multiscale video prediction beyond Mean square error

Result with a simple convolutional network

trained minimizing an l2 loss

Our result using
A multiscale architecture
an image gradient different loss
Use adversarial training

SLIDE 46

4 8

P. Luc, N. Neverova, C. Couprie, J. Verbeek, Y. LeCun ECCV18

Predicting deeper into the future of semantic segmentations

4 input images Our 2 predictions

Predictions in the RGB space quickly become blurry

despites previous attempts

Idea: predict in the space of semantic segmentation

SLIDE 47

Approach – predicting deeper into the future

Single time-step

Autoregressive model

Batch model

St−3 St−2 St−1 St St+1 Lt+1 St−3 St−2 St−1 St St+1 St+2 St+3 Lt+1 Lt+2 Lt+3 St−3 St−2 St−1 St St+1 St+2 St+3 Lt+1 Lt+2 Lt+3

Autoregressive model is either :

used for inference without

additional training (w.r.t. to single time step model) AR

Fine-tuned using BPTT

AR fine-tune Autoregressive mode is only possible for X2X, S2S, XS2XS

Same color = shared weights

4 9

SLIDE 48

Some results

5

Baselines :

Copy the last input frame to the
utput
Estimate flow between the two last

inputs, and project the last input forward using the flow

Performance measure (mean IoU) of our approach and baselines on the Cityscapes dataset

Flow baseline Our AutoRegressive fine-tune result

SLIDE 49

Instance level segmentation: Mask RCNN

Extends Faster RCNN [Ren et

al.’15] by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition

K. He G. Gkioxari P. Dollar R. Girshick’17

1

SLIDE 50

5 2

P. Luc, C. Couprie, Y. LeCun, J. Verbeek, ECCV18

Predicting future segmentation instances by forecasting convolutional features

car 1.00 bicycle 0.92 bicycle 0.97 person 0.99 person 0.97 person 0.98 person 0.93 person 0.72 person 0.97 bicycle 1.00 person 0.76 person 0.97 person 0.99 person 0.98 person 0.99 person 0.94 person 0.91 person 1.00 person 0.56 person 0.84

Luc, Neverova et al. ICCV17 F2F predictions

SLIDE 51

Conclusions

5 3

Some open problems:

Automatic metrics to evaluate generative models performances
Non deterministic training losses for future prediction

Torch code online :

For future video prediction:
Vector image generation: available soon on Othman Sbai’s github
f RGBs : on Michael Mathieu’s github
f semantic segmentations : on Pauline’s Luc github
f instance segmentation : on Pauline’s Luc github

Addressing GAN limitations: resolution, lack of novelty and control on generations

Why do we care about generative models

generate plausible new scenes.

training data: data of higher resolution, data augmentation to help train better classifiers, use the learned representations in other tasks, or make prediction about uncertain events...

decomposition

Outline

Design inspiration from generative networks

Novelty Hedonic Value 1 2 3 4

Generative Adversarial Networks

Generative Adversarial Networks

Deep convolutional GANs

Training with pictures of about 2000 Clothing items

Texture and shape labels

Class conditioned GAN

loss

loss

classifier discriminator:

for the generator:

GAN Optimization objectives

Without conditioning With class conditioning

Introduction of a Style Deviation criterion

Introduction of a Style Deviation criterion

With the Style Deviation criterion (CAN H)

Multi-class cross entropy loss: Binary cross entropy loss :

Tested deviation objectives

Human Evaluation Study

Models with texture deviation are Most Popular

Decoupled adversarial image generation

Motivation:

1024x1024 generations on the RTW dataset

Using Morgane’s pytorch “progressive growing of GANs” available online, Karras et al., ICLR’18

Decoupled architecture

×

Dg

Ds

Gs

Gt

z

Random generations

Progressive growing Progressive growing with decoupled architecture

Better class conditioning

Overall average improvement: 4.7%

Vector Image Generation

by learning parametric layer decomposition

Current deep generative models are great but… … are limited in resolution, and control in generations

Kanan et al: Layered GANs (LR-GANs), ICLR’17 GANIN et al. SPIRAL, ICML’18

Related work

Our approach

Spoiler alert: yes, we can generate sharper images, this is just an example.

Iterative generation : !" = $(!, !"'() Vectorized mask generation *"(+, ,) = $(+, ,, -") Alpha-blending !"(+, ,) = !"'( +, , . (1-*"(+, ,))+ /"*"(+, ,)

Our iterative pipeline for image reconstruction

ct

Our iterative pipeline for image generation

Training criteria

Adversarial net criterion: Wasserstein loss with Gradient Penalty (WGAN-GP), Gulrajani et al. NIPS’17 Our generator loss in the GAN setting: Our generator loss in the image reconstruction setting:

Results using a l1 reconstruction loss

Results using a l1 reconstruction loss

Target Our iterative reconstruction

Editing the original image using extracted masks, by performing local modifications of luminosity (top), or color modification using a blending

Applications

Image editing using masks: Using chosen extracted mask(s) from image reconstruction, we apply object removal (top) and color modifications with

Applications

Image vectorization: Reconstruction results on MNIST images. Our model learns a vectorized mask representation of digits that can be generated at any resolution without interpolation artifacts.

Applications

RGB image

Baselines

RGB image z 1/ MLP-baseline MLP Resnet

RGB image

Baselines

RGB image z 1/ MLP-baseline 2/ ResNet baseline Resnet Resnet

RGB image

Baselines

RGB image z 1/ MLP-baseline 2/ ResNet baseline 3/ MLP-xy baseline MLP (x,y) Resnet

Comparative results

20 masks, p of size 10, c of size 3: 260 parameters

code z = 20x13=260

CelebA generations trained on 64x64 images, sampled at 256x256

GAN results

CIFAR10 generations trained on 32x32 images, sampled at 256x256

GAN Results on ImageNet

Iterative generation : !" = $(!, !"'() Vectorized mask generation "(+, ,) = $(+, ,, -") Alpha-blending !"(+, ,) = !"'( +, , . (1-"(+, ,))+ /"*"(+, ,)