IM IMAGE-TO TO-IM IMAGE T TRANSLATIO ION W WIT ITH C CONDIT - - PowerPoint PPT Presentation

im image to to im image t translatio ion w wit ith c
SMART_READER_LITE
LIVE PREVIEW

IM IMAGE-TO TO-IM IMAGE T TRANSLATIO ION W WIT ITH C CONDIT - - PowerPoint PPT Presentation

IM IMAGE-TO TO-IM IMAGE T TRANSLATIO ION W WIT ITH C CONDIT ITIO IONAL AD ADVERSAR ARIAL AL NETWORKS Yuanjie Lu 9 th April What is image to image? Seems like a language concept can be translated into Chinese, French, Italian,


slide-1
SLIDE 1

IM IMAGE-TO TO-IM IMAGE T TRANSLATIO ION W WIT ITH C CONDIT ITIO IONAL AD ADVERSAR ARIAL AL NETWORKS

Yuanjie Lu 9th April

slide-2
SLIDE 2

What is image to image?

■ Seems like a language concept can be translated into Chinese, French, Italian, etc. ■ A visual scene can be rendered into RGB, gradient fields, boundary maps, semantic label maps, etc. ■ We can input some outlines according to the input, and output some similar pictures ■ Given enough training data, we can translate the scene expression into another scene expression

slide-3
SLIDE 3

Simple introduction to GAN

■ The generative adversarial network (GAN) consists of 2 important parts: – Generator (Generator): data generated by the machine (in most cases is an image), the purpose is to "lie" discriminator – Discriminator: the purpose is to find out the "fake data" made by the generator ■ Algorithms: –

  • GAN. (Generative Adversarial Networks)

  • DCGAN. (Unsupervised Representation Learning with Deep Convolutional

Generative Adversarial Networks) –

  • CGAN. (Conditional Generative Adversarial Nets)

  • CycleGAN. (Unpaired Image-to-Image Translation using Cycle-Consistent

Adversarial Networks) –

  • CoGAN. (Coupled Generative Adversarial Networks)

  • ProGAN. (Progressive Growing of GANs for Improved Quality, Stability, and

Variation) –

  • WGAN. (Wasserstein GAN)
slide-4
SLIDE 4

3 practical applications of GAN

slide-5
SLIDE 5

What is conditional adversarial network?

■ Conditional Generative Adversarial Network (c-GAN) ■ Not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. ■ Apply the same generic approach to problems that traditionally would require very different loss formulations.

slide-6
SLIDE 6

Why not use CNN?

– Although CNN has better results in image recognition, it makes trouble in terms

  • f generating model, because We can’t simply use Euclidean distance,

because it only calculate the global average distance – Need to tell CNN what to train and what to learn. – Although the learning process is automatic, still need a lot of manual operation to design loss – Overall, CNNs learn to minimize a loss function – an objective that scores the quality of results

slide-7
SLIDE 7

Why use c-GAN?

■ Different – Adversarial loss can classify if the output image is real or fake. – When training the data, the generator not only see random noise points, but also see the corresponding input image – Overall, through c-GAN, the corresponding pictures can be generated according to the pictures I input.

slide-8
SLIDE 8

GAN? C-GAN?

■ GAN: z → y ; c-GAN is: G: x, z → y. Where x is the input image, z is the random noise vector, and y is the output image. ■ A simple GAN generates an image from noise, and a discriminator cannot tell whether it is generated or real. ■ The conditional GAN generates an image from a given observed data which makes the discriminator indistinguishable ■ Use L1 loss function, because the pictures generated by L1 are clearer and less blurring

slide-9
SLIDE 9

Network Architecture

■ Generator – Generator with skips ■ Discriminator – Markovian Discriminator (Patch-GAN)

slide-10
SLIDE 10

Why use Generator with skip

■ The generated images have the same structure but different textures as the real

  • image. In other words, their texture can be different, but the structure must be

aligned ■ If only use VAE(Variational Autoencoders), no matter what kind of information, it has to pass through all layers, so some information may be over-calculated and not effective. ■ How to use skip connection? – For example, make connection with layer1 and layer n, layer2 and layer n-1, and so on.

slide-11
SLIDE 11

Markovian Discriminator (Patch-GAN)

■ Although the loss of L1 and L2 distance often produces blurry images of blur, it is useful for low-frequency structure. ■ When use L1 function, we only need to design a suitable loss to capture high- frequency information to complete the effect we want. ■ So they chose Patch-GAN ■ The smaller Patch-Gan, the fewer parameters, so the model run faster and can be applied to large images.

slide-12
SLIDE 12

Optimization and inference

■ To optimize the network, maximize log D(x, G(x, z)) , not log(1 − D(x, G(x, z)) ■ Use minibatch SGD ■ Learning rate = 0.0002 ■ momentum parameters β1 = 0.5, β2 = 0.999.

slide-13
SLIDE 13

Evaluation picture:

■ First, use Amazon Mechanical Turk (AMT) to verify real or fake, – Because the author believes that the ultimate goal of such a graphics generation task is to make it impossible for people to see whether it is fake or real, so This method is good ■ Second, FCN-score. – This is to use the generated image as a real image. If a classifier can classify a real image, it can naturally generate a fake image with a good effect.

slide-14
SLIDE 14

Results

slide-15
SLIDE 15

Results

slide-16
SLIDE 16

RESULTS

slide-17
SLIDE 17

THANK YOU