im image to to im image t translatio ion w wit ith c
play

IM IMAGE-TO TO-IM IMAGE T TRANSLATIO ION W WIT ITH C CONDIT - PowerPoint PPT Presentation

IM IMAGE-TO TO-IM IMAGE T TRANSLATIO ION W WIT ITH C CONDIT ITIO IONAL AD ADVERSAR ARIAL AL NETWORKS Yuanjie Lu 9 th April What is image to image? Seems like a language concept can be translated into Chinese, French, Italian,


  1. IM IMAGE-TO TO-IM IMAGE T TRANSLATIO ION W WIT ITH C CONDIT ITIO IONAL AD ADVERSAR ARIAL AL NETWORKS Yuanjie Lu 9 th April

  2. What is image to image? ■ Seems like a language concept can be translated into Chinese, French, Italian, etc. ■ A visual scene can be rendered into RGB, gradient fields, boundary maps, semantic label maps, etc. ■ We can input some outlines according to the input, and output some similar pictures ■ Given enough training data, we can translate the scene expression into another scene expression

  3. Simple introduction to GAN ■ The generative adversarial network (GAN) consists of 2 important parts: – Generator (Generator): data generated by the machine (in most cases is an image), the purpose is to "lie" discriminator – Discriminator: the purpose is to find out the "fake data" made by the generator ■ Algorithms: – GAN. (Generative Adversarial Networks) – DCGAN. (Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks) – CGAN. (Conditional Generative Adversarial Nets) – CycleGAN. (Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks) – CoGAN. (Coupled Generative Adversarial Networks) – ProGAN. (Progressive Growing of GANs for Improved Quality, Stability, and Variation) – WGAN. (Wasserstein GAN)

  4. 3 practical applications of GAN

  5. What is conditional adversarial network? ■ Conditional Generative Adversarial Network (c-GAN) ■ Not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. ■ Apply the same generic approach to problems that traditionally would require very different loss formulations.

  6. Why not use CNN? – Although CNN has better results in image recognition, it makes trouble in terms of generating model, because We can’t simply use Euclidean distance, because it only calculate the global average distance – Need to tell CNN what to train and what to learn. – Although the learning process is automatic, still need a lot of manual operation to design loss – Overall, CNNs learn to minimize a loss function – an objective that scores the quality of results

  7. Why use c-GAN? ■ Different – Adversarial loss can classify if the output image is real or fake. – When training the data, the generator not only see random noise points, but also see the corresponding input image – Overall, through c-GAN, the corresponding pictures can be generated according to the pictures I input.

  8. GAN? C-GAN? ■ GAN: z → y ; c-GAN is: G: x, z → y. Where x is the input image, z is the random noise vector, and y is the output image. ■ A simple GAN generates an image from noise, and a discriminator cannot tell whether it is generated or real. ■ The conditional GAN generates an image from a given observed data which makes the discriminator indistinguishable ■ Use L1 loss function, because the pictures generated by L1 are clearer and less blurring

  9. Network Architecture ■ Generator – Generator with skips ■ Discriminator – Markovian Discriminator (Patch-GAN)

  10. Why use Generator with skip ■ The generated images have the same structure but different textures as the real image. In other words, their texture can be different, but the structure must be aligned ■ If only use VAE(Variational Autoencoders), no matter what kind of information, it has to pass through all layers, so some information may be over-calculated and not effective. ■ How to use skip connection? – For example, make connection with layer1 and layer n, layer2 and layer n-1, and so on.

  11. Markovian Discriminator (Patch-GAN) ■ Although the loss of L1 and L2 distance often produces blurry images of blur, it is useful for low-frequency structure. ■ When use L1 function, we only need to design a suitable loss to capture high- frequency information to complete the effect we want. ■ So they chose Patch-GAN ■ The smaller Patch-Gan, the fewer parameters, so the model run faster and can be applied to large images.

  12. Optimization and inference ■ To optimize the network, maximize log D(x, G(x, z)) , not log(1 − D(x, G(x, z)) ■ Use minibatch SGD ■ Learning rate = 0.0002 ■ momentum parameters β 1 = 0.5, β 2 = 0.999.

  13. Evaluation picture: ■ First, use Amazon Mechanical Turk (AMT) to verify real or fake, – Because the author believes that the ultimate goal of such a graphics generation task is to make it impossible for people to see whether it is fake or real, so This method is good ■ Second, FCN-score. – This is to use the generated image as a real image. If a classifier can classify a real image, it can naturally generate a fake image with a good effect.

  14. Results

  15. Results

  16. RESULTS

  17. THANK YOU

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend