zoom enhance synthesize magic upscaling and material
play

ZOOM, ENHANCE, SYNTHESIZE! MAGIC UPSCALING AND MATERIAL SYNTHESIS - PowerPoint PPT Presentation

ZOOM, ENHANCE, SYNTHESIZE! MAGIC UPSCALING AND MATERIAL SYNTHESIS USING DEEP LEARNING Tuesday, 9 May 2017 Andrew Edelsten - NVIDIA Developer Technologies DEEP LEARNING FOR ART Active R&D but ready now Style transfer Generative


  1. ZOOM, ENHANCE, SYNTHESIZE! MAGIC UPSCALING AND MATERIAL SYNTHESIS USING DEEP LEARNING Tuesday, 9 May 2017 Andrew Edelsten - NVIDIA Developer Technologies

  2. DEEP LEARNING FOR ART Active R&D but ready now ▪ Style transfer ▪ Generative networks creating images and voxels ▪ Adversarial networks (DCGAN) – still early but promising ▪ DL & ML based tools from NVIDIA and partners ▪ NVIDIA ▪ Artomatix ▪ Allegorithmic ▪ Autodesk 2

  3. STYLE TRANSFER Something Fun ▪ Doodle a masterpiece! Content Style ▪ Uses CNN to take the “style” from one image and apply it to another ▪ Sept 2015: A Neural Algorithm of Artistic Style by Gatys et al ▪ Dec 2015: neural-style (github) ▪ Mar 2016: neural-doodle (github) Mar 2016: texture-nets (github) ▪ Oct 2016: fast-neural-style (github) ▪ 2 May 2017 (last week!): Deep Image Analogy (arXiv) ▪ ▪ Also numerous services: Vinci, Prisma, Artisto, Ostagram 3

  4. HTTP :// OSTAGRAM . RU / STATIC _ PAGES / LENTA 4

  5. STYLE TRANSFER Something Useful ▪ Game remaster & texture enhancement ▪ Try Neural Style and use a real- world photo for the “style” ▪ For stylized or anime up-rez try https://github.com/nagadomi/waifu2x ▪ Experiment with art styles ▪ Dream or power-up sequences ▪ “Come Swim” by Kirsten Stewart - https://arxiv.org/pdf/1701.04928v1.pdf 5

  6. GAMEWORKS: MATERIALS & TEXTURES Using DL for Game Development & Content Creation ▪ Set of tools targeting the game industry using machine learning and deep learning ▪ Launched at Game Developer Conference in March, tools run as a web service ▪ Sign up for the Beta at: https://gwmt.nvidia.com ▪ Tools in this initial release: ▪ Photo to Material: 2shot ▪ Texture Multiplier ▪ Super-Resolution 6

  7. PHOTO TO MATERIAL The 2Shot Tool ▪ From two photos of a surface, generate a “material” ▪ Based on a SIGGRAPH 2015 paper by NVIDIA Research & Aalto University (Finland) ▪ “Two - Shot SVBRDF Capture for Stationary Materials” ▪ https://mediatech.aalto.fi/publications/graphics/TwoShotSVBRDF/ ▪ Input is pixel aligned “flash” and “guide” photographs ▪ Use tripod and remote shutter or bracket ▪ Or align later ▪ Use for flat surfaces with repeating patterns 7

  8. MATERIAL SYNTHESIS FROM TWO PHOTOS Flash image Guide image Diffuse Specular Normals Glossiness Anisotropy albedo 8

  9. TEXTURE MULTIPLIER Organic variations of textures ▪ Put simply: texture in, new texture out ▪ Inspired by Gatys, Ecker & Bethge ▪ Texture Synthesis Using Convolutional Neural Networks ▪ https://arxiv.org/pdf/1505.07376.pdf ▪ Artomatix ▪ Similar product “Texture Mutation” ▪ https://artomatix.com/ 9

  10. SUPER RESOLUTION 10

  11. SUPER RESOLUTION Zoom.. ENHANCE! OK! Sure! Can you Zoom in on the enhance that? license plate 11

  12. SUPER RESOLUTION Construct a high-resolution image The task at hand Given a low-resolution image Upscale H n * H (magic?) W n * W 12

  13. UPSCALE: CREATE MORE PIXELS An ill-posed task? Pixels of the upscaled image ? ? ? Pixels of the given image ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 13

  14. TRADITIONAL APPROACH ▪ Interpolation (bicubic, lanczos, etc.) ▪ Interpolation + Sharpening (and other filtration) Interpolation Filter-based sharpening ▪ Rough estimation of the data behavior  too general ▪ Too many possibilities (8x8 grayscale has 256 (8∗8) ≈ 10 153 pixel combinations!) 14

  15. A NEW APPROACH First: narrow the possible set Photos Natural images Textures All possible images Focus on the domain of “natural images” 15

  16. A NEW APPROACH Second: Place image in the domain, then reconstruct Data from natural images is sparse, it’s compressible in some domain Then “reconstruct” images (rather than create new ones) Compress Reconstruct + prior information + constraints 16

  17. PATCH-BASED MAPPING: TRAINING Low-resolution patch High-resolution patch Mapping Model params training , LR,HR pairs of patches Training images 17

  18. PATCH-BASED MAPPING 𝒚 𝑰 𝒚 𝑴 Encode Decode LR patch HR patch High-level information about the patch 18

  19. PATCH-BASED MAPPING: SPARSE CODING 𝒚 𝑰 𝒚 𝑴 Encode Decode LR patch HR patch High-level information about the patch “Features” Sparse code 19

  20. PATCH FEATURES & RECONSTRUCTION Image patch can be reconstructed as a sparse linear combination of features Features are learned from the dataset over time 𝑬 𝒚 = 𝑬𝒜 = 𝒆 𝟐 𝒜 𝟐 + ⋯ + 𝒆 𝑳 𝒜 𝑳 𝑬 - dictionary 𝒚 - patch = 0.8 * + 0.3 * + 0.5 * 𝒜 - sparse code 𝒚 𝒆 𝟒𝟕 𝒆 𝟓𝟑 𝒆 𝟕𝟒 20

  21. GENERALIZED PATCH-BASED MAPPING Mapping in Mapping Mapping feature space LR patch High-level High-level HR patch representation of representation of the LR patch the HR patch “Features” 21

  22. GENERALIZED PATCH-BASED MAPPING Mapping in Mapping Mapping feature space 𝑋 𝑋 𝑋 1 2 3 LR patch HR patch Trainable parameters 22

  23. MAPPING OF THE WHOLE IMAGE Using Convolutions Convolutional operators HR image LR image Mapping Mapping in Mapping feature space 23

  24. AUTO-ENCODERS input output ≈ input 24

  25. AUTO-ENCODER Decode Encode input output ≈ input features 25

  26. AUTO-ENCODER Parameters 𝑋 Inference 𝑧 = 𝐺 𝑋 (𝑦) 𝑦 𝑧 Training 𝑋 = 𝑏𝑠𝑕𝑛𝑗𝑜 ෍ 𝐸𝑗𝑡𝑢(𝑦 𝑗 , 𝐺 𝑋 𝑦 𝑗 ) 𝑗 𝑦 𝑗 - training set 26

  27. AUTO-ENCODER Encode ▪ Our encoder is LOSSY by definition input information loss 27

  28. SUPER-RESOLUTION AUTO-ENCODER Parameters 𝑋 Inference 𝑧 = 𝐺 𝑋 (𝑦) 𝑦 𝑧 Training 𝑋 = 𝑏𝑠𝑕𝑛𝑗𝑜 ෍ 𝐸𝑗𝑡𝑢(𝑦 𝑗 , 𝐺 𝑋 𝑦 𝑗 ) 𝑗 𝑦 𝑗 - training set 28

  29. SUPER RESOLUTION AE: TRAINING y 𝑦 𝑦 ො 𝐺 W 𝐸 Downscaling SR AE 𝑋 LR image Ground-truth HR image Reconstructed HR image 𝑋 = 𝑏𝑠𝑕𝑛𝑗𝑜 ෍ 𝐸𝑗𝑡𝑢(𝑦 𝑗 , 𝐺 𝑋 𝐸(𝑦 𝑗 ) ) 𝑗 𝑦 𝑗 - training set 29

  30. SUPER RESOLUTION AE: INFERENCE y 𝑦 ො 𝐺 W SR AE 𝑋 Given LR image Constructed HR image 𝑧 = 𝐺 𝑋 (ො 𝑦) 30

  31. SUPER-RESOLUTION: ILL-POSED TASK? 31

  32. THE LOSS FUNCTION 32

  33. THE LOSS FUNCTION Measuring the “distance” from a good result Distance function is a key element to obtaining good results. 𝑋 = 𝑏𝑠𝑕𝑛𝑗𝑜 ෍ 𝐸 𝑦 𝑗 , 𝐺 𝑋 (𝑦 𝑗 ) 𝑗 Choice of the loss function is an important decision 33

  34. LOSS FUNCTION MSE Mean Squared Error 1 2 𝑂 𝑦 − 𝐺 𝑦 34

  35. LOSS FUNCTION: PSNR MSE PSNR Mean Squared Error Peak Signal-to-Noise Ratio 1 𝑁𝐵𝑌 2 2 𝑂 𝑦 − 𝐺 𝑦 10 ∗ 𝑚𝑝𝑕 10 𝑁𝑇𝐹 35

  36. LOSS FUNCTION: HFEN MSE PSNR Mean Squared Error Peak Signal-to-Noise Ratio 1 𝑁𝐵𝑌 2 2 𝑂 𝑦 − 𝐺 𝑦 10 ∗ 𝑚𝑝𝑕 10 𝑁𝑇𝐹 HFEN (see A) High-Pass filter High Frequency Error Norm 𝐼𝑄(𝑦 − 𝐺 𝑦 ) 2 Perceptual loss Ref A: http://ieeexplore.ieee.org/document/5617283/ 36

  37. REGULAR LOSS Result 4x Result 4x 37

  38. REGULAR LOSS + PERCEPTUAL LOSS Result 4x Result 4x 38

  39. WARNING… THIS IS EXPERIMENTAL! 39

  40. SUPER-RESOLUTION: GAN-BASED LOSS 𝐺(𝑦) real 𝑦 𝑧 𝐸(𝑧) Generator Discriminator fake = −𝑚𝑜𝐸(𝐺 𝑦 ) GAN loss Total loss = Regular (MSE+PSNR+HFEN) loss + GAN loss 40

  41. QUESTIONS? Extended presentation from Game Developer Conference 2017 https://developer.nvidia.com/deep-learning-games GameWorks: Materials & Textures https://gwmt.nvidia.com

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend