Image Analysis and Synthesis Deep Learning Use cases at technicolor
Louis Chevallier Principal Scientist, Research and Innovation technicolor.com GTC Munich - October 2017
Synthesis Deep Learning Use cases at technicolor Louis Chevallier - - PowerPoint PPT Presentation
Image Analysis and Synthesis Deep Learning Use cases at technicolor Louis Chevallier Principal Scientist, Research and Innovation technicolor.com GTC Munich - October 2017 PROVIDING THE BUSINESS A COMPETITIVE EDGE EXPLOIT DATA TO OPTIMIZE
Louis Chevallier Principal Scientist, Research and Innovation technicolor.com GTC Munich - October 2017
PROVIDING THE BUSINESS A COMPETITIVE EDGE
PROPOSE NEW TOOLS TO ANALYSE, PROCESS, REPRESENT AND RENDER AUDIOVISUAL CONTENT OFFER PROFESSIONALS AND END-USERS SOLUTIONS TO AMPLIFY THEIR IMMERSIVE EXPERIENCES IMPROVE THE USER EXPERIENCE AT HOME WITH BETTER NETWORK, VIDEO SERVICES AND CONTEXTUAL SOLUTIONS EXPLOIT DATA TO OPTIMIZE AND EXPAND OUR BUSINESS WITH AI- BASED INNOVATION
► Dailies and Color
pipeline management
► VFX ► Marketing Services ► Sound Finishing ► Color Finishing
(including IMAX theatrical and HDR for home)
► DVD manufacturing and
distribution
► Dailies and Color
pipeline management
► VFX ► Marketing Services ► Sound Finishing ► Color Finishing ► DVD manufacturing
and Distribution
► Creative ► VFX ► Sound Finishing ► Color Finishing ► Immersive Experiences ► Original IP and production ► Asset creation ► Full servicing of film and television
properties
► Full servicing including asset
and level building
► Sound Finishing ► Packaged media manufacturing
and distribution
► Immersive Experiences ► Packaged media manufacturing
and distribution
Technicolor is a company working in the media and entertainment sector for filmmakers and advertisers. Ackowledging outstanding performance of Deep Learning based solutions in computer vision.
Y (LR)
Dong, Chao, et al. "Learning a deep convolutional network for image super-resolution." European Conference
X (HR) Geometric Distortion Blur, noise lens PSF Subsampling Quantization
PSNR dB Bicubic Deep Set5 33.19 37.80
Accuracy – 2X scale factor: Speed : HD (4K) image : about 1 sec with a GPU
Applying filters a, b conditionally Y = a if K else b : relu(a+K-1) + relu(b-K)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2921-2929).
r e l u
Training Images Model train Select patch
Deep approach Ground truth Standard approach (Bicubic)
nice looking but PSNR no more applicable
Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." arXiv preprint arXiv:1609.04802 (2016).
Content Loss Discriminator Loss « perceptual loss » != PSNR Bicubic DNN mse GAN Original
12
Size = 4096
Classic audio spectrum feature: Mel-frequency cepstral coefficients + Delta + Delta2 , Size = 60 * 3 = 180
Predicted interestingness: 0.00040983 Ground truth: not interesting Predicted interestingness: 0.64466870 Ground truth: interesting
A ranking based metric, used for MediaEval performance evaluation:
Fully convolutional neural network + Post-filter BD-rate DBF + SAO + ALF
CNN
boundaries
No filters CNN
Problem :
Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. "Perceptual losses for real-time style transfer and super-resolution." arXiv preprint arXiv:1603.08155(2016).
An interesting tool for artists How to control the output, how to evaluate?