Synthesis Deep Learning Use cases at technicolor Louis Chevallier - PowerPoint PPT Presentation

Image Analysis and Synthesis Deep Learning Use cases at technicolor Louis Chevallier Principal Scientist, Research and Innovation technicolor.com GTC Munich - October 2017

PROVIDING THE BUSINESS A COMPETITIVE EDGE EXPLOIT DATA TO OPTIMIZE AND EXPAND OUR BUSINESS WITH AI- BASED INNOVATION IMPROVE THE USER EXPERIENCE AT HOME WITH OFFER PROFESSIONALS AND BETTER NETWORK, VIDEO SERVICES AND CONTEXTUAL END-USERS SOLUTIONS TO SOLUTIONS AMPLIFY THEIR IMMERSIVE EXPERIENCES PROPOSE NEW TOOLS TO ANALYSE, PROCESS, REPRESENT AND RENDER AUDIOVISUAL CONTENT

POWERING PREMIUM CONTENT ACROSS MARKETS ► Original IP and production ► Asset creation ► Full servicing of film and television properties ► Dailies and Color ► Full servicing including asset pipeline management ► Creative and level building ► VFX ► VFX ► Dailies and Color ► Sound Finishing ► Marketing Services ► Sound Finishing pipeline management ► Packaged media manufacturing ► Sound Finishing ► Color Finishing ► VFX and ► Color Finishing ► Immersive Experiences ► Marketing Services distribution ► DVD manufacturing ► Sound Finishing and Distribution ► Color Finishing (including IMAX ► Immersive Experiences theatrical and HDR for ► Packaged media manufacturing home) and distribution ► DVD manufacturing and distribution

Impact of Deep Learning Technicolor is a company working in the media and entertainment sector for filmmakers and advertisers. Ackowledging outstanding performance of Deep Learning based solutions in computer vision. • New functionalities emerge, requesting to revisit existing workflows. • Higher performances calls for proper evaluation and metrics. • Deep learning specific requirements raises integration and deployment challenges.

Uses cases • Video Enhancement • Asset Management • Upscaling • Indexing, Retrieval • denoising • Classification • Video Editing, Augmentation • CGI, Animation • Style Transfer • Video 2 animation • Mono to Stereo • Video encoding • Compression

UC#1 The Super Resolution Converting images 2K into 4K Geometric Blur, noise Subsampling X (HR) Y (LR) Distortion lens PSF Quantization Solution invert the distortions using a deep network : a stack of convolutionnal layers Baseline Dong, Chao, et al. "Learning a deep convolutional network for image super-resolution." European Conference on Computer Vision . Springer International Publishing, 2014.

Evaluation • Images captured at 2 different resolutions : LR, HR(x2) Accuracy – 2X scale factor: PSNR dB Bicubic Deep Set5 33.19 37.80 Speed : HD (4K) image : about 1 sec with a GPU

Approach • Knowledge Transfer Applying filters a, b conditionally Y = a if K else b : relu(a+K-1) + relu(b-K) r e l u Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2921-2929).

Approach Training Images • Selective Sampling Select patch train Model • And better loss, wide receptive field

Ground truth Deep approach Standard approach (Bicubic)

High order loss, GAN « perceptual loss » != PSNR Content Loss Bicubic DNN mse Discriminator Loss GAN Original nice looking but PSNR no more applicable Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." arXiv preprint arXiv:1609.04802 (2016).

UC#2 Video Classification Problem Predicting interestingness • Which one is more interesting? • Which one is more interesting? 12

Application • Media file search & browse • Advertisement • Filtering and summarization • E-learning

Approach • From image / frame: CNN feature output coefficients from a dense layer (fc7) of a CNN model (CaffeNet) Size = 4096 • From audio: MFCC feature Classic audio spectrum feature: Mel-frequency cepstral coefficients + Delta + Delta 2 , Size = 60 * 3 = 180

Results Predicted interestingness: 0.00040983 Predicted interestingness: 0.64466870 Ground truth: not interesting Ground truth: interesting

Evaluation • Datasets • Flickr (≈200000, balanced, patented API) • MediaEval (≈5000, unbalanced, human annotation ) • The Mean Average Precision (MAP) metric A ranking based metric, used for MediaEval performance evaluation : • Depth of the network • Adversarial content are possible • What about robustness against adversarial examples

UC#3 3D Animation Speeding up the production of animated movies

Video 2 Animation • Sketching animations starting from video Joints coordinates are extracted from images using plausible motions

Integration Animation are carried out by highly skilled artists with specialized GUI Need to devise new user/machine interface Mixing Rig controllers and learnt manifold compatibility

UC#4 Post-filters in future video codec Encoder Decoder Fully convolutional + neural network boundaries Post-filter BD-rate • Results DBF + SAO + ALF -3.2% CNN -4.91%

Results CNN No filters DNN has a high computation cost

UC#5 Style Transfer A new image editing tool

Approach • Base line Problem : • Flickering, stability over time • Speed Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. "Perceptual losses for real-time style transfer and super-resolution." arXiv preprint arXiv:1603.08155(2016).

UC#5 Style Transfer An interesting tool for artists How to control the output, how to evaluate?

Conclusion • DL allows substantial improvements in several applications • Many new deep based tools are emerging and more are still to be discovered • Existing workflows have to be adapted • Integration challenges need be solved

Thanks

Synthesis Deep Learning Use cases at technicolor Louis Chevallier - PowerPoint PPT Presentation

Image Analysis and Synthesis Deep Learning Use cases at technicolor Louis Chevallier Principal Scientist, Research and Innovation technicolor.com GTC Munich - October 2017 PROVIDING THE BUSINESS A COMPETITIVE EDGE EXPLOIT DATA TO OPTIMIZE

SYNTHESIS OF SUPER SYNTHESIS OF SUPER NANOPOROUS SYNTHESIS OF SUPER SYNTHESIS OF

Total Synthesis of the Polycyclic Total Synthesis of the Polycyclic Total Synthesis of the

Chemical Synthesis Techniques Chemical Synthesis Techniques Chemical Synthesis Techniques

Synthesis of Carbon Synthesis of Carbon Nanotubes Nanotubes Polina Shifrina Supervisors: Dr.

Solid Texture Synthesis Solid Texture Synthesis Solid Texture Synthesis from 2D Exemplars from

Post-Synthesis Simulation VITAL Models, SDF Files, Timing Simulation Post-synthesis simulation

Synthesis of Ranking Functions and Synthesis of Inductive Invariants and Synthesis of

Text-to-Image Generation Yu Cheng Text-to-Image Synthesis Text-to-Image Synthesis

Text-to-Speech Synthesis Bernd Mbius Language Science and Technology Saarland University

CTP431- Music and Audio Computing Sound Synthesis Graduate School of Culture Technology KAIST

Texture Synthesis Given a texture, create more CS176: Texture Synthesis All examples from Wei

Co-synthesis techniques for embedded systems embedded systems Kelvin Yuk June 5, 2002 EEC282 -

Scaling Program Synthesis by Exploiting Existing Code James Bornholt Emina Torlak University of

Verification and Synthesis of Reactive Programs Overview of System Synthesis. Amir Pnueli

From Program Synthesis to Optimal Program . . . Optimal Program Synthesis Logical Interpretation

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Emotion-Based Recommender System for Overcoming the Problem of Information Overload Hernani

Consultant Contracts Eric Pelfrey, P.E. Division of Professional Services Jim Fisher

ADBLOCK SOFTWARE Sidi Diaby WHAT ARE ADBLOCK SOFTWARE ? AdBlocks are software that filters

AD-EXTRACTOR TOOL Developer: Lalit Agarwal About Ad-Extractor A tool to extract and identify

The Dynamic Behavior of a Data Dissemination Protocol for Network Programming at Scale Jonathan

MARI ON COUNT Y UT I L I T I E S (MCU) F Y2020 F DE P Spring s F unding Pro je c ts

Transportation Alternatives Program (TAP) S cott Robertson, PE - VTrans Municipal Assistance

1 General Managers Report December 18, 2019 2 Protect Our Resources 3 Climate Change