A A 3D 3D-Ad Adver ert t Cr Crea eati tion System em for - - PowerPoint PPT Presentation
A A 3D 3D-Ad Adver ert t Cr Crea eati tion System em for - - PowerPoint PPT Presentation
A A 3D 3D-Ad Adver ert t Cr Crea eati tion System em for Product ct Place cements ADAPT SFI Research Centre, Trinity College Dublin, Ireland Introduction The common contextual advertising platforms utilize the information provided
- The common contextual advertising platforms utilize the information
provided by users to integrate 2D visual ads into videos.
- The existing platforms face many technical challenges such as ad
integration with respect to occluding objects and 3D ad placement.
- The growing video demand and the increase of user generated videos
creates additional challenges for advertisement and marketing agencies.
Introduction
1
- A 3D-advertisement creation system that can automatically analyze different depth
layers in a video sequence and seamlessly integrate new 3D objects with proper
- cclusion handling.
Contribution
2
Advert’s Workflow
3
Foreground Matting 3D Model Render Composite Output Input Video Background Plate Reconstruction
- bject/advert layer
Plane & Camera Tracking
- ccluding foreground layer
Mono Depth Estimation Interactive Rough Segmentation Mask Rough Segmentation Mask Propagation background layer Plane Localization Foreground Colours
Aim: The monocular depth estimation is used to understand the 3D geometry of the scene and anchor the 3D plane on which the
- bject will be placed.
Monocular Depth Estimation*
4
* Hu, J., Ozay, M., Zhang, Y. and Okatani, T., 2019, January. Revisiting single image depth estimation: Toward higher resolution maps with accurate object
- boundaries. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 1043-1051).
Camera Tracking
5
Aim: Allow users to decide which part of the scene is causing the
- cclusion, Broader control over tracking the occluding object
across the entire video.
Interactive Segmentation*
6
* Oh, S.W., Lee, J.Y., Xu, N. and Kim, S.J., 2019. Fast user-guided video object segmentation by interaction-and-propagation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5247-5256)
Aim: Background reconstruction is used to recover the foreground layer and to produce the final composite image.
Background Reconstruction*
7
* Kim, D., Woo, S., Lee, J.Y. and Kweon, I.S., 2019. Deep Video Inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5792-5801)
Aim: Reconstruct the transparency mask of the foreground occlusion
- layer. This is used to seamlessly recomposite fine details such as
hairs and effects such as motion blur.
Foreground Matting
8
Rough Mask Background Detailed Mask Frame
- Introducing a Background-Aware Generative Adversarial Network to
estimate alpha channels.
- Unlike the conventional methods, this architecture is designed to
accept a 7-channel volume, where the first 3 channels contain the RGB image, the second 3 channels contain the RGB background information and the last channel contains the trimap.
- The preliminary experiments using the trained model indicates a
significant improvement in the accuracy of the alpha mattes compared to the state of the art.
Foreground Matting
9
* For more info please refer to: Javidnia, H. and Pitié, F., 2020. Background Matting. arXiv preprint arXiv:2002.04433.
Foreground Matting Evaluation
10
Test Set: 40 images from our synthetic and Adobe Matting datasets References
1.
- N. Xu, B. Price, S. Cohen, and T. Huang, “Deep image matting,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp.
2970–2979.
2.
- Y. Aksoy, T. Ozan Aydin, and M. Pollefeys, “Designing effective inter-pixel information flow for natural image matting,” in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, 2017, pp. 29–37.
3.
- H. Lu, Y. Dai, C. Shen, and S. Xu, “Indices Matter: Learning to Index for Deep Image Matting,” arXiv Prepr. arXiv1908.00672, 2019.
4.
- D. Cho, Y.-W. Tai, and I. Kweon, “Natural image matting using deep convolutional neural networks,” in European Conference on Computer Vision, 2016, pp. 626–
643. 5.
- A. Levin, D. Lischinski, and Y. Weiss, “A closed-form solution to natural image matting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 2, pp. 228–242, 2007.
MSE SAD GRADIENT CONNECTIVITY Our Model 0.027169554 7.30600487 10.572783 6.52185248 Deep Matting [1] 0.045339207 16.3519901 22.7727221 16.4433491 DCNN [4] 0.078826554 19.6619429 28.9060183 20.469791 IndexNet [3] 0.050014134 14.0422988 20.0859051 13.2744296 InformationFlow [2] 0.060489726 16.9163196 23.1831811 17.0637133 ClosedForm [5] 0.068729347 17.7465063 28.2863385 18.1190624
Foreground Matting Evaluation
11
[2] [3] [ours] [1] [4]
Demo System
12
Front-end JS, HTML, CSS Back end Python, Flask Occlusion Service Tracking Service WebGL Three.js User Service Misc D3.js, Bootstrap, Font awesome, Mousetrap, Popper.js, Shepard.js, Whammy.js, ... User Interface Aurelia.js
UI
13
A B C
UI – Camera Tracking
14
Processing Speed
15
Component Time per frame Processed Frames Depth Estimation 155ms
- ne frame
Camera Tracking 500-4000ms all frames Interactive Segmentation Edit 10ms selected frames Interactive Segmentation Propagation 23ms
- ccluded frames
Foreground/Background Layers (total time including IO, networking,...) ~1000ms
- ccluded frames
> Background Reconstruction 70ms
- ccluded frames
> Foreground Matting (new) 50ms
- ccluded frames
> Foreground Colours 50ms
- ccluded frames
Final Compositing IO bounded all frames
Result
16
Result
17
www.adaptcentre.ie
18