A A 3D 3D-Ad Adver ert t Cr Crea eati tion System em for - - PowerPoint PPT Presentation

▶

Nov 03, 2022 142 likes •344 views

A A 3D 3D-Ad Adver ert t Cr Crea eati tion System em for Product ct Place cements ADAPT SFI Research Centre, Trinity College Dublin, Ireland Introduction The common contextual advertising platforms utilize the information provided

SLIDE 1

A A 3D 3D-Ad Adver ert t Cr Crea eati tion System em for Product ct Place cements

ADAPT SFI Research Centre, Trinity College Dublin, Ireland

SLIDE 2

The common contextual advertising platforms utilize the information

provided by users to integrate 2D visual ads into videos.

The existing platforms face many technical challenges such as ad

integration with respect to occluding objects and 3D ad placement.

The growing video demand and the increase of user generated videos

creates additional challenges for advertisement and marketing agencies.

Introduction

SLIDE 3

A 3D-advertisement creation system that can automatically analyze different depth

layers in a video sequence and seamlessly integrate new 3D objects with proper

cclusion handling.

Contribution

SLIDE 4

Advert’s Workflow

Foreground Matting 3D Model Render Composite Output Input Video Background Plate Reconstruction

bject/advert layer

Plane & Camera Tracking

ccluding foreground layer

Mono Depth Estimation Interactive Rough Segmentation Mask Rough Segmentation Mask Propagation background layer Plane Localization Foreground Colours

SLIDE 5

Aim: The monocular depth estimation is used to understand the 3D geometry of the scene and anchor the 3D plane on which the

bject will be placed.

Monocular Depth Estimation*

* Hu, J., Ozay, M., Zhang, Y. and Okatani, T., 2019, January. Revisiting single image depth estimation: Toward higher resolution maps with accurate object

boundaries. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 1043-1051).

SLIDE 6

Camera Tracking

SLIDE 7

Aim: Allow users to decide which part of the scene is causing the

cclusion, Broader control over tracking the occluding object

across the entire video.

Interactive Segmentation*

* Oh, S.W., Lee, J.Y., Xu, N. and Kim, S.J., 2019. Fast user-guided video object segmentation by interaction-and-propagation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5247-5256)

SLIDE 8

Aim: Background reconstruction is used to recover the foreground layer and to produce the final composite image.

Background Reconstruction*

* Kim, D., Woo, S., Lee, J.Y. and Kweon, I.S., 2019. Deep Video Inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5792-5801)

SLIDE 9

Aim: Reconstruct the transparency mask of the foreground occlusion

layer. This is used to seamlessly recomposite fine details such as

hairs and effects such as motion blur.

Foreground Matting

Rough Mask Background Detailed Mask Frame

SLIDE 10

Introducing a Background-Aware Generative Adversarial Network to

estimate alpha channels.

Unlike the conventional methods, this architecture is designed to

accept a 7-channel volume, where the first 3 channels contain the RGB image, the second 3 channels contain the RGB background information and the last channel contains the trimap.

The preliminary experiments using the trained model indicates a

significant improvement in the accuracy of the alpha mattes compared to the state of the art.

Foreground Matting

* For more info please refer to: Javidnia, H. and Pitié, F., 2020. Background Matting. arXiv preprint arXiv:2002.04433.

SLIDE 11

Foreground Matting Evaluation

Test Set: 40 images from our synthetic and Adobe Matting datasets References

N. Xu, B. Price, S. Cohen, and T. Huang, “Deep image matting,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp.

2970–2979.

Y. Aksoy, T. Ozan Aydin, and M. Pollefeys, “Designing effective inter-pixel information flow for natural image matting,” in Proceedings of the IEEE Conference on

Computer Vision and Pattern Recognition, 2017, pp. 29–37.

H. Lu, Y. Dai, C. Shen, and S. Xu, “Indices Matter: Learning to Index for Deep Image Matting,” arXiv Prepr. arXiv1908.00672, 2019.

D. Cho, Y.-W. Tai, and I. Kweon, “Natural image matting using deep convolutional neural networks,” in European Conference on Computer Vision, 2016, pp. 626–

643. 5.

A. Levin, D. Lischinski, and Y. Weiss, “A closed-form solution to natural image matting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 2, pp. 228–242, 2007.

MSE SAD GRADIENT CONNECTIVITY Our Model 0.027169554 7.30600487 10.572783 6.52185248 Deep Matting [1] 0.045339207 16.3519901 22.7727221 16.4433491 DCNN [4] 0.078826554 19.6619429 28.9060183 20.469791 IndexNet [3] 0.050014134 14.0422988 20.0859051 13.2744296 InformationFlow [2] 0.060489726 16.9163196 23.1831811 17.0637133 ClosedForm [5] 0.068729347 17.7465063 28.2863385 18.1190624

SLIDE 12

Foreground Matting Evaluation

[2] [3] [ours] [1] [4]

SLIDE 13

Demo System

Front-end JS, HTML, CSS Back end Python, Flask Occlusion Service Tracking Service WebGL Three.js User Service Misc D3.js, Bootstrap, Font awesome, Mousetrap, Popper.js, Shepard.js, Whammy.js, ... User Interface Aurelia.js

SLIDE 14

UI

A B C

SLIDE 15

UI – Camera Tracking

SLIDE 16

Processing Speed

Component Time per frame Processed Frames Depth Estimation 155ms

ne frame

Camera Tracking 500-4000ms all frames Interactive Segmentation Edit 10ms selected frames Interactive Segmentation Propagation 23ms

ccluded frames

Foreground/Background Layers (total time including IO, networking,...) ~1000ms

ccluded frames

> Background Reconstruction 70ms

ccluded frames

> Foreground Matting (new) 50ms

ccluded frames

> Foreground Colours 50ms

ccluded frames

Final Compositing IO bounded all frames

SLIDE 17

Result

SLIDE 18

Result

SLIDE 19

A A 3D 3D-Ad Adver ert t Cr Crea eati tion System em for - - PowerPoint PPT Presentation

A A 3D 3D-Ad Adver ert t Cr Crea eati tion System em for Product ct Place cements

ADAPT SFI Research Centre, Trinity College Dublin, Ireland

provided by users to integrate 2D visual ads into videos.

integration with respect to occluding objects and 3D ad placement.

creates additional challenges for advertisement and marketing agencies.

Introduction

layers in a video sequence and seamlessly integrate new 3D objects with proper

Contribution

Advert’s Workflow

Aim: The monocular depth estimation is used to understand the 3D geometry of the scene and anchor the 3D plane on which the

Monocular Depth Estimation*

Camera Tracking

Aim: Allow users to decide which part of the scene is causing the

across the entire video.

Interactive Segmentation*

Aim: Background reconstruction is used to recover the foreground layer and to produce the final composite image.

Background Reconstruction*

Aim: Reconstruct the transparency mask of the foreground occlusion

hairs and effects such as motion blur.

Foreground Matting

estimate alpha channels.

accept a 7-channel volume, where the first 3 channels contain the RGB image, the second 3 channels contain the RGB background information and the last channel contains the trimap.

significant improvement in the accuracy of the alpha mattes compared to the state of the art.

Foreground Matting

Foreground Matting Evaluation

Test Set: 40 images from our synthetic and Adobe Matting datasets References

Foreground Matting Evaluation

Demo System

UI

A B C

UI – Camera Tracking

Processing Speed

Result

Result

www.adaptcentre.ie

Thank You!