Digital twins to computer vision: A rapid path to augmented reality - - PDF document

digital twins to computer vision a rapid path to
SMART_READER_LITE
LIVE PREVIEW

Digital twins to computer vision: A rapid path to augmented reality - - PDF document

IT 2 EC 2020 Digital Twins to Computer Vision Presentation Digital twins to computer vision: A rapid path to augmented reality object detection on the battlefield C. Wythe 1 , N. Fisher 2 , S. Bobrov 3 , B. Russell 4 , J. Alessi 5 , C. Gallagher


slide-1
SLIDE 1

IT2EC 2020 Digital Twins to Computer Vision Presentation

Digital twins to computer vision: A rapid path to augmented reality object detection on the battlefield

  • C. Wythe1, N. Fisher2, S. Bobrov3, B. Russell4, J. Alessi5, C. Gallagher6 and J. Throckmorton7

1Chief Revenue Officer / Principle Investigator, Cape Henry Associates, Virginia Beach, USA 2Systems Engineer, Cape Henry Associates, Virginia Beach, USA 3Principle Architect, Independent, San Francisco, USA 4Software Engineer, Cape Henry Associates, Virginia Beach, USA 5Technologist, Independent, Virginia Beach, USA

  • 6Sr. Multimedia Developer, Cape Henry Associates, Virginia Beach, USA

7Technolgy Solutions Architect, KOVA Global, Virginia Beach, USA

Abstract — Acquiring real life training data for the purposes of object identification and training on the battlefield is both costly and a time-consuming task requiring human intervention. We lay the hypothetical foundation of rapidly developing Artificial Intelligence (AI) object recognition models based solely on available 3-Dimensional (3D) models to provide rapid and accurate battlefield object detection.

1 Introduction

Several studies have shown the efficacy of leveraging 3D models for the purposes of object identification in AI model training when utilized in conjunction with real life

  • imagery. We expand upon these studies and venture into

the possibilities of solely leveraging 3D models and synthetic images to train AI models for object recognition

  • n the battlefield. Successful object detection and

classification using AI algorithms is highly dependent on availability of training data. (e.g. labelled images) Although large repositories of labelled images exist and continue to be generated for research purposes, most labelling is generalized to object types and not to the level

  • f specificity that would produce useful object detection

algorithms for battlefield applications. Real-world training imagery is scarce, and labelling is a time-intensive human- in-the-loop event. In practice, thousands of images of two similar, but distinct items of interest (e.g. M1A1 Abrams Tank vs. Panther Tank) are required to efficiently train an AI model to high level of confidence. We explore the practicality of the utilization of existing high-fidelity gaming objects and future digital twins for rapid, automated generation of high-volume AI model training data for expedited deployment of AI powered applications to improve battlefield situational awareness. 1.2 Problem Statement Sufficient quantities of military specific labelled images for AI algorithm training do not currently exist, are difficult to obtain, and are time consuming to label. Some questions arise. Can high fidelity 3D rendered objects which exist in large quantity for virtual training environments be utilized for the automated development

  • f AI training data? Can they be successful to train AI

models to identify different types of objects with a high degree of specificity and discretion between similar object classes?

3 Approach

We would select high-fidelity 3D models of multiple battlefield vehicles found in online gaming object repositories and generate several thousand synthetic images to serve as our training data set for our AI models. We would automate image generation and labelling through the utilization of custom scripts within the 3D environment to reduce human interaction, performance errors and cost. Upon completion of AI model training, we would test and document algorithm performance against video footage of battlefield operations. 3.1 Training Image Generation We first developed a 3D scene in Unity in which to place

  • ur target 3D objects. We then scripted an animation to

modify environmental elements of the scene such as weather conditions and lighting. Computer code was developed to automate the export of synthetic imagery, correlating labels and XML notations of object

  • coordinates. The AI algorithms were trained with 10,000

synthesized images from each vehicle 3D model. The synthesized images of the target objects inherited object boundary boxes, XML annotations and class labels automatically via programmed code during the image generation process. The image capture process produced a variety

  • f

backgrounds, camera angles, weather conditions, light conditions (obscuration), and

  • bstructions (occlusion).

3.2 AI Model Training Process

slide-2
SLIDE 2

IT2EC 2020 Digital Twins to Computer Vision Presentation For our AI pipeline, we utilized the NVIDIA NGC Transfer Learning Toolkit (TLT) AI models and processed them on NVIDIA DGX on-premise hardware. 3.3 Experimentation Hardware Utilized Model Training: NVIDIA DGX-1 Deep Learning Server, with eight Tesla V100 GPU(s). Synthetic Image Production: Custom workstation with NVIDIA GTX GPU.

4 Future Work

Further experimentation is needed to add additional object classes and continue to study efficiency and confidence levels obtained utilizing additional neural network types. Planned experimentation includes testing the limits of discrete variance identification in objects like those from additive crew served weapons. We will also work toward successful deployment of developed object detection algorithms on devices for practical application. (E.g. Live video feeds, aerial drone footage, Microsoft HoloLens, and mobile phones.)

5 Conclusions

At the time of abstract submission, experiments are

  • ngoing. Preliminary results are showing promise on the

viability of this approach.

Acknowledgements

The authors would like to thank the NVIDIA DGX team for their support during this effort.

References

[1] Su, Hao, et al. "Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views." Proceedings of the IEEE International Conference on Computer Vision. 2015. [2] Peng, Xingchao, et al. "Learning deep object detectors from 3d models." Proceedings of the IEEE International Conference on Computer Vision. 2015. [3] Xiang, Yu, et al. "Objectnet3d: A large scale database for 3d object recognition." European Conference on Computer Vision. Springer, Cham, 2016.

Author/Speaker Biographies

Chuck Wythe Chuck is the Chief Revenue Officer for Cape Henry Associates and actively leads teams as a principle investigator on R&D efforts. Chuck has extensive technical experience in the fields of manpower analysis, training content development, artificial intelligence, machine learning, augmented reality, training devices, and simulation. Nosika Fisher Nosika’s primary background is in SaaS systems integrations and data science. Currently she is a Systems Engineer at Cape Henry Associates, where she leads artificial intelligence and machine learning projects, including the productization of FogLifterTM, a stand-alone mobile artificial intelligence framework for high-volume machine learning computations. Sergey Bobrov Sergey’s main focus areas of work are machine learning, data ingestion platforms and building infrastructure and models for rapid data analysis. Sergey previously worked

  • n IoT Platform Xively (acquired by Google), architecting

and building REST API back-ends for identity management, authorization

  • f

publish/subscribe messages, and IoT domain description and management. Brandon Russell Brandon is a software engineer at Cape Henry Associates, who is heavily involved in the field of artificial

  • intelligence. His main areas of work involve building end-

to-end pipelines leveraging the power of machine- learning/deep-learning to provide intelligent insights on data. Jeremy Alessi Jeremy is a technologist with 25 years of experience architecting full-stack software solutions in many fields including gaming, simulation, AR, VR, AI, mobile, streaming, fin-tech, med-tech, blockchain, and

  • transportation. Jeremy has written software that has been

used by 10’s of millions of end-users and has written and spoken extensively on various subjects in the field of software technology. Chris Gallagher

  • Fig. 1. AI Pipeline
slide-3
SLIDE 3

IT2EC 2020 Digital Twins to Computer Vision Presentation The last 12 years has had Chris working in the field of high-end computer-generated imagery. Recently, Chris has been developing augmented, virtual, and mixed reality using Unity’s core platform and their High Dynamic Rendering Pipeline. Joel Throckmorton Joel is a Technology Solutions Architect for KOVA Global and has worked in the defense industry for over 16

  • years. His most recent work has been in the development
  • f

the Lighthouse™ and FogLifter™ platforms, technology stack integrations and AI platform implementations for the US Navy and Defense Intelligence Agency.