SLIDE 1
IT2EC 2020 Digital Twins to Computer Vision Presentation
Digital twins to computer vision: A rapid path to augmented reality object detection on the battlefield
- C. Wythe1, N. Fisher2, S. Bobrov3, B. Russell4, J. Alessi5, C. Gallagher6 and J. Throckmorton7
1Chief Revenue Officer / Principle Investigator, Cape Henry Associates, Virginia Beach, USA 2Systems Engineer, Cape Henry Associates, Virginia Beach, USA 3Principle Architect, Independent, San Francisco, USA 4Software Engineer, Cape Henry Associates, Virginia Beach, USA 5Technologist, Independent, Virginia Beach, USA
- 6Sr. Multimedia Developer, Cape Henry Associates, Virginia Beach, USA
7Technolgy Solutions Architect, KOVA Global, Virginia Beach, USA
Abstract — Acquiring real life training data for the purposes of object identification and training on the battlefield is both costly and a time-consuming task requiring human intervention. We lay the hypothetical foundation of rapidly developing Artificial Intelligence (AI) object recognition models based solely on available 3-Dimensional (3D) models to provide rapid and accurate battlefield object detection.
1 Introduction
Several studies have shown the efficacy of leveraging 3D models for the purposes of object identification in AI model training when utilized in conjunction with real life
- imagery. We expand upon these studies and venture into
the possibilities of solely leveraging 3D models and synthetic images to train AI models for object recognition
- n the battlefield. Successful object detection and
classification using AI algorithms is highly dependent on availability of training data. (e.g. labelled images) Although large repositories of labelled images exist and continue to be generated for research purposes, most labelling is generalized to object types and not to the level
- f specificity that would produce useful object detection
algorithms for battlefield applications. Real-world training imagery is scarce, and labelling is a time-intensive human- in-the-loop event. In practice, thousands of images of two similar, but distinct items of interest (e.g. M1A1 Abrams Tank vs. Panther Tank) are required to efficiently train an AI model to high level of confidence. We explore the practicality of the utilization of existing high-fidelity gaming objects and future digital twins for rapid, automated generation of high-volume AI model training data for expedited deployment of AI powered applications to improve battlefield situational awareness. 1.2 Problem Statement Sufficient quantities of military specific labelled images for AI algorithm training do not currently exist, are difficult to obtain, and are time consuming to label. Some questions arise. Can high fidelity 3D rendered objects which exist in large quantity for virtual training environments be utilized for the automated development
- f AI training data? Can they be successful to train AI
models to identify different types of objects with a high degree of specificity and discretion between similar object classes?
3 Approach
We would select high-fidelity 3D models of multiple battlefield vehicles found in online gaming object repositories and generate several thousand synthetic images to serve as our training data set for our AI models. We would automate image generation and labelling through the utilization of custom scripts within the 3D environment to reduce human interaction, performance errors and cost. Upon completion of AI model training, we would test and document algorithm performance against video footage of battlefield operations. 3.1 Training Image Generation We first developed a 3D scene in Unity in which to place
- ur target 3D objects. We then scripted an animation to
modify environmental elements of the scene such as weather conditions and lighting. Computer code was developed to automate the export of synthetic imagery, correlating labels and XML notations of object
- coordinates. The AI algorithms were trained with 10,000
synthesized images from each vehicle 3D model. The synthesized images of the target objects inherited object boundary boxes, XML annotations and class labels automatically via programmed code during the image generation process. The image capture process produced a variety
- f
backgrounds, camera angles, weather conditions, light conditions (obscuration), and
- bstructions (occlusion).