Unity VR Engineer
DyViR (Dynamic Virtual Reality) developed in Unity, empowers users to create highly customizable virtual environments and simulate drone flights to generate synthetic datasets for aerial object detection AI training. Designed as part of my master’s thesis and backed by several academic publications, this tool helps bridge the gap between simulated data and real-world machine learning, accelerating research in computer vision for drones and aerial robotics.
DyViR is capable of producing synthetic datasets as seen through optical, thermal, and depth cameras. This was accomplished using Unity’s post-processing stack along with customized shaders to produce imagery resembling these sensor arrays.
Created by Rowan University’s Biomedical Art and Visualization Artists, DyViR contains 40+ aerial objects across 4 classes (drone, airplane, bird, and helicopter) to provide a diverse range of training data.
The aerial objects within DyViR traverse the 3D environment based on simulated flight patterns to allow for intention estimate (Ex. Recon, Area Denial, Kinetic Kill).
GENERATING REAL-TIME SYNTHETIC DATASETS TO IMPROVE AERIAL OBJECT DETECTION
Abstract: The widespread use of unmanned aerial vehicles (UAVs) across civilian and military applications has necessitated the advancement of real-time drone detection and tracking capabilities. Machine Learning (ML) addresses these requirements, however, to train a robust and generalizable model requires large and diverse video datasets. Curating these real-world datasets is often time-consuming and cost-prohibitive. Here, we present DyViR, a real-time customizable rendering application capable of automatically generating highly realistic synthetic, multi-modal video of aerial objects, digital environments, and automatic generation and labeling of bounding boxes. Synthetic data, coupled with real-world training sets, augment the ML training process, leading to increased performance and detection accuracy. DyViR is designed to enable non-technical users to generate datasets containing 47 different aerial objects, 4 flight patterns, and 8 environments. To verify the benefits of using synthetic data to augment existing real-world datasets, the YOLOv7-tiny model was employed to evaluate a fully real-world dataset and one augmented with synthetically generated data from DyViR, resulting in a 60.4% increase in mean average precision. This research demonstrates the potential of synthetic datasets, especially when it would be impossible or cost-prohibitive to obtain, opening the door to broader applications where data acquisition is challenging.
DyViR: dynamic virtual reality dataset for aerial threat object detection
Abstract: Unmanned combat aerial vehicles (i.e., drones), are changing the modern geopolitical stage’s surveillance, security, and conflict landscape. Various technologies and solutions can help track drones; each technology has different advantages and limitations concerning drone size and detection range. Machine learning (ML) can automatically detect and track drones in real-time while superseding human-level accuracy and providing enhanced situational awareness. Unfortunately, ML’s power depends on the data’s quality and quantity. In the drone detection task scenario, limited datasets provide limited environmental variation, view angle, view distance, and drone type. We developed a customizable software tool called DyViR that generates large synthetic video datasets for training machine learning algorithms in aerial threat object detection. These datasets contain video and audio renderings of aerial objects within user-specified dynamic simulated biomes (i.e., arctic, desert, and forest). Users can alter the environment on a timeline allowing changes to behaviors such as drone flight patterns and weather conditions across a synthetically generated dataset. DyViR supports additional controls such as motion blur, anti-aliasing, and fully dynamic moving cameras to produce imagery across multiple viewing angles. Each aerial object’s classification (drone or airplane) and bounding box data automatically exports to a comma-separated-value (CSV) file and a video to form a synthetic dataset. We demonstrate the value of DyViR by training a real-time YOLOv7-tiny model on these synthetic datasets. The performance of the object detection model improved by 60.4% over its counterpart not using DyViR. This result suggests a use-case of synthetic datasets to surmount the lack of real-world training data for aerial threat object detection.
Boosting Aerial Object Detection Performance via Virtual Reality Data and Multi-Object Training
Abstract: Deep neural network (DNN) architectures, such as R-CNN and YOLO, have demonstrated impressive performance in object detection tasks with respect to both time and accuracy. However, detecting small aerial objects remains challenging from both a data and algorithmic perspective. Collecting and annotating videos to detect small aerial objects is a time-consuming task and can quickly become a burden when new classes of objects are added to a database. In addition, the current objective functions for DNNs are not specifically designed for smaller objects. To address these challenges, we propose a virtual reality (VR) dataset for aerial object detection, which can generate large volumes of small-object aerial data. By combining VR data with real-world data, we are able to improve the performance of aerial object detection. We also introduce a cost function derived from the normalized Wasserstein distance to replace the Intersection-over-Union loss for YOLO. Experimental results demonstrate that the VR dataset and normalized Wasserstein distance improve the performance of state-of-the-art object detection methods in detecting small aerial objects. Our source code is publicly available at https://github.com/naddeok96/yolov7_mavrc