Evaluating Validity of Synthetic Data in Perception Tasks for Autonomous Vehicles

Publication Date


Document Type

Conference Proceeding

Publication Title

2020 IEEE International Conference On Artificial Intelligence Testing (AITest)

Conference Location

Oxford, UK/Virtual




Autonomous vehicles have the potential to completely upend the way we transport today, however deploying them safely at scale is not an easy task. Any autonomous driving system relies on multiple layers of software to function safely. Among these layers, the Perception layer is the most data intensive and also the most complex layer to get right. Companies need to collect and annotate lots of data to properly train deep learning perception models. Simulation systems have come up as an alternative to the expensive task of data collection and annotation. However, whether simulated data can be used as a proxy for real-world data is an ongoing debate. In this work, we attempt to address the question of whether models trained on simulated data can generalize well to the real-world. We collect datasets based on two different simulators with varying levels of graphics fidelity and use the KITTI dataset as an example of real- world data. We train three separate deep learning based object detection models on each of these datasets, and compare their performance on test sets collected from the same sources. We also add the recently released Waymo Open Dataset as a challenging test set. Performance is evaluated based on the mean average precision (mAP) metric for object detection. We find that training on simulation in general does not translate to generalizability on real-world data and that diversity in the training set is much more important than visual graphics' fidelity.


Autonomous Driving, Self-driving cars, Perception, YOLOv3, Deep learning


SJSU users: Use the following link to login and access the article via SJSU databases.


Computer Engineering