Publication Date
4-8-2026
Document Type
Article
Publication Title
IEEE Access
DOI
10.1109/ACCESS.2026.3681929
Abstract
Efficient Advancements in intelligent transportation systems increasingly demand multimodal sensing frameworks capable of operating reliably under noisy, unpredictable real-world conditions. This research presents a Smart Car Audio Intelligence framework that leverages Machine Learn- ing (ML) and Reinforcement Learning (RL) to enhance real-time vehicle perception, safety, and situational awareness. Unlike traditional single-modality or static audio models, our approach integrates four specialized submodules - Emergency, Alert, Environmental/Human sounds, and Accident detection models which are each optimized through a combination of Transformer, CLAP, and CRNN architectures, augmented with adaptive RL agents. The ML models achieved strong baseline performance across all categories, with up to 99% accuracy and 0.99 AUC, while the integration of RL (DQN and PPO) further improved decision timing, adaptability, and robustness under fluctuating noise and overlapping sound events. Reward-driven optimization reduced false-positive rates by up to 66% and decreased detection latency by more than 60%, enabling the system to react faster and more confidently to critical acoustic events such as sirens, collisions, and in-vehicle alerts. Deployed through framework which supports large-scale fleet integration, real-time dashboard reporting, and continuous learning across distributed edge agents. The proposed hybrid ML-RL audio framework thus transforms conventional vehicle monitoring into an adaptive, self-improving decision system, bridging the gap between static classification and real-world autonomous driving intelligence.
Keywords
Audio Intelligence, CLAP Embeddings, Deep Q-Network (DQN), Reinforcement Learning, Smart Vehicles, Transformer Models
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
Department
Computer Engineering
Recommended Citation
Sweekruthi Balivada, Mansi Sanjaybhai Tanna, Meera Alpesh Dhedia, Jane Wu, and Jerry Gao. "Audio based Intelligence for Smart Autonomous Vehicles using Deep Reinforcement Learning" IEEE Access (2026). https://doi.org/10.1109/ACCESS.2026.3681929