Safe Reinforcement Learning Strategies with Interpretable Decision-Making for Autonomous Driving in Uncertain Traffic Conditions

James Whitmore; Priya Mehra; Oliver Hastings; Emily Linford

doi:10.5281/zenodo.15278751

Authors

James Whitmore Department of Computer Science, University of Leeds, Leeds LS2 9JT, United Kingdom
Priya Mehra School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, United Kingdom
Oliver Hastings Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, United Kingdom
Emily Linford Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, United Kingdom

DOI:

https://doi.org/10.5281/zenodo.15278751

Keywords:

Reinforcement Learning, Safety Strategy, Bayesian Modeling, Interpretability, Autonomous Driving Decision

Abstract

The study focuses on improving the safety and interpretability of reinforcement learning in autonomous driving under uncertain traffic conditions. A decision-making model is developed using the Soft Actor-Critic algorithm, with an added module to estimate uncertainty and detect risky situations in real time. To make the system’s behavior more understandable, a state–action salience mapping is designed to show which inputs have the greatest effect on each decision. The model is tested in simulation environments involving sudden pedestrian crossings, lane changes by other vehicles, and complex traffic flows. Results show that the method reduces the accident rate by 23.5% compared with standard approaches, while also making it easier for users to follow the reasoning behind the system’s actions. These findings suggest that combining risk detection with simple visual explanation tools can help reinforcement learning models perform more reliably and transparently in real-world traffic.

References

Garikapati, D., & Shetiya, S. S. (2024). Autonomous vehicles: Evolution of artificial intelligence and the current industry landscape. Big Data and Cognitive Computing, 8(4), 42.

Yan, H., Wang, Z., Bo, S., Zhao, Y., Zhang, Y., & Lyu, R. (2024, August). Research on image generation optimization based deep learning. In Proceedings of the International Conference on Machine Learning, Pattern Recognition and Automation Engineering (pp. 194-198).

Gao, D., Shenoy, R., Yi, S., Lee, J., Xu, M., Rong, Z., ... & Chen, Y. (2023). Synaptic resistor circuits based on Al oxide and Ti silicide for concurrent learning and signal processing in artificial intelligence systems. Advanced Materials, 35(15), 2210484.

Mo, K., Chu, L., Zhang, X., Su, X., Qian, Y., Ou, Y., & Pretorius, W. (2024). Dral: Deep reinforcement adaptive learning for multi-uavs navigation in unknown indoor environment. arXiv preprint arXiv:2409.03930.

Shi, X., Tao, Y., & Lin, S. C. (2024, November). Deep Neural Network-Based Prediction of B-Cell Epitopes for SARS-CoV and SARS-CoV-2: Enhancing Vaccine Design through Machine Learning. In 2024 4th International Signal Processing, Communications and Engineering Management Conference (ISPCEM) (pp. 259-263). IEEE.

Min, L., Yu, Q., Zhang, Y., Zhang, K., & Hu, Y. (2024, October). Financial Prediction Using DeepFM: Loan Repayment with Attention and Hybrid Loss. In 2024 5th International Conference on Machine Learning and Computer Application (ICMLCA) (pp. 440-443). IEEE.

Yin, Z., Hu, B., & Chen, S. (2024). Predicting employee turnover in the financial company: A comparative study of catboost and xgboost models. Applied and Computational Engineering, 100, 86-92.

Wang, S., Jiang, R., Wang, Z., & Zhou, Y. (2024). Deep learning-based anomaly detection and log analysis for computer networks. arXiv preprint arXiv:2407.05639.

Yu, Q., Wang, S., & Tao, Y. (2025). Enhancing Anti-Money Laundering Detection with Self-Attention Graph Neural Networks. In SHS Web of Conferences (Vol. 213, p. 01016). EDP Sciences.

Zhao, R., Hao, Y., & Li, X. (2024). Business Analysis: User Attitude Evaluation and Prediction Based on Hotel User Reviews and Text Mining. arXiv preprint arXiv:2412.16744.

China PEACE Collaborative Group. (2021). Association of age and blood pressure among 3.3 million adults: insights from China PEACE million persons project. Journal of Hypertension, 39(6), 1143-1154.

Zhai, D., Beaulieu, C., & Kudela, R. M. (2024). Long‐term trends in the distribution of ocean chlorophyll. Geophysical Research Letters, 51(7), e2023GL106577.

Bellone, M., Ismailogullari, A., Müür, J., Nissin, O., Sell, R., & Soe, R. M. (2021). Autonomous driving in the real-world: The weather challenge in the Sohjoa Baltic project. In Towards connected and autonomous vehicle highways: Technical, security and social challenges (pp. 229-255). Cham: Springer International Publishing.

Lv, G., Li, X., Jensen, E., Soman, B., Tsao, Y. H., Evans, C. M., & Cahill, D. G. (2023). Dynamic covalent bonds in vitrimers enable 1.0 W/(m K) intrinsic thermal conductivity. Macromolecules, 56(4), 1554-1561.

Yan, Y., Wang, Y., Li, J., Zhang, J., & Mo, X. (2025). Crop Yield Time-Series Data Prediction Based on Multiple Hybrid Machine Learning Models.

China PEACE Collaborative Group. (2021). Association of age and blood pressure among 3.3 million adults: insights from China PEACE million persons project. Journal of Hypertension, 39(6), 1143-1154.

Zhai, D., Beaulieu, C., & Kudela, R. M. (2024). Long‐term trends in the distribution of ocean chlorophyll. Geophysical Research Letters, 51(7), e2023GL106577.

Xiao, Y., Tan, L., & Liu, J. (2025). Application of Machine Learning Model in Fraud Identification: A Comparative Study of CatBoost, XGBoost and LightGBM.

Wang, J., Ding, W., & Zhu, X. (2025). Financial Analysis: Intelligent Financial Data Analysis System Based on LLM-RAG.

Yang, J., Zhang, Y., Xu, K., Liu, W., & Chan, S. E. (2024). Adaptive Modeling and Risk Strategies for Cross-Border Real Estate Investments.

Gong, C., Zhang, X., Lin, Y., Lu, H., Su, P. C., & Zhang, J. (2025). Federated Learning for Heterogeneous Data Integration and Privacy Protection.

Shih, K., Han, Y., & Tan, L. (2025). Recommendation System in Advertising and Streaming Media: Unsupervised Data Enhancement Sequence Suggestions.

Zhao, C., Li, Y., Jian, Y., Xu, J., Wang, L., Ma, Y., & Jin, X. (2025). II-NVM: Enhancing Map Accuracy and Consistency with Normal Vector-Assisted Mapping. IEEE Robotics and Automation Letters.

Lin, Y. (2024). Design of urban road fault detection system based on artificial neural network and deep learning. Frontiers in neuroscience, 18, 1369832.

Jiang, G., Yang, J., Zhao, S., Chen, H., Zhong, Y., & Gong, C. (2025). Investment Advisory Robotics 2.0: Leveraging Deep Neural Networks for Personalized Financial Guidance.

Vepa, A., Yang, Z., Choi, A., Joo, J., Scalzo, F., & Sun, Y. (2024). Integrating Deep Metric Learning with Coreset for Active Learning in 3D Segmentation. Advances in Neural Information Processing Systems, 37, 71643-71671.

Li, Z., Ji, Q., Ling, X., & Liu, Q. (2025). A Comprehensive Review of Multi-Agent Reinforcement Learning in Video Games. Authorea Preprints.

Zhang, W., Li, Z., & Tian, Y. (2025). Research on Temperature Prediction Based on RF-LSTM Modeling. Authorea Preprints.

Liu, J., Li, K., Zhu, A., Hong, B., Zhao, P., Dai, S., ... & Su, H. (2024). Application of deep learning-based natural language processing in multilingual sentiment analysis. Mediterranean Journal of Basic and Applied Sciences (MJBAS), 8(2), 243-260.

Tang, X., Wang, Z., Cai, X., Su, H., & Wei, C. (2024, August). Research on heterogeneous computation resource allocation based on data-driven method. In 2024 6th International Conference on Data-driven Optimization of Complex Systems (DOCS) (pp. 916-919). IEEE.

Feng, H. (2024, September). The research on machine-vision-based EMI source localization technology for DCDC converter circuit boards. In Sixth International Conference on Information Science, Electrical, and Automation Engineering (ISEAE 2024) (Vol. 13275, pp. 250-255). SPIE.

Zhu, J., Ortiz, J., & Sun, Y. (2024, November). Decoupled Deep Reinforcement Learning with Sensor Fusion and Imitation Learning for Autonomous Driving Optimization. In 2024 6th International Conference on Artificial Intelligence and Computer Applications (ICAICA) (pp. 306-310). IEEE.

Zhu, J., Sun, Y., Zhang, Y., Ortiz, J., & Fan, Z. (2024, October). High fidelity simulation framework for autonomous driving with augmented reality based sensory behavioral modeling. In IET Conference Proceedings CP989 (Vol. 2024, No. 21, pp. 670-674). Stevenage, UK: The Institution of Engineering and Technology.

Liu, Z., Costa, C., & Wu, Y. (2024). Data-Driven Optimization of Production Efficiency and Resilience in Global Supply Chains. Journal of Theory and Practice of Engineering Science, 4(08), 23-33.

Sun, Y., Pargoo, N. S., Jin, P. J., & Ortiz, J. (2024). Optimizing Autonomous Driving for Safety: A Human-Centric Approach with LLM-Enhanced RLHF. arXiv preprint arXiv:2406.04481.

Wang, K., Shen, C., Li, X., & Lu, J. (2025). Uncertainty Quantification for Safe and Reliable Autonomous Vehicles: A Review of Methods and Applications. IEEE Transactions on Intelligent Transportation Systems.

Safe Reinforcement Learning Strategies with Interpretable Decision-Making for Autonomous Driving in Uncertain Traffic Conditions

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Resources

Resources

Information

Make a Submission

Keywords

Browse

Journal of Theory and Practice in Engineering and Technology (JTPET)

CONTACT US