Enhanced YOLOv8 Infrared Image Object Detection Method with SPD Module

Authors

  • Zheng Ren College of Computing, Georgia Institute of Technology, North Avenue, Atlanta, GA 30332

Keywords:

YOLOv8, Low Resolution Infrared Images, Object Detection

Abstract

This study proposes an improved deep learning-based method for object detection in low-resolution infrared images in factory environments. Considering the impact of factors such as fog, smoke and insufficient illumination on the detection accuracy, multispectral imaging techniques, especially infrared thermography, are introduced in this study to enhance the recognition capability of the system. We developed a novel algorithm based on YOLOv8 by improving the existing deep learning model and integrated the SPD module, which significantly improves the recognition accuracy of low-resolution images and small objects.The SPD module employs a unique downsampling method on the input feature maps, boosting the model's ability to detect low-resolution images and small targets while also improving inference speed. Our experimental results on LLVIP and VEDAI datasets demonstrate the superior performance of the proposed method for target detection in infrared images.

References

Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International journal of computer vision, 88, 303-338.

Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11), 1231-1237.

Qingyun, F., Dapeng, H., & Zhaokui, W. (2021). Cross-modality fusion transformer for multispectral object detection. arXiv preprint arXiv:2111.00273.

Seifert, C., Aamir, A., Balagopalan, A., Jain, D., Sharma, A., Grottel, S., & Gumhold, S. (2017). Visualizations of deep neural networks in computer vision: A survey. Transparent data mining for big and small data, 123-144.

Zhou, F. Y., Jin, L. P., & Dong, J. (2017). Review of convolutional neural network.

Jiang, P., Ergu, D., Liu, F., Cai, Y., & Ma, B. (2022). A Review of Yolo algorithm developments. Procedia computer science, 199, 1066-1073.

Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440-1448).

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.

Jia, X., Zhu, C., Li, M., Tang, W., & Zhou, W. (2021). LLVIP: A visible-infrared paired dataset for low-light vision. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3496-3504).

Razakarivony, S., & Jurie, F. (2016). Vehicle detection in aerial imagery: A small target detection benchmark. Journal of Visual Communication and Image Representation, 34, 187-203.

Jocher, G., Chaurasia, A., & Qiu, J. (2023). Ultralytics YOLO (Version 8.0. 0)[Computer software]. URL: https://github. com/ultralytics/ultralytics.

Sunkara, R., & Luo, T. (2022, September). No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects. In Joint European conference on machine learning and knowledge discovery in databases (pp. 443-459). Cham: Springer Nature Switzerland.

Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., & Sun, J. (2021). Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 13733-13742).

Cao, H. Y., Shen, X. L., & LIU, C. (2020). Improved infrared target detection algorithm of YOLOv3. Journal of Electronic Measurement and Instrumentation, 34(8), 188-194.

Jiaojiao, G. U., Bingzhen, L. I., Ke, L. I. U., & Wenzhi, J. I. A. N. G. (2021). Infrared ship target detection algorithm based on improved faster R-CNN. Infrared Technology, 43(2), 170-178.

Cai, W., Xv, P. W., Yang, Z. Y., Jiang, X. H.,& Jiang, B.(2021). Dim target detection in infrared image with complex background Applied Optics 42(4) pp 643-50

Tan, F., Mu, P.A., & Ma, Z.X. (2021).Multi-target tracking algorithm based on YOLOv3 detection and feature point matching Acta Metrologica Sinica 42(02) pp 157-62

Chen, Z., Guo, H., Yang, J., Jiao, H., Feng, Z., Chen, L., & Gao, T. (2022). Fast vehicle detection algorithm in traffic scene based on improved SSD. Measurement, 201, 111655.

Saxena, S., Dey, S., Shah, M., & Gupta, S. (2023). Traffic sign detection in unconstrained environment using improved YOLOv4. Expert Systems with Applications, 121836.

Ge, H., & Wu, Y. (2023). An Empirical Study of Adoption of ChatGPT for Bug Fixing among Professional Developers. Innovation & Technology Advances, 1(1), 21–29. https://doi.org/10.61187/ita.v1i1.19

Hu, M., Wu, Y., Yang, Y., Fan, J., & Jing, B. (2023). DAGL-Faster: Domain adaptive faster r-cnn for vehicle object detection in rainy and foggy weather conditions. Displays, 79, 102484.

Sakaridis, C., Dai, D., & Van Gool, L. (2018). Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision, 126, 973-992.

Hu, M., Wang, C., Yang, J., Wu, Y., Fan, J., & Jing, B. (2022). Rain rendering and construction of rain vehicle color-24 dataset. Mathematics, 10(17), 3210.

Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988).

Meng, X., Liu, Y., Fan, L., & Fan, J. (2023). YOLOv5s-Fog: an improved model based on YOLOv5s for object detection in foggy weather scenarios. Sensors, 23(11), 5321.

Jocher, G., Chaurasia, A., Stoken, A., Borovec, J., Kwon, Y., Michael, K., ... & Mammana, L. (2022). ultralytics/yolov5: v6. 2-yolov5 classification models, apple m1, reproducibility, clearml and deci. ai integrations. Zenodo.

Published

2024-08-15

How to Cite

Ren, Z. (2024). Enhanced YOLOv8 Infrared Image Object Detection Method with SPD Module. Journal of Theory and Practice in Engineering and Technology, 1(2), 1–7. Retrieved from https://woodyinternational.com/index.php/jtpet/article/view/42