Exploring Object Detection Algorithms and implementation of YOLOv7 and YOLOv8 based model for weapon detection

Authors

  • Divya Kumawat, Deepak Abhayankar, Sanjay Tanwani

Keywords:

anchor box, classifier, feature map, RCNN, YOLO

Abstract

This paper explores the working principles, performance metrics, and architectural nuances of various object detection techniques, focusing mainly on the YOLO (You Only Look Once) algorithms. A comprehensive comparative study is conducted, considering factors such as loss function, backbone network, and performance on standardized image sizes. Beginning with an introduction, the paper classifies object detection algorithms, into one-stage and two-stage object detection techniques. The literature review scrutinizes the operational mechanisms and constraints of existing techniques. This study then transitions into weapon detection using the YOLOv7 and YOLOv8 algorithms, leveraging a dataset sourced and pre-processed from the Roboflow website. The mean Average Precision (mAP) achieved by YOLOv7 and YOLOv8 after training for 50 epochs stands at 0.9289 and 0.9430, respectively. Furthermore, the paper elucidates how performance metrics fluctuate with respect to epoch count in YOLOv7. In conclusion, the paper outlines avenues for further research, highlighting areas that warrant attention and exploration within the realm of object detection methodologies.

Downloads

Download data is not yet available.

References

D. G. Lowe, “Object recognition from local scale-invariant features,” in Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, pp. 1150–1157 vol.2. doi: 10.1109/ICCV.1999.790410.

P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 2001, pp. I–I. doi: 10.1109/CVPR.2001.990517.

P. Felzenszwalb, D. McAllester, and D. Ramanan, “A discriminatively trained, multiscale, deformable part model,” in 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8. doi: 10.1109/CVPR.2008.4587597.

N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005, pp. 886–893 vol. 1. doi: 10.1109/CVPR.2005.177.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems, F. Pereira, C. J. Burges, L. Bottou, and K. Q. Weinberger, Eds., Curran Associates, Inc., 2012. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Region-Based Convolutional Networks for Accurate Object Detection and Segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 1, pp. 142–158, Jan. 2016, doi: 10.1109/TPAMI.2015.2437384.

R. Girshick, “Fast R-CNN,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2015 Inter, pp. 1440–1448, 2015, doi: 10.1109/ICCV.2015.169.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017, doi: 10.1109/TPAMI.2016.2577031.

K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 386–397, Feb. 2020, doi: 10.1109/TPAMI.2018.2844175.

Z. Cai and N. Vasconcelos, “Cascade R-CNN: High quality object detection and instance segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 5, pp. 1483–1498, 2021, doi: 10.1109/TPAMI.2019.2956516.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. 2016, pp. 779–788. doi: 10.1109/CVPR.2016.91.

J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 6517–6525, 2017, doi: 10.1109/CVPR.2017.690.

J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” Apr. 2018.

A. Bochkovskiy, C.-Y. Wang, and H. Liao, YOLOv4: Optimal Speed and Accuracy of Object Detection. 2020.

C. Wang, A. Bochkovskiy, and H. M. Liao, “Scaled-YOLOv4 : Scaling Cross Stage Partial Network,” pp. 13029–13038.

“Ultralytics YOLOv5 Architecture,” 2023. https://docs.ultralytics.com/yolov5/tutorials/architecture_description/#1-model-structure (accessed Jul. 19, 2023).

Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-IoU loss: Faster and better learning for bounding box regression,” AAAI 2020 - 34th AAAI Conf. Artif. Intell., no. 2, pp. 12993–13000, 2020, doi: 10.1609/aaai.v34i07.6999.

C. Li et al., “YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications”.

C.-Y. Wang, A. Bochkovskiy, and H. Liao, YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. 2022. doi: 10.48550/arXiv.2207.02696.

J. Woo, J. Baek, S. Jo, and S. Y. Kim, “A Study on Object Detection Performance of YOLOv4 for Autonomous Driving of Tram,” 2022.

J. Terven, “A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS,” Mach. Learn. Knowl. Extr., vol. 5, no. 4, pp. 1680–1716, 2023, doi: 10.3390/make5040083.

H. Lou et al., “DC-YOLOv8: Small-Size Object Detection Algorithm Based on Camera Sensor,” Electron., vol. 12, no. 10, 2023, doi: 10.3390/electronics12102323.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 580–587, 2014, doi: 10.1109/CVPR.2014.81.

J. Terven and D. Cordova-Esparza, “A Comprehensive Review of YOLO: From YOLOv1 and Beyond,” pp. 1–33, 2023, [Online]. Available: http://arxiv.org/abs/2304.00501

C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “Scaled-YOLOv4: Scaling Cross Stage Partial Network,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 13024–13033. doi: 10.1109/CVPR46437.2021.01283.

C.-Y. Wang, I.-H. Yeh, and H. Liao, “You Only Learn One Representation: Unified Network for Multiple Tasks,” J. Inf. Sci. Eng., vol. 39, pp. 691–709, 2021.

Z. Ge, “YOLOX: Exceeding YOLO Series in 2021,” pp. 1–7, 2021.

Testing, “guns Computer Vision Project,” Roboflow, 2022. https://universe.roboflow.com/testing-kfsrv/guns-l4rap

M. Kisantal, Z. Wojna, J. Murawski, J. Naruniec, and K. Cho, “Augmentation for small object detection,” Feb. 2019.

M. Khodabandeh, A. Vahdat, M. Ranjbar, and W. G. Macready, “A Robust Learning Approach to Domain Adaptive Object Detection,” Apr. 2019.

A. Wang, Y. Sun, A. Kortylewski, and A. Yuille, “Robust Object Detection Under Occlusion With Context-Aware CompositionalNets,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. 2020, pp. 12642–12651. doi: 10.1109/CVPR42600.2020.01266.

A. Serban and E. Poll, “Adversarial Examples on Object Recognition :,” vol. 53, no. 3, 2020, doi: 10.1145/3398394.

[33] C. Shorten and T. M. Khoshgoftaar, “A survey on Image Data Augmentation for Deep Learning,” J. Big Data, vol. 6, no. 1, 2019, doi: 10.1186/s40537-019-0197-0.

E. Arulprakash and M. Aruldoss, “A study on generic object detection with emphasis on future research directions,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 9, pp. 7347–7365, 2022, doi: 10.1016/j.jksuci.2021.08.001.

Downloads

Published

16.03.2024

How to Cite

Deepak Abhayankar, Sanjay Tanwani, D. K. . (2024). Exploring Object Detection Algorithms and implementation of YOLOv7 and YOLOv8 based model for weapon detection. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 877–886. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5367

Issue

Section

Research Article