Comparison of Transfer Learning Techniques for Object Detection

Authors

  • Ayush Majumdar School of Computer Engineering and Technology, Dr.Vishwanath Karad MIT World Peace University, Pune
  • Siddharth Sarma School of Computer Engineering and Technology, Dr.Vishwanath Karad MIT World Peace University, Pune
  • Sarvesh Satone School of Computer Engineering and Technology, Dr.Vishwanath Karad MIT World Peace University, Pune
  • Atharv Mankar School of Computer Engineering and Technology, Dr.Vishwanath Karad MIT World Peace University, Pune
  • Bhavana Tiple School of Computer Engineering and Technology, Dr.Vishwanath Karad MIT World Peace University, Pune
  • Madhura Phatak School of Computer Engineering and Technology, Dr.Vishwanath Karad MIT World Peace University, Pune

Keywords:

Object Detection, YOLOv5, Transfer Learning, Deep Learning

Abstract

The YOLO series of models has been the industry standard for precise and practical object identification since 2015. Since then, YOLO models have been improved for faster and more accurate detection as well as meeting a variety  of requirements      in real-time environments and multiple pertinent scenarios. The first YOLO model introduced the concept of tackling object detection by allowing a neural network to predict bounding boxes and class probabilities in one evaluation. A novel dataset of Indian roads and vehicles has been created using a highly customised dataset of 3k photos obtained from various sources to compare and analyse YOLO models such as YOLOv5, YOLOv6, YOLOv7, and YOLOR in-depth in this paper. The study’s findings reveal that, in a similar testing environment, YOLOv5 performs better than the competition, making it the most accurate YOLO Model to date.

Downloads

Download data is not yet available.

References

Redmon J, Divvala S, Girshick R, et al. You Only Look Once: Unified, Real-Time Object Detection . 2016.

Redmon J, Farhadi A. YOLOv3: An Incremental Improvement . 2018.

Wang CY, Bochkovskiy A, Liao HYM. Scaled-YOLOv4: Scaling Cross Stage Partial Net- work . 2021.

Bochkovskiy A, Wang CY, Liao HYM. YOLOv4: Optimal Speed and Accuracy of Object Detection . 2020.

Li C, Li L, Jiang H, et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications . 2022.

Ge Z, Liu S, Wang F, et al. YOLOX: Exceeding YOLO Series in 2021 . 2021.

Wang CY, Yeh IH, Liao HYM. You Only Learn One Representation: Unified Network for Multiple Tasks . 2021.

Xu S, Wang X, Lv W, et al. PP-YOLOE: An evolved version of YOLO . 2022.

Wang CY, Bochkovskiy A, Liao HYM. YOLOv7: Trainable bag-of-freebies sets new state- of-the-art for real-time object detectors . 2022.

Zhu J, Li X, Xu JP, et al. Multi-Sensor Multi- Level Enhanced YOLO for Robust Vehicle Detection in Traffic Surveillance. Sensors. 2012;.

Zhu J, Li X, Jin P, et al. MME-YOLO: Multi-Sensor Multi-Level Enhanced YOLO for Robust Vehicle Detection in Traffic Surveillance. Sensors. 2020 12;21:27. Available from: 10.3390/s21010027.

Ahmad T, ma Y, Yahya M, et al. Object Detection through Modified YOLO Neural Network. Scientific Programming. 2020 06;Available from: 10.1155/2020/8403262.

Wang CY, Bochkovskiy A, yuan Liao H. Scaled-YOLOv4: Scaling Cross Stage Partial Network. 06; 2021. p. 13024–13033. Available from: 10.1109/CVPR46437.2021.01283.

Y H, H Z. A Safety Vehicle Detection Mechanism Based on YOLOv5. IEEE 6th Interna- tional Conference on Smart Cloud. 2018;.

Lin TY, Maire M, Belongie S, et al. Microsoft COCO: Common Objects in Context. In: Fleet D, Pajdla T, Schiele B, et al., editors. Computer Vision - ECCV 2014; Cham. Springer International Publishing; 2014. p. 740–755.

Open Image dataset V6+ . 2022. Available from: https://storage.googleapis.com/ openimages/web/index.html.

Penn-Fudan Database for Pedestrian Detection and Segmentation. . 2007. Available from: https://www.cis.upenn.edu/textasciitildejshi/ped_html/.

Kumar A . 2020. Available from: https://www.kaggle.com/atulyakumar98/pothole- detection-dataset.

Keras Image Data Preprocessing . 2022. Available from: https://keras.io/api/ preprocessing/image/.

Tzutalin. LabelImg . 2007. Available from: https://github.com/tzutalin/labelImg.

Redmon J, Divvala S, Girshick R, et al. You Only Look Once: Unified, Real-Time Object Detection . 2016.

Vinston Raja, R., Ashok Kumar, K. ., & Gokula Krishnan, V. . (2023). Condition based Ensemble Deep Learning and Machine Learning Classification Technique for Integrated Potential Fishing Zone Future Forecasting. International Journal on Recent and Innovation Trends in Computing and Communication, 11(2), 75–85. https://doi.org/10.17762/ijritcc.v11i2.6131

Redmon J, Farhadi A. YOLOv3: An Incremental Improvement. arXiv. 2018;Available from: https://arxiv.org/abs/1804.02767.

Bochkovskiy A, Wang CY, Liao HYM. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv. 2020;Available from: https://arxiv.org/abs/2004.10934.

Ultralytics. Ultralytics Yolov5 . 2022. Available from: https://github.com/ultralytics/ yolov5/releases.

Xu R, Lin H, Lu K, et al. A Forest Fire Detection System Based on Ensemble Learning. 2021;12:217. Available from: 10.3390/f12020217.

Chang Lee, Deep Learning for Speech Recognition in Intelligent Assistants , Machine Learning Applications Conference Proceedings, Vol 1 2021.

Ding X, Zhang X, Ma N, et al. Repvgg: Making vgg-style convnets great again. Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021; :13733–13742.

Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018;:8759–8768.

Hussain M, Bird JJ, Faria DR. A Study on CNN Transfer Learning for Image Classifi- cation. In: Lotfi A, Bouchachia H, Gegov A, et al., editors. Advances in Computational Intelligence Systems; Cham. Springer International Publishing; 2019. p. 191–202.

Weiss K, Khoshgoftaar TM, Wang D. A survey of transfer learning. Journal of Big Data. 2016 May;3(1):9. Available from: 10.1186/s40537-016-0043-6.

Wang Q, Ma Y, Zhao K, et al. A Comprehensive Survey of Loss Functions in Machine Learning. Annals of Data Science. 2022 04;9. Available from: 10.1007/s40745-020-00253-5.

The YOLOv6 architecture Alt Text: A figure deep diving into EfficientRep Backbone of the YOLOv6 architecture

Downloads

Published

01.07.2023

How to Cite

Majumdar, A. ., Sarma, S. ., Satone, S. ., Mankar, A. ., Tiple, B. ., & Phatak, M. . (2023). Comparison of Transfer Learning Techniques for Object Detection. International Journal of Intelligent Systems and Applications in Engineering, 11(7s), 243–252. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2950

Most read articles by the same author(s)