Comparison of Transfer Learning Techniques for Object Detection
Keywords:
Object Detection, YOLOv5, Transfer Learning, Deep LearningAbstract
The YOLO series of models has been the industry standard for precise and practical object identification since 2015. Since then, YOLO models have been improved for faster and more accurate detection as well as meeting a variety of requirements in real-time environments and multiple pertinent scenarios. The first YOLO model introduced the concept of tackling object detection by allowing a neural network to predict bounding boxes and class probabilities in one evaluation. A novel dataset of Indian roads and vehicles has been created using a highly customised dataset of 3k photos obtained from various sources to compare and analyse YOLO models such as YOLOv5, YOLOv6, YOLOv7, and YOLOR in-depth in this paper. The study’s findings reveal that, in a similar testing environment, YOLOv5 performs better than the competition, making it the most accurate YOLO Model to date.
Downloads
References
Redmon J, Divvala S, Girshick R, et al. You Only Look Once: Unified, Real-Time Object Detection . 2016.
Redmon J, Farhadi A. YOLOv3: An Incremental Improvement . 2018.
Wang CY, Bochkovskiy A, Liao HYM. Scaled-YOLOv4: Scaling Cross Stage Partial Net- work . 2021.
Bochkovskiy A, Wang CY, Liao HYM. YOLOv4: Optimal Speed and Accuracy of Object Detection . 2020.
Li C, Li L, Jiang H, et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications . 2022.
Ge Z, Liu S, Wang F, et al. YOLOX: Exceeding YOLO Series in 2021 . 2021.
Wang CY, Yeh IH, Liao HYM. You Only Learn One Representation: Unified Network for Multiple Tasks . 2021.
Xu S, Wang X, Lv W, et al. PP-YOLOE: An evolved version of YOLO . 2022.
Wang CY, Bochkovskiy A, Liao HYM. YOLOv7: Trainable bag-of-freebies sets new state- of-the-art for real-time object detectors . 2022.
Zhu J, Li X, Xu JP, et al. Multi-Sensor Multi- Level Enhanced YOLO for Robust Vehicle Detection in Traffic Surveillance. Sensors. 2012;.
Zhu J, Li X, Jin P, et al. MME-YOLO: Multi-Sensor Multi-Level Enhanced YOLO for Robust Vehicle Detection in Traffic Surveillance. Sensors. 2020 12;21:27. Available from: 10.3390/s21010027.
Ahmad T, ma Y, Yahya M, et al. Object Detection through Modified YOLO Neural Network. Scientific Programming. 2020 06;Available from: 10.1155/2020/8403262.
Wang CY, Bochkovskiy A, yuan Liao H. Scaled-YOLOv4: Scaling Cross Stage Partial Network. 06; 2021. p. 13024–13033. Available from: 10.1109/CVPR46437.2021.01283.
Y H, H Z. A Safety Vehicle Detection Mechanism Based on YOLOv5. IEEE 6th Interna- tional Conference on Smart Cloud. 2018;.
Lin TY, Maire M, Belongie S, et al. Microsoft COCO: Common Objects in Context. In: Fleet D, Pajdla T, Schiele B, et al., editors. Computer Vision - ECCV 2014; Cham. Springer International Publishing; 2014. p. 740–755.
Open Image dataset V6+ . 2022. Available from: https://storage.googleapis.com/ openimages/web/index.html.
Penn-Fudan Database for Pedestrian Detection and Segmentation. . 2007. Available from: https://www.cis.upenn.edu/textasciitildejshi/ped_html/.
Kumar A . 2020. Available from: https://www.kaggle.com/atulyakumar98/pothole- detection-dataset.
Keras Image Data Preprocessing . 2022. Available from: https://keras.io/api/ preprocessing/image/.
Tzutalin. LabelImg . 2007. Available from: https://github.com/tzutalin/labelImg.
Redmon J, Divvala S, Girshick R, et al. You Only Look Once: Unified, Real-Time Object Detection . 2016.
Vinston Raja, R., Ashok Kumar, K. ., & Gokula Krishnan, V. . (2023). Condition based Ensemble Deep Learning and Machine Learning Classification Technique for Integrated Potential Fishing Zone Future Forecasting. International Journal on Recent and Innovation Trends in Computing and Communication, 11(2), 75–85. https://doi.org/10.17762/ijritcc.v11i2.6131
Redmon J, Farhadi A. YOLOv3: An Incremental Improvement. arXiv. 2018;Available from: https://arxiv.org/abs/1804.02767.
Bochkovskiy A, Wang CY, Liao HYM. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv. 2020;Available from: https://arxiv.org/abs/2004.10934.
Ultralytics. Ultralytics Yolov5 . 2022. Available from: https://github.com/ultralytics/ yolov5/releases.
Xu R, Lin H, Lu K, et al. A Forest Fire Detection System Based on Ensemble Learning. 2021;12:217. Available from: 10.3390/f12020217.
Chang Lee, Deep Learning for Speech Recognition in Intelligent Assistants , Machine Learning Applications Conference Proceedings, Vol 1 2021.
Ding X, Zhang X, Ma N, et al. Repvgg: Making vgg-style convnets great again. Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021; :13733–13742.
Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018;:8759–8768.
Hussain M, Bird JJ, Faria DR. A Study on CNN Transfer Learning for Image Classifi- cation. In: Lotfi A, Bouchachia H, Gegov A, et al., editors. Advances in Computational Intelligence Systems; Cham. Springer International Publishing; 2019. p. 191–202.
Weiss K, Khoshgoftaar TM, Wang D. A survey of transfer learning. Journal of Big Data. 2016 May;3(1):9. Available from: 10.1186/s40537-016-0043-6.
Wang Q, Ma Y, Zhao K, et al. A Comprehensive Survey of Loss Functions in Machine Learning. Annals of Data Science. 2022 04;9. Available from: 10.1007/s40745-020-00253-5.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Ayush Majumdar, Siddharth Sarma, Sarvesh Satone, Atharv Mankar, Bhavana Tiple, Madhura Phatak
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.