Machine Learning-Based Defect Prediction for Software Efficiency


  • Jayanti Goyal Computer Science Department Rajasthan Technical University (RTU) Kota
  • Ripu Ranjan Sinha Computer Science Department Rajasthan Technical University (RTU) Kota


Software Metrics, Software Defect Prediction, Software Failure Factors, Software Quality Assurance, Machine Learning


Software engineering research is centered on defect prediction. Successful software development requires better communication between data mining and software engineering. Software defect prediction is a pre-testing technique that estimates where bugs will show up in the code. The purpose of software defect prediction research is to identify potentially flawed parts of a programme before it reaches the testing phase. The primary benefit of these prediction models is that they need more testing time and money. may be directed to the modules most prone to errors. However, only a few mobile app-specific software defect prediction algorithms currently exist. It is common practise to utilise defect prediction algorithms to probe the impact domain in software (clustering, neural networks, statistical methods, and machine learning models). This research aims to examine and compared various ML (machine learning) algorithms for software bug prediction. Despite the widespread availability of failure prediction methods, no one strategy is appropriate for every data collection. Support Vector Machine, Random Forest, Naive Bayes, Logistic Regression, and Artificial Neural Network, were only some of ML methods utilised to find biggest possible subset of faults. The goal of this study is to utilize 5 data sets (JM1, KC1, KC2, PC1, and DS1) to identify flaws. As compared to other methods, ANN has been demonstrated to have the highest accuracy (93.8%).


Download data is not yet available.


M. Shepperd, D. Bowes, and T. Hall, “Researcher bias: The use of machine learning in software defect prediction,” IEEE Trans. Softw. Eng., 2014, doi: 10.1109/TSE.2014.2322358.

M. Shepperd, T. Hall, and D. Bowes, “Authors’ reply to ‘comments on “researcher bias: The use of machine learning in software defect prediction,”’” IEEE Trans. Softw. Eng., 2018, doi: 10.1109/TSE.2017.2731308.

C. Tantithamthavorn, S. McIntosh, A. E. Hassan, and K. Matsumoto, “Comments on Researcher Bias: The Use of Machine Learning in Software Defect Prediction,” IEEE Trans. Softw. Eng., 2016, doi: 10.1109/TSE.2016.2553030.

G. Czibula, Z. Marian, and I. G. Czibula, “Software defect prediction using relational association rule mining,” Inf. Sci. (Ny)., 2014, doi: 10.1016/j.ins.2013.12.031.

M. Shepperd, Q. Song, Z. Sun, and C. Mair, “Data quality: Some comments on the NASA software defect datasets,” IEEE Trans. Softw. Eng., 2013, doi: 10.1109/TSE.2013.11.

R. Malhotra, “Comparative analysis of statistical and machine learning methods for predicting faulty modules,” Appl. Soft Comput. J., 2014, doi: 10.1016/j.asoc.2014.03.032.

N. Kalaivani, R. Beena, and A. Professor, “Overview of Software Defect Prediction using Machine Learning Algorithms,” Int. J. Pure Appl. Math., vol. 118, no. 20, pp. 3863–3873, 2018.

X. Peng, “Research on Software Defect Prediction and Analysis Based on Machine Learning,” 2022. doi: 10.1088/1742-6596/2173/1/012043.

B. Yalciner and M. Ozdes, “Software Defect Estimation Using Machine Learning Algorithms,” UBMK 2019 - Proceedings, 4th Int. Conf. Comput. Sci. Eng., no. 01, pp. 487–491, 2019, doi: 10.1109/UBMK.2019.8907149.

J. Gao, L. Zhang, F. Zhao, and Y. Zhai, “Research on Software Defect Classification,” in 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), 2019, pp. 748–754. doi: 10.1109/ITNEC.2019.8729440.

Z. Cai, L. Lu, and S. Qiu, “An Abstract Syntax Tree Encoding Method for Cross-Project Defect Prediction,” IEEE Access, 2019, doi: 10.1109/ACCESS.2019.2953696.

Q. Yu, J. Qian, S. Jiang, Z. Wu, and G. Zhang, “An Empirical Study on the Effectiveness of Feature Selection for Cross-Project Defect Prediction,” IEEE Access, 2019, doi: 10.1109/ACCESS.2019.2895614.

P. Paramshetti and D. A. Phalke, “Survey on Software Defect Prediction Using Machine Learning Techniques,” Int. J. Sci. Res., 2014.

H. K. Dam, J. Grundy, T. Kim, and C. Kim, “A deep tree-based model for software defect prediction”.

G. Esteves, E. Figueiredo, A. Veloso, M. Viggiato, and N. Ziviani, “Understanding machine learning software defect predictions,” Autom. Softw. Eng., 2020, doi: 10.1007/s10515-020-00277-4.

E. A. Felix and S. P. Lee, “Integrated Approach to Software Defect Prediction,” IEEE Access, 2017, doi: 10.1109/ACCESS.2017.2759180.

T. Clancy, “The Standish Group Report CHAOS,” Proj. Smart, pp. 1–16, 2014.

V. Chomal and J. Saini, “Cataloguing Most Severe Causes that Lead Software Projects to Fail,” 2014.

A. Khalid, G. Badshah, N. Ayub, M. Shiraz, and M. Ghouse, “Software Defect Prediction Analysis Using Machine Learning Techniques,” Sustainability, vol. 15, no. 6, 2023, doi: 10.3390/su15065517.

W. Yao, M. Shafiq, X. Lin, and X. Yu, “A Software Defect Prediction Method Based on Program Semantic Feature Mining,” Electronics, vol. 12, no. 7, 2023, doi: 10.3390/electronics12071546.

M. Jorayeva, A. Akbulut, C. Catal, and A. Mishra, “Machine Learning-Based Software Defect Prediction for Mobile Applications: A Systematic Literature Review,” Sensors, vol. 22, no. 7, 2022, doi: 10.3390/s22072551.

A. Alazba and H. Aljamaan, “Software Defect Prediction Using Stacking Generalization of Optimized Tree-Based Ensembles,” Appl. Sci., vol. 12, no. 9, 2022, doi: 10.3390/app12094577.

P. Tadapaneni, N. C. Nadella, M. Divyanjali, and Y. Sangeetha, “Software Defect Prediction based on Machine Learning and Deep Learning,” in 2022 International Conference on Inventive Computation Technologies (ICICT), 2022, pp. 116–122. doi: 10.1109/ICICT54344.2022.9850643.

H. Alsawalqah et al., “Software Defect Prediction Using Heterogeneous Ensemble Classification Based on Segmented Patterns,” Appl. Sci., vol. 10, no. 5, 2020, doi: 10.3390/app10051745.

Meiliana, S. Karim, H. L. H. S. Warnars, F. L. Gaol, E. Abdurachman, and B. Soewito, “Software metrics for fault prediction using machine learning approaches: A literature review with PROMISE repository dataset,” in 2017 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), 2017, pp. 19–23. doi: 10.1109/CYBERNETICSCOM.2017.8311708.

X. Cai et al., “An under-sampled software defect prediction method based on hybrid multi-objective cuckoo search,” Concurr. Comput. Pract. Exp., 2020, doi: 10.1002/cpe.5478.

M. Akour, H. Al Sghaier, and O. Al Qasem, “The effectiveness of using deep learning algorithms in predicting students achievements,” Indones. J. Electr. Eng. Comput. Sci., 2020, doi: 10.11591/ijeecs.v19.i1.pp388-394.

C. W. Yohannese and T. Li, “A Combined-Learning based framework for improved software fault prediction,” Int. J. Comput. Intell. Syst., 2017, doi: 10.2991/ijcis.2017.10.1.43.

L. Kumar, S. Rath, and A. Sureka, “Using Source Code Metrics and Ensemble Methods for Fault Proneness Prediction,” 2017.

M. M. Moore, E. Slonimsky, A. D. Long, R. W. Sze, and R. S. Iyer, “Machine learning concepts, concerns and opportunities for a pediatric radiologist,” Pediatr. Radiol., 2019, doi: 10.1007/s00247-018-4277-7.

S. MwanjeleMwagha, M. Muthoni, and P. Ochieng, “Comparison of Nearest Neighbor (ibk), Regression by Discretization and Isotonic Regression Classification Algorithms for Precipitation Classes Prediction,” Int. J. Comput. Appl., 2014, doi: 10.5120/16919-6729.

P. Xu, “Review on Studies of Machine Learning Algorithms,” 2019. doi: 10.1088/1742-6596/1187/5/052103.

C. Pan, M. Lu, B. Xu, and H. Gao, “An improved CNN model for within-project software defect prediction,” Appl. Sci., 2019, doi: 10.3390/app9102138.

S. Chen and D. Tan, “A SA-ANN-Based Modeling Method for Human Cognition Mechanism and the PSACO Cognition Algorithm,” Complexity, 2018, doi: 10.1155/2018/6264124.

M. Assim, Q. Obeidat, and M. Hammad, “Software Defects Prediction using Machine Learning Algorithms,” 2020 Int. Conf. Data Anal. Bus. Ind. W. Towar. a Sustain. Econ. ICDABI 2020, 2020, doi: 10.1109/ICDABI51230.2020.9325677.

M. Azam, M. Nouman, and A. R. Gill, “Comparative analysis of machine learning techniques to improve software defect prediction,” KIET J. Comput. Inf. Sci. [KJCIS], vol. 5, no. 2, pp. 41–66, 2022.

A. Alsaeedi and M. Z. Khan, “Software Defect Prediction Using Supervised Machine Learning and Ensemble Techniques: A Comparative Study,” J. Softw. Eng. Appl., 2019, doi: 10.4236/jsea.2019.125007.

A. Hammouri, M. Hammad, M. Alnabhan, and F. Alsarayrah, “Software Bug Prediction using machine learning approach,” Int. J. Adv. Comput. Sci. Appl., 2018, doi: 10.14569/IJACSA.2018.090212.

S. Aleem, L. F. Capretz, and F. Ahmed, “Benchmarking Machine Learning Techniques for Software Defect Detection,” Int. J. Softw. Eng. Appl., 2015, doi: 10.5121/ijsea.2015.6302.

Alaria, S. K. "A.. Raj, V. Sharma, and V. Kumar.“Simulation and Analysis of Hand Gesture Recognition for Indian Sign Language Using CNN”." International Journal on Recent and Innovation Trends in Computing and Communication 10, no. 4 (2022): 10-14.

Satish Kumar Alaria. Design & Analysis of Cost Estimation for New Mobile-COCOMO Tool for Mobile Application. IJRITCC 2019, 7, 27-34.

Najneen Qureshi, Manish Kumar Mukhija and Satish Kumar, "RAFI: Parallel Dynamic Test-suite Reduction for Software", New Frontiers in Communication and Intelligent Systems, SCRS, India, 2021, pp. 165-176.

Software defect prediction process




How to Cite

Goyal , J. ., & Sinha, R. R. . (2023). Machine Learning-Based Defect Prediction for Software Efficiency . International Journal of Intelligent Systems and Applications in Engineering, 11(6s), 257–266. Retrieved from



Research Article