Optistroke: Harnessing Bat Algorithm-Driven Stacked Ensembles for Enhanced Stroke Prediction Using Machine Learning

Authors

  • Divya K. Assistant Professor/EEE, Karpagam Institute of Technology, Coimbatore
  • Sangeethapriya R. Assistant professor/IT, Sona college of Technology, Salem
  • Gomathi S. Assistant Professor/CSE, Dr.N.G.P Institute of Technology, Coimbatore.
  • Dhiyanesh B. Associate Professor/CSE, Dr.N.G.P Institute of Technology, Coimbatore
  • Kiruthika J. K. Assistant Professor/CSE, KPR Institute of Engineering and Technology, Coimbatore
  • Saraswathi P. Assistant Professor/IT, Velammal College of Engineering and Technology, Madurai

Keywords:

Stacked Ensemble, Gradient Descent, Prediction, Optimization, Machine Learning, Healthcare, Stroke

Abstract

Strokes are considered to be one of the most serious medical conditions in the world and must be diagnosed at an early stage so that the consequences for the patients can be minimized. The proposed BatOptiStroke is a novel technique that increases stroke prediction accuracy by using a stacked ensemble model powered by the Bat Algorithm (BA). To capture a wide range of prediction skills, BatOptiStroke combines a diversified selection of base models, such as Extreme Gradient Boosting (XGBoost), K Nearest Neighbors (KNN), and Support Vector Machines (SVM). By modifying the placements and velocities of the bats that reflect the fundamental models, the BA continuously optimizes the group's efficiency. This results in increased stroke prediction accuracy. On a sizable dataset of stroke patients, the BatOptiStroke framework's efficiency is thoroughly assessed in comparison to that of each of the base models and alternative ensemble approaches. Evaluation metrics validate BatOptiStroke's stroke prediction capabilities. The combined model set consistently outperforms individual base models, improving prediction accuracy and overall performance. Along with increased 97% accuracy, 89% precision, 95% recall, and a 93% F1 score, BatOptiStroke also contributes.

Downloads

Download data is not yet available.

References

Yang, X. S. (2010). A new metaheuristic bat-inspired algorithm. In Nature inspired cooperative strategies for optimization (NICSO 2010) (pp. 65-74). Springer, Berlin, Heidelberg.

Yang, X. S. (2013). Bat algorithm: Literature review and applications. International Journal of Bio-Inspired Computation, 5(3), 141-149.

Chen, T., &Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of statistics, 1189-1232.

Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21-27.

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.

Cortes, C., &Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.

Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.

Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2), 241-259.

Yang, X. S. (2010). A new metaheuristic bat-inspired algorithm. In Nature inspired cooperative strategies for optimization (NICSO 2010) (pp. 65-74). Springer, Berlin, Heidelberg.

Chen, T., &Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794).

Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21-27.

Cortes, C., &Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.

Breiman, L. (1996). Stacked regressions. Machine learning, 24(1), 49-64.

Li, Y., Cheng, K., You, J., & Chen, T. (2019). Improved particle swarm optimization based ensemble pruning algorithm for classification. Knowledge-Based Systems, 182, 104837.

Nguyen, T. V., Hwang, J. J., &Khosravi, A. (2019). Enhancing ensemble techniques for medical data classification. Expert Systems with Applications, 124, 1-21.

Singh, N., Mishra, S., Gadge, R., & Singh, S. (2021). A novel feature selection ensemble model for effective prediction of stroke using MRI images. Journal of Ambient Intelligence and Humanized Computing, 12(3), 3405-3416.

Zhang, C., Zhang, C., & Zhang, J. (2020). A novel stroke prediction model using Gaussian mixture model and long short-term memory. BMC Medical Informatics and Decision Making, 20(1), 1-15.

Uddin, S., & Mohd Noor, N. (2021). An improved stroke prediction model using ensemble learning with feature selection. Journal of Ambient Intelligence and Humanized Computing, 12(10), 8831-8842.

Aggarwal, A., Nalluri, J. K., &Nagabhushan, P. (2020). A machine learning-based stroke prediction system for comprehensive healthcare. International Journal of Computer Assisted Radiology and Surgery, 15(3), 497-506.

Yu, C. M., & Tseng, H. L. (2020). Predicting stroke risk using data mining techniques and Taiwan's National Health Insurance Research Database. International Journal of Medical Informatics, 140, 104168.

Wang, X., Jiang, J., Liu, Y., Liu, Y., & Wang, F. (2019). Efficient long short-term memory neural network model for stroke prediction. Healthcare Technology Letters, 6(2), 32-36.

Yang, X.S. and He, X., 2013. Bat algorithm: literature review and applications. International Journal of Bio-inspired computation, 5(3), pp.141-149.

Mirjalili, S., Mirjalili, S.M. and Yang, X.S., 2014. Binary bat algorithm. Neural Computing and Applications, 25, pp.663-681.

Fister Jr, I., Fister, D. and Yang, X.S., 2013. A hybrid bat algorithm. arXiv preprint arXiv:1303.6310.

Alageel, N., Alharbi, R., Alharbi, R., Alsayil, M. and Alharbi, L.A., 2023. Using Machine Learning Algorithm as a Method for Improving Stroke Prediction. International Journal of Advanced Computer Science and Applications, 14(4).

Ferdous, M.J. and Shahriyar, R., 2023, February. A Comparative Analysis for Stroke Risk Prediction Using Machine Learning Algorithms and Convolutional Neural Network Model. In 2023 International Conference on Electrical, Computer and Communication Engineering (ECCE) (pp. 1-6). IEEE.

Buyrukoğlu, S. and Savaş, S., 2023. Stacked-based ensemble machine learning model for positioning footballer. Arabian Journal for Science and Engineering, 48(2), pp.1371-1383.

Mushtaq, S. and Saini, K.S., 2023, March. A Review on Predicting Brain Stroke using Machine Learning. In 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 667-673). IEEE.

Dang, L., Li, J., Bai, X., Liu, M., Li, N., Ren, K., Cao, J., Du, Q. and Sun, J., 2023. Novel Prediction Method Applied to Wound Age Estimation: Developing a Stacking Ensemble Model to Improve Predictive Performance Based on Multi-mRNA. Diagnostics, 13(3), p.395.

Dataset collection:https://www.kaggle.com/code/yassineboukhari/ml-project-heart-disease-ensembling-methods/notebook

Yacouby, R. and Axman, D., 2020, November. Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models. In Proceedings of the first workshop on evaluation and comparison of NLP systems (pp. 79-91).

Ma, Y., He, T., Tan, Y. and Jiang, X., 2020. Seq-BEL: sequence-based ensemble learning for predicting virus-human protein-protein interaction. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 19(3), pp.1322-1333.

Guo, G., Wang, H., Bell, D., Bi, Y. and Greer, K., 2003. KNN model-based approach in classification. In On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, November 3-7, 2003. Proceedings (pp. 986-996). Springer Berlin Heidelberg.

LaValley, M.P., 2008. Logistic regression. Circulation, 117(18), pp.2395-2399.

Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I. and Zhou, T., 2015. Xgboost: extreme gradient boosting. R package version 0.4-2, 1(4), pp.1-4.

Rigatti, S.J., 2017. Random forest. Journal of Insurance Medicine, 47(1), pp.31-39.

Geurts, P., Ernst, D. and Wehenkel, L., 2006. Extremely randomized trees. Machine learning, 63, pp.3-42.

Wang, L. ed., 2005. Support vector machines: theory and applications (Vol. 177). Springer Science & Business Media.

De Ville, B., 2013. Decision trees. Wiley Interdisciplinary Reviews: Computational Statistics, 5(6), pp.448-455.

Leung, K.M., 2007. Naive bayesian classifier. Polytechnic University Department of Computer Science/Finance and Risk Engineering, 2007, pp.123-156.

Khosla, A., Cao, Y., Lin, C.C.Y., Chiu, H.K., Hu, J. and Lee, H., 2010, July. An integrated machine learning approach to stroke prediction. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 183-192).

Joshi, A., Choudhury, T., Sai Sabitha, A., Srujan Raju, K. (2020). Data Mining in Healthcare and Predicting Obesity. In: Raju, K., Govardhan, A., Rani, B., Sridevi, R., Murty, M. (eds) Proceedings of the Third International Conference on Computational Intelligence and Informatics . Advances in Intelligent Systems and Computing, vol 1090. Springer, Singapore. https://doi.org/10.1007/978-981-15-1480-7_82

Apat, S.K., Mishra, J., Srujan Raju, K., Padhy, N. (2023). State of the Art of Ensemble Learning Approach for Crop Prediction. In: Kumar, R., Pattnaik, P.K., R. S. Tavares, J.M. (eds) Next Generation of Internet of Things. Lecture Notes in Networks and Systems, vol 445. Springer, Singapore. https://doi.org/10.1007/978-981-19-1412-6_58

Khin, T., Srujan Raju, K., Sinha, G.R., Khaing, K.K., Kyi, T.M. (2020). Review of Optimization Methods of Medical Image Segmentation. In: Raju, K., Govardhan, A., Rani, B., Sridevi, R., Murty, M. (eds) Proceedings of the Third International Conference on Computational Intelligence and Informatics . Advances in Intelligent Systems and Computing, vol 1090. Springer, Singapore. https://doi.org/10.1007/978-981-15-1480-7_17

Downloads

Published

24.03.2024

How to Cite

K., D. ., R., S. ., S., G. ., B., D. ., J. K., K. ., & P., S. . (2024). Optistroke: Harnessing Bat Algorithm-Driven Stacked Ensembles for Enhanced Stroke Prediction Using Machine Learning. International Journal of Intelligent Systems and Applications in Engineering, 12(20s), 835–845. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5309

Issue

Section

Research Article